===========================
Markup Using ASCIIMathML
===========================
:Author: Mark Nodine
:Contact: mnodine@alum.mit.edu
:Revision: $Revision: 762 $
:Date: $Date: 2006-01-27 11:47:47 -0600 (Fri, 27 Jan 2006) $
:Copyright: This document has been placed in the public domain.
.. |amml| replace:: ASCIIMathML
.. contents::
This document describes the |amml| input notation for producing
mathml markup to represent mathematical formulas. |amml| was designed
by Peter Jipsen of Chapman University, who did a Javascript
implementation which you can find at the `ASCIIMathML web page`_.
In order to use this feature of prest, your perl installation will
have to include Text::ASCIIMathML, which is available on CPAN.
.. _ASCIIMathML web page: http://www1.chapman.edu/~jipsen/mathml/asciimath.html
ASCIIMathML markup
==================
The syntax is very permissive and does not generate syntax
errors. This allows mathematically incorrect expressions to be
displayed, which is important for teaching purposes. It also causes
less frustration when previewing formulas.
.. default-role:: mathml
If you encode ``x^2`` or ``a_(mn)`` or ``a_{mn}`` or ``(x+1)/y`` or
``sqrtx``, you pretty much get what you expect, namely `x^2` or
`a_(mn)` or `a_{mn}` or `(x+1)/y` or `sqrtx`. The choice of grouping
parenthesis is up to you (they don't have to match either). If the
displayed expression can be parsed uniquely without them, they are
omitted. Most LaTeX commands are also supported, so the last two
formulas above can also be written as ``\frac{x+1}{y}`` and
``\sqrt{x}``.
The parser uses no operator precedence and only respects the grouping
brackets, subscripts, superscript, fractions and (square) roots. This
is done for reasons of efficiency and generality. The resulting MathML
code can quite easily be processed further to ensure additional
syntactic requirements of any particular application.
The grammar
-----------
Here is a definition of the grammar used to parse
ASCIIMathML expressions. In the Backus-Naur form given below, the
letter on the left of the ``::=`` represents a category of symbols that
could be one of the possible sequences of symbols listed on the right.
The vertical bar ``|`` separates the alternatives. ::
c ::= [A-z] | numbers | greek letters | other constant symbols
(see below)
u ::= 'sqrt' | 'text' | 'bb' | other unary symbols for font commands
b ::= 'frac' | 'root' | 'stackrel' | 'newcommand' | 'newsymbol'
binary symbols
l ::= ( | [ | { | (: | {: left brackets
r ::= ) | ] | } | :) | :} right brackets
S ::= c | lEr | uS | bSS | "any" simple expression
E ::= SE | S/S |S_S | S^S | S_S^S expression (fraction, sub-,
super-, subsuperscript)
The translation rules
---------------------
Each terminal symbol is translated into a corresponding MathML
node. The constants are mostly converted to their respective Unicode
symbols. The other expressions are converted as follows::
lSr -> <mrow>lSr</mrow>
(note that any pair of brackets can be used to
delimit subexpressions, they don't have to match)
sqrt S -> <msqrt>S'</msqrt>
text S -> <mtext>S'</mtext>
"any" -> <mtext>any</mtext>
frac S1 S2 -> <mfrac>S1' S2'</mfrac>
root S1 S2 -> <mroot>S2' S1'</mroot>
stackrel S1 S2 -> <mover>S2' S1'</mover>
S1/S2 -> <mfrac>S1' S2'</mfrac>
S1_S2 -> <msub>S1 S2'</msub>
S1^S2 -> <msup>S1 S2'</msup>
S1_S2^S3 -> <msubsup>S1 S2' S3'</msubsup> or
<munderover>S1 S2' S3'</munderover> (in some cases)
In the rules above, the expression ``S'`` is the same as ``S``, except that if
``S`` has an outer level of brackets, then ``S'`` is the expression inside
these brackets.
Matrices
--------
A simple syntax for matrices is also recognized::
l(S11,...,S1n),(...),(Sm1,...,Smn)r
or ::
l[S11,...,S1n],[...],[Sm1,...,Smn]r.
Here ``l`` and ``r`` stand for any of the left and right
brackets (just like in the grammar they do not have to match). Both of
these expressions are translated to ::
<mrow>l<mtable><mtr><mtd>S11</mtd>...
<mtd>S1n</mtd></mtr>...
<mtr><mtd>Sm1</mtd>...
<mtd>Smn</mtd></mtr></mtable>r</mrow>.
Note that each row must have the same number of expressions, and there
should be at least two rows.
LaTeX matrix commands are not recognized.
Tokenization
------------
The input formula is broken into tokens using a "longest matching
initial substring search". Suppose the input formula has been
processed from left to right up to a fixed position. The longest
string from the list of constants (given below) that matches the
initial part of the remainder of the formula is the next token. If
there is no matching string, then the first character of the remainder
is the next token. The symbol table at the top of the ASCIIMathML.js
script specifies whether a symbol is a math operator (surrounded by a
``<mo>`` tag) or a math identifier (surrounded by a ``<mi>``
tag). For single character tokens, letters are treated as math
identifiers, and non-alphanumeric characters are treated as math
operators. For digits, see "Numbers" below.
Spaces are significant when they separate characters and thus prevent
a certain string of characters from matching one of the
constants. Multiple spaces and end-of-line characters are equivalent
to a single space.
Numbers
-------
A string of digits, optionally followed by a decimal point (a period)
and another string of digits, is parsed as a single token and
converted to a MathML number, i.e., enclosed with the ``<mn>`` tag.
Greek letters
-------------
* Lowercase letters
``alpha`` ``beta`` ``chi`` ``delta`` ``epsilon`` ``eta`` ``gamma`` ``iota``
``kappa`` ``lambda`` ``mu`` ``nu`` ``omega`` ``phi`` ``pi`` ``psi`` ``rho``
``sigma`` ``tau`` ``theta`` ``upsilon`` ``xi`` ``zeta``
* Uppercase letters
``Delta`` ``Gamma`` ``Lambda`` ``Omega`` ``Phi`` ``Pi`` ``Psi`` ``Sigma``
``Theta`` ``Xi``
* Variants
``varepsilon`` ``varphi`` ``vartheta``
Standard functions
------------------
``sin`` ``cos`` ``tan`` ``csc`` ``sec`` ``cot`` ``sinh`` ``cosh``
``tanh`` ``log`` ``ln`` ``det`` ``dim`` ``lim`` ``mod`` ``gcd``
``lcm`` ``min`` ``max``
Operation symbols
-----------------
.. mathml:: newcommand "bs" "\\"
======== ============================================= ============== =======
Type Description Entity MathML
======== ============================================= ============== =======
``+`` ``+`` ``+`` `+`
``-`` ``-`` ``-`` `-`
``*`` Mid dot ⋅ `*`
``**`` Star ⋆ `**`
``//`` ``/`` ``/`` `//`
``\\`` ``\`` ``\`` `bs`
``xx`` Cross product × `xx`
``-:`` Divided by ÷ `-:`
``@`` Compose functions ∘ `@`
``o+`` Circle with plus ⊕ `o+`
``ox`` Circle with x ⊗ `ox`
``o.`` Circle with dot ⊙ `o.`
``sum`` Sum for sub- and superscript ∑ `sum`
``prod`` Product for sub- and superscript ∏ `prod`
``^^`` Logic "and" ∧ `^^`
``^^^`` Logic "and" for sub- and superscript ⋀ `^^^`
``vv`` Logic "or" ∨ `vv`
``vvv`` Logic "or" for sub- and superscript ⋁ `vvv`
``nn`` Logic "intersect" ∩ `nn`
``nnn`` Logic "intersect" for sub- and superscript ⋂ `nnn`
``uu`` Logic "union" ∪ `uu`
``uuu`` Logic "union" for sub- and superscript ⋃ `uuu`
======== ============================================= ============== =======
Relation symbols
----------------
======== ============================================= ============== =======
Type Description Entity MathML
======== ============================================= ============== =======
``=`` ``=`` = `=`
``!=`` Not equals ≠ `!=`
``<`` ``<`` < `<`
``>`` ``>`` > `>`
``<=`` Less than or equal ≤ `<=`
``>=`` Greater than or equal ≥ `>=`
``-lt`` Precedes ≺ `-lt`
``>-`` Succeeds ≻ `>-`
``in`` Element of ∈ `in`
``!in`` Not an element of ∉ `!in`
``sub`` Subset ⊂ `sub`
``sup`` Superset ⊃ `sup`
``sube`` Subset or equal ⊆ `sube`
``supe`` Superset or equal ⊇ `supe`
``-=`` Equivalent ≡ `-=`
``~=`` Congruent to ≅ `~=`
``~~`` Asymptotically equal to ≈ `~~`
``prop`` Proportional to ∝ `prop`
======== ============================================= ============== =======
Logical symbols
---------------
======== ============================================= ================ =======
Type Description Entity MathML
======== ============================================= ================ =======
``and`` And " and " `and`
``or`` Or " or " `or`
``not`` Not ¬ `not`
``=>`` Implies ⇒ `=>`
``if`` If " if " `if`
``iff`` If and only if ⇔ `iff`
``AA`` For all ∀ `AA`
``EE`` There exists ∃ `EE`
``_|_`` Perpendicular, bottom ⊥ `_|_`
``TT`` Top ⊤ `TT`
``|--`` Right tee ⊢ `|--`
``|==`` Double right tee ⊨ `|==`
======== ============================================= ================ =======
Grouping brackets
-----------------
======== ============================================= ============== =======
Type Description Entity MathML
======== ============================================= ============== =======
``(`` ``(`` ``(`` `(`
``)`` ``)`` ``)`` `)`
``[`` ``[`` ``[`` `[`
``]`` ``]`` ``]`` `]`
``{`` ``{`` ``{`` `{`
``}`` ``}`` ``}`` `}`
``(:`` Left angle bracket ⟨ `(:`
``:)`` Right angle bracket ⟩ `:)`
``{:`` Invisible left grouping element
``:}`` Invisible right grouping element
======== ============================================= ============== =======
Miscellaneous symbols
---------------------
=========== ====================================== ================== =======
Type Description Entity MathML
=========== ====================================== ================== =======
``int`` Integral ∫ `int`
``oint`` Countour integral ∮ `oint`
``del`` Partial derivative &del; `del`
``grad`` Gradient ∇ `grad`
``+-`` Plus or minus ± `+-`
``O/`` Null set ∅ `O/`
``oo`` Infinity ∞ `oo`
``aleph`` Hebrew letter aleph ℵ `aleph`
``/_`` Angle ∠ `/_`
``:.`` Therefore ∴ `:.`
``...`` Ellipsis ... `...`
``cdots`` Three centered dots ⋯ `cdots`
``\<sp>`` Non-breaking space (<sp> means space) `|\ |`
``quad`` Quad space `|quad|`
``diamond`` Diamond ⋄ `diamond`
``square`` Square □ `square`
``|__`` Left floor ⌊ `|__`
``__|`` Right floor ⌋ `__|`
``|~`` Left ceiling ⌈ `|~`
``~|`` Right ceiling ⌉ `~|`
``CC`` Complex numbers ℂ `CC`
``NN`` Natural numbers ℕ `NN`
``QQ`` Rational numbers ℚ `QQ`
``RR`` Real numbers ℝ `RR`
``ZZ`` Integers ℤ `ZZ`
=========== ====================================== ================== =======
Arrows
------
======== ============================================= ============== =======
Type Description Entity MathML
======== ============================================= ============== =======
``uarr`` Up arrow ↑ `uarr`
``darr`` Down arrow ↓ `darr`
``rarr`` Right arrow → `rarr`
``->`` Right arrow → `->`
``larr`` Left arrow ← `larr`
``harr`` Horizontal (two-way) arrow ↔ `harr`
``rArr`` Right double arrow ⇒ `rArr`
``lArr`` Left double arrow ⇐ `lArr`
``hArr`` Horizontal double arrow ⇔ `hArr`
======== ============================================= ============== =======
Accents
-------
========== =================== ========
Type Description MathML
========== =================== ========
``hat x`` Hat over x `hat x`
``bar x`` Bar over x `bar x`
``ul x`` Underbar under x `ul x`
``vec x`` Right arrow over x `vec x`
``dot x`` Dot over x `dot x`
``ddot x`` Double dot over x `ddot x`
========== =================== ========
Font commands
-------------
========== ======================== ========
Type Description MathML
========== ======================== ========
``bb A`` Bold A `bb A`
``bbb A`` Double-struck A `bbb A`
``cc A`` Calligraphic (script) A `cc A`
``tt A`` Teletype (monospace) A `tt A`
``fr A`` Fraktur A `fr A`
``sf A`` Sans-serif A `sf A`
========== ======================== ========
Defining new commands and symbols
---------------------------------
It is possible to define new commands and symbols using the
``newcommand`` and ``newsymbol`` binary operators. The former defines a
macro that gets expanded and reparsed as ASCIIMathML and the latter
defines a constant that gets used as a math operator (``<mo>``)
element. Both of the arguments must be text, optionally enclosed in
grouping operators.
.. mathml:: newcommand "DDX" "{:d/dx:}"
.. mathml:: newsymbol{"!le"}{"≰"}
For example, ``newcommand "DDX" "{:d/dx:}"`` would define a new
command ``DDX``. It could then be invoked like ``DDXf(x)``, which
would expand to ``{:d/dx:}f(x)`` and appear `DDXf(x)`. The text
``newsymbol{"!le"}{"≰"}`` could be used to create a symbol you
could invoke with ``!le``, as in ``a !le b``, which appears `a !le b`.
Attributes for mstyle
=====================
For point of reference, this document was compiled with ``-D
mstyle='mathcolor=red'``.
* ``displaystyle``
The displaystyle attribute for the element, if specified. One of the
values "true" or "false". If the displaystyle is false, then fractions
are represented with a smaller font size and the placement of
subscripts and superscripts of sums and integrals changes.
* ``mathvariant``
The mathvariant attribute for the element, if specified. One of the
values "normal", "bold", "italic", "bold-italic", "double-struck",
"bold-fraktur", "script", "bold-script", "fraktur", "sans-serif",
"bold-sans-serif", "sans-serif-italic", "sans-serif-bold-italic", or
"monospace".
* ``mathsize``
The mathsize attribute for the element, if specified. Either "small",
"normal" or "big", or of the form "number v-unit".
* ``mathfamily``
A string representing the font family.
* ``mathcolor``
The mathcolor attribute for the element, if specified. It be in one of
the forms "#rgb" or "#rrggbb", or should be an html-color-name.
* ``mathbackground``
The mathbackground attribute for the element, if specified. It should
be in one of the forms "#rgb" or "#rrggbb", or an html-color-name, or
the keyword "transparent".