NAME
Parse::Marpa::Doc::Internals - Marpa Internals
OVERVIEW
Knowledge of Marpa internals is not necessary to use Marpa. But you can make better use of Marpa if you understand how it works. This document assumes that you have read the other Marpa documentation, and that you have worked with another parser generator, or have a general knowledge of parsing.
EARLEY SETS
To speak pedantically, the algorithm in Earley's 1970 parsing article does not actually parse in the strict sense. It is a recognizer, not a parser.
In Marpa, Earley sets are built as tokens are recognized. If an Earley item of the correct form is in the right place in these sets after input is finished, the input is recognized as one of those described by the grammar. Once the recognition phase is done, the phase that does parsing in the strict sense uses the Earley sets as its input.
Each of Marpa's earlemes corresponds one-to-one to an Earley set. An Earley set is a set of Earley items. Each Earley item has an QDFA state, an origin earleme and an dot earleme. An Earley item's dot earleme is also called its current earleme. Here's a representation of an Earley item, as you might see it in debugging output from Marpa:
S5@6-7
This Earley item is for QDFA state 5 (that is the "S5
" part), The "@6-7
" part says that this Earley item originates at earleme 6, and is current at earleme 7. The number of an Earley item's current, or dot, earleme is always the same as number of the Earley set that contains the item.
(Note to experts: Those familiar with Earley parsing will note that S5@6-7
does not look like a traditional Earley item. QDFA states are not traditional -- they are my elaboration of an invention by Aycock and Horspool. The use of the at-sign as punctuation follows Grune and Jacobs. Finally, in the traditional notation for an Earley item, the dot earleme is not included at all -- you're supposed to figure out from the context which Earley set an Earley item belongs to.)
QDFA STATES
I will mention QDFA's (quasi-deterministic finite automata) frequently and NFA's (non-deterministic finite automata) sometimes. All you need to know about NFA's is that the QDFA's are built from them. NFA state numbers sometimes appear in the diagnostic outputs. They can be ignored.
About QDFA's, it will be necessary to know a bit more. Let's start with an example of another Earley item:
S1@2-2
This states that this item is for QDFA state ;, that it originates at earleme 2; and that it is currently at earleme 2. We can get a description of the QDFA states with the show_QDFA
method. Here's what it says about QDFA state 1:
S1: predict; 1,5
e ::= . e op e
e ::= . number
<e> => S3
<number> => S4
For those who like the big picture or simply like to skim ahead, the entire show_QDFA
output, along with full trace and debugging output for this example, is in the appendices.
The first line of show_QDFA
's description of QDFA state 1 begins with its name: S1
. After it comes the predict
tag. That indicates this item was created by a prediction. That's one of the four ways an Earley item can be created. There's more on that below.
The two numbers after the "predict
" tag on the first line may be ignored. They are the numbers of the NFA states that were combined into this QDFA state. Marpa uses QDFA states to track the parse. Every QDFA state has one or more LR(0) items. The second and third lines in the display represent the LR(0) items. They are rules with a dot added to indicate how far recognition has proceeded into the rule.
A dot between two symbols indicates that all symbols before the dot have been recognized, but that none of the symbols after the dot have been recognized. A dot at the beginning of an LR(0) item indicates that the rule is a prediction -- none of its symbols have yet been recognized. A dot at the end indicates that the rule is complete -- all of its symbols have been recognized. The location of the dot is called the dot position.
The last two lines in the display show transitions. The first line says that, on seeing an e
, you transition to QDFA state 3. The second states that, on seeing a number
, you transition to QDFA state 4.
It's important not to confuse Earley items with LR(0) items. Earley items are built from one or more LR(0) items. In traditional Earley parsing, each Earley item contained one and only one LR(0) item. This made the internals simpler, but was not as efficient. Marpa combines LR(0) items into QDFA states based on ideas in Aycock and Horspool 2002.
Each QDFA state is a statement about possible parses. The presence of an QDFA state in an Earley item means that the parses in its LR(0) items are possible at the Earley item's dot earleme. The dot position of every LR(0) item in the QDFA state corresponds to the dot earleme of the Earley item. Also, the origin of every LR(0) item is at the origin earleme of the Earley item.
In our example, the dots in all the LR(0) items in S1 (QDFA state 1) are at the beginning of their rules. This means the origin and dot earlemes must be the same, and from the "2-2
" in the description of the Earley item S1@2-2
, we can see that that is the case.
HOW EARLEY SETS ARE BUILT
New items come onto the Earley sets in four ways: scanning, completion, prediction, and initialization.
Scanning
Scanning adds Earley items to indicate which tokens have been seen in the input, and where. Suppose the Earley item S1@2-2 (mentioned above) is present at earleme 2. Marpa knows to instruct the lexer to look for a number
token because there is a transition from S1 to S4 on number
.
If the lexer finds a number
token with a length (in earlemes) of 1 at earleme 2, a new Earley item S4@2-3 is added at earleme 3. (Earleme 3 because 3 is 2+1, the current token plus the token length.) Here (again from show_QDFA
) is the description of QDFA state 4.
S4: 6
e ::= number .
We can see that there is only one LR(0) item in QDFA state 4, and that it has the dot pointer at the end of its rule. An LR(0) item with the dot pointer at the end of its rule is a completed rule.
Marpa calls the item that was looking for the scanned symbol the predecessor of the item added by the scan. In this example, S1@2-2 is the predecessor of S4@2-3. Any item which is added to an Earley set based on an QDFA state transition from a predecessor, is called that predecessor's successor. In this example, S4@2-3 is the successor of S1@2-2.
An Example
Here's what S4@2-3 looks like in the Earley sets, after a successful scan of the input "2+2"
.
S4@2-3 [p=S1@2-2; t=2]
After the name of the Earley item is a link choice entry, shown in brackets. For brevity, and when the context makes the meaning clear, the link choice of an Earley item may simply be called a choice. This parse is not ambiguous, so all Earley items have at most one link choice entry.
In this case the link choice entry is for a token choice. Token choices are shown as [p=
predecessor; t=
token_value]
pairs. In the above example, the token choice [p=S1@2-2; t=2]
indicates that the token has a value of "2", and that, as mentioned above, the predecessor of S4@2-3 is S1@2-2.
Completion
Whenever an Earley item with a completed rule is added to an Earley set, other Earley items may be added to that Earley set as a result. The item with the completed rule is called the cause, and any new item added as a result is called an effect.
Let's look at one example of a completion "cause and effect". As the cause in our example, we'll use S4@2-3, the Earley item in our example of a scanned item. Its state was QDFA state 4. QDFA state 4 contained the LR(0) item
e ::= number .
The dot is at the far right, so it's a completed rule whose the left hand side is e
.
S4@2-3 begins at earleme 2 and has a complete rule with e
as its lhs, and therefore any Earley item at earleme 2 which is looking for an e
is in luck. S5@0-2 is such a rule. QDFA state 5 looks like this:
S5: 3
e ::= e op . e
<e> => S6
This says that in QDFA state 5, when you see an e
, you transition to QDFA state 6. Here's S6@0-3, the item which is the effect of the completion:
S6@0-3 [p=S5@0-2; c=S4@2-3]
The meanings of the terms predecessor and successor are similar to what they are in token scanning. Since S6@0-3 comes from moving the dot forward in a rule in S5@0-2, we say S5@0-2 is the predecessor of S6@0-3, and that S6@0-3 is the successor of S5@0-2.
The same Earley item can be added to the same Earley set for multiple reasons. A rule can be added because of one or more scanned tokens; because of one or more completions; or because of any combination of these.
Traditionally in Earley's algorithm, the same item is not added twice. Where an item comes from is not important in recognition. But parsing in the strict sense means determining the structure of the input, and that requires us to know why an Earley item was added to an Earley set. Every different reason for the existence of an Earley item in an Earley set is an alternative that can be followed when it comes time to create a parse structure.
Marpa uses the link choices to track the reasons why Earley items were added. Aycock and Horspool don't call them link choices, but the method is basically the one they give in their 2002 article.
We've already seen a token link choice in the scanning example. Earley item S6@0-3 has another example of a link choice: [p=S5@0-2; c=S4@2-3]
. This is another kind of link choice, a completion link choice. This link choice says that one reason for S6@0-3 to exist was a completion, with S4@2-3 as the cause and S5@0-2 the predecessor. Since this parse in unambiguous, there will be no other links, and this will remain the only reason for S6@0-3 to be in the Earley sets.
Prediction
A third source of Earley items is prediction. Whenever an Earley item is expecting to see an e
, for example, it can also expect to see any symbols that are at the start of any rules that have an e
on their left hand side.
Recall QDFA state 5
S5: 3
e ::= e op . e
<e> => S6
QDFA state 5 is expecting an e
. This means that the rules
e ::= . e op e
e ::= . number
can also be expected to originate at this earleme. Marpa is smart about grouping rules into QDFA states and both these rules are in QDFA state 1:
S1: predict; 1,5
e ::= . e op e
e ::= . number
<e> => S3
<number> => S4
This means that since S5@0-2 is at earleme 2, we should also expect to see an item for QDFA state 1 at earleme 2, and we do:
S1@2-2
Every rule in a predicted item will have the dot at the beginning. That means the origin and dot earlemes will always be the same.
Link choices are not recorded for predicted items. Prediction items are essential in the Earley sets, to get rules started. But they are not actually needed when the time comes for parsing and evaluation. That means they don't need to record link choices.
Initialization
Some Earley items are put into Earley set 0 to start things off. In our example they are
S0@0-0
S1@0-0
QDFA state 0 contains the start rules, with the dot pointer at the beginning. Only Earley set 0 will contain an Earley item for QDFA state 0. In our example, S1@0-0 contains rules predicted by S0@0-0.
HOW A SUCCESSFUL PARSE IS RECOGNIZED
As mentioned, what's usually called Earley's algorithm is just a recognizer, an algorithm to build the Earley sets. The input is recognized successfully if, in the last earleme, there is a an Earley item for a completed start rule state. A completed start rule state is an QDFA state containing a completed start rule. A completed start rule is the LR(0) item for a start rule that has its dot position at the end.
At most two of the QDFA states can be completed start rule states. One is a special case. In a null parse, the start state, QDFA state 0, is also a completed start rule state.
The other completed start rule state is not hard to spot. Marpa always adds its own, new, internal, start rule, with a special internal start symbol as its lhs, and the original start symbol as its rhs. The completed start rule state for non-null parses contains the LR(0) item which has Marpa's internal start rule and has the dot pointer at the end of that rule. The completed start rule state in our example is QDFA state 2:
S2: 8
e['] ::= e .
A parse is successful at earleme N it contains an Earley item for a completed start rule state with a origin earleme of 0 and a dot earleme of N. Here's the Earley item which makes the parse in our example successful at earleme 3:
S2@0-3 [p=S0@0-0; c=S6@0-3]
THE PARSE BOCAGE
After a successful parse has been recognized, it can be evaluated. The main data structure of a Marpa evaluator is a parse bocage.
Each Marpa evaluator starts with parse criteria: the start and end earlemes of the parse, and its top symbol. Most often these are the default. The default start earleme is earleme 0. The default end earleme in offline mode is the furthest earleme reached by any token in the parse. The default top symbol of the parse is the start symbol of the grammar.
The parse bocage stores all possible parses meeting the evaluator's parse criteria. Marpa's parse bocages are adaptations of parse forests. Parse bocages, unlike parse forests, are graphs, meaning that they may contain not just parse trees, but also parses which contain cycles and are therefore infinitely ambiguous.
Parse bocages, like parse forests, consist of and-nodes and or-nodes. As in parse forests, the or-nodes represent choices between pieces of parse trees. In parse forests, the and-nodes represent productions of the original grammar. In Marpa's parse bocages, the and-nodes are binary -- they have at most two children. Marpa's and-nodes represents pieces of the productions of the original grammar. The form in which productions are found in the parse bocages and-nodes is very similar to Chomsky Normal Form (CNF).
Marpa uses the binary, CNF-like version of the original grammar for four reasons. First, that's how it find it in the Earley items. Every link choices contains at most two links, and so each Earley item corresponds to a production with at most two symbols on the right hand side. Second, the form in which the grammar occurs in the Earley sets is even more compact than a parse forest. Third, unlike the parse forest, it can handle parses with cycles.
Fourth, while Marpa could rewrite the grammar back into its original form at any point, binary trees are easier and faster to traverse and manipulate. If Marpa were to rewrite the parse data from the Earley sets back into the productions of the original grammar, it would be spending effort to create and slower and more cumbersome data structure.
Every and-node represents either a production of the original grammar, or a piece of one. In a parse bocage, and-nodes and or-nodes alternate. Every and-nodes has or-nodes as children, and every or-node has and-nodes for children. This means that many or-nodes are trivial -- they have only one and-node as a child, and represent a choice from a list containing only a single alternative.
Names of Parse Bocage Nodes
An or-node correspond to an Earley item, and some selection criteria which guide and limit the choices which can be made from that Earley item. The name of an or-node consists of the name of an Earley item, followed by a label indicating the selection criteria. If an Earley item was reached through a predecessor link, the selection criteria are the rule currently being derived, and the current position in that rule.
For example, the or-node named S3@0-1R0:2 corresponds to Earley item S3@0-1. That Earley item was reached at position 2 in a derivation of rule 0 (e -> e . op e)
Note: is naming correct? Should this be position 1?
The current position in an or-node does not necessary correspond to the dot position of the Earley items. Nulled symbols may have been skipped over. No Earley items represent nulled symbols, but every nulled symbol is represented by a node in the parse bocage.
Or-nodes which aren't reached through predecessor links are reached via cause links, or are the top symbol. In this case the selection criterion is a single symbol, either the completed symbol corresponding to the cause link, or whatever was chosen as the top symbol. An example of a non-predecessor or-node is S6@0-3L0. In this case the Earley item was S6@0-3, and the predecessor symbol was symbol 0 (e). Where a symbol is the selection criterion, only LR(0) items with that symbol on their left hand side are valid choices.
What about tokens?
The Parse Bocage Grammar
Marpa typically displays a parse bocage as a parse bocage grammar. This is a grammar where each production contains an and-node as its left hand side, and one or two symbols on the right hand side. The right hand side of a parse bocage productions can be either one or two or-nodes, an or-node and a token value, or just a token value.
Check this.
It may seem pointless to take a grammar, and come this far in the parsing, only to write out another grammar. But there is a big difference between the original grammar and the parse bocage grammar. The original grammar describes all parses which are allowed to occur in any possible input. The parse bocage grammar describes only parses which actually did occur with the specific input which was accepted by the recognizer. In the parse bocage grammar, the work of parsing has actually been done.
Examples of Parse Bocage Nodes
Discuss examples of Parse Bocage Nodes, Including:
At least one with a token value
At least one with cause link on the rhs
At least one with predecessor link on the rhs
At least one with a predecessor lhs (Sn@a-bRx.y)
At least one with a non-predecessor lhs (Sn@a-bLx)
THE PARSE TREES
THE EVALUATION STACK
GRAMMAR REWRITING
Marpa rewrites grammars, adding internal symbols and rules in the process. This rewriting does not affect the semantics, but it does show up when you examine the internals.
Marpa's internal symbols have tags at the end, enclosed in square brackets. This means all Marpa's internal symbols end in a right square bracket.
Adding a Start Rule
For convenience, many parsers add their own start rule and their own start symbol to grammars. Marpa is no exception. The new start symbol is the old one with "[']
" suffixed. We saw a Marpa internal start symbol above: e[']
. If the grammar allows a null parse, there will also be a nulling start symbol, with "['][]
" suffixed.
Elminating Proper Nullable Symbols
Nulling symbols are those which always produce the empty sentence. Nullable symbols are those which can produce the empty sentence. Non-nullable symbols are those which never produce the empty sentence.
Pedantically, nulling symbols are also nullable symbols. A proper nullable is a symbol that, in the grammar of interest, can produce either the empty sentence or a non-empty sentence. In other words, a proper nullable is any nullable symbol which is not a nulling symbol.
Nullable symbols have been a headache for previous versions of Earley parsers. Aycock and Horspool 2002 outlined a new approach for dealing with them. I use their ideas with modifications of my own.
Marpa rewrites its grammar to eliminate proper nullables. It does this by turning the original proper nullable into a non-nullable symbol, and creating a second symbol to serve as the original symbol's nullable variant. The non-nullable variant of the original symbol keeps the original name, but is no longer allowed to appear in places where it might be nulled. In places where the original symbol was nulled, the nulling variant is substituted. The name of the nulling variant is that of the original symbol with the nulling tag ("[]
") suffixed.
The newly introduced nulling symbols will not appear on any left hand sides. Marpa marks nulling symbols internally and recognizes them directly. It does not need empty rules.
More difficult than dealing with proper nullables on the left hand side of rules, are the cases where proper nullables appear on the right hand side of rules. These rules have to be replaced with new rules covering every possible combination of the non-nullable and nulling variants. That rewrite is described in the next section.
CHAF Rewriting
To deal with the splitting of rhs proper nullables into two symbols, one non-nullable and one nulling, Aycock and Horspool simply created new rules for all possible combinations on the right hand side. This factoring is exponential in the worst case. I don't like leaving exponential explosions in an algorithm, even unlikely ones. And I felt the generation of all possible combinations for an arbitrarily long right hand side might be inefficient in practice.
A result due to Chomsky shows that any grammar can be rewritten as a grammar with at most two symbols on the right hand side. Relaxing Chomsky's rewrite to allow right hand sides with any number of symbols, but at most two proper nullables, produces a rewrite I call CHAF (Chomsky-Horspool-Aycock Form).
CHAF changes the worst case to linear, and in practical cases lowers the multiplier. Here's an example of a CHAF rewrite from Marpa's own self-grammar. First, the rule:
production paragraph:
non structural production sentences,
production sentence,
non structural production sentences,
optional action sentence,
non structural production sentences.
This rule contains four proper nullables, reinforcing my fear that grammars written as test cases won't be the only ones with lots of proper nullables on the right hand side. non structural production sentences
and optional action sentence
are both proper nullables and non structural production sentences
appears three times.
Here's is the output from show_rules
, showing what Marpa did with this rule:
12: production-paragraph
-> non-structural-production-sentences
production-sentence
non-structural-production-sentences
action-sentence:optional
non-structural-production-sentences /* !useful */
99: production-paragraph ->
non-structural-production-sentences
production-sentence
production-paragraph[R12:2][x5e]
/* priority=0.3 */
100: production-paragraph ->
non-structural-production-sentences[]
production-sentence
production-paragraph[R12:2][x5e]
/* priority=0.1 */
101: production-paragraph ->
non-structural-production-sentences
production-sentence
production-paragraph[R12:2][x5e][]
/* priority=0.2 */
102: production-paragraph ->
non-structural-production-sentences[]
production-sentence
production-paragraph[R12:2][x5e][]
103: production-paragraph[R12:2][x5e] ->
non-structural-production-sentences
production-paragraph[R12:3][x60]
/* priority=0.3 */
104: production-paragraph[R12:2][x5e] ->
non-structural-production-sentences[]
production-paragraph[R12:3][x60]
/* priority=0.1 */
105: production-paragraph[R12:2][x5e] ->
non-structural-production-sentences
production-paragraph[R12:3][x60][]
/* priority=0.2 */
106: production-paragraph[R12:3][x60] ->
action-sentence:optional
non-structural-production-sentences
/* priority=0.3 */
107: production-paragraph[R12:3][x60] ->
action-sentence:optional[]
non-structural-production-sentences
/* priority=0.1 */
108: production-paragraph[R12:3][x60] ->
action-sentence:optional
non-structural-production-sentences[]
/* priority=0.2 */
Rule 12 is the original rule. Because Marpa has rewritten it, the rule is marked !useful
, telling later stages in the precomputation to ignore it. Marpa breaks Rule 12 up into three pieces, each with no more than two proper nullables. Rules 99 to 102 are the first piece, with the first two symbols from Rule 12. Rules 103 to 105 are the second, with the 3rd symbol. Rules 106 to 108 are the third, with the 4th and 5th symbols from Rule 12.
Each piece is factored, so that every combination of nulling and non-nullable symbols is included. New symbols are introduced to be the left hand sides of the pieces. The tag "[R12:3]
" indicates that this symbol is the left hand side for the piece of Rule 12 which begins at right hand symbol 3 (the first symbol is symbol 0). The tags beginning with an "x
", like "[x5d]
", are arbitrary unique hex values, inserted to guarantee that the new symbols are unique.
This rule is a worst case for CHAF, because the last three symbols of the right hand side are all proper nullables. That means that the last two pieces of the original rule can be either empty or non-empty, and therefore that both of the newly created lhs symbols are also proper nullables.
There are now a total of 6 proper nullables: the original 4 plus the 2 symbols newly created to serve as left hand sides. This is why, in order to have only 2 proper nullables per piece, the original rule needed to be divided into 3 pieces. The newly created lhs symbols, because they are proper nullables, need to be split into nulling and non-nullable variants and factored, just like the proper nullables in the original rule.
Nonetheless this factoring can be done with 10 rules in CHAF, while the original Aycock-Horspool factoring (NNF) required 16. After more than 4 proper nullables, the advantage of CHAF becomes overwhelming. With 5 proper nullables, there would be 13 rules for CHAF versus 32 for NNF. With 6 proper nullables, 16 versus 64.
The user's semantics are preserved, because Marpa, while splitting the rule into pieces and factoring the pieces, inserts logic to gather and preserve the values of child nodes. These values are presented to the user's actions as if no CHAF rewrite had occurred.
Converting Sequence Productions to BNF
Internally, Marpa converts productions specified as sequences into BNF productions. The conversion is done in a standard way. For example,
paragraphs: empty line separated paragraph sequence.
becomes
2: paragraphs -> paragraph[Seq:1-*][Sep:empty_line][x5]
3: paragraphs -> paragraph[Seq:1-*][Sep:empty_line][x5] empty-line
4: paragraph[Seq:1-*][Sep:empty_line][x5] -> paragraph
5: paragraph[Seq:1-*][Sep:empty_line][x5] ->
paragraph[Seq:1-*][Sep:empty_line][x5] empty-line paragraph
In the added symbol, the tag "[Seq:1-*]
" indicates this is a symbol for a sequence of from 1 to an infinite number of symbols and the tag "[Sep:empty_line]
" that it is empty_line
separated.
Here's another example, this time of a sequence without a separator:
definition paragraph: definition sequence. concatenate lines.
is written as the following BNF:
9: definition-paragraph -> definition[Seq:1-*][xa]
10: definition[Seq:1-*][xa] -> definition
11: definition[Seq:1-*][xa] -> definition[Seq:1-*][xa] definition
SELF-TUTORIALS
If you want to investigate internals more on your own, here are two "self-tutorials", which should make you pretty much an expert.
First, go through the show_earley_sets
output in the appendix. For each Earley item in it, reason out how it came to be there.
Second, take the grammar used in the example and run it on the input text "2+2*3
". While the parse in this document was not ambiguous, the grammar was. The grammar's ambiguity reveals itself when there is more than one operation in the input string. Get the show_earley_sets
output from any one of the ambiguous parses and reason out how all of its Earley items came to be.
APPENDIX: DATA FOR THE EXAMPLE
Below are the MDL grammar and outputs for the example used in this document. The input text was "2+2
".
The MDL Grammar
semantics are perl5. version is 0.211.8.
start symbol is E.
E: E, Op, E.
q{
my ($right_string, $right_value)
= ($_[2] =~ /^(.*)==(.*)$/);
my ($left_string, $left_value)
= ($_[0] =~ /^(.*)==(.*)$/);
my $op = $_[1];
my $value;
if ($op eq "+") {
$value = $left_value + $right_value;
} elsif ($op eq "*") {
$value = $left_value * $right_value;
} elsif ($op eq "-") {
$value = $left_value - $right_value;
} else {
croak("Unknown op: $op");
}
"(" . $left_string . $op . $right_string . ")==" . $value;
}.
E: Number.
q{
my $v0 = pop @_;
$v0 . "==" . $v0;
}.
Number matches qr/\d+/.
Op matches qr/[-+*]/.
the default action is q{
my $v_count = scalar @_;
return "" if $v_count <= 0;
return $_[0] if $v_count == 1;
"(" . join(";", @_) . ")";
}.
The show_symbols
Output
0: e, lhs=[0 1], rhs=[0 2]
1: op, lhs=[], rhs=[0] terminal
2: number, lhs=[], rhs=[1] terminal
3: e['], lhs=[2], rhs=[]
The show_rules
Output
0: e -> e op e
1: e -> number
2: e['] -> e
The show_QDFA
Output
Start States: S0; S1
S0: 7
e['] ::= . e
<e> => S2
S1: predict; 1,5
e ::= . e op e
e ::= . number
<e> => S3
<number> => S4
S2: 8
e['] ::= e .
S3: 2
e ::= e . op e
<op> => S1; S5
S4: 6
e ::= number .
S5: 3
e ::= e op . e
<e> => S6
S6: 4
e ::= e op e .
The show_earley_sets
Output
Current Earley Set: 4; Furthest: 3
Earley Set 0
S0@0-0
S1@0-0
Earley Set 1
S4@0-1 [p=S1@0-0; t=2]
S2@0-1 [p=S0@0-0; c=S4@0-1]
S3@0-1 [p=S1@0-0; c=S4@0-1]
Earley Set 2
S5@0-2 [p=S3@0-1; t=+]
S1@2-2
Earley Set 3
S4@2-3 [p=S1@2-2; t=2]
S6@0-3 [p=S5@0-2; c=S4@2-3]
S3@2-3 [p=S1@2-2; c=S4@2-3]
S2@0-3 [p=S0@0-0; c=S6@0-3]
S3@0-3 [p=S1@0-0; c=S6@0-3]
The show_bocage
Output
package: Parse::Marpa::E_1; parse count: 0
S2@0-3L3[0] ::= S6@0-3L0
item: e['] ::= e .
argc=1; closure=sub { "DUMMY" }
S6@0-3L0[0] ::= S5@0-2R0:3 S4@2-3L0
item: e ::= e op e .
argc=3; closure=sub { "DUMMY" }
S5@0-2R0:3[0] ::= S3@0-1R0:2 '+'
item: e ::= e op . e
argc=3
S4@2-3L0[0] ::= '2'
item: e ::= number .
argc=1; closure=sub { "DUMMY" }
S3@0-1R0:2[0] ::= S4@0-1L0
item: e ::= e . op e
argc=3
S4@0-1L0[0] ::= '2'
item: e ::= number .
argc=1; closure=sub { "DUMMY" }
The show_tree
Output
Tree Node: S2@0-3L3[0]; Depth=0; Argc=1
Rule: e['] ::= e .
Cause: S6@0-3L0
Closure: CODE(0x5a462a0)
Tree Node: S6@0-3L0[0]; Depth=1; Argc=3
Rule: e ::= e op e .
Predecessor: S5@0-2R0:3
Cause: S4@2-3L0
Closure: CODE(0x5a46520)
Tree Node: S4@2-3L0[0]; Depth=2; Argc=1
Rule: e ::= number .
Value: '2'
Closure: CODE(0x5a46fa0)
Tree Node: S5@0-2R0:3[0]; Depth=2; Argc=3
Rule: e ::= e op . e
Predecessor: S3@0-1R0:2
Value: '+'
Tree Node: S3@0-1R0:2[0]; Depth=3; Argc=3
Rule: e ::= e . op e
Cause: S4@0-1L0
Tree Node: S4@0-1L0[0]; Depth=4; Argc=1
Rule: e ::= number .
Value: '2'
Closure: CODE(0x5a46fa0)
The trace_values
Output
Pushed value: '2'
Popping 1 values to evaluate S4@0-1L0, rule: 1: e -> number
Calculated and pushed value: '2==2'
Pushed value: '+'
Pushed value: '2'
Popping 1 values to evaluate S4@2-3L0, rule: 1: e -> number
Calculated and pushed value: '2==2'
Popping 3 values to evaluate S6@0-3L0, rule: 0: e -> e op e
Calculated and pushed value: '(2+2)==4'
Popping 1 values to evaluate S2@0-3L3, rule: 2: e['] -> e
Calculated and pushed value: '(2+2)==4'
(2+2)==4
SUPPORT
See the support section in the main module.
AUTHOR
Jeffrey Kegler
LICENSE AND COPYRIGHT
Copyright 2007 - 2008 Jeffrey Kegler
This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0.