NAME
Marpa::Semantics - How Marpa Evaluates Parses
SYNOPSIS
my $grammar = Marpa::Grammar->new(
{ start => 'Expression',
actions => 'My_Actions',
default_action => 'first_arg',
rules => [
{ lhs => 'Expression', rhs => [qw/Term/] },
{ lhs => 'Term', rhs => [qw/Factor/] },
{ lhs => 'Factor', rhs => [qw/Number/] },
{ lhs => 'Term', rhs => [qw/Term Add Term/], action => 'do_add' },
{ lhs => 'Factor',
rhs => [qw/Factor Multiply Factor/],
action => 'do_multiply'
},
],
}
);
sub My_Actions::do_add {
my ( undef, $t1, undef, $t2 ) = @_;
return $t1 + $t2;
}
sub My_Actions::do_multiply {
my ( undef, $t1, undef, $t2 ) = @_;
return $t1 * $t2;
}
sub My_Actions::first_arg { shift; return shift; }
my $value_ref = $recce->value;
my $value = $value_ref ? ${$value_ref} : 'No Parse';
OVERVIEW
This document describes Marpa's standard semantics. Marpa's standard semantics are shared by Marpa's Single Parse Evaluator and its Multi-parse Evaluator. As of this writing, the standard semantics are the only semantics available for Marpa.
Marpa's standard semantics will be familiar to those who have used traditional methods to evaluate parses. A parse is seen as a parse tree. Nodes on the tree are evaluated recursively, bottom-up. Once the values of all its child nodes are known, a parent node is ready to be evaluated. The value of a parse is the value of the top node of the parse tree.
Marpa's semantics see a virtual parse tree with virtual nodes. These virtual structures keep ambiguity and rule rewrites invisible to the semantics. Nodes in Marpa's virtual parse trees are of three kinds:
Token Nodes
Rule Nodes
Null Nodes
Action Names and Semantic Closures
During grammar generation and input recognition, Marpa does not allow semantics to be specified directly as Perl closures. This approach to semantics follows Perl 6, the wisdom of whose example I discovered the hard way.
When a Marpa grammar is created, all non-constant semantics take the form of strings which express actions. Actions are semantics in an indirect form. Actions are given meaning later by an evaluator.
At evaluation time, actions are resolved to semantic Perl closures, and the semantic Perl closures are run to produce values. (For brevity, semantic Perl closures are sometimes called semantics closures.)
NODES
Token Nodes
For every input token, there is an associated token node. In the usual, token-stream, model of input, every token will become a leaf node in the parse tree. Tokens always have a token symbol. At lexing time, they can be assigned a token value. If no token value is assigned at lex time, the token value defaults to a Perl undef
.
Rule Nodes
Nodes which are ancestors of token nodes are called rule nodes. Rule nodes are always associated with a rule. The value of a rule node is computed at node evaluation time. Applications can specify, on a per-rule basis, semantic Perl closures to evaluate rule nodes.
The semantic closure's arguments will be a per-parse variable followed by the values of its child nodes in lexical order. The value returned by the semantic Perl closure becomes the value of the node. If there is no semantic closure for a rule node, the value of the rule node is a Perl undef
.
Sequence Rule Nodes
Some rules are sequence rules. Everything said above about rule nodes, also applies to sequence rule nodes. Specifically, the arguments to the semantic Perl closures for sequence rules are the per-parse variable followed by the values of the child nodes in lexical order.
The difference (and it is a big one) is that in an ordinary rule, the right hand side is fixed in length, and that length is known when you are writing the semantic Perl closure. In a sequence rule, the number of right hand side symbols is not known until node evaluation time. A semantic Perl closure for a sequence rule node is written in the same way that you write any Perl closure that handles a variable number of arguments.
Sequence semantics work best for sequences of items all of which have the same semantics. When that is not the case, writing the sequence using ordinary non-sequence rules should be considered as an alternative.
By default, if a sequence rule has separators, the separators are thrown away before the semantics are applied. They do not appear in the semantic Perl closure's @_
array. If the value of the keep
rule property is a Perl true, separators are kept, and do appear in the @_
array.
Null Nodes
A null node is a node which derives the zero-length, or empty string. This means that a null node cannot be the ancestor of any token nodes. In Marpa, null nodes are always leaf nodes.
Null nodes are of two kinds. A nulling symbol node corresponds to a nulling symbol. A nulled rule node represents a nulled rule.
For every null node there is a null node symbol, which is used to the determine the value of the null node. For a nulled rule node, the null node symbol is the nulled rule's left hand side symbol. For a nulling symbol node, the null node symbol is the nulling symbol.
The value of a null node is the null value of the null node symbol. The null value of a symbol comes from that symbol's null_value
property, if one is defined. Otherwise, the null value of the symbol comes from the grammar's default null value, as defined by the grammar's default_null_value
named argument. If neither the symbol null_value
property or the grammar's default_null_value
named argument is defined, a symbol's null value is a Perl undef
,
A null subtree is a subtree all of whose nodes are null. Marpa prunes all null subtrees back to their topmost null node. This means that all null nodes that remain in Marpa's virtual parse tree will be leaf nodes.
The "lost" semantics of the non-root nodes of null subtrees are usually not missed. Null subtrees cannot contain token nodes, so no token nodes are lost when null subtrees are pruned. As bushy as a null subtree might be, all of its nodes are null nodes, and the semantics of null nodes is almost always constant.
All null nodes correspond to zero-length strings, so we are literally dealing here with the "semantics of nothing". In theory the semantics of nothing can be arbitrarily complex. In practice it should be possible to keep them simple.
If any application ever actually needs it, Marpa could implement a complex, and even dynamic, "semantics of nothing". For details see below.
Null Sequence Nodes
Rule nodes for sequences were mentioned above. Sequence nodes can also be null nodes. This happens with sequence rules which have a min
rule property of 0. Such a sequence rule can contain any number of sequence items, including zero items. When a sequence contains zero items, it must derive the zero-length string, and its node is a null node.
Sequence null nodes follow the rules for null nodes. Their value is that of a symbol -- the left hand side symbol of the nulled sequence rule.
Nullable sequence rules do have a potential to surprise. When the node for a nullable sequence rule is a null node, its semantics comes from the null value for its left hand side symbol. When the node for a nullable sequence rule is not a null node, then it is a rule node and its semantics come from the rule. It's up to the application to see that these two -- the null value and the sequence rule -- "make sense" together. One way to do this is to create a special symbol which is dedicated to service as the left hand side symbol for the sequence rule. That dedicated symbol can then be given the correct semantics.
The rules for nodes in null subtrees apply with equal force to nodes for sequence rules. In a nulled subtree, the only node whose semantics matters is the root node of that subtree. If a zero-length sequence is in a nulled subtree, and that zero-length sequence is not the root node of that subtree, its semantics will be completely ignored.
SEMANTIC PHASES
This section explains the order in which the semantics specified by an application is implemented. As a reminder, the implementation of an action requires two steps. In the first step the action is resolved to a Perl closure. In the second step that Perl closure is called.
When the semantics are applied to a parse, it produces a value called a parse result. Because Marpa allows ambiguous parsing, each parse can have zero or more parse results. The series of parse results for a single set of evaluation parameters is called a parse series.
Evaluation can be reset with the Marpa::Recognizer::reset_evaluation method. When this is done, the recognizer begins a new parse series.
Recognizer Setup Phase
In the Recognizer Setup Phase, the null values of the symbols are determined. The null values of symbols never change -- they are in effect properties of the grammar.
For every Marpa Recognizer object, the Recognizer Setup Phase occurs before any Parse Series Setup Phase. The Recognizer Setup Phase occurs only once in the life of a recognizer object.
Parse Series Setup Phase
During the Parse Series Setup Phase all semantic action names are resolved to Perl closures. The semantic Perl closures are only resolved in this phase -- they will not be called until the Parse Tree Traversal Phase.
A Parse Series Setup Phase occurs in the first call of the Marpa::Recognizer::value method of each parse series. Ranking action names are not resolved in the Parse Series Setup Phase.
Ranking Phase
In the Ranking Phase, all ranking actions are resolved to Perl closures, and those Perl closures are called. The Ranking Phase is the only time during which ranking actions are resolved or called.
Ranking Phases do not occur if the ranking method is left as the default. If the ranking method is set to a value other than "none
", a Ranking Phase occurs in the first call of the Marpa::Recognizer::value method of each parse series.
Parse Result Setup Phase
In the Parse Result Setup Phase, the per-parse variable is created. If a constructor was found for the action_object
, it is run at this point, and the per-parse variable is its return value. A Parse Result Setup Phase occurs once for each parse result or, in other words, once for each call to the Marpa::Evaluator::value
method.
In every parse series, all the Parse Result Setup Phases occur after the Parse Series Setup Phase. In every parse series which has a Ranking Phase, all the Parse Result Setup Phases occur after the Ranking Phase. For each parse result, the Parse Result Setup Phase occurs before any of the Parse Tree Traversal Phases.
Parse Tree Traversal Phase
During the Parse Tree Traversal Phase, the semantic Perl closures are called. For each parse result, the Parse Tree Traversal Phase follows the Parse Result Setup Phase.
During parse tree traversal, Node Evaluation Time occurs for every node in the parse. A Parse Tree Traversal Phase occurs once for each call to the Marpa::Evaluator::value
method.
Node Evaluation Time
Node Evaluation Time is not really a separate phase. Node Evaluation Time is the Parse Tree Traversal Phase, seen from the point of view of the individual nodes of the parse tree. If a node has a semantic Perl closure, it is called at Node Evaluation Time.
SEARCHING FOR RULE SEMANTIC CLOSURES
Marpa finds the semantic Perl closure for each rule based on rule and symbol properties and on the grammar named arguments. The search for a semantic Perl closure is equivalent to following these steps:
Resolve the
action
propertyIf no
action
property is defined for a rule, the search proceeds to the next step.If the
action
property for a rule is defined and if it resolves to a Perl closure, that Perl closure becomes the rule's semantic Perl closure, and the search ends. The resolution of action names is described below.If a defined
action
property does not successfully resolve to a closure, no further attempt to find the semantic Perl closure is made. The search ends, and an exception is thrown.Resolve the
lhs
propertyIf the
lhs
property for a rule resolves to a Perl closure, that Perl closure becomes the rule's semantic Perl closure, and the search ends.If the
lhs
property fails to resolve to a Perl closure, Marpa is unusually lenient. No exception is thrown. This is the one case where failure of an action name to resolve is not treated as a fatal error. The search proceeds to the next step.Resolve the
default_action
Named ArgumentIf the grammar does not have a
default_action
defined, the search proceeds to the next step.If the grammar does have a
default_action
defined, and if it resolves to a Perl closure, that Perl closure becomes the rule's semantic Perl closure, and the search ends.If a defined
default_action
grammar named argument does not successfully resolve to a closure, no further attempt to find the semantic Perl closure is made. The search ends, and an exception is thrown.Evaluate to
undef
If no semantic Perl closure is found, the value of the rule is always a Perl
undef
. Marpa optimizes for this case, but the effect is the same as if the rule's semantic Perl closure was the following:sub { return }
RESOLVING ACTION NAMES
Action names are resolved to semantic Perl closures in the evaluator setup phase. Candidates for resolution as action names include
The
default_action
named argument of Marpa's grammar.The
action
property of Marpa's rules.The
ranking_action
property of Marpa's rules.The
ranking_action
property of Marpa's symbols.The
new
constructor in the package specified by theaction_object
named argument of the Marpa grammar;The
lhs
property of Marpa's rules.
Resolution of the action object constructor is explained below as a special case.
Explicit Resolution
The standard semantics support the closures
named argument, which allows the user to directly control the mapping from action names to semantic Perl closures. The value of the closures
named argument is a reference to a hash whose keys are action names and whose hash values are CODE refs.
If an action name is the key of an entry in the closures
hash, it resolves to the closure referenced by the value part of that hash entry. Resolution via the closures
named argument is called explicit resolution.
When explicit resolution is the only kind of resolution that is wanted, it is best to pick a name that is very unlikely to be the name of a Perl closure. Many of Marpa::HTML's action names are intended for explicit resolution only. In Marpa::HTML those action names begin with an exclamation mark ("!"), and that convention is recommended.
Fully Qualified Action Names
If explicit resolution fails, Marpa transforms the action name into a fully qualified Perl name. An action name that contains a double colon ("::
") or a single quote ("'
") is considered to be a fully qualified name. Any other action name is considered to be a bare action name.
If the action name to be resolved is already a fully qualified name, it is not further transformed. It will be resolved in the form it was received, or not at all.
For bare action names, Marpa tries to qualify them by adding a package name. If the actions
grammar named argument is defined, Marpa uses it as the package name. Otherwise, if the action_object
grammar named argument is defined, Marpa uses it as the package name. Once Marpa has fully qualified the action name, Marpa looks for a Perl closure with that name in the namespace.
If Marpa cannot fully qualify an action name, it will not attempt to resolve it. This means that for an action name to resolve successfully, one of these four things must be the case:
The
actions
named argument is defined.The
action_object
named argument is defined.The action name resolves explicitly.
The action name is fully qualified to begin with.
In all but one circumstance, failure to resolve an action name is thrown as an exception. Marpa is more lenient when it attempts to use the lhs
rule property as an action name. That is the one case in which Marpa will look at other alternaties. See the section on finding rule semantic closures.
Marpa's philosophy is that asking the programmer to be specific about action names can be a slight inconvenience, but silently executing unintended code might be a major disaster. Marpa prefers action name resolution to fail when the application's intent is not clear.
Generally it's best practice to put the semantic Perl closures into their own namespace. But if, for example, the user wants to leave the semantic closures in the main
namespace, she can specify "main"
as the value of the actions
named argument.
THE PER-PARSE VARIABLE
In the parse setup phase, Marpa creates a per-parse variable. This becomes the first argument of the semantic Perl closures for the rule nodes. If the grammar's action_object
named argument is not defined, the per-parse variable is initialized to an empty hash ref.
Most data for the rule node semantic Perl closures will be passed up the parse tree. The semantic Perl closures will see the values of their child nodes as arguments, and will return their own value to be seen as an argument by their parent node. The per-parse variable can be used for data which does not conveniently fit this model.
The per-parse variable has the same lifetime as the parse. After it is initialized, Marpa's internals never alter it -- it is reserved for use by the semantic Perl closures implementing rule node semantics.
Action Object Constructor
If the grammar's action_object
named argument has a defined value, that value is treated as the name of a class. The action object constructor is the new
method in the action_object
class.
The action object constructor is run at parse setup time. The return value of the action object constructor becomes the per-parse variable. It is a fatal error if the grammar's action_object
named argument is defined, but does not name a class with a new
method.
The fully qualified name of the action object constructor is the value of the action_object
named argument followed by the literal string "::new
". Resolution of the action object constructor is done as resolution of this fully qualified action name.
If a grammar has both the action
and the action_object
named arguments defined, all action names except for the action object constructor will be resolved in the action
package or not at all. Only the action object constructor name will be resolved using the action_object
class.
Bypass via explicit resolution applies to the action object constructor. If the fully qualified name of the action object constructor is a hash key in the evaluator's closures
named argument, then the Perl closure referred to by the value of that hash entry becomes the action object constructor.
PARSE ORDER
Duplicate Parses
The same parse result is never returned twice. A parse result is considered to be a duplicate of another if both parse results apply the same rules in the same order at the same earleme locations.
This may seem obvious, but in parsers which use LR states (of which Marpa is one) otherwise identical parses may involve a different series of LR states. Earlier more naive Marpa implementations would return parses which were different if the LR state was taken in account, but which were identical according to the more natural definition in the previous paragraph.
Default Parse Order
By calling the Marpa::Recognizer::value method repeatedly, Marpa can produce all the parse results for a given parse. The default is for the parse results to be returned in an arbitrary order. This corresponds to the "none
" value of the Recognizer's ranking_method
named argument.
A General Approach to Sorting Parses
The most general way to sort Marpa parses is for the application to take control. The application can set up the Marpa semantic actions set that parses return <rank, value>
duples. These duples can be implemented as pointers to arrays. The duples can then be sorted and the rank discarded. In the worst case, producing and sorting all the parses can take a very long time.
The Marpa interface has no special features to deal with this general case, because they would add clutter rather than value. And in the worst case, putting the logic inside Marpa would not improve efficiency.
The "Constant" Ranking Method
Marpa does support a simplified approach to sorting parses. It is a compromise between generality and efficiency. The Constant Ranking Method is general enough to handle many, perhaps even most, practical applications, and simple enough that Marpa is able to optimize it.
The Constant Ranking Method is specified by giving the Marpa Recognizer's ranking method named argument a value of "constant
".
The basic idea is to allow the user to specify constant values for rules, and to rank all other nodes according to the sum of the values of their children. Leaf nodes default to a value of 0.
The value of a rule must be "constant" in the sense that it cannot depend on the values of its children. This is a major limitation, but it greatly simplifies the logic for re-ranking parses as they are iterated. And it is less of a limitation than it may appear, because rules without ranking closures will take into account the values of their children. By strategicly mixing rules have with no ranking action, but which do take into account child values, and rules which can have a user-specified rank, but which are not allowed to take into account child values, most real-life applications can accomplish what they need to.
The ranking action of a token leaf node is specified using the token symbol's ranking_action
property. The ranking action of a nulled leaf node is specified using the null node symbol's ranking_action
property. The ranking action of a rule is specified using the rule's ranking_action
property.
Ranking actions should return a reference to the rank. Ranks must are Perl numbers. Negative values and non-integer values are allowed. The highest numeric value is considered to be the highest rank, and the lowest numeric value is considered to be the lowest rank.
As a special case, if a ranking action returns a Perl undef
, Marpa will skip that possibility, rather than ranking it. Note that any "skipped" node in a parse result causes that whole parse result to be skipped. A consequence of this is that any skipped node in an unambiguous parse will result in no parses being returned.
It may seem to be draconian, but in fact skipping the entire tree is probably the easiest and most natural way to deal with skipped nodes. The traditional semantics requires trees to have a full set of nodes. It is unclear how a semantics of partial trees would be defined, or of what use it would be if it existed.
An instance of a rule is a rule, a start location, and an end location. Ranking actions are called once for each rule instance. While ranking actions return constants in the sense that they cannot be aware of the ranks of their child nodes, the rank returned can vary based on the rule's start and end location. The ranking actions can determine their location using the context-aware static methods.
For the rank of a node to be calculated, the ranking action must first be resolved to a ranking Perl closure. Ranking actions are resolved to ranking Perl closures in the evaluator setup phase, using the same logic that resolves semantics actions to Perl semantic closures. The logic that resolves action names to closures is described above. The ranking actions are resolved to Perl closures, and the Perl closures called in the Ranking Phase.
Using Ranking for Side Effects
For every parse series, ranking actions are quaranteed to be called once and only once for each rule instance. As a reminder, a rule instance is a rule, together with a start and end location. This regularity makes it useful to use ranking actions for side effects, even when there is no interest in actually ordering the parses, and even in evaluating the parse result.
An application may wants to know all the instances of a particular rule that occur in any parse result. An practical example is an application which is looking for ambiguities -- cases in the parse series where two different rules derive the same input string. The application can create ranking actions which have the side effect of tracking instances of these two rules by location, and use this information to detect ambiguities. If there is no interest in an actual parse, ranking actions which return undef
can cause all parses to be discarded.
This is much faster than the alternative of producing all the parse results and traverse each trees, looking for nodes with those rules. The number of parse results grows much faster than the number of ambiguities -- it can be exponential in the length of the input.
CONTEXT-AWARE STATIC METHODS
sub rank_null_a {
return \( ( $MyTest::MAXIMAL ? -1 : 1 )
* 10**( 3 - Marpa::token_location() ) );
}
Ranking Perl closures have available a set of context-aware static methods. A closure can use these methods to learn about the context in which it is called. As of this writing, the context-aware static methods are tested for the ranking Perl closures only. This may change in the future.
Marpa::location
Returns the earleme location of the origin (or start) of the rule.
Marpa::token_location
Returns the earleme location of the token. Intended for use in callbacks associated with empty rules and nulling symbols.
NULL SUBTREES
In Marpa, a null node must be leaf node. Because Marpa prunes every null subtree back to its topmost null node, none of the non-root nodes in a null subtree are represented in Marpa's virtual parse tree. Here's an example:
sub L {
shift;
return 'L(' . ( join q{;}, @_ ) . ')';
}
sub R {
shift;
return 'R(' . ( join q{;}, @_ ) . ')';
}
sub S {
shift;
return 'S(' . ( join q{;}, @_ ) . ')';
}
my $grammar = Marpa::Grammar->new(
{ start => 'S',
actions => 'main',
rules => [
[ 'S', [qw/L R/] ],
[ 'L', [qw/A B X/] ],
[ 'L', [] ],
[ 'R', [qw/A B Y/] ],
[ 'R', [] ],
[ 'A', [] ],
[ 'B', [] ],
[ 'X', [] ],
[ 'Y', [] ],
],
symbols => {
L => { null_value => 'null L' },
R => { null_value => 'null R' },
A => { null_value => 'null A' },
B => { null_value => 'null B' },
X => { null_value => 'null X', terminal => 1 },
Y => { null_value => 'null Y', terminal => 1 },
},
}
);
$grammar->precompute();
my $recce = Marpa::Recognizer->new( { grammar => $grammar } );
$recce->tokens( [ [ 'X', 'x' ], ] );
If we write the unpruned parse tree one node per line in pre-order, depth-first, indenting children below their parents, we get something like this:
0: Rule Node, Rule: S := L R
1: Rule Node, Rule L := A B X
1.1: Null Node, Symbol A
1.2: Null Node, Symbol B
1.3: Token Node, Token value is 'x'
2: Rule Node, Rule R := A B Y
2.1: Null Node, Symbol A
2.2: Null Node, Symbol B
2.3: Null Node, Symbol Y
In this example, six nodes are nulled. Four of them are in a single subtree: 2, 2.1, 2.2 and 2.3. Marpa prunes every null subtree back to its null root node, which in this case is the node numbered 2.
The pruned tree looks like this
0: Rule Node, Rule: S := L R
1: Rule Node, Rule L := A B X
1.1: Null Node, Symbol A
1.2: Null Node, Symbol B
1.3: Token Node, Token value is 'x'
2: Null Node, Symbol R
Here is the output:
S(L(null A;null B;x);null R)
In the output we see
The null value for node 1.1: "
null A
".The null value for node 1.2: "
null B
".The token value for node 1.3: "
x
".An application of the semantic Perl closure for node 1.
The null value for node 2: "
null R
".An application of the semantic Perl closure for rule node 0.
We do not see any output for nodes 2.1, 2.2, or 2.3 because they were non-root nodes in the pruned subtree. We do see the null value for node 2, because after pruning it is a leaf node. We do not see an application of the semantic Perl closure for node 2, because after pruning, node 2 is not a rule node.
The Semantics of Nothing
Rarely, your application may call for a complex semantics of nothing. If the semantics of nothing, while complex, remains constant, you can handle it setting every nullable symbol's null_value
property to the value which your semantics produces when that nullable symbol is the root symbol of a null subtree.
If the values in your "semantics of nothing" are not constants, Marpa can still calculate them. Determine which of your nullable symbols have a dynamic semantics. Call these your dynamic nullables. Let the null_value
property of every dynamic nullable be a hash key. For every rule with a dynamic nullable on its right hand side, write the rule's semantic Perl closure so that it maps that hash key to a Perl closure, which it then runs to calculate the value of the dynamic nullable.
INFINITELY AMBIGUOUS GRAMMARS
Marpa handles will parse using an infinitely ambiguous grammar. (In the technical literature, an infinite ambiguity is more usually called a cycle.)
An example of an infinitely ambiguous grammar is the following:
S ::= A
A ::= B
B ::= A
B :: 'x'
Given the input 'x', this grammar will produce these parses
S -> A -> B -> x
S -> A -> B -> A -> B -> x
S -> A -> B -> A -> B -> x -> A -> B -> x
.
.
.
Because of the two rules A ::= B
and B ::= A
, this list of parses could go on forever. The two rules A ::= B
and B ::= A
form what is called a cycle.
Typically, if a user has written an grammar with an infinite cycle, it was a mistake and he wants to rewrite it before proceeding. By default, an infinitely ambiguous grammar is a fatal error. This is the behavior most users will want.
To proceed to producing parse results from an infinitely ambiguous grammar, the user must set the grammar's infinite action named argument to a value other than "fatal
". The other choices are "warn
" and "quiet
".
Cycle Length
Obviously, Marpa cannot list all of an infinite number of parse results. Marpa deals with potentially infinite parses by limiting the cycle length. Cycle length is the number of times a parse derivation goes around a potentially infinite cycle.
Marpa limits all cycles to a length of 1. There will always be a finite number of these parse results.
Above I showed a set of parses from an example of an infinitely ambiguous grammar. Here are those parses again, this time labeled with their cycle length.
Cycle length 1: S -> A -> B -> x
Cycle length 2: S -> A -> B -> A -> B -> x
Cycle length 3: S -> A -> B -> A -> B -> x -> A -> B -> x
Of the parse results in the above list, Marpa would return a value only for the first, the one whose cycle length is 1.
Caveats
Marpa allows an application to work around an infinite ambiguity. Most parsers force a programmer to make cycle removal her first priority. Marpa allows her to deal with the cycle later or, if she chooses, never.
The precise behavior of Marpa's cycle detection is somewhat loosely defined. In Marpa cycle is a return to the same rule, with the same start and end earlemes. But these are the rules as rewritten by Marpa, not as in the original grammar. The CHAF rewrite, for example, breaks rules up, and a CHAF rewrite can cause a cycle to end before it would end in terms of the rules of the original grammar.
Marpa's exact definition of a cycle is experimental and subject to change. In particular, it would be unwise to have a semantics rely in detail on the content and length of cycles. Marpa's cycle detection could be made to follow the original grammar, and more carefully defined, but far as I know nobody cares about these details. The present implementation is more than adequate as a fail-safe, and I believe it matches or exceeds the extent of present interest.
LICENSE AND COPYRIGHT
Copyright 2007-2010 Jeffrey Kegler, all rights reserved. Marpa is free software under the Perl license. For details see the LICENSE file in the Marpa distribution.