NAME
Parse::Stallion::RD - Parser for subset of grammars written for Parse::RecDescent
NOTE
This is an exercise to show how to use Parse::Stallion, the module Parse::Stallion::RD runs atop of. On some test cases, Parse::Stallion::RD runs faster than Parse::RecDescent. Rewriting a grammar for Parse::Stallion should be even faster.
There are differences in behaviors of a parser generated with Parse::Stallion::RD and a parser from Parse::RecDescent. If behavior is missing here that is desired, please report to arthur\@acm.org . The implemented features with differences are listed below, other features were not put in for this release.
VERSION
0.41
SYNOPSIS
use Parse::Stallion::RD;
$parser = new Parse::Stallion::RD($grammar);
$parser->startrule($text);
compared with:
use Parse::RecDescent;
$parser = new Parse::RecDescent($grammar);
$parser->startrule($text);
DESCRIPTION
See Parse::RecDescent's documentation. This section lists what is similar to Parse::RecDescent and what is not.
Features implemented with noted differences
and rules
or rules (|)
rules defined on different lines
single quotes
double quotes (with interpolation)
repetition specifier (with separator, Parse::Stallion::RD allows non-raw patterns as separators)
tokens
$skip
actions
@item and %item (the set of values in %item are the same values in @item which is different than Parse::RecDescent for some directives, i.e. leftop and rightop)
$return
$skip
$thisline and $prevline
$thisparser (set for within an action but not following the action)
start-up actions
autoactions ($::RD_AUTOACTION)
look-ahead
<leftop>
<rightop>
<reject>
alternations (though the naming of alternations is not consistent with %item
<commit>, <uncommit>
<error>, <error?>, <error: message>, <error?: message> (error messages are not split across lines the same way, if an <error> directive is not the last or clause in a production, then only the or-clauses that occured before will show up, error messages cannot contain '>' or '<')
<defer>
$text (does not reset the text back if modified)
<rulevar>
Differences between Parse::RecDescent and Parse::Stallion
Here are some noteworthy differences between Parse::Stallion and Parse::RecDescent that come up while developing this module.
String/Code
Parse::RecDescent takes a grammar from a string, Parse::Stallion is set up via perl code. Parse::Stallion::EBNF has a string oriented interface for Parse::Stallion.
Actions/Evaluation/Parse Forward/Parse Backtrack
Deferred actions somewhat correspond to the evaluation phase of Parse::Stallion.
In Parse::Stallion, if the evaluation is done after the parsing, the evaluation routine does get the results of other 'sub' evaluation routines within its parameters. In Parse::RecDescent a delayed action just returns the number of delayed actions to that point and not the result of the delayed action, an undelayed action returns either $return or the value of the last line.
In Parse::RecDescent, items in actions are similar to the parameters passed in the evaluation phase of Parse::Stallion.
Parse::Stallion also has Leaf nodes with subroutines that execute during the parsing phase: parse_forward and parse_backtrack. Those are used in Parse::Stallion::RD to mimic the actions of Parse::RecDescent.
In Parse::Stallion, if a parameter occurs more than once, it is passed in as an array reference, instead of being overwritten as is done in Parse::RecDescent. The parameters passed in during the evaluation phase correspond to all the subrules in a possibly complex rule, not just the latest items in an and clause, as is in Parse::RecDescent.
Error
<error> messages in Parse::RecDescent provide useful information, such as where an error occured, what else was expected. This can clearly be duplicated as was done in this module, but it requires making use of recording the position with Leaf rules. Though the returned parameters max_position and max_position_at may help.
MATCH_ONCE
Parse::Stallion has the option of a rule matching once and if failing, not to attempt 'variations'. That is, if a multiple rule matches the string 5 times, the Parse::RecDescent will not backtrack to try it 4 times. Parse::Stallion by default will try to backtrack, which may prove slower, but one can create a rule with the MATCH_ONCE option to allow this, as is done in Parse::Stallion::RD.
LEFTOP, RIGHTOP, REPITITION, AND ALIASES
Parse::RecDescent's leftop operation can include the separator in the directive's return value or not depending on how it is set up.
<leftop: 'a' 'b' 'c'>
will return an array ('a','c','c,'....'c'). Whereas
<leftop: 'a' b 'c'>
b: 'b'
will return an array ('a','b','c','b','c',...,'b','c').
In Parse::Stallion, on can specify the aliases of subrules and those that share the same name, end up in an array ref.
A({thelist => qr/a/}, M(A(qr/b/, {thelist => qr/c/})))
will result in the evaluation routine having a parameter:
$_[0]->{thelist} = ['a','c', ..., 'c']
The above cases also affect rightop's and repetition in Parse::RecDescent.
OTHER ITEMS
This module requires Text::Balanced to work but since Parse::Stallion does not require Text::Balanced and this is part of Parse::Stallion it is not part of the dependencies.
One can increase $Parse::Stallion::RD::__max_steps in case one runs into the 'Not enough steps to do parse...' error.
AUTHOR
Arthur Goldstein, <arthur@acm.org>
ACKNOWLEDGEMENTS
Damian Conway, Christopher Frenze, and Rene Nyffenegger.
COPYRIGHT AND LICENSE
Copyright (C) 2009 by Arthur Goldstein. All Rights Reserved.
This module is free software. It may be used, redistributed and/or modified under the terms of the Perl Artistic License (see http://www.perl.com/perl/misc/Artistic.html)
BUGS
Please email in bug reports.
TO DO AND FUTURE POSSIBLE CHANGES
Implement missing items from Parse::RecDescent. Email suggestions to arthur at acm.org.
SEE ALSO
t/rd.t Test file that comes with installation and has many examples. t/rdbasics.t, t/rdfullbasics.t other test files
Parse::RecDescent
Parse::Stallion