NAME

Parse::Stallion::EBNF - Output/Input parser in Extended Backus Naur Form.

SYNOPSIS

#Output
use Parse::Stallion;
$parser = new Parse::Stallion(...);

use Parse::Stallion::EBNF;
$ebnf_form = ebnf Parse::Stallion::EBNF($parser);

print $ebnf_form;

#Input
my $rules = '
  start = number qr/\s*\+\s*/ number
   S{return $number->[0] + $number->[1]}S;
  number = qr/\d+/;
';

my $rule_parser = ebnf_new Parse::Stallion::EBNF($rules);

my $value = $rule_parser->parse_and_evaluate('1 + 6');
#$value should be 7

DESCRIPTION

Output

Given a parser from Parse::Stallion, creates a string that is the parser's grammar in EBNF.

Input

Use Parse::Stallion for more complicated grammars.

Enter a string with simple grammar rules, a parser is returned.

Each rule must be terminated by a semicolon.

Each rule name must consist of word characters (\w).

Format:

<rule_name> = <rule_def>;

Four types of rules: 'and', 'or', 'leaf', 'multiple'/'optional'

Rule names and aliases must start with a letter or underscore though may contain digits as well. They are case sensitive.

AND

'and' rule, the rule_def must be rule names separated by whitespace.

OR

'or' rule, the rule_def must be rule names separated by single pipes (|).

LEAF

'leaf' rule, the rule_def can be a 'qr' or 'q' followed by a non-space, non-word character (\W) up to a repitition of that character. What is betweent the characters is treated as either a regular expression (if 'qr') or a string (if 'q'). Additionally, if a string is within quotes or double quotes it is treated as a string. The following are the same:

q/x\x/, q'x\x', 'x\x', "x\x",  qr/x\\x/, qr'x\\x'

The qr of a leaf is not the same as a perl regexp's declaration. Notably, one cannot escape the delimiting chars. That is, qr/\//

is valid perl but not valid here, one could instead use

qr+/+

which is also valid perl.

Modifiers are allowed and are inserted into the regexp via an extended regex sequence:

qr/abc/i

internally becomes

qr/(?i)abc/

MULTIPLE/Optional

'multiple' rule, a rule name enclosed within curly braces {}. Optionally may have a minimum and maximum occurence by following the definition with an asterisk min, max. For example:

multiple_rule = {ruleM}*5,0;

would have at least 5 occurences of ruleM.

Optional rules can be specified within square brackets. The following are the same:

{rule_a}*0,1

[rule_a]

SUBRULES

Subrules may be specified within a rule by enclosing the subrule within parentheses.

ALIAS

An alias may be specified by an alias name followed by a dot: the alias then a dot. I.e.,

alias.rule

alias.qr/regex/

alias.(rule1 rule2)

alias.(rule1 | rule2)

EVALUATION

For the evaluation phase (see Parse::Stallion) any rule can have at the end of its definition, before the semicolon, a subroutine that should be enclosed within S{ til }S. Or else S[ til ]S or (S til )S. The 'sub ' declaration is done internally.

Internally all subrules have variables created that contain their evaluated values. If a subrule's name may occur more than once it is passed in an array reference. See Parse::Stallion for details on parameters passed to evaluation routine. This saves on having to create code for reading in the parameters.

Examples:

rule = number plus number S{subroutine}S;

will create an evaluation subroutine string and eval:

sub {
my $number = $_[0]->{number};
my $plus = $_[0]->{plus};
subroutine
}

$number is an array ref, $plus is the returned value from subrule plus.

number = /\d+/ S{subroutine}S;

is a leaf rule, which only gets one argument to its subroutine:

sub {
my $_ = $_[0];
subroutine
}

Evaluation is only done after parsing unlike the option of during parsing found in Parse::Stallion.

COMMENTS

Comments may be placed on the lines after the semi-colon:

rule = 'xxx' ; comment
; comment 2
; comment 3

head3 PRECEDENCE

If there are multiple rules within an or clause it is recommended they be put together within parentheses:

a = (b c) | d ;   a = b c | d will not work

The last subroutine corresponds to the whole rule:

a = e.(c S{...s1...}S) | d S{...s2...}S ;

s1, if called, will get $c as an argument. s2, if called will get either $e or $d as an argument and the other will be undef.

SEE ALSO

example/calculator_ebnf.pl

t/ebnf_in.t in the test cases for examples.

Parse::Stallion