Name
Parser::LR - Create and use an LR(1) parser.
Synopsis
Create an LR grammar from rules expressed one rule per line of a string. Each rule starts with an expandable symbol followed by one possible expansion as a mixture of expandable and terminal symbols:
my $grammar = compileGrammar(<<END);
A A + B
A B
B B * C
B C
C n
C D C
D ++
D --
C C E
E **
E //
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
C D n
END
Use the grammar so created to parse a string an array of terminal symbols into a parse tree with parseWithGrammar:
my $tree = parseWithGrammar($grammar, qw{n * ( ++ -- n ** // + -- ++ n // ** )});
Print the parse tree tree, perhaps with printParseTreeAsBrackets or printParseTreeAsXml:
ok printParseTreeAsBrackets($grammar, $tree) eq <<END;
A
B
B
C n
C
B "*"
C "("
A
A
B
C "++"
C
C
C "--" n
C "**"
C "//"
C
C
B
A "+"
B
C "--"
C
C
C "++" n
C "//"
C "**"
C
C
B
A ")"
C
B
A
END
Description
Create and use an LR(1) parser.
Version 20191121.
The following sections describe the methods in each functional area of this module. For an alphabetic listing of all methods by name see Index.
Create and use an LR(1) parser.
Construct an LR(1) parser from a set of rules using compileGrammar; use the parser produced to parse sequences of terminal symbols using parseWithGrammar; print the resulting parse tree with printParseTree or printParseTreeAsXml.
compileGrammar($%)
Compile a grammar from a set of rules expressed as a string with one rule per line. Returns a "Parser::LR::Grammar Definition".
Parameter Description
1 $rules Rule definitions
2 %options Options
Example:
if (1) {
my $grammar = 𝗰𝗼𝗺𝗽𝗶𝗹𝗲𝗚𝗿𝗮𝗺𝗺𝗮𝗿(<<END);
A A + B
A B
B B * C
B C
C n
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
END
ok printGrammar($grammar) eq <<END;
Rule Expandable Expansion
1 0 A A + B
2 1 A B
3 2 B B * n
4 3 B B * ( A )
5 4 B B * [ A ]
6 5 B B * { A }
7 6 B B * ( )
8 7 B B * [ ]
9 8 B B * { }
10 9 B n
11 10 B ( A )
12 11 B [ A ]
13 12 B { A }
14 13 B ( )
15 14 B [ ]
16 15 B { }
END
my $tree = parseWithGrammar($grammar, qw/( [ { } ] + [ { n } ] ) * [ n + n ] /);
ok printParseTree($grammar, $tree) eq <<END;
Rule Expandable Terminal
1 1 A
2 4 B
3 10 B
4 (
5 0 A
6 1 A
7 11 B
8 [
9 1 A
10 15 B
11 {
12 }
13 ]
14 +
15 11 B
16 [
17 1 A
18 12 B
19 {
20 1 A
21 9 B
22 n
23 }
24 ]
25 )
26 *
27 [
28 0 A
29 1 A
30 9 B
31 n
32 +
33 9 B
34 n
35 ]
END
ok printParseTreeAsXml($grammar, $tree) eq <<END;
<A rule="1">
<B rule="4">
<B rule="10">
<"(" pos="0"/>
<A rule="0">
<A rule="1">
<B rule="11">
<"[" pos="1"/>
<A rule="1">
<B rule="15">
<"{" pos="2"/>
<"}" pos="3"/>
</B>
</A>
<"]" pos="4"/>
</B>
</A>
<"+" pos="5"/>
<B rule="11">
<"[" pos="6"/>
<A rule="1">
<B rule="12">
<"{" pos="7"/>
<A rule="1">
<B rule="9">
<n pos="8"/>
</B>
</A>
<"}" pos="9"/>
</B>
</A>
<"]" pos="10"/>
</B>
</A>
<")" pos="11"/>
</B>
<"*" pos="12"/>
<"[" pos="13"/>
<A rule="0">
<A rule="1">
<B rule="9">
<n pos="14"/>
</B>
</A>
<"+" pos="15"/>
<B rule="9">
<n pos="16"/>
</B>
</A>
<"]" pos="17"/>
</B>
</A>
END
ok printGrammarAsXml($grammar) eq <<END
<grammar>
<A><A/><"+"/><B/></A>
<A><B/></A>
<B><B/><"*"/><n/></B>
<B><B/><"*"/><"("/><A/><")"/></B>
<B><B/><"*"/><"["/><A/><"]"/></B>
<B><B/><"*"/><"{"/><A/><"}"/></B>
<B><B/><"*"/><"("/><")"/></B>
<B><B/><"*"/><"["/><"]"/></B>
<B><B/><"*"/><"{"/><"}"/></B>
<B><n/></B>
<B><"("/><A/><")"/></B>
<B><"["/><A/><"]"/></B>
<B><"{"/><A/><"}"/></B>
<B><"("/><")"/></B>
<B><"["/><"]"/></B>
<B><"{"/><"}"/></B>
</grammar>
END
}
parseWithGrammar($@)
Parse, using a compiled $grammar, an array of terminals and return a parse tree.
Parameter Description
1 $grammar Compiled grammar
2 @terminals Terminals to parse
Example:
if (1) {
my $grammar = compileGrammar(<<END);
A A + B
A B
B B * C
B C
C n
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
END
ok printGrammar($grammar) eq <<END;
Rule Expandable Expansion
1 0 A A + B
2 1 A B
3 2 B B * n
4 3 B B * ( A )
5 4 B B * [ A ]
6 5 B B * { A }
7 6 B B * ( )
8 7 B B * [ ]
9 8 B B * { }
10 9 B n
11 10 B ( A )
12 11 B [ A ]
13 12 B { A }
14 13 B ( )
15 14 B [ ]
16 15 B { }
END
my $tree = 𝗽𝗮𝗿𝘀𝗲𝗪𝗶𝘁𝗵𝗚𝗿𝗮𝗺𝗺𝗮𝗿($grammar, qw/( [ { } ] + [ { n } ] ) * [ n + n ] /);
ok printParseTree($grammar, $tree) eq <<END;
Rule Expandable Terminal
1 1 A
2 4 B
3 10 B
4 (
5 0 A
6 1 A
7 11 B
8 [
9 1 A
10 15 B
11 {
12 }
13 ]
14 +
15 11 B
16 [
17 1 A
18 12 B
19 {
20 1 A
21 9 B
22 n
23 }
24 ]
25 )
26 *
27 [
28 0 A
29 1 A
30 9 B
31 n
32 +
33 9 B
34 n
35 ]
END
ok printParseTreeAsXml($grammar, $tree) eq <<END;
<A rule="1">
<B rule="4">
<B rule="10">
<"(" pos="0"/>
<A rule="0">
<A rule="1">
<B rule="11">
<"[" pos="1"/>
<A rule="1">
<B rule="15">
<"{" pos="2"/>
<"}" pos="3"/>
</B>
</A>
<"]" pos="4"/>
</B>
</A>
<"+" pos="5"/>
<B rule="11">
<"[" pos="6"/>
<A rule="1">
<B rule="12">
<"{" pos="7"/>
<A rule="1">
<B rule="9">
<n pos="8"/>
</B>
</A>
<"}" pos="9"/>
</B>
</A>
<"]" pos="10"/>
</B>
</A>
<")" pos="11"/>
</B>
<"*" pos="12"/>
<"[" pos="13"/>
<A rule="0">
<A rule="1">
<B rule="9">
<n pos="14"/>
</B>
</A>
<"+" pos="15"/>
<B rule="9">
<n pos="16"/>
</B>
</A>
<"]" pos="17"/>
</B>
</A>
END
ok printGrammarAsXml($grammar) eq <<END
<grammar>
<A><A/><"+"/><B/></A>
<A><B/></A>
<B><B/><"*"/><n/></B>
<B><B/><"*"/><"("/><A/><")"/></B>
<B><B/><"*"/><"["/><A/><"]"/></B>
<B><B/><"*"/><"{"/><A/><"}"/></B>
<B><B/><"*"/><"("/><")"/></B>
<B><B/><"*"/><"["/><"]"/></B>
<B><B/><"*"/><"{"/><"}"/></B>
<B><n/></B>
<B><"("/><A/><")"/></B>
<B><"["/><A/><"]"/></B>
<B><"{"/><A/><"}"/></B>
<B><"("/><")"/></B>
<B><"["/><"]"/></B>
<B><"{"/><"}"/></B>
</grammar>
END
}
printGrammar($)
Print a $grammar.
Parameter Description
1 $grammar Grammar
Example:
if (1) {
my $grammar = compileGrammar(<<END);
A A + B
A B
B B * C
B C
C n
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
END
ok 𝗽𝗿𝗶𝗻𝘁𝗚𝗿𝗮𝗺𝗺𝗮𝗿($grammar) eq <<END;
Rule Expandable Expansion
1 0 A A + B
2 1 A B
3 2 B B * n
4 3 B B * ( A )
5 4 B B * [ A ]
6 5 B B * { A }
7 6 B B * ( )
8 7 B B * [ ]
9 8 B B * { }
10 9 B n
11 10 B ( A )
12 11 B [ A ]
13 12 B { A }
14 13 B ( )
15 14 B [ ]
16 15 B { }
END
my $tree = parseWithGrammar($grammar, qw/( [ { } ] + [ { n } ] ) * [ n + n ] /);
ok printParseTree($grammar, $tree) eq <<END;
Rule Expandable Terminal
1 1 A
2 4 B
3 10 B
4 (
5 0 A
6 1 A
7 11 B
8 [
9 1 A
10 15 B
11 {
12 }
13 ]
14 +
15 11 B
16 [
17 1 A
18 12 B
19 {
20 1 A
21 9 B
22 n
23 }
24 ]
25 )
26 *
27 [
28 0 A
29 1 A
30 9 B
31 n
32 +
33 9 B
34 n
35 ]
END
ok printParseTreeAsXml($grammar, $tree) eq <<END;
<A rule="1">
<B rule="4">
<B rule="10">
<"(" pos="0"/>
<A rule="0">
<A rule="1">
<B rule="11">
<"[" pos="1"/>
<A rule="1">
<B rule="15">
<"{" pos="2"/>
<"}" pos="3"/>
</B>
</A>
<"]" pos="4"/>
</B>
</A>
<"+" pos="5"/>
<B rule="11">
<"[" pos="6"/>
<A rule="1">
<B rule="12">
<"{" pos="7"/>
<A rule="1">
<B rule="9">
<n pos="8"/>
</B>
</A>
<"}" pos="9"/>
</B>
</A>
<"]" pos="10"/>
</B>
</A>
<")" pos="11"/>
</B>
<"*" pos="12"/>
<"[" pos="13"/>
<A rule="0">
<A rule="1">
<B rule="9">
<n pos="14"/>
</B>
</A>
<"+" pos="15"/>
<B rule="9">
<n pos="16"/>
</B>
</A>
<"]" pos="17"/>
</B>
</A>
END
ok printGrammarAsXml($grammar) eq <<END
<grammar>
<A><A/><"+"/><B/></A>
<A><B/></A>
<B><B/><"*"/><n/></B>
<B><B/><"*"/><"("/><A/><")"/></B>
<B><B/><"*"/><"["/><A/><"]"/></B>
<B><B/><"*"/><"{"/><A/><"}"/></B>
<B><B/><"*"/><"("/><")"/></B>
<B><B/><"*"/><"["/><"]"/></B>
<B><B/><"*"/><"{"/><"}"/></B>
<B><n/></B>
<B><"("/><A/><")"/></B>
<B><"["/><A/><"]"/></B>
<B><"{"/><A/><"}"/></B>
<B><"("/><")"/></B>
<B><"["/><"]"/></B>
<B><"{"/><"}"/></B>
</grammar>
END
}
printParseTree($$$)
Print a parse tree.
Parameter Description
1 $grammar Grammar
2 $tree Parse tree
3 $indent Optional indent level
Example:
if (1) {
my $grammar = compileGrammar(<<END);
A A + B
A B
B B * C
B C
C n
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
END
ok printGrammar($grammar) eq <<END;
Rule Expandable Expansion
1 0 A A + B
2 1 A B
3 2 B B * n
4 3 B B * ( A )
5 4 B B * [ A ]
6 5 B B * { A }
7 6 B B * ( )
8 7 B B * [ ]
9 8 B B * { }
10 9 B n
11 10 B ( A )
12 11 B [ A ]
13 12 B { A }
14 13 B ( )
15 14 B [ ]
16 15 B { }
END
my $tree = parseWithGrammar($grammar, qw/( [ { } ] + [ { n } ] ) * [ n + n ] /);
ok 𝗽𝗿𝗶𝗻𝘁𝗣𝗮𝗿𝘀𝗲𝗧𝗿𝗲𝗲($grammar, $tree) eq <<END;
Rule Expandable Terminal
1 1 A
2 4 B
3 10 B
4 (
5 0 A
6 1 A
7 11 B
8 [
9 1 A
10 15 B
11 {
12 }
13 ]
14 +
15 11 B
16 [
17 1 A
18 12 B
19 {
20 1 A
21 9 B
22 n
23 }
24 ]
25 )
26 *
27 [
28 0 A
29 1 A
30 9 B
31 n
32 +
33 9 B
34 n
35 ]
END
ok printParseTreeAsXml($grammar, $tree) eq <<END;
<A rule="1">
<B rule="4">
<B rule="10">
<"(" pos="0"/>
<A rule="0">
<A rule="1">
<B rule="11">
<"[" pos="1"/>
<A rule="1">
<B rule="15">
<"{" pos="2"/>
<"}" pos="3"/>
</B>
</A>
<"]" pos="4"/>
</B>
</A>
<"+" pos="5"/>
<B rule="11">
<"[" pos="6"/>
<A rule="1">
<B rule="12">
<"{" pos="7"/>
<A rule="1">
<B rule="9">
<n pos="8"/>
</B>
</A>
<"}" pos="9"/>
</B>
</A>
<"]" pos="10"/>
</B>
</A>
<")" pos="11"/>
</B>
<"*" pos="12"/>
<"[" pos="13"/>
<A rule="0">
<A rule="1">
<B rule="9">
<n pos="14"/>
</B>
</A>
<"+" pos="15"/>
<B rule="9">
<n pos="16"/>
</B>
</A>
<"]" pos="17"/>
</B>
</A>
END
ok printGrammarAsXml($grammar) eq <<END
<grammar>
<A><A/><"+"/><B/></A>
<A><B/></A>
<B><B/><"*"/><n/></B>
<B><B/><"*"/><"("/><A/><")"/></B>
<B><B/><"*"/><"["/><A/><"]"/></B>
<B><B/><"*"/><"{"/><A/><"}"/></B>
<B><B/><"*"/><"("/><")"/></B>
<B><B/><"*"/><"["/><"]"/></B>
<B><B/><"*"/><"{"/><"}"/></B>
<B><n/></B>
<B><"("/><A/><")"/></B>
<B><"["/><A/><"]"/></B>
<B><"{"/><A/><"}"/></B>
<B><"("/><")"/></B>
<B><"["/><"]"/></B>
<B><"{"/><"}"/></B>
</grammar>
END
}
printParseTreeAsBrackets($$$)
Print a parse tree as XML.
Parameter Description
1 $grammar Grammar
2 $tree Parse tree
3 $indent Optional indent level
Example:
if (1) {
my $grammar = compileGrammar(<<END);
A A + B
A B
B B * C
B C
C n
C D C
D ++
D --
C C E
E **
E //
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
C D n
END
my $tree = parseWithGrammar($grammar, qw{n * ( ++ -- n ** // + -- ++ n // ** )});
ok 𝗽𝗿𝗶𝗻𝘁𝗣𝗮𝗿𝘀𝗲𝗧𝗿𝗲𝗲𝗔𝘀𝗕𝗿𝗮𝗰𝗸𝗲𝘁𝘀($grammar, $tree) eq <<END;
A
B
B
C n
C
B "*"
C "("
A
A
B
C "++"
C
C
C "--" n
C "**"
C "//"
C
C
B
A "+"
B
C "--"
C
C
C "++" n
C "//"
C "**"
C
C
B
A ")"
C
B
A
END
}
printParseTreeAsXml($$$)
Print a parse tree as XML.
Parameter Description
1 $grammar Grammar
2 $tree Parse tree
3 $indent Optional indent level
Example:
if (1) {
my $grammar = compileGrammar(<<END);
A A + B
A B
B B * C
B C
C n
C ( A )
C [ A ]
C { A }
C ( )
C [ ]
C { }
END
ok printGrammar($grammar) eq <<END;
Rule Expandable Expansion
1 0 A A + B
2 1 A B
3 2 B B * n
4 3 B B * ( A )
5 4 B B * [ A ]
6 5 B B * { A }
7 6 B B * ( )
8 7 B B * [ ]
9 8 B B * { }
10 9 B n
11 10 B ( A )
12 11 B [ A ]
13 12 B { A }
14 13 B ( )
15 14 B [ ]
16 15 B { }
END
my $tree = parseWithGrammar($grammar, qw/( [ { } ] + [ { n } ] ) * [ n + n ] /);
ok printParseTree($grammar, $tree) eq <<END;
Rule Expandable Terminal
1 1 A
2 4 B
3 10 B
4 (
5 0 A
6 1 A
7 11 B
8 [
9 1 A
10 15 B
11 {
12 }
13 ]
14 +
15 11 B
16 [
17 1 A
18 12 B
19 {
20 1 A
21 9 B
22 n
23 }
24 ]
25 )
26 *
27 [
28 0 A
29 1 A
30 9 B
31 n
32 +
33 9 B
34 n
35 ]
END
ok 𝗽𝗿𝗶𝗻𝘁𝗣𝗮𝗿𝘀𝗲𝗧𝗿𝗲𝗲𝗔𝘀𝗫𝗺𝗹($grammar, $tree) eq <<END;
<A rule="1">
<B rule="4">
<B rule="10">
<"(" pos="0"/>
<A rule="0">
<A rule="1">
<B rule="11">
<"[" pos="1"/>
<A rule="1">
<B rule="15">
<"{" pos="2"/>
<"}" pos="3"/>
</B>
</A>
<"]" pos="4"/>
</B>
</A>
<"+" pos="5"/>
<B rule="11">
<"[" pos="6"/>
<A rule="1">
<B rule="12">
<"{" pos="7"/>
<A rule="1">
<B rule="9">
<n pos="8"/>
</B>
</A>
<"}" pos="9"/>
</B>
</A>
<"]" pos="10"/>
</B>
</A>
<")" pos="11"/>
</B>
<"*" pos="12"/>
<"[" pos="13"/>
<A rule="0">
<A rule="1">
<B rule="9">
<n pos="14"/>
</B>
</A>
<"+" pos="15"/>
<B rule="9">
<n pos="16"/>
</B>
</A>
<"]" pos="17"/>
</B>
</A>
END
ok printGrammarAsXml($grammar) eq <<END
<grammar>
<A><A/><"+"/><B/></A>
<A><B/></A>
<B><B/><"*"/><n/></B>
<B><B/><"*"/><"("/><A/><")"/></B>
<B><B/><"*"/><"["/><A/><"]"/></B>
<B><B/><"*"/><"{"/><A/><"}"/></B>
<B><B/><"*"/><"("/><")"/></B>
<B><B/><"*"/><"["/><"]"/></B>
<B><B/><"*"/><"{"/><"}"/></B>
<B><n/></B>
<B><"("/><A/><")"/></B>
<B><"["/><A/><"]"/></B>
<B><"{"/><A/><"}"/></B>
<B><"("/><")"/></B>
<B><"["/><"]"/></B>
<B><"{"/><"}"/></B>
</grammar>
END
}
Parser::LR::Grammar Definition
LR parser produced
Output fields
dfa - DFA from grammar
expandables - Expandable symbols
expansionStates - States we can expand in
finalState - Final state at end of parse
grammar - Grammar from which the NFA was derived
longestRule - Longest rule
nfa - NFA from grammar
optionalExpandables - Expandables that can expand to nothing
reducers - The expandables an expandable can reduce to
rules - Rules
startSymbol - Start symbol
startTerminals - Terminals that start rules
terminals - Terminal symbols
Parser::LR::Rule Definition
A parsing rule
Output fields
expandable - Symbol to expand
expansion - Symbol expansion
print - Rule printer
rule - Rule number
Private Methods
printRule($)
Print a rule
Parameter Description
1 $rule Rule
longestMatchingRule($@)
Find the longest rule that completely matches the top of the stack.
Parameter Description
1 $grammar Grammar
2 @stack Stack
partialMatch($@)
Check whether we have a partial match with the top of the stack.
Parameter Description
1 $grammar Grammar
2 @stack Stack
reduceStackWithRule($$$)
Reduce by the specified rule and update the stack and parse tree to match.
Parameter Description
1 $rule Rule
2 $stack Stack
3 $tree Parse tree
parseWithGrammarAndLog($@)
Parse, using a compiled $grammar, an array of terminals and return a log of parsing actions taken.
Parameter Description
1 $grammar Compiled grammar
2 @terminals Terminals to parse
printSymbolAsXml($)
Print a symbol in a form acceptable as Xml
Parameter Description
1 $symbol Symbol
printGrammarAsXml($$)
Print a $grammar as XML.
Parameter Description
1 $grammar Grammar
2 $indent Indentation level
printParseTreeAndGrammarAsXml($$)
Print a parse tree produced from a grammar by parseWithGrammarAndLog as XML.
Parameter Description
1 $tree Parse tree
2 $grammar Grammar
Index
1 compileGrammar - Compile a grammar from a set of rules expressed as a string with one rule per line.
2 longestMatchingRule - Find the longest rule that completely matches the top of the stack.
3 parseWithGrammar - Parse, using a compiled $grammar, an array of terminals and return a parse tree.
4 parseWithGrammarAndLog - Parse, using a compiled $grammar, an array of terminals and return a log of parsing actions taken.
5 partialMatch - Check whether we have a partial match with the top of the stack.
6 printGrammar - Print a $grammar.
7 printGrammarAsXml - Print a $grammar as XML.
8 printParseTree - Print a parse tree.
9 printParseTreeAndGrammarAsXml - Print a parse tree produced from a grammar by parseWithGrammarAndLog as XML.
10 printParseTreeAsBrackets - Print a parse tree as XML.
11 printParseTreeAsXml - Print a parse tree as XML.
12 printRule - Print a rule
13 printSymbolAsXml - Print a symbol in a form acceptable as Xml
14 reduceStackWithRule - Reduce by the specified rule and update the stack and parse tree to match.
Installation
This module is written in 100% Pure Perl and, thus, it is easy to read, comprehend, use, modify and install via cpan:
sudo cpan install Parser::LR
Author
Copyright
Copyright (c) 2016-2019 Philip R Brenan.
This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.