NAME
Parse::Eyapp::YATW - Tree transformation objects
SYNOPSIS
#!/usr/bin/perl -w
use strict;
use Rule6;
use Parse::Eyapp::YATW;
my %BinaryOperation = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
sub set_terminfo {
no warnings;
*TERMINAL::info = sub { $_[0]{attr} };
}
sub is_foldable {
my ($op, $left, $right);
return 0 unless defined($op = $BinaryOperation{ref($_[0])});
return 0 unless ($left = $_[0]->child(0), $left->isa('NUM'));
return 0 unless ($right = $_[0]->child(1), $right->isa('NUM'));
my $leftnum = $left->child(0)->{attr};
my $rightnum = $right->child(0)->{attr};
$left->child(0)->{attr} = eval "$leftnum $op $rightnum";
$_[0] = $left;
}
my $parser = new Rule6();
my $input = "2*3";
my $t = $parser->Run(\$input);
&set_terminfo;
print "\n***** Before ******\n";
print $t->str;
my $p = Parse::Eyapp::YATW->new(PATTERN => \&is_foldable);
$p->s($t);
print "\n***** After ******\n";
print $t->str."\n";
Introduction
Parse::Eyapp:YATW
objects implement tree transformations. They have two attributes PATTERN
and NAME
. PATTERN
is a reference to the code implementing the transformation. NAME
is the name of the transformation.
Though usually you build a transformation by means of Treeregexp programs you can directly invoke the method new
to build a tree transformation. A transformation object can be built from a function that conforms to the YATW tree transformation call protocol
For a subroutine pattern_sub
to work as a YATW tree transformation - as subroutine is_foldable
in the SYNOPSIS section - has to conform to the following call description:
pattern_sub(
$_[0], # Node being visited
$_[1], # Father of this node
$index, # Index of this node in @Father->children
$self, # The YATW pattern object
);
The pattern_sub
must return TRUE if matched and FALSE otherwise.
The function is_foldable
in the SYNOPSIS section holds the properties to be a YATW tree transformation
1 sub is_foldable {
2 my ($op, $left, $right);
3
4 return 0 unless defined($op = $BinaryOperation{ref($_[0])});
5 return 0 unless ($left = $_[0]->child(0), $left->isa('NUM'));
6 return 0 unless ($right = $_[0]->child(1), $right->isa('NUM'));
7
8 my $leftnum = $left->child(0)->{attr};
9 my $rightnum = $right->child(0)->{attr};
10 $left->child(0)->{attr} = eval "$leftnum $op $rightnum";
11 $_[0] = $left;
12 }
First, checks that the current node is one of PLUS
, MINUS
, TIMES
or DIV
(line 4). Then checks that both children are NUM
bers (lines 5 and 6). In such case proceeds to modify its left child with the result of operating both children (line 10). The matching tree is finally substituted by its left child (line 11).
This is the output of the program in the SYNOPSIS section:
pl@nereida:~/LEyapp/examples$ eyapp Rule6.yp; foldrule6.pl
***** Before ******
TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3]))
***** After ******
NUM(TERMINAL[6])
Follows the grammar description file in Rule6.yp
:
pl@nereida:~/LEyapp/examples$ cat -n Rule6.yp
1 %{
2 use Data::Dumper;
3 %}
4 %right '='
5 %left '-' '+'
6 %left '*' '/'
7 %left NEG
8 %tree
9
10 %%
11 line: exp { $_[1] }
12 ;
13
14 exp: %name NUM
15 NUM
16 | %name VAR
17 VAR
18 | %name ASSIGN
19 VAR '=' exp
20 | %name PLUS
21 exp '+' exp
22 | %name MINUS
23 exp '-' exp
24 | %name TIMES
25 exp '*' exp
26 | %name DIV
27 exp '/' exp
28 | %name UMINUS
29 '-' exp %prec NEG
30 | '(' exp ')' { $_[2] } /* Let us simplify a bit the tree */
31 ;
32
33 %%
34
35 use Tail2;
The module Tail2
in file examples/Tail2.pm
implements the lexical analyzer plus the error
and run
methods.
Parse::Eyapp:YATW Methods
Parse::Eyapp:YATW
objects represent tree transformations. They carry the information of what nodes match and how to modify them.
Parse::Eyapp::YATW->new
Builds a treeregexp transformation object. Though usually you build a transformation by means of Treeregexp programs you can directly invoke the method to build a tree transformation. A transformation object can be built from a function that conforms to the YATW tree transformation call protocol (see the section "The YATW Tree Transformation Call Protocol"). Follows an example (file examples/12ts_simplify_with_s.pl
):
nereida:~/src/perl/YappWithDefaultAction/examples> \
sed -ne '68,$p' 12ts_simplify_with_s.pl | cat -n
1 sub is_code {
2 my $self = shift; # tree
3
4 # After the shift $_[0] is the father, $_[1] the index
5 if ((ref($self) eq 'CODE')) {
6 splice(@{$_[0]->{children}}, $_[1], 1);
7 return 1;
8 }
9 return 0;
10 }
11
12 Parse::Eyapp->new_grammar(
13 input=>$translationscheme,
14 classname=>'Calc',
15 firstline =>7,
16 );
17 my $parser = Calc->new(); # Create the parser
18
19 $parser->YYData->{INPUT} = "2*-3\n"; print "2*-3\n"; # Set the input
20 my $t = $parser->Run; # Parse it
21 print $t->str."\n";
22 my $p = Parse::Eyapp::YATW->new(PATTERN => \&is_code);
23 $p->s($t);
24 { no warnings; # make attr info available only for this display
25 local *TERMINAL::info = sub { $_[0]{attr} };
26 print $t->str."\n";
27 }
After the Parse::Eyapp::YATW
object $p
is built at line 22 the call to method $p->s($t)
applies the transformation is_code
using a bottom-up traversing of the tree $t
. The achieved effect is the elimination of CODE
references in the translation scheme tree. When executed the former code produces:
nereida:~/src/perl/YappWithDefaultAction/examples> 12ts_simplify_with_s.pl
2*-3
EXP(TIMES(NUM(TERMINAL,CODE),TERMINAL,UMINUS(TERMINAL,NUM(TERMINAL,CODE),CODE),CODE),CODE)
EXP(TIMES(NUM(TERMINAL[2]),TERMINAL[*],UMINUS(TERMINAL[-],NUM(TERMINAL[3]))))
The file foldrule6.pl
in the examples/
distribution directory gives you another example:
nereida:~/src/perl/YappWithDefaultAction/examples> cat -n foldrule6.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use Rule6;
4 use Parse::Eyapp::YATW;
5
6 my %BinaryOperation = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
7
8 sub set_terminfo {
9 no warnings;
10 *TERMINAL::info = sub { $_[0]{attr} };
11 }
12 sub is_foldable {
13 my ($op, $left, $right);
14 return 0 unless defined($op = $BinaryOperation{ref($_[0])});
15 return 0 unless ($left = $_[0]->child(0), $left->isa('NUM'));
16 return 0 unless ($right = $_[0]->child(1), $right->isa('NUM'));
17
18 my $leftnum = $left->child(0)->{attr};
19 my $rightnum = $right->child(0)->{attr};
20 $left->child(0)->{attr} = eval "$leftnum $op $rightnum";
21 $_[0] = $left;
22 }
23
24 my $parser = new Rule6();
25 $parser->YYData->{INPUT} = "2*3";
26 my $t = $parser->Run;
27 &set_terminfo;
28 print "\n***** Before ******\n";
29 print $t->str;
30 my $p = Parse::Eyapp::YATW->new(PATTERN => \&is_foldable);
31 $p->s($t);
32 print "\n***** After ******\n";
33 print $t->str."\n";
when executed produces:
nereida:~/src/perl/YappWithDefaultAction/examples> foldrule6.pl
***** Before ******
TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3]))
***** After ******
NUM(TERMINAL[6])
The YATW Tree Transformation Call Protocol
For a subroutine pattern_sub
to work as a YATW tree transformation - as subroutines is_foldable
and is_code
above - has to conform to the following call description:
pattern_sub(
$_[0], # Node being visited
$_[1], # Father of this node
$index, # Index of this node in @Father->children
$self, # The YATW pattern object
);
The pattern_sub
must return TRUE if matched and FALSE otherwise.
The protocol may change in the near future. Avoid using other information than the fact that the first argument is the node being visited.
Parse::Eyapp::YATW->buildpatterns
Works as Parse::Eyapp->new but receives an array of subs conforming to the YATW Tree Transformation Call Protocol.
our @all = Parse::Eyapp::YATW->buildpatt(\&delete_code, \&delete_tokens);
$yatw->delete
The root of the tree that is currently matched by the YATW transformation $yatw
will be deleted from the tree as soon as is safe. That usually means when the processing of their siblings is finished. The following example (taken from file examples/13ts_simplify_with_delete.pl
in the Parse::Eyapp distribution) illustrates how to eliminate CODE and syntactic terminals from the syntax tree:
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ \
sed -ne '62,$p' 13ts_simplify_with_delete.pl | cat -n
1 sub not_useful {
2 my $self = shift; # node
3 my $pat = $_[2]; # get the YATW object
4
5 (ref($self) eq 'CODE') or ((ref($self) eq 'TERMINAL') and ($self->{token} eq $self->{attr}))
6 or do { return 0 };
7 $pat->delete();
8 return 1;
9 }
10
11 Parse::Eyapp->new_grammar(
12 input=>$translationscheme,
13 classname=>'Calc',
14 firstline =>7,
15 );
16 my $parser = Calc->new(); # Create the parser
17
18 $parser->YYData->{INPUT} = "2*3\n"; print $parser->YYData->{INPUT};
19 my $t = $parser->Run; # Parse it
20 print $t->str."\n"; # Show the tree
21 my $p = Parse::Eyapp::YATW->new(PATTERN => \¬_useful);
22 $p->s($t); # Delete nodes
23 print $t->str."\n"; # Show the tree
when executed we get the following output:
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ 13ts_simplify_with_delete.pl
2*3
EXP(TIMES(NUM(TERMINAL[2],CODE),TERMINAL[*],NUM(TERMINAL[3],CODE),CODE))
EXP(TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3])))
$yatw->unshift
Tha call $yatw->unshift($b)
safely unshifts (inserts at the beginning) the node $b
in the list of its siblings of the node that matched (i.e in the list of siblings of $_[0]
). The following example shows a YATW transformation insert_child
that illustrates the use of unshift
(file examples/26delete_with_trreereg.pl
):
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ \
sed -ne '70,$p' 26delete_with_trreereg.pl | cat -n
1 my $transform = Parse::Eyapp::Treeregexp->new( STRING => q{
2
3 delete_code : CODE => { $delete_code->delete() }
4
5 {
6 sub not_semantic {
7 my $self = shift;
8 return 1 if ((ref($self) eq 'TERMINAL') and ($self->{token} eq $self->{attr}));
9 return 0;
10 }
11 }
12
13 delete_tokens : TERMINAL and { not_semantic($TERMINAL) } => {
14 $delete_tokens->delete();
15 }
16
17 insert_child : TIMES(NUM(TERMINAL), NUM(TERMINAL)) => {
18 my $b = Parse::Eyapp::Node->new( 'UMINUS(TERMINAL)',
19 sub { $_[1]->{attr} = '4.5' }); # The new node will be a sibling of TIMES
20
21 $insert_child->unshift($b);
22 }
23 },
24 )->generate();
25
26 Parse::Eyapp->new_grammar(
27 input=>$translationscheme,
28 classname=>'Calc',
29 firstline =>7,
30 );
31 my $parser = Calc->new(); # Create the parser
32
33 $parser->YYData->{INPUT} = "2*3\n"; print $parser->YYData->{INPUT}; # Set the input
34 my $t = $parser->Run; # Parse it
35 print $t->str."\n"; # Show the tree
36 # Get the AST
37 our ($delete_tokens, $delete_code);
38 $t->s($delete_tokens, $delete_code);
39 print $t->str."\n"; # Show the tree
40 our $insert_child;
41 $insert_child->s($t);
42 print $t->str."\n"; # Show the tree
When is executed the program produces the following output:
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ 26delete_with_trreereg.pl
2*3
EXP(TIMES(NUM(TERMINAL[2],CODE),TERMINAL[*],NUM(TERMINAL[3],CODE),CODE))
EXP(TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3])))
EXP(UMINUS(TERMINAL[4.5]),TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3])))
Don't try to take advantage that the transformation sub receives in $_[1]
a reference to the father (see the section "The YATW Tree Transformation Call Protocol") and do something like:
unshift $_[1]->{children}, $b
it is unsafe.
$yatw->insert_before
A call to $yatw->insert_before($node)
safely inserts $node
in the list of siblings of $_[0]
just before $_[0]
(i.e. the ndoe that matched with $yatw
). The following example (file t/33moveinvariantoutofloop.t
) illustrates its use:
my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
moveinvariant: WHILE(VAR($b), BLOCK(@a, ASSIGN($x, $e), @c))
and { is_invariant($ASSIGN, $WHILE) } => {
my $assign = $ASSIGN;
$BLOCK->delete($ASSIGN);
$moveinvariant->insert_before($assign);
}
},
);
Here the ASSIGN($x, $e)
subtree - if is loop invariant - will be moved to the list of siblings of $WHILE
just before the $WHILE
.
Tree Matching and Tree Substitution
Matching Trees
Both the transformation objects in Parse::Eyapp::YATW
and the nodes in Parse::Eyapp::Node
have a method named m
for matching. For a Parse::Eyapp::YATW
object, the method -when called in a list context- returns a list of Parse::Eyapp::Node::Match
nodes.
@R = $t->m($yatw1, $yatw2, $yatw3, ...)
A Parse::Eyapp::Node::Match
object describes the nodes of the actual tree that have matched. The nodes in the returned list are organized in a hierarchy. They appear in the list sorted according to a depth-first visit of the actual tree $t
. In a scalar context m
returns the first element of the list.
Let us denote by $t
the actual tree being searched and $r
one of the Parse::Eyapp::Node::Match
nodes in the resulting forest @R
. Then we have the following methods:
The method
$r->node
return the node$t
of the actual tree that matchedThe method
$r->father
returns the father of$r
in the matching forest. The father of$r
is defined by this property:$r->father->node
is the nearest ancestor of$r->node
that matched with the treeregexp pattern. That is, there is no ancestor that matched between$r->node
and$r->father->node
. Otherwise$r->father
isundef
The method
$r->coord
returns the coordinates of$r->node
relative to$t
. For example, the coordinate".1.3.2"
denotes the node$t->child(1)->child(3)->child(2)
, where$t
is the root of the search.The method
$r->depth
returns the depth of$r->node
in$t
.When
m
was called as aParse::Eyapp::Node
method, i. e. with potentially more than oneYATW
treeregexp, the method$r->names
returns the array of names of the transformations that matched with$r->node
.
The following example illustrates a use of m
as a Parse::Eyapp:YATW
method. It solves a problem of scope analysis in a C compiler: matching each RETURN
statement with the function that surrounds it. The parsing was already done, the AST was built and left in $t
. The treeregexp used is:
retscope: /FUNCTION|RETURN/
and the code that solves the problem is:
# Associate each "return exp" with its "function"
my @returns = $retscope->m($t);
for (@returns) {
my $node = $_->node;
if (ref($node) eq 'RETURN') {
my $function = $_->father->node;
$node->{function} = $function;
$node->{t} = $function->{t};
}
}
The first line gets a list of Parse::Eyapp::Node::Match
nodes describing the actual nodes that matched /FUNCTION|RETURN/
. If the node described by $_
is a 'RETURN'
node, the expresion $_->father->node
must necessarily point to the function node that encloses it.
The second example shows the use of m
as a Parse::Eyapp::Node
method.
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ cat -n m2.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use Rule6;
4 use Parse::Eyapp::Treeregexp;
5
6 Parse::Eyapp::Treeregexp->new( STRING => q{
7 fold: /times|plus|div|minus/i:bin(NUM($n), NUM($m))
8 zxw: TIMES(NUM($x), .) and { $x->{attr} == 0 }
9 wxz: TIMES(., NUM($x)) and { $x->{attr} == 0 }
10 })->generate();
11
12 # Syntax analysis
13 my $parser = new Rule6();
14 $parser->YYData->{INPUT} = "0*0*0";
15 my $t = $parser->Run;
16 print "Tree:",$t->str,"\n";
17
18 # Search
19 my $m = $t->m(our ($fold, $zxw, $wxz));
20 print "Match Node:\n",$m->str,"\n";
When executed with input 0*0*0
the program generates this output:
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ m2.pl
Tree:TIMES(TIMES(NUM(TERMINAL),NUM(TERMINAL)),NUM(TERMINAL))
Match Node:
Match[[TIMES:0:wxz]](Match[[TIMES:1:fold,zxw,wxz]])
The representation of Match
nodes by str
deserves a comment. Match
nodes have their own info
method. It returns a string containing the concatenation of the class of $r->node
(i.e. the actual node that matched), the depth ($r->depth
) and the names of the transformations that matched (as provided by the method $r->names
)
The SEVERITY
option of Parse::Eyapp::Treeregexp::new
The SEVERITY
option of Parse::Eyapp::Treeregexp::new
controls the way matching succeeds regarding the number of children. To illustrate its use let us consider the following example. The grammar used Rule6.yp
is similar to the one in the SYNOPSIS example.
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ cat -n numchildren.pl
1 #!/usr/bin/perl -w
2 use strict;
3 use Rule6;
4 use Parse::Eyapp::Treeregexp;
5
6 sub TERMINAL::info { $_[0]{attr} }
7
8 my $severity = shift || 0;
9 my $parser = new Rule6();
10 $parser->YYData->{INPUT} = shift || '0*2';
11 my $t = $parser->Run;
12
13 my $transform = Parse::Eyapp::Treeregexp->new(
14 STRING => q{
15 zero_times_whatever: TIMES(NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
16 },
17 SEVERITY => $severity,
18 FIRSTLINE => 14,
19 )->generate;
20
21 $t->s(our @all);
22
23 print $t->str,"\n";
The program gets the severity level from the command line (line 9). The specification of the term TIMES(NUM($x))
inside the transformation zero_times_whatever
does not clearly state that TIMES
must have two children. There are several interpretations of the treregexp depending on the level fixed for SEVERITY
:
0:
TIMES
must have at least one child. Don't care if it has more.1:
TIMES
must have exactly one child.2:
TIMES
must have exactly one child. When visit aTIMES
node with a different number of children issue a warning.3:
TIMES
must have exactly one child. When visit aTIMES
node with a different number of children issue an error.
Observe the change in behavior according to the level of SEVERITY
:
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 0 '0*2'
NUM(TERMINAL[0])
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 1 '0*2'
TIMES(NUM(TERMINAL[0]),NUM(TERMINAL[2]))
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 2 '0*2'
Warning! found node TIMES with 2 children.
Expected 1 children (see line 15 of ./numchildren.pl)"
TIMES(NUM(TERMINAL[0]),NUM(TERMINAL[2]))
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 3 '0*2'
Error! found node TIMES with 2 children.
Expected 1 children (see line 15 of ./numchildren.pl)"
at (eval 2) line 29
Tree Substitution: The s
methods
Both Parse::Eyapp:Node
and Parse::Eyapp::YATW
objects (i.e. nodes and tree transformations) are provided with a s
method.
In the case of a Parse::Eyapp::YATW
object the method s
applies the tree transformation using a single bottom-up traversing: the transformation is recursively applied to the children and then to the current node.
For Parse::Eyapp:Node
nodes the set of transformations is applied to each node until no transformation matches any more. The example in the "SYNOPSIS" section illustrates the use:
1 # Let us transform the tree. Define the tree-regular expressions ..
2 my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
3 { # Example of support code
4 my %Op = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
5 }
6 constantfold: /TIMES|PLUS|DIV|MINUS/:bin(NUM($x), NUM($y))
7 => {
8 my $op = $Op{ref($_[0])};
9 $x->{attr} = eval "$x->{attr} $op $y->{attr}";
10 $_[0] = $NUM[0];
11 }
12 uminus: UMINUS(NUM($x)) => { $x->{attr} = -$x->{attr}; $_[0] = $NUM }
13 zero_times_whatever: TIMES(NUM($x), .) and { $x->{attr} == 0 } => { $_[0] = $NUM }
14 whatever_times_zero: TIMES(., NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
15 },
16 OUTPUTFILE=> 'main.pm'
17 );
18 $p->generate(); # Create the tranformations
19
20 $t->s($uminus); # Transform UMINUS nodes
21 $t->s(@all); # constant folding and mult. by zero
The call at line 20 can be substituted by $uminus->s($t)
without changes.
AUTHOR
Casiano Rodriguez-Leon (casiano@ull.es)
ACKNOWLEDGMENTS
This work has been supported by CEE (FEDER) and the Spanish Ministry of Educación y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.
A large percentage of code is verbatim taken from Parse::Yapp 1.05. The author of Parse::Yapp is Francois Desarmenien.
I wish to thank Francois Desarmenien for his Parse::Yapp
module, to my students at La Laguna and to the Perl Community. Special thanks to my family and Larry Wall.
LICENCE AND COPYRIGHT
Copyright (c) 2006-2007 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.
Parse::Yapp copyright is of Francois Desarmenien, all rights reserved. 1998-2001
These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 675:
Non-ASCII character seen before =encoding in 'I<Educación'. Assuming UTF-8