NAME

Parse::Eyapp - Extensions for Parse::Yapp

VERSION

1.069577

SYNOPSIS

use strict;
use Parse::Eyapp;
use Parse::Eyapp::Treeregexp;

sub TERMINAL::info {
  $_[0]{attr}
}

my $grammar = q{
  %right  '='     # Lowest precedence
  %left   '-' '+' # + and - have more precedence than = Disambiguate a-b-c as (a-b)-c
  %left   '*' '/' # * and / have more precedence than + Disambiguate a/b/c as (a/b)/c
  %left   NEG     # Disambiguate -a-b as (-a)-b and not as -(a-b)
  %tree           # Let us build an abstract syntax tree ...

  %%
  line: exp <%name EXPRESION_LIST + ';'>  { $_[1] } /* list of expressions separated by ';' */
  ;

  /* The %name directive defines the name of the class to which the node being built belongs */
  exp:
      %name NUM  NUM            | %name VAR   VAR         | %name ASSIGN VAR '=' exp
    | %name PLUS exp '+' exp    | %name MINUS exp '-' exp | %name TIMES  exp '*' exp
    | %name DIV     exp '/' exp | %name UMINUS '-' exp %prec NEG
    |   '(' exp ')'  { $_[2] }  /* Let us simplify a bit the tree */
  ;

  %%
  sub _Error { die "Syntax error near ".($_[0]->YYCurval?$_[0]->YYCurval:"end of file")."\n" }

  sub _Lexer {
    my($parser)=shift; # The parser object

    for ($parser->YYData->{INPUT}) { # Topicalize
      m{\G\s+}gc;
      $_ eq '' and return('',undef);
      m{\G([0-9]+(?:\.[0-9]+)?)}gc and return('NUM',$1);
      m{\G([A-Za-z][A-Za-z0-9_]*)}gc and return('VAR',$1);
      m{\G(.)}gcs and return($1,$1);
    }
  }

  sub Run {
      my($self)=shift;
      $self->YYParse( yylex => \&_Lexer, yyerror => \&_Error, );
  }
}; # end grammar

our (@all, $uminus);

Parse::Eyapp->new_grammar( # Create the parser package/class
  input=>$grammar,
  classname=>'Calc', # The name of the package containing the parser
  firstline=>7       # String $grammar starts at line 7 (for error diagnostics)
);
my $parser = Calc->new();                # Create a parser
$parser->YYData->{INPUT} = "2*-3+b*0;--2\n"; # Set the input
my $t = $parser->Run;                    # Parse it!
local $Parse::Eyapp::Node::INDENT=2;
print "Syntax Tree:",$t->str;

# Let us transform the tree. Define the tree-regular expressions ..
my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
    { #  Example of support code
      my %Op = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
    }
    constantfold: /TIMES|PLUS|DIV|MINUS/:bin(NUM($x), NUM($y))
      => {
        my $op = $Op{ref($bin)};
        $x->{attr} = eval  "$x->{attr} $op $y->{attr}";
        $_[0] = $NUM[0];
      }
    uminus: UMINUS(NUM($x)) => { $x->{attr} = -$x->{attr}; $_[0] = $NUM }
    zero_times_whatever: TIMES(NUM($x), .) and { $x->{attr} == 0 } => { $_[0] = $NUM }
    whatever_times_zero: TIMES(., NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
  },
  OUTPUTFILE=> 'main.pm'
);
$p->generate(); # Create the tranformations

$t->s($uminus); # Transform UMINUS nodes
$t->s(@all);    # constant folding and mult. by zero

local $Parse::Eyapp::Node::INDENT=0;
print "\nSyntax Tree after transformations:\n",$t->str,"\n";

 

Introduction

Parse::Eyapp (Extended yapp) is a collection of modules that extends Francois Desarmenien Parse::Yapp 1.05. Eyapp extends yacc/yapp syntax with functionalities like named attributes, EBNF-like expressions, modifiable default action, automatic syntax tree building, semi-automatic abstract syntax tree building, translation schemes, tree regular expressions, tree transformations, scope analysis support, directed acyclic graphs and a few more.

This document is divided in three parts:

The Eyapp Language

Eyapp Grammar

This section describes the syntax of the Eyapp language using its own notation. The grammar extends yacc and yapp grammars. Semicolons have been omitted to save space. Between C-like comments you can find an (informal) explanation of the language associated with the token.

eyapp: head body tail ;
symbol: LITERAL  /* A string literal like 'hello' */
    |   ident   
ident:  IDENT  /* IDENT is [A-Za-z_][A-Za-z0-9_]* */ 
head: headsec '%%'
headsec:  decl *
decl:  '\n'      
    |   SEMANTIC typedecl symlist '\n'  /* SEMANTIC  is %semantic\s+token      */
    |   SYNTACTIC typedecl symlist '\n' /* SYNTACTIC is %syntactic\s+token     */
    |   TOKEN typedecl symlist '\n'     /* TOKEN     is %token                 */
    |   ASSOC typedecl symlist '\n'     /* ASSOC     is %(left|right|nonassoc) */
    |   START ident '\n'                /* START     is %start                 */
    |   HEADCODE '\n'                   /* HEADCODE  is %{ Perl code ... %}    */
    |   UNION CODE '\n'                 /* UNION CODE  see yacc/bison          */
    |   DEFAULTACTION CODE '\n'         /* DEFAULTACTION is %defaultaction     */
    |   TREE treeclauses? '\n'          /* TREE      is %tree                  */
    |   METATREE '\n'                   /* METATREE  is %metatree              */
    |   TYPE typedecl identlist '\n'    /* TYPE      is %type                  */
    |   EXPECT NUMBER '\n'              /* EXPECT    is %expect                */
                                        /* NUMBER    is \d+                    */
typedecl:   /* empty */
    |       '<' IDENT '>'
treeclauses: BYPASS ALIAS? | ALIAS BYPASS?
symlist:    symbol + 
identlist:  ident +
body: rules * '%%'
rules: IDENT ':' rhss ';'  
rhss: rule <+ '|'>  
rule:   optname rhs (prec epscode)?
rhs:  rhseltwithid *
rhseltwithid : 
      rhselt '.' IDENT 
    | '$' rhselt  
    | rhselt
rhselt:     symbol    
    | code    
    | '(' optname rhs ')' 
    | rhselt STAR               /* STAR   is (%name\s*([A-Za-z_]\w*)\s*)?\*  */
    | rhselt '<' STAR symbol '>' 
    | rhselt OPTION             /* OPTION is (%name\s*([A-Za-z_]\w*)\s*)?\?  */
    | rhselt '<' PLUS symbol '>'
    | rhselt PLUS               /* PLUS   is (%name\s*([A-Za-z_]\w*)\s*)?\+  */
optname: (NAME IDENT)?          /* NAME is %name */
       | NOBYPASS IDENT         /* NOBYPASS is %no\s+bypass */
prec: PREC symbol               /* PREC is %prec */
epscode:  code ?   
code:   
    CODE           /* CODE     is { Perl code ... }         */
  | BEGINCODE      /* BEGINCODE is %begin { Perl code ... } */
tail:  TAILCODE ?  /* TAILCODE is { Perl code ... } */

The semantic of Eyapp agrees with the semantic of yacc and yapp for all the common constructions.

Comments

Comments are either Perl style, from # up to the end of line, or C style, enclosed between /* and */.

Syntactic Variables, Symbolic Tokens and String Literals

Two kind of symbols may appear inside a Parse::Eyapp program: Non-terminal symbols or syntactic variables, called also left-hand-side symbols and Terminal symbols, called also Tokens.

Tokens are the symbols the lexical analyzer function returns to the parser. There are two kinds: symbolic tokens and string literals.

Syntactic variables and symbolic tokens identifiers must conform to the regular expression [A-Za-z][A-Za-z0-9_]*.

When building the syntax tree (i.e. when running under the %tree directive) symbolic tokens will be considered semantic tokens (see section "Syntactic and Semantic tokens").

String literals are enclosed in single quotes and can contain almost anything. They will be received by the parser as double-quoted strings. Any special character as '"', '$' and '@' is escaped. To have a single quote inside a literal, escape it with '\'.

When building the syntax tree (i.e. when running under the %tree directive) string literals will be considered syntactic tokens (see section "Syntactic and Semantic tokens").

The Head Section

An Eyapp program has three parts:

eyapp: head body tail ;

Each part is separated by the symbol %%:

head: headsec '%%'

The head section contains a list of declarations

headsec:  decl *

There are different kinds of declarations.

This reference does not fully describes all the declarations that are shared with yacc and yapp.

Example of Head Section

In this and the next sections we will describe the basics of the Eyapp language using the file examples/Calc.eyp that accompanies this distribution. This file implements a trivial calculator. Here is the header section:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '1,11p' Calc.eyp | cat -n
 1  # examples/Calc.eyp
 2  %right  '='
 3  %left   '-' '+'
 4  %left   '*' '/'
 5  %left   NEG
 6  %right  '^'
 7  %{
 8  my %s; # symbol table
 9  %}
10
11  %%

Declarations and Precedence

Lines 2-5 declare several tokens. The usual way to declare tokens is through the %token directive. The declarations %nonassoc, %left and %right not only declare the tokens but also associate a priority with them. Tokens declared in the same line have the same precedence. Tokens declared with these directives in lines below have more precedence than those declared above. Thus, in the example above we are saying that "+" and "-" have the same precedence but higher precedence than "\=". The final effect of "-" having greater precedence than "\=" will be that an expression like:

a = 4 - 5

will be interpreted as

a = (4 - 5)

and not as

(a = 4) - 5

The use of the %left indicates that - in case of ambiguity and a match between precedences - the parser must build the tree corresponding to a left parenthesization. Thus, the expression

4 - 5 - 9

will be interpreted as

(4 - 5) - 9

Header Code

Perl code surrounded by %{ and %} can be inserted in this section. Such code will be inserted in the module generated by eyapp near the beginning. Therefore, declarations like the one of the calculator symbol table %s

7  %{
8  my %s; # symbol table
9  %}

will be visible from almost any point in the file.

The Start Symbol of the Grammar

%start IDENT declares IDENT as the start symbol of the grammar. When %start is not used, the first rule in the body section will be used.

Expect

The %expect #NUMBER directive works as in bison and suppress warnings when the number of Shift/Reduce conflicts is exactly #NUMBER. See section "Solving Ambiguities and Conflicts" to know more about Shift/Reduce conflicts.

Type and Union

C oriented declarations like %type and %union are parsed but ignored.

The Body

The body section contains the rules describing the grammar:

body:   rules * '%%'
rules:  IDENT ':' rhss ';'  
rhss:   (optname rhs (prec epscode)?) <+ '|'>  

Rules

A rule is made of a left-hand-side symbol (the syntactic variable), followed by a ':' and one or more right-hand-sides (or productions) separated by '|' and terminated by a ';' like in:

exp: 
     exp '+' exp
  |  exp '-' exp
;

A production (right hand side) may be empty:

input:   
     /* empty */
  |  input line
;

The former two productions can be abbreviated as

input: 
     line *
;

The operators *, + and ? are presented in section "Lists and Optionals".

A syntactic variable cannot appear more than once as a rule name (This differs from yacc).

Semantic Values and Semantic Actions

A production A: X_1 X_2 ... X_n can be followed by a semantic action. Such semantic action is nothing but Perl code that will be treated as an anonymous subroutine.

The Eyapp parser builds the syntax tree using a left-right bottom-up traverse of the syntax tree. Each times the Parser visits a production A: X_1 X_2 ... X_n to build new nodes the associated semantic action is called. The process of visiting the handle like subtree of the syntax tree

A 
|
|--X_1
|--X_2
.
.
.
`--X_n

and calling the associated semantic action is known - in LR terminology - as a reduction by the production A: X_1 X_2 ... X_n.

Asociated with each symbol of a Parse::Eyapp grammar there is a Semantic Value or Attribute. The semantic values of terminals are provided by the lexical analyzer. In the calculator example, the semantic value associated with an expression is its numeric value:

                      exp '+' exp { $_[1] + $_[3] }

When the semantic action/anonymous subroutine is called, the arguments are as follows:

  • $_[1] to $_[n] are the parameters just as $1 to $n in yacc,

  • $_[0] is the parser object itself. Having $_[0] beeing the parser object itself allows you to call parser methods. Actually yacc macros have been converted into parser methods. See section "Methods Available in the Generated Class".

The returned value will be the attribute associated with the left hand side of the production.

Names can be given to the attributes using the dot notation:

exp.left '+' exp.right { $left + $right }

See section "Names for attributes" for more details about the dot and dollar notations.

If no action is specified the default action

{ $_[1] }

will be called instead. See section "Default actions" to know more.

Actions in Mid-Rule

Actions can be inserted in the middle of a production like in:

block: '{'.bracket { $ids->begin_scope(); } declaration*.decs statement*.sts '}' { ... }

A middle production action is managed by inserting a new rule in the grammar and associating the semantic action with it:

Temp: /* empty */ { $ids->begin_scope(); }

Middle production actions can refer to the attributes on its left. They count as one of the components of the production. Thus the program:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '1,4p' intermediateaction2.yp
%%
S:  'a' { $_[1]x4 }.mid 'a' { print "$_[2], $mid, $_[3]\n"; }
;
%%

The auxiliar syntactica variables are named @#position-#order where #position is the position of the action in the rhs and order is an ordinal number. See the .output file for the former example:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ eyapp -v intermediateaction2.yp
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '1,5p' intermediateaction2.output
Rules:
------
0:      $start -> S $end
1:      S -> 'a' @1-1 'a'
2:      @1-1 -> /* empty */

when given input aa the execution will produce as output aaaa, aaaa, a.

Example of Body Section

Following with the calculator example, the body is:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '12,48p' Calc.eyp | cat -n
 1  start:
 2      input { \%s }
 3  ;
 4
 5  input: line *
 6  ;
 7
 8  line:
 9    '\n'         { undef }
10    | exp '\n'   { print "$_[1]\n" if defined($_[1]); $_[1] }
11    | error  '\n'
12        {
13          $_[0]->YYErrok;
14          undef
15        }
16  ;
17
18  exp:
19      NUM
20    | $VAR                   { $s{$VAR} }
21    | $VAR '=' $exp          { $s{$VAR} = $exp }
22    | exp.left '+' exp.right { $left + $right }
23    | exp.left '-' exp.right { $left - $right }
24    | exp.left '*' exp.right { $left * $right }
25    | exp.left '/' exp.right
26      {
27         $_[3] and return($_[1] / $_[3]);
28         $_[0]->YYData->{ERRMSG} = "Illegal division by zero.\n";
29         $_[0]->YYError; # Pretend that a syntactic error ocurred: _Error will be called
30         undef
31      }
32    | '-' $exp %prec NEG     { -$exp }
33    | exp.left '^' exp.right { $left ** $right }
34    | '(' $exp ')'           { $exp }
35  ;
36
37  %%

This example does not uses any of the Eyapp extensions (with the exception of the star list at line 5) and the dot and dollar notations. Please, see the Parse::Yapp pages and elsewhere documentation on yacc and bison for more information.

Solving Ambiguities and Conflicts

When Eyapp analizes a grammar like:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ cat -n ambiguities.eyp
    1  %%
    2  exp:
    3      NUM
    4    | exp '-' exp
    5  ;
    6  %%

it will produce a warning announcing the existence of shift-reduce conflicts:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ eyapp ambiguities.eyp
1 shift/reduce conflict (see .output file)
State 5: reduce by rule 2: exp -> exp '-' exp (default action)
State 5: shifts:
  to state    3 with '-'
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ ls -ltr | tail -1
-rw-rw----  1 pl users   1082 2007-02-06 08:26 ambiguities.output

when eyapp finds warnings automatically produces a .output file describing the conflict.

What the warning is saying is that an expression like exp '-' exp (rule 2) followed by a minus '-' can be worked in more than one way. If we have an input like NUM - NUM - NUM the activity of a LALR(1) parser (the family of parsers to which Eyapp belongs) consists of a sequence of shift and reduce actions. A shift action has as consequence the reading of the next token. A reduce action is finding a production rule that matches and substituting the rhs of the production by the lhs. For input NUM - NUM - NUM the activity will be as follows (the dot is used to indicate where the next input token is):

.NUM - NUM - NUM # shift
 NUM.- NUM - NUM # reduce exp: NUM 
 exp.- NUM - NUM # shift
 exp -.NUM - NUM # shift
 exp - NUM.- NUM # reduce exp: NUM
 exp - exp.- NUM # shift/reduce conflict

up this point two different decisions can be taken: the next description can be

exp.- NUM # reduce by exp: exp '-' exp (rule 2)

or:

exp - exp -.NUM # shift '-' (to state 3)

that is why it is called a shift-reduce conflict.

That is also the reason for the precedence declarations in the head section. Another kind of conflicts are reduce-reduce conflicts. They arise when more that rhs can be applied for a reduction action.

Eyapp solves the conflicts applying the following rules:

  • In a shift/reduce conflict, the default is the shift.

  • In a reduce/reduce conflict, the default is to reduce by the earlier grammar production (in the input sequence).

  • The precedences and associativities are associated with tokens in the declarations section. This is made by a sequence of lines beginning with one of the directives: %left, %right, or %nonassoc, followed by a list of tokens. All the tokens on the same line have the same precedence and associativity; the lines are listed in order of increasing precedence.

  • A precedence and associativity is associated with each grammar production; it is the precedence and associativity of the last token or literal in the right hand side of the production.

  • The %prec directive can be used when a rhs is involved in a conflict and has no tokens inside or it has but the precedence of the last token leads to an incorrect interpretation. A rhs can be followed by an optional %prec token directive giving the production the precedence of the token

    exp:   '-' exp %prec NEG { -$_[1] }
  • If there is a shift/reduce conflict, and both the grammar production and the input character have precedence and associativity associated with them, then the conflict is solved in favor of the action (shift or reduce) associated with the higher precedence. If the precedences are the same, then the associativity is used; left associative implies reduce, right associative implies shift, and nonassociating implies error.

To solve a shift-reduce conflict between a production A --> SOMETHING and a token 'a' you can follow this procedure:

1. Edit the .output file
2. Search for the state where the conflict between the production and the token is. In our example it looks like:
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '56,65p' ambiguities.output
State 5:

       exp -> exp . '-' exp    (Rule 2)
       exp -> exp '-' exp .    (Rule 2)

       '-'     shift, and go to state 3

       '-'     [reduce using rule 2 (exp)]
       $default        reduce using rule 2 (exp)
3. Inside the state there has to be a production of the type A --> SOMETHING. (with the dot at the end) indicating that a reduction must take place. There has to be also another production of the form A --> prefix . suffix, where suffix can start with the involved token 'a'.
4. Decide what action shift or reduce matches the kind of trees you want. In this example we want NUM - NUM - NUM to produce a tree like MINUS(MINUS(NUM, NUM), NUM) and not NUM, MINUS(MINUS(NUM, NUM)). We want the conflict in exp - exp.- NUM to be solved in favor of the reduction by exp: exp '-' exp. This is achieved by declaring %left '-'.

Error Recovery

The token name error is reserved for error handling. This name can be used in grammar productions; it suggests places where errors are expected, and recovery can take place:

line:
  '\n'         { undef }
  | exp '\n'   { print "$_[1]\n" if defined($_[1]); $_[1] }
  | error  '\n'
      {
        $_[0]->YYErrok;
        undef
      }

The parser pops its stack until it enters a state where the token error is legal. It then shifts the token error and proceeds to discard tokens until finding one that is acceptable. In the example all the tokens until finding a '\n' will be skipped. If no special error productions have been specified, the processing will halt.

In order to prevent a cascade of error messages, the parser, after detecting an error, remains in error state until three tokens have been successfully read and shifted. If an error is detected when the parser is already in error state, no message is given, and the input token is quietly deleted. The method YYErrok used in the example communicates to the parser that a satisfactory recovery has been reached and that it can safely emit new error messages.

You cannot have a literal 'error' in your grammar as it would confuse the driver with the error token. Use a symbolic token instead.

The Tail

The tail section contains Perl code. Usually the lexical analyzer and the Error management subroutines go here:

The Lexical Analyzer

The Lexical Analyzer is called each time the parser needs a new token. It is called with only one argument (the parser object) and returns a pair containing the next token and its associated attribute.

The fact that is a method of the parser object means that the parser methods are accesible inside the lexical analyzer. Specially interesting is the $_[0]->YYData method which provides access to the user data area.

When the lexical analyzer reaches the end of input, it must return the pair ('', undef)

See below how to write a lexical analyzer (file examples/Calc.eyp):

 1  sub make_lexer {
 2    my $input = shift;
 3
 4    return sub {
 5      my $parser = shift;
 6
 7      for ($$input) {
 8        m{\G[ \t]*}gc;
 9        m{\G([0-9]+(?:\.[0-9]+)?)}gc   and return ('NUM',$1);
10        m{\G([A-Za-z][A-Za-z0-9_]*)}gc and return ('VAR',$1);
11        m{\G\n}gc                      and do { $lineno++; return ("\n", "\n") };
12        m{\G(.)}gc                     and return ($1,$1);
13
14        return('',undef);
15      }
16    }
17  }

The subroutine make_lexer creates the lexical analyzer as a closure. The lexer returned by make_lexer is used by the YYParse method:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '90,97p' Calc.eyp | cat -n
1  sub Run {
2      my($self)=shift;
3      my $input = shift or die "No input given\n";
4
5      return $self->YYParse( yylex => make_lexer($input), yyerror => \&_Error,
6        #yydebug =>0x1F
7      );
8  }

The Error Report Subroutine

The Error Report subroutine is also a parser method, and consequently receives as parameter the parser object.

See the error report subroutine for the example in examples/Calc.eyp:

 1  %%
 2
 3  my $lineno = 1;
 4
 5  sub _Error {
 6    my $parser = shift;
 7
 8      exists $parser->YYData->{ERRMSG}
 9    and do {
10        print $parser->YYData->{ERRMSG};
11        delete $parser->YYData->{ERRMSG};
12        return;
13    };
14    my($token)=$parser->YYCurval;
15    my($what)= $token ? "input: '$token'" : "end of input";
16    my @expected = $parser->YYExpect();
17    local $" = ', ';
18    print << "ERRMSG";
19
20  Syntax error near $what (lin num $lineno).
21  Expected one of these terminals: @expected
22  ERRMSG
23  }

See the Parse::Yapp pages and elsewhere documentation on yacc and bison for more information.

Using an Eyapp Program

The following is an example of a program that uses the calculator explained in the two previous sections:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ cat -n usecalc.pl
 1  #!/usr/bin/perl -w
 2  use strict;
 3  use Calc;
 4
 5  my $parser = Calc->new();
 6  my $input = <<'EOI';
 7  a = 2*3
 8  d = 5/(a-6)
 9  b = (a+1)/7
10  c=a*3+4)-5
11  a = a+1
12  EOI
13  my $t = $parser->Run(\$input);
14  print "========= Symbol Table ==============\n";
15  print "$_ = $t->{$_}\n" for sort keys %$t;

The output for this program is (the input for each output appear as a Perl comment on the right):

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ eyapp Calc.eyp
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ usecalc.pl
6                                              # a = 2*3
Illegal division by zero.                      # d = 5/(a-6)
1                                              # b = (a+1)/7

Syntax error near input: ')' (lin num 4).      # c=a*3+4)-5
Expected one of these terminals: -, /, ^, *, +,

7                                              # a = a+1
========= Symbol Table ==============
a = 7
b = 1
c = 22

Lists and Optionals

The elements of a rhs can be one of these:

rhselt:     
      symbol    
    | code    
    | '(' optname rhs ')' 
    | rhselt STAR               /* STAR   is (%name\s*([A-Za-z_]\w*)\s*)?\*  */
    | rhselt '<' STAR symbol '>' 
    | rhselt OPTION             /* OPTION is (%name\s*([A-Za-z_]\w*)\s*)?\?  */
    | rhselt '<' PLUS symbol '>'
    | rhselt PLUS               /* PLUS   is (%name\s*([A-Za-z_]\w*)\s*)?\+  */

The STAR, OPTION and PLUS operators provide a simple mechanism to express lists:

  • In Eyapp the + operator indicates one or more repetitions of the element to the left of +, thus a rule like:

    decls:  decl +

    is the same as:

    decls:  decls decl 
         |  decl

    An additional symbol may be included to indicate lists of elements separated by such symbol. Thus

    rhss: rule <+ '|'>  

    is equivalent to:

    rhss: rhss '|' rule 
        | rule
  • The operators * and ? have their usual meaning: 0 or more for * and optionality for ?. Is legal to parenthesize a rhs expression as in:

    optname: (NAME IDENT)?

Names for attributes

Attributes can be referenced by meaningful names instead of the classic error-prone positional approach using the dot notation:

rhs:  rhseltwithid *
rhseltwithid : 
      rhselt '.' IDENT 
    | '$' rhselt  
    | rhselt

for example:

exp : exp.left '-' exp.right  { $left - $right }

By qualifying the first appearance of the syntactic variable exp with the notation exp.left we can later refer inside the actions to the associated attribute using the lexical variable $left. Thus the former code is equivalent to:

exp '-' exp
 { 
   my $lhs = shift; 
   my ($left, $right) = @_[1, 3];
   $lhs->{n} = $left->{n} - $right->{n} 
 }

The dolar notation $A can be used as an abbreviation of A.A. For example, the code:

$VAR '=' $exp
  { $lhs->{n} = $sym{$VAR->{attr}}->{n} = $exp->{n} }

is equivalent to:

VAR '=' exp
{ 
  my $lhs = shift;
  my ($VAR, $exp) = @_[1, 3];
  $lhs->{n} = $sym{$VAR->{attr}}->{n} = $exp->{n} 
}

Default actions

When no action is specified both yapp and eyapp implicitly insert the semantic action { $_[1] }. In Parse::Eyapp you can modify such behavior using the %defaultaction { Perl code } directive. The { Perl code } clause that follows the %defaultaction directive is executed when reducing by any production for which no explicit action was specified.

See an example that translates an infix expression like a=b*-3 into a postfix expression like a b 3 NEG * = :

# File Postfix.eyp (See the examples/ directory)
%right  '='
%left   '-' '+'
%left   '*' '/'
%left   NEG

%defaultaction { return  "$left $right $op"; }

%%
line: $exp  { print "$exp\n" }
;

exp:        $NUM  { $NUM }
        |   $VAR  { $VAR }
        |   VAR.left '='.op exp.right
        |   exp.left '+'.op exp.right
        |   exp.left '-'.op exp.right
        |   exp.left '*'.op exp.right
        |   exp.left '/'.op exp.right
        |   '-' $exp %prec NEG { "$exp NEG" }
        |   '(' $exp ')' { $exp }
;

%%

# Support subroutines as in the Synopsis example
...

The file containing the Eyapp program must be compiled with eyapp:

nereida:~/src/perl/YappWithDefaultAction/examples> eyapp Postfix.eyp

Next, you have to write a client program:

nereida:~/src/perl/YappWithDefaultAction/examples> cat -n usepostfix.pl
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use Postfix;
     4
     5  my $parser = new Postfix();
     6  $parser->Run;

Now we can run the client program:

nereida:~/src/perl/YappWithDefaultAction/examples> usepostfix.pl
Write an expression: -(2*a-b*-3)
2 a * b 3 NEG * - NEG

Abstract Syntax Trees : %tree and %name

Parse::Eyapp facilitates the construction of concrete syntax trees and abstract syntax trees (abbreviated AST from now on) through the %tree directive. Nodes in the AST are blessed in the production name. By default the name of a production is the concatenation of the left hand side and the production number. The production number is the ordinal number of the production as they appear in the associated .output file (see option -v of eyapp). For example, given the grammar:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '9,28p' treewithoutnames.pl
my $grammar = q{
  %right  '='     # Lowest precedence
  %left   '-' '+' # + and - have more precedence than = Disambiguate a-b-c as (a-b)-c
  %left   '*' '/' # * and / have more precedence than + Disambiguate a/b/c as (a/b)/c
  %left   NEG     # Disambiguate -a-b as (-a)-b and not as -(a-b)
  %tree           # Let us build an abstract syntax tree ...

  %%
  line: exp <+ ';'>  { $_[1] } /* list of expressions separated by ';' */
  ;

  exp:
       NUM           |   VAR       | VAR '=' exp
    | exp '+' exp    | exp '-' exp |  exp '*' exp
    | exp '/' exp
    | '-' exp %prec NEG
    |   '(' exp ')'  { $_[2] }
  ;

The tree produced by the parser when feed with input a=2*b is:

_PLUS_LIST(exp_6(TERMINAL[a],exp_9(exp_4(TERMINAL[2]),exp_5(TERMINAL[b]))))

If we want to see the correspondence between names and rules we can generate and check the corresponding file .output:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '28,42p' treewithoutnames.output
Rules:
------
0:      $start -> line $end
1:      PLUS-1 -> PLUS-1 ';' exp
2:      PLUS-1 -> exp
3:      line -> PLUS-1
4:      exp -> NUM
5:      exp -> VAR
6:      exp -> VAR '=' exp
7:      exp -> exp '+' exp
8:      exp -> exp '-' exp
9:      exp -> exp '*' exp
10:     exp -> exp '/' exp
11:     exp -> '-' exp
12:     exp -> '(' exp ')'

We can see now that the node exp_9 corresponds to the production exp -> exp '*' exp. Observe also that the Eyapp production:

        line: exp <+ ';'>
actually produces the productions:

1:      PLUS-1 -> PLUS-1 ';' exp
2:      PLUS-1 -> exp

and that the name of the class associated with the non empty list is _PLUS_LIST.

A production rule can be named using the %name IDENTIFIER directive. For each production rule a namespace/package is created. The IDENTIFIER is the name of the associated package. Therefore, by modifying the former grammar with additional %name directives:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '8,26p' treewithnames.pl
my $grammar = q{
  %right  '='     # Lowest precedence
  %left   '-' '+' # + and - have more precedence than = Disambiguate a-b-c as (a-b)-c
  %left   '*' '/' # * and / have more precedence than + Disambiguate a/b/c as (a/b)/c
  %left   NEG     # Disambiguate -a-b as (-a)-b and not as -(a-b)
  %tree           # Let us build an abstract syntax tree ...

  %%
  line: exp <%name EXPS + ';'>  { $_[1] } /* list of expressions separated by ';' */
  ;

  exp:
      %name NUM    NUM           | %name VAR   VAR         | %name ASSIGN VAR '=' exp
    | %name PLUS   exp '+' exp   | %name MINUS exp '-' exp | %name TIMES  exp '*' exp
    | %name DIV    exp '/' exp
    | %name UMINUS '-' exp %prec NEG
    |   '(' exp ')'  { $_[2] }
  ;

we are explictly naming the productions. Thus, all the node instances corresponding to the production exp: VAR '=' exp will belong to the class ASSIGN. Now the tree for a=2*b becomes:

EXPS(ASSIGN(TERMINAL[a],TIMES(NUM(TERMINAL[2]),VAR(TERMINAL[b]))))

Observe how the list has been named EXPS. The %name directive prefixes the list operator ([+*?]).

About the Encapsulation of Nodes

There is no encapsulation of nodes. The user/client knows that they are hashes that can be decorated with new keys/attributes. All nodes in the AST created by %tree are Parse::Eyapp::Node nodes. The only reserved field is children which is a reference to the array of children. You can always create a Node class by hand by inheriting from Parse::Eyapp::Node. See section "Separated Compilation with eyapp and treereg" for an example.

TERMINAL Nodes

Nodes named TERMINAL are built from the tokens provided by the lexical analyzer. Parse::Eyapp follows the same protocol than Parse::Yapp for communication between the parser and the lexical analyzer: A couple ($token, $attribute) is returned by the lexical analyzer. These values are stored under the keys token and attr. TERMINAL nodes as all Parse::Eyapp::Node nodes also have the attribute children but is - almost always - empty.

Explicit Actions Inside %tree

Explicit actions can be specified by the programmer like in this line from the "SYNOPSIS" example:

|   '(' exp ')'  { $_[2] }  /* Let us simplify a bit the tree */

Explicit actions receive as arguments the references to the children nodes already built. The programmer can influence the shape of the tree by inserting these explicit actions. In this example the programmer has decided to simplify the syntax tree: the nodes associated with the parenthesis are discarded and the reference to the subtree containing the proper expression is returned. Such manoeuvre is called bypassing. See section "The bypass clause and the %no bypass directive" to know more about automatic bypassing

Explicitly Building Nodes With YYBuildAST

Sometimes the best time to decorate a node with some attributes is just after being built. In such cases the programmer can take manual control building the node with YYBuildAST to inmediately proceed to decorate it.

The following example illustrates the situation:

Variable:
    %name  VARARRAY
    $ID ('[' binary ']') <%name INDEXSPEC +>
      {
        my $self = shift;
        my $node =  $self->YYBuildAST(@_);
        $node->{line} = $ID->[1];
        return $node;
      }

This production rule defines the expression to access an array element as an identifier followed by a non empty list of binary expressions Variable: ID ('[' binary ']')+. Furthermore, the node corresponding to the list of indices has been named INDEXSPEC.

When no explicit action is inserted a binary node will be built having as first child the node corresponding to the identifier $ID and as second child the reference to the list of binary expressions. The children corresponding to '[' and ']' are discarded since they are -by default- syntactic tokens (see section "Syntactic and Semantic tokens"). However, the programmer wants to decorate the node being built with a line attribute holding the line number in the source code where the identifier being used appears. The call to the Parse::Eyapp::Driver method YYBuildAST does the job of building the node. After that the node can be decorated and returned.

Actually, the %tree directive is semantically equivalent to:

%default action { goto &Parse::Eyapp::Driver::YYBuildAST }

Returning non References Under %tree

When a explicit user action returns s.t. that is not a reference no node will be inserted. This fact can be used to supress nodes in the AST being built. See the following example (file examples/returnnonode.yp):

nereida:~/src/perl/YappWithDefaultAction/examples> sed -ne '1,11p' returnnonode.yp | cat -n
 1  %tree
 2  %semantic token 'a' 'b'
 3  %%
 4  S:  /* empty */
 5      | S A
 6      | S B
 7  ;
 8  A : 'a'
 9  ;
10  B : 'b' { }
11  ;

since the action at line 10 returns undef the B : 'b' subtree will not be inserted in the AST:

nereida:~/src/perl/YappWithDefaultAction/examples> usereturnnonode.pl
ababa
S_2(S_3(S_2(S_3(S_2(S_1,A_4(TERMINAL[a]))),A_4(TERMINAL[a]))),A_4(TERMINAL[a]))

Observe the absence of Bs and 'b's.

Intermediate actions and %tree

Intermediate actions can be used to change the shape of the AST (prune it, decorate it, etc.) but the value returned by them is ignored. The grammar below has two intermediate actions. They modify the attributes of the node to its left and return a reference $f to such node (lines 5 and 6):

nereida:~/src/perl/YappWithDefaultAction/examples> \
         sed -ne '1,10p' intermediateactiontree.yp | cat -n
 1  %semantic token 'a' 'b'
 2  %tree bypass
 3  %%
 4  S:    /* empty */
 5      | S A.f { $f->{attr} = "A"; $f; } A
 6      | S B.f { $f->{attr} = "B"; $f; } B
 7  ;
 8  A : %name A 'a'
 9  ;
10  B : %name B 'b'

See the client program running:

nereida:~/src/perl/YappWithDefaultAction/examples> cat -n useintermediateactiontree.pl
 1  #!/usr/bin/perl -w
 2  use strict;
 3  use Parse::Eyapp;
 4  use intermediateactiontree;
 5
 6  { no warnings;
 7  *A::info = *B::info = sub { $_[0]{attr} };
 8  }
 9
10  my $parser = intermediateactiontree->new();
11  my $t = $parser->Run;
12  print $t->str,"\n";
nereida:~/src/perl/YappWithDefaultAction/examples> useintermediateactiontree.pl
aabbaa
S_2(S_4(S_2(S_1,A[A],A[a]),B[B],B[b]),A[A],A[a])

The attributes of left As have been effectively changed by the intermediate actions from 'a' to 'A'. However no further children have been inserted.

Syntactic and Semantic tokens

Parse::Eyapp diferences between syntactic tokens and semantic tokens. By default all tokens declared using string notation (i.e. between quotes like '+', '=') are considered syntactic tokens. Tokens declared by an identifier (like NUM or VAR) are by default considered semantic tokens. Syntactic tokens do not yield to nodes in the syntactic tree. Thus, the first print in the former "SYNOPSIS" example:

$parser->YYData->{INPUT} = "2*-3+b*0;--2\n"; 
my $t = $parser->Run;                    
local $Parse::Eyapp::Node::INDENT=2;
print "Syntax Tree:",$t->str;

gives as result the following output:

nereida:~/src/perl/YappWithDefaultAction/examples> synopsis.pl
Syntax Tree:
EXPRESION_LIST(
  PLUS(
    TIMES(
      NUM(
        TERMINAL[2]
      ),
      UMINUS(
        NUM(
          TERMINAL[3]
        )
      ) # UMINUS
    ) # TIMES,
    TIMES(
      VAR(
        TERMINAL[b]
      ),
      NUM(
        TERMINAL[0]
      )
    ) # TIMES
  ) # PLUS,
  UMINUS(
    UMINUS(
      NUM(
        TERMINAL[2]
      )
    ) # UMINUS
  ) # UMINUS
) # EXPRESION_LIST

TERMINAL nodes corresponding to tokens that were defined by strings like '=', '-', '+', '/', '*', '(' and ')' do not appear in the tree. TERMINAL nodes corresponding to tokens that were defined using an identifer, like NUM or VAR are, by default, semantic tokens and appear in the AST.

Changing the Status of a Token

The new token declaration directives %syntactic token and %semantic token can change the status of a token. For example (file 15treewithsyntactictoken.pl in the examples/ directory), given the grammar:

%syntactic token b
%semantic token 'a' 'c'
%tree

%%

S: %name ABC
     A B C
 | %name BC
     B C
;

A: %name A
     'a'
;

B: %name B
     b
;

C: %name C
    'c'
;
%%

the tree build for input abc will be ABC(A(TERMINAL[a]),B,C(TERMINAL[c])).

Saving the Information of Syntactic Tokens in their Father

The reason for the adjective %syntactic applied to a token is to state that the token influences the shape of the syntax tree but carries no other information. When the syntax tree is built the node corresponding to the token is discarded.

Sometimes the difference between syntactic and semantic tokens is blurred. For example the line number associated with an instance of the syntactic token '+' can be used later -say during type checking- to emit a more accurate error diagnostic. But if the node was discarded the information about that line number is no longer available. When building the syntax tree Parse::Eyapp (namely the method Parse::Eyapp::YYBuildAST) checks if the method TERMINAL::save_attributes exists and if so it will be called when dealing with a syntactic token. The method receives as argument - additionally to the reference to the attribute of the token as it is returned by the lexical analyzer - a reference to the node associated with the left hand side of the production. Here is an example (file examples/Types.eyp) of use:

sub TERMINAL::save_attributes {
  # $_[0] is a syntactic terminal
  # $_[1] is the father.
  push @{$_[1]->{lines}}, $_[0]->[1]; # save the line number
}

The bypass clause and the %no bypass directive

The shape of the tree can be also modified using some %tree clauses as %tree bypass which will produce an automatic bypass of any node with only one child at tree-construction-time.

A bypass operation consists in returning the only child of the node being visited to the father of the node and re-typing (re-blessing) the node in the name of the production (if a name was provided).

A node may have only one child at tree-construction-time for one of two reasons.

  • The first occurs when the right hand side of the production was already unary like in:

    exp:
        %name NUM  NUM 

    Here - if the bypass clause is used - the NUM node will be bypassed and the child TERMINAL built from the information provided by the lexical analyzer will be renamed/reblessed as NUM.

  • Another reason for a node to be bypassed is the fact that though the right hand side of the production may have more than one symbol, only one of them is not a syntactic token like in:

    exp: '(' exp ')'

A consequence of the global scope application of %tree bypass is that undesired bypasses may occur like in

exp : %name UMINUS
      '-' $exp %prec NEG

though the right hand side has two symbols, token '-' is a syntactic token and therefore only exp is left. The bypass operation will be applied when building this node. This bypass can be avoided applying the no bypass ID directive to the corresponding production:

exp : %no bypass UMINUS
      '-' $exp %prec NEG

The following example (file examples/bypass.pl) is the equivalent of the "SYNOPSIS" example but using the bypass clause instead:

use Parse::Eyapp;
use Parse::Eyapp::Treeregexp;

sub TERMINAL::info { $_[0]{attr} }
{ no warnings; *VAR::info = *NUM::info = \&TERMINAL::info; }

my $grammar = q{
  %right  '='     # Lowest precedence
  %left   '-' '+' 
  %left   '*' '/' 
  %left   NEG     # Disambiguate -a-b as (-a)-b and not as -(a-b)
  %tree bypass    # Let us build an abstract syntax tree ...

  %%
  line: exp <%name EXPRESION_LIST + ';'>  { $_[1] } 
  ;

  exp:
      %name NUM  NUM            | %name VAR   VAR         | %name ASSIGN VAR '=' exp
    | %name PLUS exp '+' exp    | %name MINUS exp '-' exp | %name TIMES  exp '*' exp
    | %name DIV     exp '/' exp
    | %no bypass UMINUS
      '-' $exp %prec NEG
    |   '(' exp ')'
  ;

  %%
  # sub _Error, _Lexer and Run like in the synopsis example
  # ...
}; # end grammar

our (@all, $uminus);

Parse::Eyapp->new_grammar( # Create the parser package/class
  input=>$grammar,
  classname=>'Calc', # The name of the package containing the parser
  firstline=>7       # String $grammar starts at line 7 (for error diagnostics)
);
my $parser = Calc->new();                # Create a parser
$parser->YYData->{INPUT} = "a=2*-3+b*0\n"; # Set the input
my $t = $parser->Run;                    # Parse it!

print "\n************\n".$t->str."\n************\n";

# Let us transform the tree. Define the tree-regular expressions ..
my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
  { #  Example of support code
    my %Op = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
  }
  constantfold: /TIMES|PLUS|DIV|MINUS/:bin(NUM, NUM)
    => {
      my $op = $Op{ref($_[0])};
      $NUM[0]->{attr} = eval  "$NUM[0]->{attr} $op $NUM[1]->{attr}";
      $_[0] = $NUM[0];
    }
  zero_times_whatever: TIMES(NUM, .) and { $NUM->{attr} == 0 } => { $_[0] = $NUM }
  whatever_times_zero: TIMES(., NUM) and { $NUM->{attr} == 0 } => { $_[0] = $NUM }
  uminus: UMINUS(NUM) => { $NUM->{attr} = -$NUM->{attr}; $_[0] = $NUM }
  },
  OUTPUTFILE=> 'main.pm'
);
$p->generate(); # Create the tranformations

$t->s(@all);    # constant folding and mult. by zero

print $t->str,"\n";

when running this example with input "a=2*-3+b*0\n" we obtain the following output:

nereida:~/src/perl/YappWithDefaultAction/examples> bypass.pl

************
EXPRESION_LIST(ASSIGN(TERMINAL[a],PLUS(TIMES(NUM[2],UMINUS(NUM[3])),TIMES(VAR[b],NUM[0]))))
************
EXPRESION_LIST(ASSIGN(TERMINAL[a],NUM[-6]))

As you can see the trees are more compact when using the bypass directive.

The alias clause of the %tree directive

Access to children in Parse::Eyapp is made through the child and children methods. There are occasions however where access by name to the children may be preferable. The use of the alias clause with the %tree directive creates accessors to the children with names specified by the programmer. The dot and dolar notations are used for this. When dealing with a production like:

A: 
   %name A_Node
   Node B.bum N.pum $Chip

methods bum, pum and Chip will be created for the class A_Node. Those methods wil provide access to the respective child (first, second and third in the example). The methods are build at compile-time and therefore later transformations of the AST modifying the order of the children may invalidate the use of these getter-setters.

As an example, the CPAN module Language::AttributeGrammar provides AST decorators from an attribute grammar specification of the AST. To work Language::AttributeGrammar requires named access to the children of the AST nodes. Follows an example (file examples/CalcwithAttributeGrammar.pl) of a small calculator:

use Parse::Eyapp;
use Language::AttributeGrammar;

my $grammar = q{
... # priority declarations. Like in previous examples
%tree bypass alias

%%
line: $exp  { $_[1] }
;

exp:
    %name NUM
          $NUM
        | %name VAR
          $VAR
    ............ # as in the bypass example
}; # end grammar

Parse::Eyapp->new_grammar(
  input=>$grammar, classname=>'Rule6', firstline =>7,
);
my $parser = Rule6->new();
$parser->YYData->{INPUT} = "a = -(2*3+5-1)\n";
my $t = $parser->Run;
my $attgram = new Language::AttributeGrammar <<'EOG';
# Compute the expression
NUM:    $/.val = { $<attr> }
TIMES:  $/.val = { $<left>.val * $<right>.val }
PLUS:   $/.val = { $<left>.val + $<right>.val }
MINUS:  $/.val = { $<left>.val - $<right>.val }
UMINUS: $/.val = { -$<exp>.val }
ASSIGN: $/.val = { $<exp>.val }
EOG

my $res = $attgram->apply($t, 'val');

Translation Schemes and the %metatree directive

Eyapp allows through the %metatree directive the creation of Translation Schemes as described in the Dragon's book. Instead of executing the semantic actions associated with the productions, the syntax tree is built. Semantic actions aren't executed. Instead they are inserted as nodes of the syntax tree. The main difference with ordinary nodes being that the attribute of such a CODE node is a reference to the anonymous subroutine representing the semantic action. The tree is later traversed in depth-first order using the $t->translation_scheme method: each time a CODE node is visited the action is executed.

The following example parses a tiny subset of a typical typed language and decorates the syntax tree with a new attribute t holding the type of each declared variable:

use strict; # File examples/trans_scheme_simple_decls4.pl
use Data::Dumper;
use Parse::Eyapp;
our %s; # symbol table

my $ts = q{ 
  %token FLOAT INTEGER NAME

  %{
  our %s;
  %}

  %metatree

  %%
  Dl:  D <* ';'>
  ;

  D : $T { $L->{t} = $T->{t} } $L
  ;

  T : FLOAT    { $lhs->{t} = "FLOAT" }
    | INTEGER  { $lhs->{t} = "INTEGER" }
  ;

  L : $NAME
        { $NAME->{t} = $lhs->{t}; $s{$NAME->{attr}} = $NAME }
    | $NAME { $NAME->{t} = $lhs->{t}; $L->{t} = $lhs->{t} } ',' $L
        { $s{$NAME->{attr}} = $NAME }
  ;
  %%
}; # end $ts

sub Error { die "Error sintáctico\n"; }

{ # Closure of $input, %reserved_words and $validchars
  my $input = "";
  my %reserved_words = ();
  my $validchars = "";

  sub parametrize__scanner {
    $input = shift;
    %reserved_words = %{shift()};
    $validchars = shift;
  }

  sub scanner {
    $input =~ m{\G\s+}gc;                     # skip whites
    if ($input =~ m{\G([a-z_A_Z]\w*)\b}gc) {
      my $w = uc($1);                 # upper case the word
      return ($w, $w) if exists $reserved_words{$w};
      return ('NAME', $1);            # not a reserved word
    }
    return ($1, $1) if ($input =~ m/\G([$validchars])/gc);
    die "Not valid token: $1\n" if ($input =~ m/\G(\S)/gc);
    return ('', undef); # end of file
  }
} # end closure

Parse::Eyapp->new_grammar(input=>$ts,classname=>'main',outputfile=>'Types.pm');
my $parser = main->new(yylex => \&scanner, yyerror => \&Error); 

parametrize__scanner(
  "float x,y;\ninteger a,b\n",
  { INTEGER => 'INTEGER', FLOAT => 'FLOAT'},
  ",;"
);

my $t = $parser->YYParse() or die "Syntax Error analyzing input";

$t->translation_scheme;

$Data::Dumper::Indent = 1;
$Data::Dumper::Terse = 1;
$Data::Dumper::Deepcopy  = 1;
$Data::Dumper::Deparse = 1;
print Dumper($t);
print Dumper(\%s);

Inside a Translation Scheme the lexical variable $lhs refers to the attribute of the father.

Execution Stages of a Translation Scheme

The execution of a Translation Scheme can be divided in the following stages:

1. During the first stage the grammar is analyzed and the parser is built:
Parse::Eyapp->new_grammar(input=>$ts,classname=>'main',outputfile=>'Types.pm');

This stage is called Class Construction Time

2. A parser conforming to the generated grammar is built
my $parser = main->new(yylex => \&scanner, yyerror => \&Error);

This stage is called Parser Construction Time

3. The next phase is Tree construction time. The input is set and the tree is built:
parametrize__scanner(
   "float x,y;\ninteger a,b\n",
   { INTEGER => 'INTEGER', FLOAT => 'FLOAT'},
   ",;"
 );

 my $t = $parser->YYParse() or die "Syntax Error analyzing input";
4. The last stage is Execution Time. The tree is traversed in depth first order and the CODE nodes are executed.
$t->translation_scheme;

This combination of bottom-up parsing with depth first traversing leads to a semantic behavior similar to recursive top-down parsers but with two advantages:

  • The grammar can be left-recursive

  • At the time of executing the action the syntax tree is already built, therefore we can refer to nodes on the right side of the action like in:

    D : $T { $L->{t} = $T->{t} } $L

The %begin directive

The %begin { code } directive can be used when building a translation scheme, i.e. when under the control of the %metatree directive. It indicates that such { code } will be executed at tree construction time. Therefore the code receives as arguments the references to the nodes of the branch than is being built. Usually begin code assist in the construction of the tree. Line 39 of the following code shows an example. The action { $exp } simplifies the syntax tree bypassing the parenthesis node. The example also illustrates the combined use of default actions and translation schemes.

nereida:~/src/perl/YappWithDefaultAction/examples> \
               cat -n trans_scheme_default_action.pl
  1  #!/usr/bin/perl -w
  2  use strict;
  3  use Data::Dumper;
  4  use Parse::Eyapp;
  5  use IO::Interactive qw(is_interactive);
  6
  7  my $translationscheme = q{
  8  %{
  9  # head code is available at tree construction time
 10  use Data::Dumper;
 11  our %sym; # symbol table
 12  %}
 13
 14  %defaultaction { $lhs->{n} = eval " $left->{n} $_[2]->{attr} $right->{n} " }
 15
 16  %metatree
 17
 18  %right   '='
 19  %left   '-' '+'
 20  %left   '*' '/'
 21
 22  %%
 23  line:       %name EXP
 24                exp <+ ';'> /* Expressions separated by semicolons */
 25                  { $lhs->{n} = $_[1]->Last_child->{n} }
 26  ;
 27
 28  exp:
 29              %name PLUS
 30                exp.left '+' exp.right
 31          |   %name MINUS
 32                exp.left '-' exp.right
 33          |   %name TIMES
 34                exp.left '*' exp.right
 35          |   %name DIV
 36                exp.left '/' exp.right
 37          |   %name NUM   $NUM
 38                  { $lhs->{n} = $NUM->{attr} }
 39          |   '(' $exp ')'  %begin { $exp }        # Bypass the node
 40          |   %name VAR
 41                $VAR
 42                  { $lhs->{n} = $sym{$VAR->{attr}}->{n} }
 43          |   %name ASSIGN
 44                $VAR '=' $exp
 45                  { $lhs->{n} = $sym{$VAR->{attr}}->{n} = $exp->{n} }
 46
 47  ;
 48
 49  %%
 50  # tail code is available at tree construction time
 51  sub _Error {
 52    die "Syntax error.\n";
 53  }
 54
 55  sub _Lexer {
 56      my($parser)=shift;
 57
 58      for ($parser->YYData->{INPUT}) {
 59          defined($_) or  return('',undef);
 60
 61          s/^\s*//;
 62          s/^([0-9]+(?:\.[0-9]+)?)// and return('NUM',$1);
 63          s/^([A-Za-z][A-Za-z0-9_]*)// and return('VAR',$1);
 64          s/^(.)// and return($1,$1);
 65          s/^\s*//;
 66      }
 67  }
 68
 69  sub Run {
 70      my($self)=shift;
 71      return $self->YYParse( yylex => \&_Lexer, yyerror => \&_Error );
 72  }
 73  }; # end translation scheme
 74
 75  $Data::Dumper::Indent = 1;
 76  $Data::Dumper::Terse = 1;
 77  $Data::Dumper::Deepcopy  = 1;
 78  my $p = Parse::Eyapp->new_grammar(
 79    input=>$translationscheme,
 80    classname=>'main',
 81    firstline => 6,
 82    outputfile => 'main.pm');
 83  die $p->qtables() if $p->Warnings;
 84  my $parser = main->new();
 85  print "Write a sequence of arithmetic expressions: " if is_interactive();
 86  $parser->YYData->{INPUT} = <>;
 87  my $t = $parser->Run() or die "Syntax Error analyzing input";
 88  $t->translation_scheme;
 89  my $treestring = Dumper($t);
 90  our %sym;
 91  my $symboltable = Dumper(\%sym);
 92  print <<"EOR";
 93  ***********Tree*************
 94  $treestring
 95  ******Symbol table**********
 96  $symboltable
 97  ************Result**********
 98  $t->{n}
 99
100  EOR

Compiling with eyapp

Separated Compilation with eyapp and treereg

A Treeregexp program can be isolated in a file an compiled with the program treereg. The default extension is .trg. See the following example:

nereida:~/src/perl/YappWithDefaultAction/examples> cat -n Shift.trg
 1  # File: Shift.trg
 2  {
 3    sub log2 {
 4      my $n = shift;
 5      return log($n)/log(2);
 6    }
 7
 8    my $power;
 9  }
10  mult2shift: TIMES($e, NUM($m)) and { $power = log2($m->{attr}); (1 << $power) == $m->{attr} }
11    => {
12      $_[0]->delete(1);
13      $_[0]->{shift} = $power;
14      $_[0]->type('SHIFTLEFT');
15    }

Note that auxiliary support code can be inserted at any point between transformations (lines 2-6). The code will be inserted (without the defining curly brackets) at that point. Note also that the lexical variable $power is visible inside the definition of the mult2shift transformation.

A treeregexp like $e matches any node. A reference to the node is saved in the lexical variable $e. The scope of the variable $e is the current tree transformation, i.e. mult2shift. Such kind of treeregexps are called scalar treeregexps.

The call to the delete method at line 12 deletes the second child of the node being visited (i.e. NUM($m)).

The call to type at line 14 retypes the node as a SHIFTLEFT node.

The program is compiled using the script treereg:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ eyapp Rule6
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ treereg Shift
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ ls -ltr | tail -2
-rw-rw----  1 pl users   5960 2007-01-30 09:09 Rule6.pm
-rw-rw----  1 pl users   1424 2007-01-30 09:09 Shift.pm

The Grammar Rule6.yp is similar to the one in the "SYNOPSIS" section. Module Rule6.pm contains the parser. The module Shift.pm contains the code implementing the tree transformations.

The client program follows:

nereida:~/src/perl/YappWithDefaultAction/examples> cat -n useruleandshift.pl
 1  #!/usr/bin/perl -w
 2  use strict;
 3  use Rule6;
 4  use Shift;
 5  { no warnings; *TERMINAL::info = \&TERMINAL::attr; }
 6
 7  push @SHIFTLEFT::ISA, 'Parse::Eyapp::Node';
 8  sub SHIFTLEFT::info { $_[0]{shift} }
 9
10  my $parser = new Rule6();
11  $parser->YYData->{INPUT} = <>;
12  my $t = $parser->Run;
13  print "***********\n",$t->str,"\n";
14  $t->s(@Shift::all);
15  print "***********\n",$t->str,"\n";

Lines 5 and 8 provide the node classes TERMINAL and SHIFTLEFT of info methods to be used by str (lines 13 and 14). To make SHIFTLEFT a node class it has to inherit from Parse::Eyapp::Node (line 7).

Multiplications by a power of two are substituted by the corresponding shifts:

nereida:~/src/perl/YappWithDefaultAction/examples> useruleandshift.pl
a=b*8
***********
ASSIGN(TERMINAL[a],TIMES(VAR(TERMINAL[b]),NUM(TERMINAL[8])))
***********
ASSIGN(TERMINAL[a],SHIFTLEFT[3](VAR(TERMINAL[b])))

Compiling: More Options

See files Rule9.yp, Transform4.trg and foldand0rule9_4.pl in the examples directory for a more detailed vision of this example. File Rule9.yp is very much like the grammar in the "SYNOPSIS" example. To compile the grammar Rule9.yp and the treeregexp file Transform4.trg use the commands:

eyapp -m 'Calc' Rule9.yp

That will produce a file Calc.pm containing a package Calc that implements the LALR parser. Then the command:

treereg -o T.pm -p 'R::' -m T Transform4

produces a file T.pm containing a package T that implements the tree transformation program. The -p option announces that node classes are prefixed by 'R::'.

With such parameters the client program uses the generated modules as follows:

nereida:~/src/perl/YappWithDefaultAction/examples> cat -n foldand0rule9_4.pl
 1  #!/usr/bin/perl -w
 2  # File: foldand0rule9_4.pl. Compile it with
 3  #          eyapp -m 'Calc' Rule9.yp; treereg -o T.pm -p 'R::' -m T Transform4
 4  use strict;
 5  use Calc;
 6  use T;
 7
 8  sub R::TERMINAL::info { $_[0]{attr} }
 9  my $parser = new Calc(yyprefix => "R::");
10  my $t = $parser->YYParse( yylex => \&Calc::Lexer, yyerror => \&Calc::Error);
11  print "\n***** Before ******\n";
12  print $t->str."\n";
13  $t->s(@T::all);
14  print "\n***** After ******\n";
15  print $t->str."\n";

running the program produces the following output:

nereida:~/src/perl/YappWithDefaultAction/examples> foldand0rule9_4.pl
2*3

***** Before ******
R::TIMES(R::NUM(R::TERMINAL[2]),R::TERMINAL[*],R::NUM(R::TERMINAL[3]))

***** After ******
R::NUM(R::TERMINAL[6])

Parse::Eyapp Methods

A Parse::Eyapp object holds the information about the Eyapp input grammar: parsing tables, conflicts, semantic actions, etc.

Parse::Eyapp->new_grammar

To translate an Eyapp grammar you must use either the eyapp script or call the class constructor new_grammar. The Parse::Eyapp method Parse::Eyapp->new_grammar(input=>$grammar) creates a package containing the code that implements a LALR parser for the input grammar:

my $p = Parse::Eyapp->new_grammar(
  input=>$translationscheme,
  classname=>'Grammar',
  firstline => 6,
  outputfile => 'main');
die $p->Warnings if $p->Warnings;
my $new_parser_for_grammar = Grammar->new();

The method returns a Parse::Eyapp object.

You can check the object to see if there were problems during the construction of the parser for your grammar:

die $p->qtables() if $p->Warnings;

The method Warnings returns the warnings produced during the parsing. The absence of warnings indicates the correctness of the input program.

The call to Parse::Eyapp->new_grammar generates a class/package containing the parser for your input grammar. Such package lives in the namespace determined by the classname argument of new_grammar. To create a parser for the grammar you call the constructor new of the just created class:

my $new_parser_for_grammar = Grammar->new();

The meaning of the arguments of Parse::Eyapp->new_grammar is:

- input

The string containing the input

- classname

The name of the package that will held the code for the LALR parser. The package of the caller will be used as default if none is specified.

- firstline

For error diagnostics. The line where the definition of the Eyapp grammar starts.

- linenumbers

Include/not include # line directives in the generated code

- outputfile

If defined the generated code fill be dumped in the specified filename (with extension .pm) and the LALR information ambigueties and conflicts) in the specified filename with extension .output.

$eyapp->qtables

Returns a string containing information on warnings, ambiguities, conflicts, rules and the generated DFA tables. Is the same information in file.output when using the command eyapp -v file.eyp.

my $p = Parse::Eyapp->new_grammar(
  input=>$eyappprogram,
  classname=>'SimpleC',
  outputfile => 'SimpleC.pm',
  firstline=>12,
);

print $p->qtables() if $p->Warnings;

$eyapp->outputtables

It receives two arguments

$eyapp->outputtables($path, $base)

Similar to qtables but prints the information on warnings, conflicts and rules to the specified $path/$file.

$eyapp->Warnings

Returns the warnings resulting from compiling the grammar:

my $p = Parse::Eyapp->new_grammar(
  input=>$translationscheme,
  classname=>'main',
  firstline => 6,
  outputfile => 'main'
);
die $p->Warnings if $p->Warnings;

Returns the empty string if there were no conflicts.

$eyapp->ShowDfa

Returns a string with the information about the LALR generated DFA.

$eyapp->Summary

Returns a string with summary information about the compilation of the grammar. No arguments.

$eyapp->Conflicts

Returns a string with summary information about the conflicts that arised when compiling the grammar. No arguments.

$eyapp->DfaTable

Returns a string with the parsing tables

Methods Available in the Generated Class

This section describes the methods and objects belonging to the class generated either using eyapp or new_grammar. In the incoming paragraphs we will assume that Class was the value selected for the classname argument when Parse::Eyapp->new_grammar was called. Objects belonging to Class are the actual parsers for the input grammar.

Class->new

The method Class->new returns a new LALR parser object. Here Class stands for the name of the class containing the parser. See an example of call:

my $parser = main->new(yyprefix => 'Parse::Eyapp::Node::',
                       yylex    => \&main::_Lexer,
                       yyerror  => \&main::_Error,
                       yydebug => 0x1F,
);

The meaning of the arguments used in the example are as follows:

- yyprefix

Used with %tree or %metatree. When used, the type names of the nodes of the syntax tree will be build prefixing the value associated to yyprefix to the name of the production rule. The name of the production rule is either explicitly given through a %name directive or the concatenation of the left hand side of the rule with the ordinal of the right hand side of the production. See section "Compiling with eyapp" for an example.

- yylex

Reference to the lexical analyzer subroutine

- yyerror

Reference to the error subroutine. The error subroutine receives as first argument the reference to the Class parser object. This way it can take advantage of methods like YYCurval and YYExpect (see below):

sub _Error {
  my($token)=$_[0]->YYCurval;
  my($what)= $token ? "input: '$token'" : "end of input";
  my @expected = $_[0]->YYExpect();

  local $" = ', ';
  die "Syntax error near $what. Expected one of these tokens: @expected\n";
}
- yydebug

Controls the level of debugging. Must be a number.

The package produced from the grammar has several methods.

The parser object has the following methods that work at parsing time exactly as in Parse::Yapp. These methods can be found in the module Parse::Eyapp::Driver. Assume you have in $parser the reference to your parser object:

$parser->YYParse()

It very much works Parse::Yapp::YYParse and as yacc/bison yyparse. It accepts almost the same arguments as Class->new with the exception of yyprefix which can be used only with new.

Debugging the Grammar

The integer parameter yydebug of new and YYParse controls the level of debugging. Different levels of verbosity can be obtained by setting the bits of this argument. It works as follows:

/============================================================\
| Bit Value  | Outputs                                       |
|------------+-----------------------------------------------|
|  0x01      |  Token reading (useful for Lexer debugging)   |
|------------+-----------------------------------------------|
|  0x02      |  States information                           |
|------------+-----------------------------------------------|
|  0x04      |  Driver actions (shifts, reduces, accept...)  |
|------------+-----------------------------------------------|
|  0x08      |  Parse Stack dump                             |
|------------+-----------------------------------------------|
|  0x10      |  Error Recovery tracing                       |
\============================================================/

As an example consider the grammar (file PlusList1.yp in examples/)

%%
S:      'c'+  { print "S -> 'c'+\n" }
;
%%

When the parser is called with yydebug activated:

$self->YYParse(yylex => \&_Lexer, yyerror => \&_Error , yydebug => 0x1F);

the output reports about the parser activities:

> use_pluslist1.pl
----------------------------------------
In state 0:
Stack:[0]
c
Need token. Got >c<
Shift and go to state 2.
----------------------------------------
In state 2:
Stack:[0,2]
Don't need token.
Reduce using rule 2 (PLUS-1,1): Back to state 0, then go to state 3.
----------------------------------------
In state 3:
Stack:[0,3]
Need token. Got ><
Reduce using rule 3 (S,1): S -> 'c'+
Back to state 0, then go to state 1.
----------------------------------------
In state 1:
Stack:[0,1]
Shift and go to state 4.
----------------------------------------
In state 4:
Stack:[0,1,4]
Don't need token.
Accept.

$parser->YYErrok

Works as yacc/bison yyerrok. Modifies the error status so that subsequent error messages will be emitted.

$parser->YYError

Works as yacc/bison YYERROR. Pretends that a syntax error has been detected.

$parser->YYNberr

The current number of errors

$parser->YYAccept

Works as yacc/bison YYACCEPT. The parser finishes returning the current semantic value to indicate success.

$parser->YYAbort

Works as yacc/bison YYABORT. The parser finishes returning undef to indicate failure.

$parser->YYRecovering

Works as yacc/bison YYRECOVERING. Returns TRUE if the parser is recovering from a syntax error.

$parser->YYCurtok

Gives the current token

$parser->YYCurval

Gives the attribute associated with the current token

$parser->YYExpect

Returns the list of tokens the parser expected when the failure occurred

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ \
                           sed -ne '26,33p' Postfix.eyp
sub _Error {
  my($token)=$_[0]->YYCurval;
  my($what)= $token ? "input: '$token'" : "end of input";
  my @expected = $_[0]->YYExpect();

  local $" = ', ';
  die "Syntax error near $what. Expected one of these tokens: @expected\n";
}

$parser->YYLexer

Returns a reference to the lexical analyzer

$parser->YYLhs

Returns the identifier of the left hand side of the current production (the one that is being used for reduction/antiderivation. An example of use can be found in examples/Lhs1.yp:

%defaultaction { print $_[0]->YYLhs,"\n" }

$parser->YYRuleindex

Returns the index of the production rule, counting the super rule as rule 0. To know the numbers have a look at the .output file. To get a .output file use the option -v of eyapp or the outputfile parameter when using method new_grammar (see the documentation for eyapp).

$parser->YYRightside

Returns an array of strings describing the right hand side of the rule

$parser->YYIsterm

Returns TRUE if the symbol given as argument is a terminal. Example:

 DB<0> x $self->YYIsterm('exp')
0  ''
 DB<1> x $self->YYIsterm('*')
0  1

An example of combined use of YYRightside, YYRuleindex, YYLhs and YYIsterm can be found examples/Rule3.yp:

nereida:~/src/perl/YappWithDefaultAction/examples> sed -n -e '4,22p' Rule3.yp | cat -n
 1  sub build_node {
 2    my $self = shift;
 3    my @children = @_;
 4    my @right = $self->YYRightside();
 5    my $var = $self->YYLhs;
 6    my $rule = $self->YYRuleindex();
 7
 8    for(my $i = 0; $i < @right; $i++) {
 9      $_ = $right[$i];
10      if ($self->YYIsterm($_)) {
11        $children[$i] = bless { token => $_, attr => $children[$i] },
12                                            __PACKAGE__.'::TERMINAL';
13      }
14    }
15    bless {
16            children => \@children,
17            info => "$var -> @right"
18          }, __PACKAGE__."::${var}_$rule"
19  }

when executed an output similar to this is produced:

nereida:~/src/perl/YappWithDefaultAction/examples> userule3.pl
2*3
$VAR1 = bless( {
  'info' => 'exp -> exp * exp',
  'children' => [
    bless( {
      'info' => 'exp -> NUM',
      'children' => [ bless( { 'attr' => '2', 'token' => 'NUM' }, 'Rule3::TERMINAL' ) ]
    }, 'Rule3::exp_6' ),
    bless( { 'attr' => '*', 'token' => '*' }, 'Rule3::TERMINAL' ),
    bless( {
      'info' => 'exp -> NUM',
      'children' => [ bless( { 'attr' => '3', 'token' => 'NUM' }, 'Rule3::TERMINAL' )
      ]
    }, 'Rule3::exp_6' )
  ]
}, 'Rule3::exp_11' );

$parser->YYIssemantic

Returns TRUE if the terminal is semantic. Semantics token can be declared using the directive %semantic token. The opposite of a Semantic token is a Syntactic token. Syntactic tokens can be declared using the directive %syntactic token.

When using the %tree directive all the nodes corresponding to syntactic tokens are pruned from the tree. Under this directive tokens in the text delimited by simple quotes (like '+') are, by default, considered syntactic tokens.

When using the %metatree directive all the tokens are considered, by default, semantic tokens. Thus, no nodes will be - by default- pruned when construction the code augmented tree. The exception are string tokens used as separators in the definition of lists, like in S <* ';'>. If you want the separating string token to appear include an explicit semantic declaration for it (example %semantic token ';').

$parser->YYName

Returns the name of the current rule (The production whose reduction gave place to the execution of the current semantic action).

 DB<12> x $self->YYName
0  'exp_11'

$parser->YYPrefix

Return and/or sets the yyprefix attribute. This a string that will be concatenated as a prefix to any Parse::Eyapp::Node nodes in the syntax tree.

$parser->YYBypass

Returns TRUE if running under the %tree bypass clause

$parser->YYBypassrule

Returns TRUE if the production being used for reduction was marked to be bypassed.

$parser->YYFirstline

First line of the input string describing the grammar

Parse::Eyapp::Driver::BeANode

Is not a method. Receives as input a Class name. Introduces Parse::Eyapp::Node as an ancestor class of Class. To work correctly, objects belonging to Class must be hashes with a children key whose value must be a reference to the array of children. The children must be also Parse::Eyapp::Node nodes. Actually you can circumvent this call by directly introducing Parse::Eyapp::Node in the ancestors of Class:

push @{$class."::ISA"}, "Parse::Eyapp::Node" 

$parser->YYBuildAST

Sometimes the best time to decorate a node with some attributes is just after being built. In such cases the programmer can take manual control building the node with YYBuildAST to inmediately proceed to decorate it.

The following example from examples/Types.eyp illustrates the idea:

Variable:
    %name  VARARRAY
    $ID ('[' binary ']') <%name INDEXSPEC +>
      {
        my $self = shift;
        my $node =  $self->YYBuildAST(@_);
        $node->{line} = $ID->[1];
        return $node;
      }

Actually, the %tree directive is semantically equivalent to:

%default action { goto &Parse::Eyapp::Driver::YYBuildAST }

$parser->YYBuildTS

Similar to $parser->YYBuildAST but builds nodes for translation schemes.

Parse::Eyapp::Parse objects

The parser for the Eyapp language was written and generated using Parse::Eyapp and the eyapp compiler (actually the first version was bootstrapped using the yapp compiler). The Eyapp program parsing the Eyapp language is in the file Parse/Eyapp/Parse.yp in the Parse::Eyapp distribution. (See also section "The Eyapp Language") Therefore Parse::Eyapp::Parse objects have all the methods mentioned in the section ""Methods Available in the Generated Class"". A Parse::Eyapp::Parse is nothing but a particular kind of Parse::Eyapp parser: the one that parses Eyapp grammars.

The Treeregexp Language

A Treeregexp program is made of the repetition of three kind of primitives: The treeregexp transformations, auxiliar Perl code and Transformation Families.

treeregexplist:  treeregexp* 

treeregexp: 
    IDENT ':' treereg ('=>' CODE)?  # Treeregexp 
  | CODE                            # Auxiliar code
  | IDENT '=' IDENT + ';'           # Transformation families

Treeregexp themselves follow the rule:

IDENT ':' treereg ('=>' CODE)?

Several instances of this rule can be seen in the example in the "SYNOPSIS" section. The identifier IDENT gives the name to the rule. At the time of this writing (2006) there are the following kinds of treeregexes:

treereg: 
      /* tree patterns with children */
    IDENT '(' childlist ')' ('and' CODE)? 
  | REGEXP (':' IDENT)? '(' childlist ')' ('and' CODE)? 
  | SCALAR '(' childlist ')' ('and' CODE)?  
  | '.' '(' childlist ')' ('and' CODE)? 
        /* leaf tree patterns */
  | IDENT ('and' CODE)? 
  | REGEXP (':' IDENT)? ('and' CODE)? 
  | '.' ('and' CODE)? 
  | SCALAR ('and' CODE)? 
  | ARRAY 
  | '*' 

Treeregexp rules

When seen a rule like

zero_times: TIMES(NUM($x), ., .) and { $x->{attr} == 0 } => { $_[0] = $NUM }

The Treeregexp translator creates a Parse::Eyapp:YATW object that can be later referenced in the user code by the package variable $zero_times.

The treeregexp

The first part of the rule TIMES(NUM($x), ., .) indicates that for a matching to succeed the node being visited must be of type TIMES, have a left child of type NUM and two more children.

If the first part succeeded then the following part takes the control to see if the semantic conditions are satisfied.

Semantic condition

The second part is optional and must be prefixed by the reserved word and followed by a Perl code manifesting the semantic conditions that must be hold by the node to succeed. Thus, in the example:

zero_times: TIMES(NUM($x), ., .) and { $x->{attr} == 0 } => { $_[0] = $NUM }

the semantic condition $x->{attr} == 0 states that the value of the number stored in the TERMINAL node referenced by $x must be zero.

Referencing the matching nodes

The node being visited can be referenced/modified inside the semantic actions using $_[0].

The Treeregexp translator automatically creates a set of lexical variables for us. The scope of these variables is limited to the semantic condition and the transformation code.

Thus, in the example

zero_times: TIMES(NUM($x), ., .) and { $x->{attr} == 0 } => { $_[0] = $NUM }

the node being visited $_[0] can be also referenced using the lexical variable $TIMES which is created by he Treeregexp compiler. In the same way a reference to the left child NUM will be stored in the lexical variable $NUM and a reference to the child of $NUM will be stored in $x. The semantic condition states that the attribute of the node associated with $x must be zero.

When the same type of node appears several times inside the treeregexp part the associated lexical variable is declared by the Treeregexp compiler as an array. This is the case in the constantfold transformation in the "SYNOPSIS" example, where there are two nodes of type NUM:

constantfold: /TIMES|PLUS|DIV|MINUS/(NUM($x), ., NUM($y))
   => {
  $x->{attr} = eval  "$x->{attr} $W->{attr} $y->{attr}";
  $_[0] = $NUM[0];
}

Thus variable $NUM[0] references the node that matches the first NUM term in the formula and $NUM[1] the one that matches the second.

Transformation code

The third part of the rule is also optional and comes prefixed by the big arrow =>. The Perl code in this section usually transforms the matching tree. To achieve the modification of the tree, the Treeregexp programmer must use $_[0] and not the lexical variables provided by the translator. Remember that in Perl $_[0] is an alias of the actual parameter. The constantfold example above will not work if we rewrite the code $_[0] = $NUM[0] as

{ $TIMES = $NUM }

Regexp Treeregexes

The previous constantfold example used a classic Perl linear regexp to explicit that the root node of the matching subtree must match the Perl regexp. The general syntax for REGEXP treeregexes patterns is:

treereg: REGEXP (':' IDENT)? '(' childlist ')' ('and' CODE)? 

The REGEXP must be specified between slashes (other delimiters as {} are not accepted). It is legal to specify options after the second slash (like e, i, etc.).

The operation of string oriented regexps is slightly modified when they are used inside a treeregexp: by default the option x will be assumed. The treeregexp compiler will automatically insert it. Use the new option X (upper case X) if you want to supress such behavior. There is no need also to insert \b word anchors to delimit identifiers: all the identifiers in a regexp treeregexp are automatically surrounded by \b. Use the option B (upper case B) to supress this behavior.

The optional identifier after the REGEXP indicates the name of the lexical variable that will be held a reference to the node whose type matches REGEXP. Variable $W (or @W if there are more than one REGEXP and or dot treeregexes) will be used instead if no identifier is specified.

Scalar Treeregexes

A scalar treeregxp is defined writing a Perl scalar inside the treeregexp, like $x in NUM($x). A scalar treeregxp immediately matches any node that exists and stores a reference to such node inside the Perl lexical scalar variable. The scope of the variable is limited to the semantic parts of the Treeregexp. Is illegal to use $W or $W_#num as variable names for scalar treeregexes.

Dot Treeregexes

A dot matches any node. It can be seen as an abbreviation for scalar treeregexes. The reference to the matching node is stored in the lexical variable $W. The variable @W will be used instead if there are more than one REGEXP and or dot treeregexes

Array Treeregexp Expressions

The Treeregexp language permits expressions like:

A(@a,B($x),@c)

After the matching variable @A contains the shortest prefix of $A->children that does not match B($x). The variable @c contains the remaining sufix of $A->children.

The following example uses array treereg expressions to move the assignment b = 5 out of the while loop:

 ..  ......................................................................
 93  my $program = "a =1000; c = 1; while (a) { c = c*a; b = 5; a = a-1 }\n";
 94  $parser->YYData->{INPUT} = $program;
 95  my $t = $parser->Run;
 96  my @output = split /\n/, $t->str;
 97
 98  my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
 99    moveinvariant: BLOCK(
100                     @prests,
101                     WHILE(VAR($b), BLOCK(@a, ASSIGN($x, NUM($e)), @c)),
102                     @possts
103                   )
104      => {
105           my $assign = $ASSIGN;
106           $BLOCK[1]->delete($ASSIGN);
107           $BLOCK[0]->insert_before($WHILE, $assign);
108         }
109    },
110    FIRSTLINE => 99,
111  );
112  $p->generate();

Star Treeregexp

Deprecated. Don't use it. Is still there but not to endure.

Transformation Families

Transformations created by Parse::Eyapp::Treeregexp can be grouped in families. That is the function of the rule:

treeregexp: IDENT '=' IDENT + ';' 

The next example (file examples/TSwithtreetransformations3.eyp) defines the family

algebraic_transformations = constantfold zero_times times_zero comasocfold;

Follows the code:

    my $transform = Parse::Eyapp::Treeregexp->new( STRING => q{

     uminus: UMINUS(., NUM($x), .) => { $x->{attr} = -$x->{attr}; $_[0] = $NUM }
     constantfold: /TIMES|PLUS|DIV|MINUS/:bin(NUM($z), ., NUM($y))
        => {
       $z->{attr} = eval  "$z->{attr} $W->{attr} $y->{attr}";
       $_[0] = $NUM[0];
     }
     commutative_add: PLUS($x, ., $y, .)
       => { my $t = $x; $_[0]->child(0, $y); $_[0]->child(2, $t)}
     comasocfold: TIMES(DIV(NUM($x), ., $b), ., NUM($y))
        => {
       $x->{attr} = $x->{attr} * $y->{attr};
       $_[0] = $DIV;
     }
     zero_times: TIMES(NUM($x), ., .) and { $x->{attr} == 0 } => { $_[0] = $NUM }
     times_zero: TIMES(., ., NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
     algebraic_transformations = constantfold zero_times times_zero comasocfold;
   },
   );

   $transform->generate();
   our ($uminus);
   $uminus->s($tree);

The transformations belonging to a family are usually applied toghether:

$tree->s(@algebraic_transformations);

Code Support

In between Treeregexp rules and family assignments the programmer can insert Perl code between curly brackets. That code usually gives support to the semantic conditions and transformations inside the rules. See for example test 14 in the t/ directory of the Parse::Eyapp distribution.

{
  sub not_semantic {
    my $self = shift;
    return  1 if $self->{token} eq $self->{attr};
    return 0;
  }
}

delete_tokens : TERMINAL and { not_semantic($TERMINAL) } 
                         => { $delete_tokens->delete() }

Parse::Eyapp::Node Methods

The Parse::Eyapp::Node objects represent the nodes of the syntax tree. All the node classes build by %tree and %metatree directives inherit from Parse::Eyapp::Node and consequently have acces to the methods provided in such module.

Parse::Eyapp::Node->new

Nodes are usually created using the %tree or %metatree Parse::Eyapp directives. The Parse::Eyapp::Node constructor new offers an alternative way to create forests.

This class method can be used to build multiple nodes on a row. It receives a string describing the tree and optionally a reference to a subroutine. Such subroutine (called the attribute handler) is in charge to initialize the attributes of the just created nodes. The attribute handler is called with the array of references to the nodes as they appear in the string from left to right.

Parse::Eyapp::Node->new returns an array of pointers to the nodes created as they appear in the input string from left to right. In scalar context returns a pointer to the first of these trees.

The following example (see file examples/28foldwithnewwithvars.pl) of a treeregexp transformation creates a new NUM(TERMINAL) node using Parse::Eyapp::Node->new:

my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
  {
    my %Op = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
  }
  constantfold: /TIMES|PLUS|MINUS|DIV/(NUM($x), NUM($y))
     => {
    my $op = $Op{ref($_[0])};

    my $res = Parse::Eyapp::Node->new(
      q{NUM(TERMINAL)},
      sub {
        my ($NUM, $TERMINAL) = @_;
        $TERMINAL->{attr} = eval "$x->{attr} $op $y->{attr}";
        $TERMINAL->{token} = 'NUM';
      },
    );
    $_[0] = $res;
  }
  },
);

The string can describe more than one tree like in:

my @t = Parse::Eyapp::Node->new(
         'A(C,D) E(F)', sub { my $i = 0; $_->{n} = $i++ for @_ });

The following trees will be built:

bless( { 'n' => 0,
  'children' => [
    bless( { 'n' => 1, 'children' => [] }, 'C' ),
    bless( { 'n' => 2, 'children' => [] }, 'D' )
  ]
}, 'A' );
bless( { 'n' => 3,
  'children' => [
    bless( { 'n' => 4, 'children' => [] }, 'F' )
  ]
}, 'E' );

and @t will contain 5 references to the corresponding subtrees A(C,D), C, D, E(F) and F.

Directed Acyclic Graphs with Parse::Eyapp::Node->hnew

Parse::Eyapp provides the method Parse::Eyapp::Node->hnew to build Directed Acyclic Graphs (DAGs) instead of trees. They are built using hashed consing, i.e. memoizing the creation of nodes. It works very much like Parse::Eyapp::Node->new but if one of the implied trees was previously built, hnew returns a reference to the existing one. See the following debugger session where several DAGs describing type expressions are built:

 DB<2> x $a = Parse::Eyapp::Node->hnew('F(X_3(A_3(A_5(INT)), CHAR, A_5(INT)),CHAR)')
0  F=HASH(0x85f6a20)
   'children' => ARRAY(0x85e92e4)
   |- 0  X_3=HASH(0x83f55fc)
   |     'children' => ARRAY(0x83f5608)
   |     |- 0  A_3=HASH(0x85a0488)
   |     |     'children' => ARRAY(0x859fad4)
   |     |        0  A_5=HASH(0x85e5d3c)
   |     |           'children' => ARRAY(0x83f4120)
   |     |              0  INT=HASH(0x83f5200)
   |     |                 'children' => ARRAY(0x852ccb4)
   |     |                      empty array
   |     |- 1  CHAR=HASH(0x8513564)
   |     |     'children' => ARRAY(0x852cad4)
   |     |          empty array
   |     `- 2  A_5=HASH(0x85e5d3c)
   |           -> REUSED_ADDRESS
   `- 1  CHAR=HASH(0x8513564)
         -> REUSED_ADDRESS
 DB<3> x $a->str
0  'F(X_3(A_3(A_5(INT)),CHAR,A_5(INT)),CHAR)'

The second occurrence of A_5(INT) is labelled REUSED_ADDRESS. The same occurs with the second instance of CHAR. Parse::Eyapp::Node->hnew can be more convenient than new when dealing with optimizations like common subexpressions or during type checking. See file examples/Types.eyp for a more comprehensive example.

$node->type

Returns the type of the node. It can be called as a sub when $node is not a Parse::Eyapp::Node like this:

Parse::Eyapp::Node::type($scalar)

This is the case when visiting CODE nodes.

The following session with the debugger illustrates how it works:

> perl -MParse::Eyapp::Node -de0
DB<1> @t = Parse::Eyapp::Node->new("A(B,C)") # Creates a tree
DB<2> x map { $_->type } @t # Get the types of the three nodes
0  'A'
1  'B'
2  'C'
DB<3> x Parse::Eyapp::Node::type(sub {})
0  'CODE'
DB<4> x Parse::Eyapp::Node::type("hola")
0  'Parse::Eyapp::Node::STRING'
DB<5> x Parse::Eyapp::Node::type({ a=> 1})
0  'HASH'
DB<6> x Parse::Eyapp::Node::type([ a, 1 ])
0  'ARRAY'

As it is shown in the example it can be called as a subroutine with a (CODE/HASH/ARRAY) reference or an ordinary scalar.

The words HASH, CODE, ARRAY and STRING are reserved for ordinary Perl references. Avoid naming a node with one of those words.

$node->child

Setter-getter to modify a specific child of a node. It is called like:

$node->child($i)

Returns the child with index $i. Returns undef if the child does not exists. It has two obligatory parameters: the node (since it is a method) and the index of the child. Sets teh new value if called

$node->child($i, $tree)

The method will croak if the obligatory parameters are not provided. Follows an example of use inside a Treereg program (see file examples/TSwithtreetransformations2.eyp) that swaps the children of a PLUS node:

my $transform = Parse::Eyapp::Treeregexp->new( STRING => q{
   commutative_add: PLUS($x, ., $y, .) # 1st . = '+' 2nd . = CODE
     => { my $t = $x; $_[0]->child(0, $y); $_[0]->child(2, $t)}
}

Child Access Through %tree alias

Remember that when the Eyapp program runs under the %tree alias directive The dot and dollar notations can be used to generate named getter-setters to access the children:

%tree bypass alias
.... 
%%
exp: %name PLUS
       exp.left '+' exp.right
....
%%
.... # and later
print $exp->left->str;

Here methods with names left and right will be created to access the corresponding children associated with the two instances of exp in the right hand side of the production rule.

$node->children

Returns the array of children of the node. When the tree is a translation scheme the CODE references are also included. See examples/TSPostfix3.eyp for an example of use inside a Translation Scheme:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$\
                      sed -ne '31,34p' TSPostfix3.eyp
line: %name PROG
       exp <%name EXP + ';'>
         { @{$lhs->{t}} = map { $_->{t}} ($_[1]->children()); }

The tree in a Translation Scheme contains the references to the CODE implementing the semantic actions. For example, the syntax tree built by the parser for the input a=-b*3 in TSPostfix3.eyp is:

PROG(EXP(
    ASSIGN(
      TERMINAL[a],
      TERMINAL[=],
      TIMES(
        NEG(TERMINAL[-], VAR(TERMINAL[b], CODE), CODE),
        TERMINAL[*],
        NUM(TERMINAL[3], CODE),
        CODE
      ) # TIMES,
      CODE
    ) # ASSIGN
  ) # EXP,
  CODE
) # PROG

$node->children can also be used as a setter.

$node->Children

Returns the array of children of the node. When dealing with a translation scheme, the $node->Children method (first in uppercase) returns the non CODE children of the node.

$node->last_child

Return the last child of the node. When dealing with translation schemes, the last can be a CODE node.

$node->Last_child

The $node->Last_child method returns the last non CODE child of the node. See an example:

line:       %name EXP
              exp <+ ';'> /* Expressions separated by semicolons */
          { $lhs->{n} = $_[1]->Last_child->{n} }
;

$node->descendant

The descendant method returns the descendant of a node given its coordinates. The coordinates of a node $s relative to a tree $t to which it belongs is a string of numbers separated by dots like ".1.3.2" which denotes the child path from $t to $s, i.e. $s == $t->child(1)->child(3)->child(2).

See a session with the debugger:

  DB<7> x $t->child(0)->child(0)->child(1)->child(0)->child(2)->child(1)->str
0  '
BLOCK[8:4:test]^{0}(
  CONTINUE[10,10]
)
  DB<8> x $t->descendant('.0.0.1.0.2.1')->str
0  '
BLOCK[8:4:test]^{0}(
  CONTINUE[10,10]

$node->str

The str method returns a string representation of the tree. The str method traverses the syntax tree dumping the type of the node being visited in a string. If the node being visited has a method info it will be executed and its result inserted between $DELIMITERs into the string. Thus, in the "SYNOPSIS" example, by adding the info method to the class TERMINAL:

sub TERMINAL::info {
  $_[0]{attr}
}

we achieve the insertion of attributes in the string being built by str.

The existence of some methods (like footnote) and the values of some package variables influence the behavior of str. Among the most important are:

@PREFIXES = qw(Parse::Eyapp::Node::);                                 # Prefixes to supress 
$INDENT = 0; # 0 = compact, 1 = indent, 2 = indent and include Types in closing parenthesis
$STRSEP = ',';                                # Separator between nodes, by default a comma
$DELIMITER = '[';                         # The string returned by C<info> will be enclosed 
$FOOTNOTE_HEADER = "\n---------------------------\n"; 
$FOOTNOTE_SEP = ")\n"; 
$FOOTNOTE_LEFT = '^{';                               # Left delimiter for a footnote number
$FOOTNOTE_RIGHT = '}';                              # Right delimiter for a footnote number
$LINESEP = 4;                             # When indent=2 the enclosing parenthesis will be
                                          # commented if more than $LINESEP apart

The following list defines the $DELIMITERs you can choose for attribute representation:

'[' => ']', '{' => '}', '(' => ')', '<' => '>'

If the node being visited has a method footnote, the string returned by the method will be concatenated at the end of the string as a footnote. The variables $FOOTNOTE_LEFT and $FOOTNOTE_RIGHT govern the displaying of footnote numbers.

Follows an example of output using footnotes.

nereida:~/doc/casiano/PLBOOK/PLBOOK/code/Simple-Types/script> \
                                         usetypes.pl prueba24.c
PROGRAM^{0}(FUNCTION[f]^{1}(RETURNINT(TIMES(INUM(TERMINAL[2:2]),VAR(TERMINAL[a:2])))))
---------------------------
0)
Types:
$VAR1 = {
  'CHAR' => bless( {
    'children' => []
  }, 'CHAR' ),
  'VOID' => bless( {
    'children' => []
  }, 'VOID' ),
  'INT' => bless( {
    'children' => []
  }, 'INT' ),
  'F(X_1(INT),INT)' => bless( {
    'children' => [
      bless( {
        'children' => [
          $VAR1->{'INT'}
        ]
      }, 'X_1' ),
      $VAR1->{'INT'}
    ]
  }, 'F' )
};
Symbol Table:
$VAR1 = {
  'f' => {
    'type' => 'F(X_1(INT),INT)',
    'line' => 1
  }
};

---------------------------
1)
$VAR1 = {
  'a' => {
    'type' => 'INT',
    'param' => 1,
    'line' => 1
  }
};

The first footnote was due to a call to PROGRAM:footnote. The footnote method for the PROGRAM node was defined as:

nereida:~/doc/casiano/PLBOOK/PLBOOK/code/Simple-Types/lib/Simple> \
                            sed -n -e '691,696p' Types.eyp | cat -n
    1  sub PROGRAM::footnote {
    2    return "Types:\n"
    3           .Dumper($_[0]->{types}).
    4           "Symbol Table:\n"
    5           .Dumper($_[0]->{symboltable})
    6  }

The second footnote was produced by the existence of a FUNCTION::footnote method:

nereida:~/doc/casiano/PLBOOK/PLBOOK/code/Simple-Types/lib/Simple> \
                           sed -n -e '702,704p' Types.eyp | cat -n
1  sub FUNCTION::footnote {
2    return Dumper($_[0]->{symboltable})
3  }

The source program for the example was:

1  int f(int a) {
2    return 2*a;
3  }

$node->delete

The $node->delete($child) method is used to delete the specified child of $node. The child to delete can be specified using the index or a reference. It returns the deleted child.

Throws an exception if the object can't do children or has no children. See also the delete method of treeregexes (Parse::Eyapp:YATW objects) to delete the node being visited.

The following example moves out of a loop an assignment statement assuming is an invariant of the loop. To do it, it uses the delete and insert_before methods:

nereida:~/src/perl/YappWithDefaultAction/examples> \
            sed -ne '98,113p' moveinvariantoutofloopcomplexformula.pl
my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
  moveinvariant: BLOCK(
                   @prests,
                   WHILE(VAR($b), BLOCK(@a, ASSIGN($x, NUM($e)), @c)),
                   @possts
                 )
    => {
         my $assign = $ASSIGN;
         $BLOCK[1]->delete($ASSIGN);
         $BLOCK[0]->insert_before($WHILE, $assign);
       }
  },
  FIRSTLINE => 99,
);
$p->generate();
$moveinvariant->s($t);

The example below deletes CODE nodes from the tree build for a translation scheme:

my $transform = Parse::Eyapp::Treeregexp->new( 
  STRING=>q{
    delete_code: CODE => { Parse::Eyapp::Node::delete($CODE) }
  },
)

Observe how delete is called as a subroutine.

$node->unshift($newchild)

Inserts $newchild at the beginning of the list of children of $node. See also the unshift method for Parse::Eyapp:YATW treeregexp transformation objects

$node->push($newchild)

Inserts $newchild at the end of the list of children of $node.

$node->insert_before($position, $new_child)

Inserts $newchild before $position in the list of children of $node. Variable $position can be an index or a reference.

The method throws an exception if $position is an index and is not in range. Also if $node has no children.

The method throws a warning if $position is a reference and does not define an actual child. In such case $new_child is not inserted.

See also the insert_before method for Parse::Eyapp:YATW treeregexp transformation objects

$node->insert_after($position, $new_child)

Inserts $newchild after $position in the list of children of $node. Variable $position can be an index or a reference.

The method throws an exception if $position is an index and is not in the range of $node-children>.

The method throws a warning if $position is a reference and does not exists in the list of children. In such case $new_child is not inserted.

$node->translation_scheme

Traverses $node. Each time a CODE node is visited the subroutine referenced is called with arguments the node and its children. Usually the code will decorate the nodes with new attributes or will update existing ones. Obviously this method does nothing for an ordinary AST. It is used after compiling an Eyapp program that makes use of the %metatree directive.

$node->bud

Bottom-up decorator. The tree is traversed bottom-up. The set of transformations is applied to each node in the order supplied by the user. As soon as one succeeds no more transformations are applied. For an example see the files examples/Types.eyp and examples/Trans.trg. The code below shows an extract of the type-checking phase of a toy-example compiler:

nereida:~/src/perl/YappWithDefaultAction/examples> \
                        sed -ne '600,611p' Types.eyp
 my @typecheck = (
   our $inum,
   our $charconstant,
   our $bin,
   our $arrays,
   our $assign,
   our $control,
   our $functioncall,
   our $statements,
 );

 $t->bud(@typecheck);

As an example of the appearance of the treeregexp transformations involved in the former call, see the code of the $control treeregexp transformation:

nereida:~/src/perl/YappWithDefaultAction/examples> \
                        sed -ne '183,192p' Trans.trg
control: /IF|IFELSE|WHILE/:con($bool)
  => {
    $bool = char2int($con, 0) if $bool->{t} == $CHAR;
      type_error("Condition must have integer type!", $bool->line)
    unless $bool->{t} == $INT;

    $con->{t} = $VOID;

    return 1;
  }

Transformations Objects: Parse::Eyapp:YATW Methods

Parse::Eyapp:YATW objects represent tree transformations. They carry the information of what nodes match and how to modify them.

Parse::Eyapp::Node->new

Builds a treeregexp transformation object. Though usually you build a transformation by means of Treeregexp programs you can directly invoke the method to build a tree transformation. A transformation object can be built from a function that conforms to the YATW tree transformation call protocol (see the section "The YATW Tree Transformation Call Protocol"). Follows an example (file examples/12ts_simplify_with_s.pl):

nereida:~/src/perl/YappWithDefaultAction/examples> \
       sed -ne '68,$p' 12ts_simplify_with_s.pl | cat -n
 1  sub is_code {
 2    my $self = shift; # tree
 3
 4    # After the shift $_[0] is the father, $_[1] the index
 5    if ((ref($self) eq 'CODE')) {
 6      splice(@{$_[0]->{children}}, $_[1], 1);
 7      return 1;
 8    }
 9    return 0;
10  }
11
12  Parse::Eyapp->new_grammar(
13    input=>$translationscheme,
14    classname=>'Calc',
15    firstline =>7,
16  );
17  my $parser = Calc->new();                # Create the parser
18
19  $parser->YYData->{INPUT} = "2*-3\n";  print "2*-3\n"; # Set the input
20  my $t = $parser->Run;                    # Parse it
21  print $t->str."\n";
22  my $p = Parse::Eyapp::YATW->new(PATTERN => \&is_code);
23  $p->s($t);
24  { no warnings; # make attr info available only for this display
25    local *TERMINAL::info = sub { $_[0]{attr} };
26    print $t->str."\n";
27  }

After the Parse::Eyapp::YATW object $p is built at line 22 the call to method $p->s($t) applies the transformation is_code using a bottom-up traversing of the tree $t. The achieved effect is the elimination of CODE references in the translation scheme tree. When executed the former code produces:

nereida:~/src/perl/YappWithDefaultAction/examples> 12ts_simplify_with_s.pl
2*-3
EXP(TIMES(NUM(TERMINAL,CODE),TERMINAL,UMINUS(TERMINAL,NUM(TERMINAL,CODE),CODE),CODE),CODE)
EXP(TIMES(NUM(TERMINAL[2]),TERMINAL[*],UMINUS(TERMINAL[-],NUM(TERMINAL[3]))))

The file foldrule6.pl in the examples/ distribution directory gives you another example:

nereida:~/src/perl/YappWithDefaultAction/examples> cat -n foldrule6.pl
  1  #!/usr/bin/perl -w
  2  use strict;
  3  use Rule6;
  4  use Parse::Eyapp::YATW;
  5
  6  my %BinaryOperation = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
  7
  8  sub set_terminfo {
  9    no warnings;
 10    *TERMINAL::info = sub { $_[0]{attr} };
 11  }
 12  sub is_foldable {
 13    my ($op, $left, $right);
 14    return 0 unless defined($op = $BinaryOperation{ref($_[0])});
 15    return 0 unless ($left = $_[0]->child(0), $left->isa('NUM'));
 16    return 0 unless ($right = $_[0]->child(1), $right->isa('NUM'));
 17
 18    my $leftnum = $left->child(0)->{attr};
 19    my $rightnum = $right->child(0)->{attr};
 20    $left->child(0)->{attr} = eval "$leftnum $op $rightnum";
 21    $_[0] = $left;
 22  }
 23
 24  my $parser = new Rule6();
 25  $parser->YYData->{INPUT} = "2*3";
 26  my $t = $parser->Run;
 27  &set_terminfo;
 28  print "\n***** Before ******\n";
 29  print $t->str;
 30  my $p = Parse::Eyapp::YATW->new(PATTERN => \&is_foldable);
 31  $p->s($t);
 32  print "\n***** After ******\n";
 33  print $t->str."\n";

when executed produces:

nereida:~/src/perl/YappWithDefaultAction/examples> foldrule6.pl

***** Before ******
TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3]))
***** After ******
NUM(TERMINAL[6])

The YATW Tree Transformation Call Protocol

For a subroutine pattern_sub to work as a YATW tree transformation - as subroutines is_foldable and is_code above - has to conform to the following call description:

pattern_sub(
    $_[0],  # Node being visited
    $_[1],  # Father of this node
    $index, # Index of this node in @Father->children
    $self,  # The YATW pattern object
);

The pattern_sub must return TRUE if matched and FALSE otherwise.

The protocol may change in the near future. Avoid using other information than the fact that the first argument is the node being visited.

Parse::Eyapp::YATW->buildpatterns

Works as Parse::Eyapp->new but receives an array of subs conforming to the YATW Tree Transformation Call Protocol.

our @all = Parse::Eyapp::YATW->buildpatt(\&delete_code, \&delete_tokens);

$yatw->delete

The root of the tree that is currently matched by the YATW transformation $yatw will be deleted from the tree as soon as is safe. That usually means when the processing of their siblings is finished. The following example (taken from file examples/13ts_simplify_with_delete.pl in the Parse::Eyapp distribution) illustrates how to eliminate CODE and syntactic terminals from the syntax tree:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ \
       sed -ne '62,$p' 13ts_simplify_with_delete.pl | cat -n
 1  sub not_useful {
 2    my $self = shift; # node
 3    my $pat = $_[2];  # get the YATW object
 4
 5    (ref($self) eq 'CODE') or ((ref($self) eq 'TERMINAL') and ($self->{token} eq $self->{attr}))
 6      or do { return 0 };
 7    $pat->delete();
 8    return 1;
 9  }
10
11  Parse::Eyapp->new_grammar(
12    input=>$translationscheme,
13    classname=>'Calc',
14    firstline =>7,
15  );
16  my $parser = Calc->new();                # Create the parser
17
18  $parser->YYData->{INPUT} = "2*3\n"; print $parser->YYData->{INPUT};
19  my $t = $parser->Run;                    # Parse it
20  print $t->str."\n";                      # Show the tree
21  my $p = Parse::Eyapp::YATW->new(PATTERN => \&not_useful); 
22  $p->s($t);                               # Delete nodes
23  print $t->str."\n";                      # Show the tree

when executed we get the following output:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ 13ts_simplify_with_delete.pl
2*3
EXP(TIMES(NUM(TERMINAL[2],CODE),TERMINAL[*],NUM(TERMINAL[3],CODE),CODE))
EXP(TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3])))

$yatw->unshift

Tha call $yatw->unshift($b) safely unshifts (inserts at the beginning) the node $b in the list of its siblings of the node that matched (i.e in the list of siblings of $_[0]). The following example shows a YATW transformation insert_child that illustrates the use of unshift (file examples/26delete_with_trreereg.pl):

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ \
        sed -ne '70,$p' 26delete_with_trreereg.pl | cat -n
 1  my $transform = Parse::Eyapp::Treeregexp->new( STRING => q{
 2
 3      delete_code : CODE => { $delete_code->delete() }
 4
 5      {
 6        sub not_semantic {
 7          my $self = shift;
 8          return  1 if ((ref($self) eq 'TERMINAL') and ($self->{token} eq $self->{attr}));
 9          return 0;
10        }
11      }
12
13      delete_tokens : TERMINAL and { not_semantic($TERMINAL) } => {
14        $delete_tokens->delete();
15      }
16
17      insert_child : TIMES(NUM(TERMINAL), NUM(TERMINAL)) => {
18        my $b = Parse::Eyapp::Node->new( 'UMINUS(TERMINAL)',
19          sub { $_[1]->{attr} = '4.5' }); # The new node will be a sibling of TIMES
20
21        $insert_child->unshift($b); 
22      }
23    },
24  )->generate();
25
26  Parse::Eyapp->new_grammar(
27    input=>$translationscheme,
28    classname=>'Calc',
29    firstline =>7,
30  );
31  my $parser = Calc->new();                # Create the parser
32
33  $parser->YYData->{INPUT} = "2*3\n"; print $parser->YYData->{INPUT}; # Set the input
34  my $t = $parser->Run;                # Parse it
35  print $t->str."\n";                        # Show the tree
36  # Get the AST
37  our ($delete_tokens, $delete_code);
38  $t->s($delete_tokens, $delete_code);
39  print $t->str."\n";                        # Show the tree
40  our $insert_child;
41  $insert_child->s($t);
42  print $t->str."\n";                        # Show the tree

When is executed the program produces the following output:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ 26delete_with_trreereg.pl
2*3
EXP(TIMES(NUM(TERMINAL[2],CODE),TERMINAL[*],NUM(TERMINAL[3],CODE),CODE))
EXP(TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3])))
EXP(UMINUS(TERMINAL[4.5]),TIMES(NUM(TERMINAL[2]),NUM(TERMINAL[3])))

Don't try to take advantage that the transformation sub receives in $_[1] a reference to the father (see the section "The YATW Tree Transformation Call Protocol") and do something like:

unshift $_[1]->{children}, $b

it is unsafe.

$yatw->insert_before

A call to $yatw->insert_before($node) safely inserts $node in the list of siblings of $_[0] just before $_[0] (i.e. the ndoe that matched with $yatw). The following example (file t/33moveinvariantoutofloop.t) illustrates its use:

my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
  moveinvariant: WHILE(VAR($b), BLOCK(@a, ASSIGN($x, $e), @c)) 
       and { is_invariant($ASSIGN, $WHILE) } => {
         my $assign = $ASSIGN;
         $BLOCK->delete($ASSIGN);
         $moveinvariant->insert_before($assign);
       }
  },
);

Here the ASSIGN($x, $e) subtree - if is loop invariant - will be moved to the list of siblings of $WHILE just before the $WHILE.

Matching Trees

Both the transformation objects in Parse::Eyapp::YATW and the nodes in Parse::Eyapp::Node have a method named m for matching. For a Parse::Eyapp::YATW object, the method -when called in a list context- returns a list of Parse::Eyapp::Node::Match nodes.

@R = $t->m($yatw1, $yatw2, $yatw3, ...)

A Parse::Eyapp::Node::Match object describes the nodes of the actual tree that have matched. The nodes in the returned list are organized in a hierarchy. They appear in the list sorted according to a depth-first visit of the actual tree $t. In a scalar context m returns the first element of the list.

Let us denote by $t the actual tree being searched and $r one of the Parse::Eyapp::Node::Match nodes in the resulting forest @R. Then we have the following methods:

  • The method $r->node return the node $t of the actual tree that matched

  • The method $r->father returns the father of $r in the matching forest. The father of $r is defined by this property: $r->father->node is the nearest ancestor of $r->node that matched with the treeregexp pattern. That is, there is no ancestor that matched between $r->node and $r->father->node. Otherwise $r->father is undef

  • The method $r->coord returns the coordinates of $r->node relative to $t. For example, the coordinate ".1.3.2" denotes the node $t->child(1)->child(3)->child(2), where $t is the root of the search.

  • The method $r->depth returns the depth of $r->node in $t.

  • When m was called as a Parse::Eyapp::Node method, i. e. with potentially more than one YATW treeregexp, the method $r->names returns the array of names of the transformations that matched with $r->node.

The following example illustrates a use of m as a Parse::Eyapp:YATW method. It solves a problem of scope analysis in a C compiler: matching each RETURN statement with the function that surrounds it. The parsing was already done, the AST was built and left in $t. The treeregexp used is:

retscope: /FUNCTION|RETURN/

and the code that solves the problem is:

# Associate each "return exp" with its "function"
my @returns = $retscope->m($t); 
for (@returns) {
  my $node = $_->node;
  if (ref($node) eq 'RETURN') {
    my $function = $_->father->node; 
    $node->{function}  = $function;  
    $node->{t} = $function->{t};
  }
}

The first line gets a list of Parse::Eyapp::Node::Match nodes describing the actual nodes that matched /FUNCTION|RETURN/. If the node described by $_ is a 'RETURN' node, the expresion $_->father->node must necessarily point to the function node that encloses it.

The second example shows the use of m as a Parse::Eyapp::Node method.

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ cat -n m2.pl
 1  #!/usr/bin/perl -w
 2  use strict;
 3  use Rule6;
 4  use Parse::Eyapp::Treeregexp;
 5
 6  Parse::Eyapp::Treeregexp->new( STRING => q{
 7    fold: /times|plus|div|minus/i:bin(NUM($n), NUM($m))
 8    zxw: TIMES(NUM($x), .) and { $x->{attr} == 0 }
 9    wxz: TIMES(., NUM($x)) and { $x->{attr} == 0 }
10  })->generate();
11
12  # Syntax analysis
13  my $parser = new Rule6();
14  $parser->YYData->{INPUT} = "0*0*0";
15  my $t = $parser->Run;
16  print "Tree:",$t->str,"\n";
17
18  # Search
19  my $m = $t->m(our ($fold, $zxw, $wxz));
20  print "Match Node:\n",$m->str,"\n";

When executed with input 0*0*0 the program generates this output:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ m2.pl
Tree:TIMES(TIMES(NUM(TERMINAL),NUM(TERMINAL)),NUM(TERMINAL))
Match Node:
Match[[TIMES:0:wxz]](Match[[TIMES:1:fold,zxw,wxz]])

The representation of Match nodes by str deserves a comment. Match nodes have their own info method. It returns a string containing the concatenation of the class of $r->node (i.e. the actual node that matched), the depth ($r->depth) and the names of the transformations that matched (as provided by the method $r->names)

The SEVERITY option of Parse::Eyapp::Treeregexp::new

The SEVERITY option of Parse::Eyapp::Treeregexp::new controls the way matching succeeds regarding the number of children. To illustrate its use let us consider the following example. The grammar used Rule6.yp is similar to the one in the SYNOPSIS example.

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ cat -n numchildren.pl
 1  #!/usr/bin/perl -w
 2  use strict;
 3  use Rule6;
 4  use Parse::Eyapp::Treeregexp;
 5
 6  sub TERMINAL::info { $_[0]{attr} }
 7
 8  my $severity = shift || 0;
 9  my $parser = new Rule6();
10  $parser->YYData->{INPUT} = shift || '0*2';
11  my $t = $parser->Run;
12
13  my $transform = Parse::Eyapp::Treeregexp->new(
14    STRING => q{
15      zero_times_whatever: TIMES(NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
16    },
17    SEVERITY => $severity,
18    FIRSTLINE => 14,
19  )->generate;
20
21  $t->s(our @all);
22
23  print $t->str,"\n";

The program gets the severity level from the command line (line 9). The specification of the term TIMES(NUM($x)) inside the transformation zero_times_whatever does not clearly state that TIMES must have two children. There are several interpretations of the treregexp depending on the level fixed for SEVERITY:

  • 0: TIMES must have at least one child. Don't care if it has more.

  • 1: TIMES must have exactly one child.

  • 2: TIMES must have exactly one child. When visit a TIMES node with a different number of children issue a warning.

  • 3: TIMES must have exactly one child. When visit a TIMES node with a different number of children issue an error.

Observe the change in behavior according to the level of SEVERITY:

pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 0 '0*2'
NUM(TERMINAL[0])
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 1 '0*2'
TIMES(NUM(TERMINAL[0]),NUM(TERMINAL[2]))
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 2 '0*2'
Warning! found node TIMES with 2 children.
Expected 1 children (see line 15 of ./numchildren.pl)"
TIMES(NUM(TERMINAL[0]),NUM(TERMINAL[2]))
pl@nereida:~/src/perl/YappWithDefaultAction/examples$ numchildren.pl 3 '0*2'
Error! found node TIMES with 2 children.
Expected 1 children (see line 15 of ./numchildren.pl)"
 at (eval 2) line 29

Tree Substitution: The s methods

Both Parse::Eyapp:Node and Parse::Eyapp::YATW objects (i.e. nodes and tree transformations) are provided with a s method.

In the case of a Parse::Eyapp::YATW object the method s applies the tree transformation using a single bottom-up traversing: the transformation is recursively applied to the children and then to the current node.

For Parse::Eyapp:Node nodes the set of transformations is applied to each node until no transformation matches any more. The example in the "SYNOPSIS" section illustrates the use:

 1  # Let us transform the tree. Define the tree-regular expressions ..
 2  my $p = Parse::Eyapp::Treeregexp->new( STRING => q{
 3    { #  Example of support code
 4      my %Op = (PLUS=>'+', MINUS => '-', TIMES=>'*', DIV => '/');
 5    }
 6    constantfold: /TIMES|PLUS|DIV|MINUS/:bin(NUM($x), NUM($y))
 7      => {
 8        my $op = $Op{ref($_[0])};
 9        $x->{attr} = eval  "$x->{attr} $op $y->{attr}";
10        $_[0] = $NUM[0];
11      }
12    uminus: UMINUS(NUM($x)) => { $x->{attr} = -$x->{attr}; $_[0] = $NUM }
13    zero_times_whatever: TIMES(NUM($x), .) and { $x->{attr} == 0 } => { $_[0] = $NUM }
14    whatever_times_zero: TIMES(., NUM($x)) and { $x->{attr} == 0 } => { $_[0] = $NUM }
15    },
16    OUTPUTFILE=> 'main.pm'
17  );
18  $p->generate(); # Create the tranformations
19 
20  $t->s($uminus); # Transform UMINUS nodes
21  $t->s(@all);    # constant folding and mult. by zero

The call at line 20 can be substituted by $uminus->s($t) without changes.

Scope Analysis with Parse::Eyapp::Scope

A scope manager helps to compute the mapping function that maps the uses (instances) of source objects to their definitions. For instance, in identifier scope analysis the problem is to associate each ocurrence of an identifier with the declaration that applies to it. Another example is loop scope analysis where the problem is to associate each occurrence of a CONTINUE or BREAK node with the shallowest LOOP that encloses it. Or label scope analysis, the problem to associate a GOTO node with the node to jump to, that is, with the STATEMENT associated with the label.

To take advantage of Parse::Eyapp::Scope, the compiler writer must mark at the appropriate time (for example a new block or new subroutine for identifier scope analysis, a new loop for loop scope analysis, etc.) the beginning of a new scope calling the method begin_scope. From that point on any ocurring instance of an object (for example, variables in expressions for identifier scope analysis, breaks and continues for loop scope analysis, etc.) must be declared calling the method scope_instance. The programmer must also mark the end of the current scope at the appropriate time.

$scope->end_scope

There are two ways of calling $scope->end_scope. The first one is for Scope Analysis Problems where a symbol table is needed (for example in identifier scope analysis and label scope analysis.

$scope->end_scope with first Arg a Symbol Table

For each ocurring instance of an object $x that occurred since the last call to begin_scope the call to

$scope->end_scope(\%symboltable, $definition_node, 'attr1', 'attr2', ... )

decorates the ocurring instance $x with several attributes:

  • An entry $x->{SCOPE_NAME} is built that will reference $definition_node.

  • An entry $x->{ENTRY_NAME} is built. That entry references $symboltable{$x->key} (to have a faster access from the instance to the attributes of the object). The instantiated nodes must have a $x->key method which provides the entry for the node in the symbol table:

    pl@nereida:~/src/perl/YappWithDefaultAction/examples$ sed -ne '651,657p' Types.eyp
    sub VAR::key {
      my $self = shift;
    
      return $self->child(0)->{attr}[0];
    }
    
    *VARARRAY::key = *FUNCTIONCALL::key = \&VAR::key;
  • For each aditional arguments attr#k an entry $x->{attr#k} will be built. That entry references $symboltable{$x->key}{attr#k}. Therefore the entry for $x in the symbol table must already have a field named attr#k.

In a list context $scope>end_scope returns two references. The first one is a reference to a list of node instantiated that weren't defined in the current scope. The second is a reference to a list of nodes that were defined in this scope. In a scalar context returns the first of these two. An instance $x is defined if, and only if, exists $symboltable{$_->key}.

$scope->end_scope for Simple Scope Analysis

Some scope analysis problems do not require the existence of a symbol table (for instance, the problem of associating a RETURN node with the FUNCTION that encloses it). For such kind of problems $scope>end_scope provides a second form of call. The second way to call $scope>end_scope is

$declared = $scopemanager->end_scope($definition_node);

The only argument is the reference to the node that controls/defines the scope. The method returns a reference to the declared nodes. Any node instanced with scope_instance since the last call to begin_scope is considered declared.

$scope->begin_scope

Marks the beginning of an scope. Example (file examples/Types.eyp):

loopPrefix:
    $WHILE '(' expression ')'
      {
        $loops->begin_scope;
        $_[3]->{line} = $WHILE->[1]; # Save the line for error diagostic
        $_[3]
      }

$scope->scope_instance

Declares the node argument to be an occurring instance of the scope:

nereida:~/doc/casiano/PLBOOK/PLBOOK/code> \
    sed -ne '375,380p' Simple6.eyp | cat -n
 1      $Variable '=' binary
 2        {
 3          my $parser = shift;
 4          $ids->scope_instance($Variable);
 5          $parser->YYBuildAST(@_); # "Manually" build the node
 6        }

Parse::Eyapp::Scope->new

Parse::Eyapp::Scope->new returns a scope managment object. The scope mapping function is implemented by Parse::Eyapp::Scope through a set of attributes that are added to the nodes involved in the scope analysis. The names of these attributes can be specified using the parameters of Parse::Eyapp::Scope->new. The arguments of new are:

  • SCOPE_NAME is the name chosen for the attribute of the node instance which will held the reference to the definition node. If not specified it will take the value "scope".

  • ENTRY_NAME is the name of the attribute of the node instance which will held the reference to the symbol table entry. By default takes the value "entry".

  • SCOPE_DEPTH is the name for an attribute of the definition node. Optional. If not specified it will not be defined.

ENVIRONMENT

Remember to set the environment variable PERL5LIB if you decide to install Parse::Eyapp at a location other than the standard. For example, on a bash or sh:

export PERL5LIB-/home/user/wherever_it_is/lib/:$PERL5LIB

on a csh or tcsh

setenv PERL5LIB /home/user/wherever_it_is/lib/:$PERL5LIB

Be sure the scripts eyapp and treereg are in the execution PATH.

DEPENDENCIES

This distribution depends on the following modules:

It seems that List::Util is in the core of Perl distributions since version 5.73:

> perl -MModule::CoreList -e 'print Module::CoreList->first_release("List::Util")'
5.007003

and Data::Dumper is also in the core since 5.5:

> perl -MModule::CoreList -e 'print Module::CoreList->first_release("Data::Dumper")'
5.005

and Pod::Usage is also in the core since 5.6:

> perl -MModule::CoreList -e 'print Module::CoreList->first_release("Pod::Usage")'
5.006

I also recommend the following modules:

The dependence on Test::Warn, Test::Pod and Test::Exception is merely for the execution of tests. If the modules aren't installed the tests depending on them will be skipped.

INSTALLATION

To install it, follow the traditional mantra:

perl Makefile.PL
make
make test
make install

Also:

  • Make a local copy of the examples/ directory in this distribution

  • Probably it will be also a good idea to make a copy of the tests in the t/ directory. They also illustrate the use of Eyapp

  • Print and read the pdf file in http://nereida.deioc.ull.es/~pl/perlexamples/Eyapp.pdf

BUGS AND LIMITATIONS

  • This distribution is an alpha version. A release will be in CPAN by February 2007. Hopefully, at that time the interface will freeze or -at least- changes in the API will be minor. In the meanwhile it will be likely to change.

  • The way Parse::Eyapp parses Perl code is verbatim the way it does Parse::Yapp 1.05. Quoting Francois Desarmenien Parse::Yapp documentation:

    "Be aware that matching braces in Perl is much more difficult than in C: inside strings they don't need to match. While in C it is very easy to detect the beginning of a string construct, or a single character, it is much more difficult in Perl, as there are so many ways of writing such literals. So there is no check for that today. If you need a brace in a double-quoted string, just quote it (\{ or \}). For single-quoted strings, you will need to make a comment matching it in th right order. Sorry for the inconvenience.

    {
        "{ My string block }".
        "\{ My other string block \}".
        qq/ My unmatched brace \} /.
        # Force the match: {
        q/ for my closing brace } /
        q/ My opening brace { /
        # must be closed: }
    }

    All of these constructs should work."

    Alternative exact solutions were tried but resulted in much slower code. Therefore, until something faster is found, I rather prefer for Parse::Eyapp to live with this limitation.

    The same limitation may appear inside header code (code between %{ and %})

  • English is not my native language. For sure this text has lexical, syntactic and semantic errors. I'll be most gratefull to know about any typos, grammar mistakes, ways to rewrite paragraphs and misconceptions you have found.

  • There are unknown bugs. Please report problems to Casiano Rodriguez-Leon (casiano@cpan.org).

SEE ALSO

REFERENCES

  • The classic Dragon's book Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman (Addison-Wesley 1986)

AUTHOR

Casiano Rodriguez-Leon (casiano@ull.es)

ACKNOWLEDGMENTS

This work has been supported by CEE (FEDER) and the Spanish Ministry of Educación y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.

A large percentage of code is verbatim taken from Parse::Yapp 1.05. The author of Parse::Yapp is Francois Desarmenien.

I wish to thank Francois Desarmenien for his Parse::Yapp module, to my students at La Laguna and to the Perl Community. Special thanks to my family and Larry Wall.

LICENCE AND COPYRIGHT

Copyright (c) 2006-2007 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.

Parse::Yapp copyright is of Francois Desarmenien, all rights reserved. 1998-2001

These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 1651:

Non-ASCII character seen before =encoding in 'sintáctico\n";'. Assuming CP1252