NAME
Language::AttributeGrammar - Attribute grammars for doing computations over trees.
SYNOPSIS
use Language::AttributeGrammar;
# Grammar to return a new tree that is just like the old one, except
# every leaf's value is the value of the minimum leaf.
my $grammar = new Language::AttributeGrammar <<'END_GRAMMAR';
# find the minimum of a tree from the leaves up
Leaf: $/.min = { $<value> }
Branch: $/.min = { List::Util::min($<left>.min, $<right>.min)) }
# find the global minimum and propagate it back down the tree
ROOT: $/.gmin = { $/.min }
Branch: $<left>.gmin = { $/.gmin }
| $<right>.gmin) = { $/.gmin }
# reconstruct the tree with every leaf replaced with the minimum value
Leaf: $/.result = { Leaf->new($/.gmin) }
Branch: $/.result = { Branch->new($<left>.result, $<right>.result) }
END_GRAMMAR
# This grammar expects that you define these classes:
# Branch (with a ->left and ->right attribute)
# Leaf (with a ->value attribute)
# Use the grammar
my $tree = Branch->new( Leaf->new(1),
Branch->new( Leaf->new(2), Leaf->new(3)));
# Apply the attribute grammar to the data structure and fetch the result
my $result = $grammar->apply($tree, 'result');
DESCRIPTION
This module implements simple (for now) Attribute Grammar support for Perl data structures. An attribute grammar is a way to specify computation over a predefined data structure, say, as generated by Parse::RecDescent. This is done by associating attributes with the nodes of the data structure.
There are two types of attributes: synthesized and inherited. Synthesized attributes propagate bottom-up, that is, they use information from the children of a node to infer the attribute's value on that node. Inherited attributes are the opposite: they use information from a node in the structure to infer attributes on its chilren.
In the example above in the synopsis, the min
attribute is synthesized, since it takes the values at the leaves and infers the minimum at a branch. The gmin
(global minimum) attribute is inherited, since it uses gmin
that was computed at the root node and propagates it downward to the leaves.
Syntax
Some special syntax is used in throughout the definitions, borrowed from the syntax for Perl 6 grammars.
$/
The current node.
$/.attr
The
attr
attribute on the current node.$<foo>
The child node named
foo
of the current node.$<child>.attr
The
attr
attribute on the child node.`arbitrary(code)`.attr
Execute
arbitrary(code)
IN LIST CONTEXT and fetch theattr
attribute from each element. So:Foo: $/.bar = { `get_child($/)`.bar } # WRONG
$/.bar
will always be 1 (the number of thingsget_child
returned). If you want to do this right, since you are only intending to use one value:Foo: $/.bar = { `get_child($/)`.bar[0] } # okay
Also, the code inside backticks must not refer to any lexical variables or any attributes. That is,
$/
and his children are the only variables you may refer to (but you may call methods on them, etc.).
The grammar definition is composed of a series of semantics definitions. An example semantic definition is:
Foo: $/.baz = { $<child>.baz }
| $<child>.quux = { $/.quux }
This specifies the implementations of the synthesized attribute baz
and the inherited attribute quux
for nodes of type Foo. That is, you can find the baz
attribute of the current node by looking at the baz attribute of its child, and you can find the quux
attribute of any node's child by looking at the quux
attribute of the node itself.
The $<child>
notation is defined to pretty much do the right thing. But, in the name of predictability, here are the semantics:
If $/
has a method named child
(for the attribute $<child>
), then that method is called with no arguments to fetch the attribute. Otherwise, if $/
is a blessed hash, then the module snoops inside the hash and pulls out the key named "child". If the hash has no such key, or the object is not a blessed hash (eg. a blessed array), then we give up.
If your tree has a different convention for extracting child nodes, you may use the backtick syntax described above:
Cons: $/.sum = { `$/->get_child('head')`.sum + `$/->get_child('tail')`.sum }
Nil: $/.sum = { 0 }
Cons: `$/->get_child('head')`.gsum = { $/.gsum }
In the future I may provide a callback that allows the user to define the meaning of $<child>
.
There is one special class name that can go to the left of the colon: ROOT
. This represents the root of the data structure you were given, and is used to avoid the common annoyance of creating a Root node class tha just bootstraps the "real" tree. So when you say:
ROOT: $/.gmin = { $/.min }
That means that when you're at the root of the data structure, the global minimum is equal to the local minimum.
Usage
After you have a grammar specification in a string, create a new grammar object:
my $grammar = Language::AttributeGrammar->new($grammar_string);
This contains a minimal data structure of the semantics definitions. The constructor also can take an options hash as its first argument:
my $grammar = Language::AttributeGrammar->new({ prefix => 'Foo::' }, $grammar_string);
The only option at the moment is prefix
, which will prepend this prefix to all the types mentioned in your grammar. However, if you need to omit this prefix, name the type in your grammar starting with a ::
, and the prefix will not be prepended.
In order to find an attribute on the root node of a data structure, apply
it to the data structure, giving the name of the attribute you wish to find.
my $attr = $grammar->apply($data, 'attr');
You may set attributes on the root of the data structure by passing a hash.
my $attr = $grammar->apply($data, 'attr', {
starting_number => 0,
});
In order to find attributes on nodes that are lower in the structure, you must concoct your attribute grammar to propagate that information up the tree somehow. Usually this is done using a synthesized attribute that mirrors the given data structure.
AUTHOR
Luke Palmer <lrpalmer gmail com>