NAME
TPath::Forester - a generator of TPath expressions for a particular class of nodes
VERSION
version 1.007
SYNOPSIS
# we apply the TPath::Forester role to a class
{
package MyForester;
use Moose; # for simplicity we omit removing Moose droppings, etc.
use MooseX::MethodAttributes; # needed if you're going to add some attributes
with 'TPath::Forester'; # compose in the TPath::Forester methods and attributes
# define abstract methods
sub children { $_[1]->children } # our nodes know their children
sub parent { $_[1]->parent } # our nodes know their parent
sub has_tag { # our nodes have a tag attribute which is
my ($self, $node, $tag) = @_; # their only tag
$node->tag eq $tag;
}
sub matches_tag {
my ($self, $node, $re) = @_;
$node->tag =~ $re;
}
# define an attribute
sub baz :Attr {
# the canonical order of arguments, none of which we need
# my ($self, $node, $index, $collection, @args) = @_;
'baz';
}
}
# now select some nodes from a tree
my $f = MyForester->new; # make a forester
my $path = $f->path('//foo/>bar[@depth = 4]'); # compile a path
my $root = fetch_tree(); # get a tree of interest
my @nodes = $path->select($root); # find the nodes of interest
# say our nodes have a text method that returns a string
$f->add_test( sub { shift->text =~ /^\s+$/ } ); # ignore whitespace nodes
$f->add_test( sub { shift->text =~ /^-?\d+$/ } ); # ignore integers
$f->add_test( sub { ! length shift->text } ); # ignore empty nodes
# reset to ignoring nothing
$f->clear_tests;
DESCRIPTION
A TPath::Forester
understands your trees and hence can translate TPath expressions into objects that will select the appropriate nodes from your trees. It can also generate an index appropriate to your trees if you're doing multiple selects on a particular tree.
TPath::Forester
is a role. It provides most, but not all, methods and attributes required to construct TPath::Expression objects. You must specify how to find a node's children and its parent (you may have to rely on a TPath::Index for this), and you must define how a tag string or regex may match a node, if at all.
Why "Forester"
Foresters are people who can tell you about trees. A class with the role TPath::Forester
can also tell you about trees. I think now "arborist" sounds better, but I don't feel like refactoring everything to use a new name.
ATTRIBUTES
log_stream
A TPath::LogStream required by the @log
attribute. By default it is TPath::StderrLog. This attribute is required by the @log
attribute from TPath::Attributes::Standard.
one_based
Whether to use xpath-style index predicates, with [1]
being the index of the first element, or zero-based indices, with [0]
being the first index. This only affects non-negative indices. This attribute is false by default.
case_insensitive
Whether selectors are case-insensitive in their matchign of tags. This attribute is false by default.
METHODS
add_test, has_tests, clear_tests
Add a code ref that will be used to test whether a node is ignorable. The return value of this code will be treated as a boolean value. If it is true, the node, and all its children, will be passed over as possible items to return from a select.
Example test:
$f->add_test(sub {
my ($forester, $node, $index) = @_;
return $forester->has_tag('foo');
});
Every test will receive the forester itself, the node, and the index as arguments. This example test will cause the forester $f
to ignore foo
nodes.
This method has the companion methods has_tests
and clear_tests
. The former says whether the list is empty and the latter clears it.
add_attribute
Expects a name, a code reference, and possibly options. Adds the attribute to the forester.
If the attribute name is already in use, the method will croak unless you specify that this attribute should override the already named attribute. E.g.,
$f->add_attribute( 'foo', sub { ... }, -override => 1 );
If you specify the attribute as overriding and the name is *not* already in use, the method will carp. You can use the -force
option to skip all this checking and just add the attribute.
Note that the code reference will receive the forester, a node, an index, a collection of nodes, and optionally any additional arguments. If you want the attribute to evaluate as undefined for a particular node, it must return undef
for this node.
attribute
Expects a TPath::Context, an attribute name, and an optional parameter list. Returns the value of the attribute in that context.
path
Takes a TPath expression and returns a TPath::Expression.
index
Takes a tree node and returns a TPath::Index object that TPath::Expression objects can use to cache information about the tree rooted at the given node.
parent
Expects a TPath::Context and returns the parent of the context node according to the index. If your nodes know their own parents, you probably want to override this method. See also TPath::Index.
id
Expects a node. Returns id of node, if any. By default this method always returns undef. Override if your node has some defined notion of id.
autoload_attribute
Expects an attribute name and optionally a list of arguments. Returns a code reference instantiating the attribute. This method is required for attributes such as
//foo[@:a]
or
//bar[@:b(1)]
Note the unescaped colon preceding the attribute name.
Autoloading is useful for this such as HTML or XML trees, where nodes may have ad hoc attributes.
This method must be defined by each forester requiring attribute auto-loading. The default method will always return undef
, and if one attempts to use it to autoload an attribute an error will be thrown during expression compilation.
is_leaf
Expects a node, and an index.
Returns whether the context node is a leaf. Override this with something more efficient where available. E.g., where the node provides an is_leaf
method,
sub is_leaf { $_[1]->is_leaf }
is_root
Expects a node and an index.
Returns whether the context node is the root. Delegates to index.
Override this with something more efficient where available. E.g., where the node provides an is_root
method,
sub is_root { $_[1]->is_root }
has_tag
Expects a node and a string. Returns whether the node, in whatever sense is appropriate to this sort of node, "has" the string as a tag. See the required tag
method.
matches_tag
Expects a node and a compiled regex. Returns whether the node, in whatever sense is appropriate to this sort of node, has a tag that matches the regex. See the required tag
method.
wrap
Expects a node and possibly an options hash. Returns a node of the type understood by the forester.
If your forester must coerce things into a tree of the right type, override this method, which otherwise just passes through its second argument.
Note, if you do need to override the default wrap, you'll have to jump through a few Moose hoops. The basic pattern is
...
use Moose;
...
with 'TPath::Forester' => { -excludes => 'wrap' };
...
{
no warnings 'redefine';
sub wrap {
my ($self, $node, %opts) = @_;
return $node if blessed $node and $node->isa('MyNode');
# coerce
...
}
}
See TPath::Forester::Ref for an example.
ROLES
TPath::Attributes::Standard, TPath::TypeCheck
REQUIRED METHODS
children
Expects a node and an index. Returns the children of the node as a list.
tag
Expects a node and returns the value selectors are matched against, or undef
if the node has no tag.
If your node type cannot be so easily mapped to a particular tag, you may want to override the has_tag
and matches_tag
methods and supply a no-op method for tag
.
AUTHOR
David F. Houghton <dfhoughton@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2013 by David F. Houghton.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.