NAME
XML::XPathScript::Processor - XML::XPathScript transformation engine
VERSION
version 2.00
SYNOPSIS
# OO API
use XML::XPathScript::Processor;
my $processor = XML::XPathScript::Processor->new;
$processor->set_xml( $dom );
$processor->set_template( $template );
my $transformed = $processor->apply_templates( '//foo' );
# functional API
use XML::XPathScript::Processor;
XML::XPathscript::Processor->import_functional;
set_xml( $dom );
set_template( $template );
my $transformed = apply_templates( '//foo' );
SYNOPSIS
# OO API
use XML::XPathScript::Processor;
my $processor = XML::XPathScript::Processor->new;
$processor->set_xml( $dom );
$processor->set_template( $template );
my $transformed = $processor->apply_templates( '//foo' );
# functional API
use XML::XPathScript::Processor;
XML::XPathscript::Processor->import_functional;
set_xml( $dom );
set_template( $template );
my $transformed = apply_templates( '//foo' );
DESCRIPTION
The XML::XPathScript distribution offers an XML parser glue, an embedded stylesheet language, and a way of processing an XML document into a text output. This module implements the latter part: it takes an already filled out $template
template object and an already parsed XML document (which are usually both provided by the parent XML::XPathScript object), and provides a simple API to implement stylesheets.
Typically, the processor is encapsulated within a XML::XPathScript object. In which case, all the black magick is already done for you, and the only part you have to worry about is the XPathScript language functions that XML::XPathScript::Processor imports into the stylesheet (see "XPATHSCRIPT LANGUAGE FUNCTIONS").
It is also possible to use a processor on its own, without using a stylesheet. This might be desirable, for example, to use XPathScript within a different templating system, like Embperl or HTML::Mason. For a discussion on how to use this module in such cases, see section "Embedding XML::XPathScript::Processor in a Templating System".
Embedding XML::XPathScript::Processor in a Templating System
It is possible to use the XPathScript processing engine without having to rely on stylesheets. This can be desirable if one wishes to use XPathScript within a different templating system, like Embperl or HTML::Mason. To do so, one simply has to directly use XML::XPathScript::Processor.
Example, with HTML::Mason:
<%perl>
use XML::XPathScript::Processor;
use XML::XPathScript::Template;
use XML::LibXML;
my $processor = XML::XPathScript::Processor->new;
# load the dom
my $dom = XML::LibXML->new->parse_string( <<'END_XML' );
<orchid>
<genus>Miltonesia</genus>
<species>spectabilis</species>
<variety>moreliana</variety>
</orchid>
END_XML
$processor->set_dom( $dom );
# load the template
my $template = XML::XPathScript::Template->new;
$processor->set_template( $template );
$template->set( orchid => { showtag => 0 } );
$template->set( genus => { rename => 'i' } );
$template->set( species => { rename => 'i' } );
$template->set( variety => { pre => 'var. ' } );
</%perl>
<p>This orchid is a <% $processor->apply_templates( '//orchid' ) %>.</p>
Same example, with Embperl:
[!
use XML::XPathScript::Processor;
use XML::XPathScript::Template;
use XML::LibXML;
!]
[-
$processor = XML::XPathScript::Processor->new;
# load the dom
$dom = XML::LibXML->new->parse_string( <<'END_XML' );
<orchid>
<genus>Miltonesia</genus>
<species>spectabilis</species>
<variety>moreliana</variety>
</orchid>
END_XML
$processor->set_dom( $dom );
# load the template
$template = XML::XPathScript::Template->new;
$processor->set_template( $template );
$template->set( orchid => { showtag => 0 } );
$template->set( genus => { rename => 'i' } );
$template->set( species => { rename => 'i' } );
$template->set( variety => { pre => 'var. ' } );
-]
<p>This orchid is a [+ $processor->apply_templates( '//orchid' ) +].</p>
DESCRIPTION
The XML::XPathScript distribution offers an XML parser glue, an embedded stylesheet language, and a way of processing an XML document into a text output. This module implements the latter part: it takes an already filled out $template
template object and an already parsed XML document (which are usually both provided by the parent XML::XPathScript object), and provides a simple API to implement stylesheets.
Typically, the processor is encapsulated within a XML::XPathScript object. In which case, all the black magick is already done for you, and the only part you have to worry about is the XPathScript language functions that XML::XPathScript::Processor imports into the stylesheet (see "XPATHSCRIPT LANGUAGE FUNCTIONS").
It is also possible to use a processor on its own, without using a stylesheet. This might be desirable, for example, to use XPathScript within a different templating system, like Embperl or HTML::Mason. For a discussion on how to use this module in such cases, see section "Embedding XML::XPathScript::Processor in a Templating System".
Embedding XML::XPathScript::Processor in a Templating System
It is possible to use the XPathScript processing engine without having to rely on stylesheets. This can be desirable if one wishes to use XPathScript within a different templating system, like Embperl or HTML::Mason. To do so, one simply has to directly use XML::XPathScript::Processor.
Example, with HTML::Mason:
<%perl>
use XML::XPathScript::Processor;
use XML::XPathScript::Template;
use XML::LibXML;
my $processor = XML::XPathScript::Processor->new;
# load the dom
my $dom = XML::LibXML->new->parse_string( <<'END_XML' );
<orchid>
<genus>Miltonesia</genus>
<species>spectabilis</species>
<variety>moreliana</variety>
</orchid>
END_XML
$processor->set_dom( $dom );
# load the template
my $template = XML::XPathScript::Template->new;
$processor->set_template( $template );
$template->set( orchid => { showtag => 0 } );
$template->set( genus => { rename => 'i' } );
$template->set( species => { rename => 'i' } );
$template->set( variety => { pre => 'var. ' } );
</%perl>
<p>This orchid is a <% $processor->apply_templates( '//orchid' ) %>.</p>
Same example, with Embperl:
[!
use XML::XPathScript::Processor;
use XML::XPathScript::Template;
use XML::LibXML;
!]
[-
$processor = XML::XPathScript::Processor->new;
# load the dom
$dom = XML::LibXML->new->parse_string( <<'END_XML' );
<orchid>
<genus>Miltonesia</genus>
<species>spectabilis</species>
<variety>moreliana</variety>
</orchid>
END_XML
$processor->set_dom( $dom );
# load the template
$template = XML::XPathScript::Template->new;
$processor->set_template( $template );
$template->set( orchid => { showtag => 0 } );
$template->set( genus => { rename => 'i' } );
$template->set( species => { rename => 'i' } );
$template->set( variety => { pre => 'var. ' } );
-]
<p>This orchid is a [+ $processor->apply_templates( '//orchid' ) +].</p>
XPATHSCRIPT LANGUAGE FUNCTIONS
This section covers the utility functions that are available within a stylesheet.
- processor
-
$processor = processor()
Returns the processor object. Useful for when XML::XPathScript::Processor is used in functional mode.
- set_dom, get_dom
-
set_dom( $dom ) $dom = get_dom
Accessors for the dom the processor is to transform. $dom must be an XML::LibXML or XML::XPath document or element.
- get_parser
-
$parser = get_parser()
Returns the parser associated with the loaded dom as a string ( 'XML::LibXML' or 'XML::XPath'), or undef if no dom has been loaded yet.
- enable_binmode
-
enable_binmode()
Enables binmode for the processor's output. See "binmode" in XML::XPathScript.
- get_binmode
-
$mode = get_binmode()
Returns true if binmode has been enabled, false otherwise.
- set_template, get_template
-
set_template( $t ) $t = get_template
Accessors for the processor's template. The template $t must be an XML::XPathScript::Template object.
- set_interpolation, get_interpolation
-
set_interpolation( $bool ) $bool = get_interpolation()
Sets / accesses the interpolation mode (on or off) of the processor.
- set_interpolation_regex, get_interpolation_regex
-
Sets / accesses the interpolation regex used by the processor.
- findnodes
-
@nodes = findnodes( $path ) @nodes = findnodes( $path, $context )
Returns a list of nodes found by XPath expression $path, optionally using $context as the context node (if not provided, defaults to the root node of the document). In scalar context returns a NodeSet object (but you do not want to do that, see "XPath scalar return values considered harmful" in XML::XPathScript).
- findvalue
-
$value = findvalue( $path ) $value = findvalue( $path, $context )
Evaluates XPath expression $path and returns the resulting value. If the path returns an object, stringification is done automatically for you using "xpath_to_string".
- xpath_to_string
-
$string = xpath_to_string( $blob )
Converts any XPath data type, such as "Literal", "Numeric", "NodeList", text nodes, etc. into a pure Perl string (UTF-8 tainted too - see "is_utf8_tainted"). Scalar XPath types are interpreted in the straightforward way, DOM nodes are stringified into conform XML, and NodeList's are stringified by concatenating the stringification of their members (in the latter case, the result obviously is not guaranteed to be valid XML).
See "XPath scalar return values considered harmful" in XML::XPathScript on why this is useful.
- findvalues
-
@values = findvalues( $path ) @values = findvalues( $path, $context )
Evaluates XPath expression $path as a nodeset expression, just like "findnodes" would, but returns a list of UTF8-encoded XML strings instead of node objects or node sets. See also "XPath scalar return values considered harmful" in XML::XPathScript.
- findnodes_as_string
-
@nodes = findnodes_as_string( $path ) @nodes = findnodes_as_string( $path, $context )
Similar to "findvalues" but concatenates the XML snippets. The result obviously is not guaranteed to be valid XML.
- matches
-
$bool = matches( $node, $path ) $bool = matches( $node, $path, $context )
Returns true if the node matches the path (optionally in context $context)
- apply_templates
-
$transformed = apply_templates( @nodes, \%params ) $transformed = apply_templates( $xpath, $context, \%params ) $transformed = apply_templates( $xpath, \%params )
This is where the whole magic in XPathScript resides: recursively applies the stylesheet templates to the nodes provided either literally (first invocation form) or through an XPath expression (second and third invocation forms), and returns a string concatenation of all results.
If called without nodes or an xpath. renders the whole document (same as
apply_templates('/')
).An hash of parameters, %params can also be passed to apply_templates, which will be passed to any testcode function called from the template.
Calls to apply_templates() may occur both implicitly (at the top of the document, and for rendering subnodes when the templates choose to handle that by themselves), and explicitly (because
testcode
routines require the XML::XPathScript::Processor to "DO_SELF_AND_KIDS").If appropriate care is taken in all templates (especially the
testcode
routines and the text() template), the string result of apply_templates need not be UTF-8 (see "binmode" in XML::XPathScript): it is thus possible to use XPathScript to produce output in any character set without an extra translation pass. - call_template
-
call_template( $node, $t, $templatename )
EXPERIMENTAL - allows
testcode
routines to invoke a template by name, even if the selectors do not fit (e.g. one can apply template B to an element node of type A). Returns the stylesheeted string computed out of $node just like "apply_templates" would. - is_element_node
-
$bool = is_element_node( $object )
Returns true if $object is an element node, false otherwise.
- is_text_node
-
$bool = is_text_node( $object )
Returns true if $object is a "true" text node (not a comment node), false otherwise.
- is_comment_node
-
$bool = is_comment_node ( $object )
Returns true if $object is an XML comment node, false otherwise.
- is_pi_node
-
$bool = is_pi_node( $object )
Returns true iff $object is a processing instruction node.
- is_nodelist
-
$bool = is_nodelist( $object )
Returns true if $node is a node list (as returned by "findnodes" in scalar context), false otherwise.
- is_utf8_tainted
-
$bool = is_utf8_tainted( $string )
Returns true if Perl thinks that $string is a string of characters (in UTF-8 internal representation), and false if Perl treats $string as a meaningless string of bytes.
The dangerous part of the story is when concatenating a non-tainted string with a tainted one, as it causes the whole string to be re-interpreted into UTF-8, even the part that was supposedly meaningless character-wise, and that happens in a nonportable fashion (depends on locale and Perl version). So don't do that - and use this function to prevent that from happening.
- get_xpath_of_node
-
$xpath = get_xpath_of_node( $node )
Returns an XPath string that points to $node, from the root. Useful to create error messages that point at some location in the original XML document.
XPATHSCRIPT LANGUAGE CONSTANTS
$DO_SELF_AND_KIDS, $DO_SELF_ONLY, $DO_NOT_PROCESS, $DO_TEXT_AS_CHILD
DO_SELF_AND_KIDS, DO_SELF_ONLY, DO_NOT_PROCESS, DO_TEXT_AS_CHILD
These constants are used to define the action tag of an element, or the return value of a testcode function (see "Stylesheet#action" in XML::XPathScript). They are automatically exported.
The pseudo-bareword way to refer to the constants (e.g., DO_SELF_ONLY
) is deprecated, and will eventually be removed in a future release.
METHODS
- import_functional
-
XML::XPathScript::Processor->import_functional( $prefix ) $processor->import_functional( $prefix )
Imports the stylesheet utility functions into the current namespace. If $prefix is given, is it prepended to the function names (i.e., if $prefix is 'xps_', apply_templates will become xps_apply_templates).
If the first form is used, a new processor object is secretly created and assigned to the namespace (it can be retrieved using the function processor()). The second form uses the already existing $processor as the underlaying processor object for the namespace.
Example:
use XML::XPathScript::Processor; # import the goodies in the current namespace XML::XPathScript::Processor->import_functional; # set the document and template we want to use set_dom( $xml_dom ); set_template( $template ); my @foo_nodes = findnodes( '//foo' ); # print the last foo, transformed print apply_templates( $foo_nodes[-1] );
NAME
XML::XPathScript::Processor - XML::XPathScript transformation engine
XPATHSCRIPT LANGUAGE FUNCTIONS
This section covers the utility functions that are available within a stylesheet.
- processor
-
$processor = processor()
Returns the processor object. Useful for when XML::XPathScript::Processor is used in functional mode.
- set_dom, get_dom
-
set_dom( $dom ) $dom = get_dom
Accessors for the dom the processor is to transform. $dom must be an XML::LibXML or XML::XPath document or element.
- get_parser
-
$parser = get_parser()
Returns the parser associated with the loaded dom as a string ( 'XML::LibXML' or 'XML::XPath'), or undef if no dom has been loaded yet.
- enable_binmode
-
enable_binmode()
Enables binmode for the processor's output. See "binmode" in XML::XPathScript.
- get_binmode
-
$mode = get_binmode()
Returns true if binmode has been enabled, false otherwise.
- set_template, get_template
-
set_template( $t ) $t = get_template
Accessors for the processor's template. The template $t must be an XML::XPathScript::Template object.
- set_interpolation, get_interpolation
-
set_interpolation( $bool ) $bool = get_interpolation()
Sets / accesses the interpolation mode (on or off) of the processor.
- set_interpolation_regex, get_interpolation_regex
-
Sets / accesses the interpolation regex used by the processor.
- findnodes
-
@nodes = findnodes( $path ) @nodes = findnodes( $path, $context )
Returns a list of nodes found by XPath expression $path, optionally using $context as the context node (if not provided, defaults to the root node of the document). In scalar context returns a NodeSet object (but you do not want to do that, see "XPath scalar return values considered harmful" in XML::XPathScript).
- findvalue
-
$value = findvalue( $path ) $value = findvalue( $path, $context )
Evaluates XPath expression $path and returns the resulting value. If the path returns an object, stringification is done automatically for you using "xpath_to_string".
- xpath_to_string
-
$string = xpath_to_string( $blob )
Converts any XPath data type, such as "Literal", "Numeric", "NodeList", text nodes, etc. into a pure Perl string (UTF-8 tainted too - see "is_utf8_tainted"). Scalar XPath types are interpreted in the straightforward way, DOM nodes are stringified into conform XML, and NodeList's are stringified by concatenating the stringification of their members (in the latter case, the result obviously is not guaranteed to be valid XML).
See "XPath scalar return values considered harmful" in XML::XPathScript on why this is useful.
- findvalues
-
@values = findvalues( $path ) @values = findvalues( $path, $context )
Evaluates XPath expression $path as a nodeset expression, just like "findnodes" would, but returns a list of UTF8-encoded XML strings instead of node objects or node sets. See also "XPath scalar return values considered harmful" in XML::XPathScript.
- findnodes_as_string
-
@nodes = findnodes_as_string( $path ) @nodes = findnodes_as_string( $path, $context )
Similar to "findvalues" but concatenates the XML snippets. The result obviously is not guaranteed to be valid XML.
- matches
-
$bool = matches( $node, $path ) $bool = matches( $node, $path, $context )
Returns true if the node matches the path (optionally in context $context)
- apply_templates
-
$transformed = apply_templates() $transformed = apply_templates( $xpath ) $transformed = apply_templates( $xpath, $context ) $transformed = apply_templates( @nodes )
This is where the whole magic in XPathScript resides: recursively applies the stylesheet templates to the nodes provided either literally (last invocation form) or through an XPath expression (second and third invocation forms), and returns a string concatenation of all results. If called without arguments at all, renders the whole document (same as
apply_templates("/")
).Calls to apply_templates() may occur both implicitly (at the top of the document, and for rendering subnodes when the templates choose to handle that by themselves), and explicitly (because
testcode
routines require the XML::XPathScript::Processor to "DO_SELF_AND_KIDS").If appropriate care is taken in all templates (especially the
testcode
routines and the text() template), the string result of apply_templates need not be UTF-8 (see "binmode" in XML::XPathScript): it is thus possible to use XPathScript to produce output in any character set without an extra translation pass. - call_template
-
call_template( $node, $t, $templatename )
EXPERIMENTAL - allows
testcode
routines to invoke a template by name, even if the selectors do not fit (e.g. one can apply template B to an element node of type A). Returns the stylesheeted string computed out of $node just like "apply_templates" would. - is_element_node
-
$bool = is_element_node( $object )
Returns true if $object is an element node, false otherwise.
- is_text_node
-
$bool = is_text_node( $object )
Returns true if $object is a "true" text node (not a comment node), false otherwise.
- is_comment_node
-
$bool = is_comment_node ( $object )
Returns true if $object is an XML comment node, false otherwise.
- is_pi_node
-
$bool = is_pi_node( $object )
Returns true iff $object is a processing instruction node.
- is_nodelist
-
$bool = is_nodelist( $object )
Returns true if $node is a node list (as returned by "findnodes" in scalar context), false otherwise.
- is_utf_tainted
-
$bool = is_utf8_tainted( $string )
Returns true if Perl thinks that $string is a string of characters (in UTF-8 internal representation), and false if Perl treats $string as a meaningless string of bytes.
The dangerous part of the story is when concatenating a non-tainted string with a tainted one, as it causes the whole string to be re-interpreted into UTF-8, even the part that was supposedly meaningless character-wise, and that happens in a nonportable fashion (depends on locale and Perl version). So don't do that - and use this function to prevent that from happening.
- get_xpath_of_node
-
$xpath = get_xpath_of_node( $node )
Returns an XPath string that points to $node, from the root. Useful to create error messages that point at some location in the original XML document.
METHODS
- import_functional
-
XML::XPathScript::Processor->import_functional( $prefix ) $processor->import_functional( $prefix )
Imports the stylesheet utility functions into the current namespace. If $prefix is given, is it prepended to the function names (i.e., if $prefix is 'xps_', apply_templates will become xps_apply_templates).
If the first form is used, a new processor object is secretly created and assigned to the namespace (it can be retrieved using the function processor()). The second form uses the already existing $processor as the underlaying processor object for the namespace.
Example:
use XML::XPathScript::Processor; # import the goodies in the current namespace XML::XPathScript::Processor->import_functional; # set the document and template we want to use set_dom( $xml_dom ); set_template( $template ); my @foo_nodes = findnodes( '//foo' ); # print the last foo, transformed print apply_templates( $foo_nodes[-1] );
AUTHORS
Yanick Champoux <yanick@cpan.org>
Dominique Quatravaux <domq@cpan.org>
Matt Sergeant <matt@sergeant.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2019, 2018, 2008, 2007 by Matt Sergeant.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.