NAME
XML::Struct::Reader - Read XML streams into XML data structures
SYNOPSIS
my $reader = XML::Struct::Reader->new( from => "file.xml" );
my $data = $reader->read;
DESCRIPTION
This module reads an XML stream (via XML::LibXML::Reader) into XML::Struct/MicroXML data structures.
METHODS
read = readNext ( [ $stream ] [, $path ] )
Read the next XML element from a stream. If no path option is specified, the reader's path option is used ("*
" by default, first matching the root, then every other element).
readDocument( [ $stream ] [, $path ] )
Read an entire XML document. In contrast to read
/readNext
, this method always reads the entire stream. The return value is the first element (that is the root element by default) in scalar context and a list of elements in array context. Multiple elements can be returned for instance when a path was specified to select document fragments.
readElement( [ $stream ] )
Read an XML element from a stream and return it as array reference with element name, attributes, and child elements. In contrast to method read
, this method expects the stream to be at an element node ($stream->nodeType == 1
) or bad things might happed.
readAttributes( [ $stream ] )
Read all XML attributes from a stream and return a (possibly empty) hash reference.
readContent( [ $stream ] )
Read all child elements of an XML element and return the result as (possibly empty) array reference. Significant whitespace is only included if option whitespace
is enabled.
CONFIGURATION
- from
-
A source to read from. Possible values include a string or string reference with XML data, a filename, an URL, a file handle, instances of XML::LibXML::Document or XML::LibXML::Element, and a hash reference with options passed to XML::LibXML::Reader.
- stream
-
A XML::LibXML::Reader to read from. If no stream has been defined, one must pass a stream parameter to the
read...
methods. Setting a source with optionfrom
automatically sets a stream. - attributes
-
Include attributes (enabled by default). If disabled, the representation of an XML element will be
[ $name => \@children ]
instead of
[ $name => \%attributes, \@children ]
- path
-
Optional path expression to be used as default value when calling
read
. Pathes must either be absolute (starting with "/
") or consist of a single element name. The special name "*
" matches all element names.A path is a very reduced form of an XPath expressions (no axes, no "
..
", no node tests,//
only at the start...). Namespaces are not supported yet. - whitespace
-
Include ignorable whitespace as text elements (disabled by default)
- ns
-
Define how XML namespaces should be processed. By default (value '
keep
'), this document:<doc> <x:foo xmlns:x="http://example.org/" bar="doz" /> </doc>
is transformed to this structure, keeping namespace prefixes and declarations as unprocessed element names and attributes:
[ 'doc', {}, [ [ 'x:foo', { 'bar' => 'doz', 'xmlns:x' => 'http://example.org/' } ] ]
Setting this option to '
strip
' will remove all namespace prefixes and namespace declaration attributes, so the result would be:[ 'doc', {}, [ [ 'foo', { 'bar' => 'doz' } ] ]
Setting this option to '
disallow
' results in an error when namespace prefixes or declarations are read.Expanding namespace URIs ('
expand'
) is not supported yet. - simple
-
Convert XML to simple key-value structure (SimpleXML) with XML::Struct::Simple.
- depth
-
Only transform to a given depth, starting at
0
for the root node. Negative values, non-numeric values orundef
are ignored (unlimited depth as default).XML elements below the depth are converted to SimpleXML by default or to MicroXML if option
simple
is enabled. This can be configured with optiondeep
.This option is useful for instance to access document-oriented XML embedded in data oriented XML.
- deep
-
How to transform elements below given
depth
. This option is experimental. - root
-
Include root element when converting to SimpleXML. Disabled by default.
- content
-
Name of text content when converting to SimpleXML.