NAME

XML::Struct::Reader - Read XML streams into XML data structures

SYNOPSIS

my $reader = XML::Struct::Reader->new( from => "file.xml" );
my $data   = $reader->read;

DESCRIPTION

This module reads an XML stream (via XML::LibXML::Reader) into XML::Struct/MicroXML data structures.

METHODS

read = readNext ( [ $stream ] [, $path ] )

Read the next XML element from a stream. If no path option is specified, the reader's path option is used ("*" by default, first matching the root, then every other element).

readDocument( [ $stream ] [, $path ] )

Read an entire XML document. In contrast to read/readNext, this method always reads the entire stream. The return value is the first element (that is the root element by default) in scalar context and a list of elements in array context. Multiple elements can be returned for instance when a path was specified to select document fragments.

readElement( [ $stream ] )

Read an XML element from a stream and return it as array reference with element name, attributes, and child elements. In contrast to method read, this method expects the stream to be at an element node ($stream->nodeType == 1) or bad things might happed.

readAttributes( [ $stream ] )

Read all XML attributes from a stream and return a (possibly empty) hash reference.

readContent( [ $stream ] )

Read all child elements of an XML element and return the result as (possibly empty) array reference. Significant whitespace is only included if option whitespace is enabled.

CONFIGURATION

from

A source to read from. Possible values include a string or string reference with XML data, a filename, an URL, a file handle, instances of XML::LibXML::Document or XML::LibXML::Element, and a hash reference with options passed to XML::LibXML::Reader.

stream

A XML::LibXML::Reader to read from. If no stream has been defined, one must pass a stream parameter to the read... methods. Setting a source with option from automatically sets a stream.

attributes

Include attributes (enabled by default). If disabled, the representation of an XML element will be

[ $name => \@children ]

instead of

[ $name => \%attributes, \@children ]
path

Optional path expression to be used as default value when calling read. Pathes must either be absolute (starting with "/") or consist of a single element name. The special name "*" matches all element names.

A path is a very reduced form of an XPath expressions (no axes, no "..", no node tests, // only at the start...). Namespaces are not supported yet.

whitespace

Include ignorable whitespace as text elements (disabled by default)

ns

Define how XML namespaces should be processed. By default (value 'keep'), this document:

<doc>
  <x:foo xmlns:x="http://example.org/" bar="doz" />
</doc>

is transformed to this structure, keeping namespace prefixes and declarations as unprocessed element names and attributes:

[ 'doc', {}, [
    [
      'x:foo', {
          'bar' => 'doz',
          'xmlns:x' => 'http://example.org/'
      }
    ]
]

Setting this option to 'strip' will remove all namespace prefixes and namespace declaration attributes, so the result would be:

[ 'doc', {}, [
    [
      'foo', {
          'bar' => 'doz'
      }
    ]
]

Setting this option to 'disallow' results in an error when namespace prefixes or declarations are read.

Expanding namespace URIs ('expand') is not supported yet.

simple

Convert XML to simple key-value structure (SimpleXML) with XML::Struct::Simple.

depth

Only transform to a given depth, starting at 0 for the root node. Negative values, non-numeric values or undef are ignored (unlimited depth as default).

XML elements below the depth are converted to SimpleXML by default or to MicroXML if option simple is enabled. This can be configured with option deep.

This option is useful for instance to access document-oriented XML embedded in data oriented XML.

deep

How to transform elements below given depth. This option is experimental.

root

Include root element when converting to SimpleXML. Disabled by default.

content

Name of text content when converting to SimpleXML.