NAME

XML::Reader - Reading XML and providing path information based on a pull-parser.

SYNOPSIS

use XML::Reader;

my $text = q{<init>n <?test pi?> t<page node="400">m <!-- remark --> r</page></init>};

my $rdr = XML::Reader->new(\$text) or die "Error: $!";
while ($rdr->iterate) {
    printf "Path: %-19s, Value: %s\n", $rdr->path, $rdr->value;
}

This program produces the following output:

Path: /init              , Value: n t
Path: /init/page/@node   , Value: 400
Path: /init/page         , Value: m r
Path: /init              , Value:

DESCRIPTION

XML::Reader provides a simple and easy to use interface for sequentially parsing XML files (so called "pull-mode" parsing) and at the same time keeps track of the complete XML-path.

It was developped as a wrapper on top of XML::Parser (while, at the same time, some basic functions have been copied from XML::TokeParser). Both XML::Parser and XML::TokeParser allow pull-mode parsing, but do not keep track of the complete XML-Path. Also, the interfaces to XML::Parser and XML::TokeParser require you to distinguish between start-tags, end-tags and text on seperate lines, which, in my view, complicates the interface (although, XML::Reader allows option {filter => 4, mode => 'pyx'} which emulates start-tags, end-tags and text on separate lines, if that's what you want).

There is also XML::TiePYX, which lets you pull-mode parse XML-Files (see http://www.xml.com/pub/a/2000/03/15/feature/index.html for an introduction to PYX). But still, with XML::TiePYX you need to account for start-tags, end-tags and text, and it does not provide the full XML-path.

By contrast, XML::Reader translates start-tags, end-tags and text into XPath-like expressions. So you don't need to worry about tags, you just get a path and a value, and that's it. (However, should you wish to operate XML::Reader in a PYX compatible mode, there is option {filter => 4, mode => 'pyx'}, as mentioned above, which allows you to parse XML in that way).

But going back to the normal mode of operation, here is an example XML in variable '$line1':

my $line1 = 
q{<?xml version="1.0" encoding="iso-8859-1"?>
  <data>
    <item>abc</item>
    <item><!-- c1 -->
      <dummy/>
      fgh
      <inner name="ttt" id="fff">
        ooo <!-- c2 --> ppp
      </inner>
    </item>
  </data>
};

This example can be parsed with XML::Reader using the methods iterate to iterate one-by-one through the XML-data, path and value to extract the current XML-path and its value.

You can also keep track of the start- and end-tags: There is a method is_start, which returns 1 or 0, depending on whether the XML-file had a start tag at the current position. There is also the equivalent method is_end.

There are also the methods tag, attr, type and level. tag gives you the current tag-name, attr returns the attribute-name, type returns either 'T' for text or '@' for attributes and level indicates the current nesting-level (a number >= 0).

Here is a sample program which parses the XML in '$line1' from above to demonstrate the principle...

use XML::Reader;

my $rdr = XML::Reader->new(\$line1) or die "Error: $!";
my $i = 0;
while ($rdr->iterate) { $i++;
    printf "%3d. pat=%-22s, val=%-9s, s=%-1s, e=%-1s, tag=%-6s, atr=%-6s, t=%-1s, lvl=%2d\n", $i,
      $rdr->path, $rdr->value, $rdr->is_start, $rdr->is_end, $rdr->tag, $rdr->attr, $rdr->type, $rdr->level;
}

...and here is the output:

 1. pat=/data                 , val=         , s=1, e=0, tag=data  , atr=      , t=T, lvl= 1
 2. pat=/data/item            , val=abc      , s=1, e=1, tag=item  , atr=      , t=T, lvl= 2
 3. pat=/data                 , val=         , s=0, e=0, tag=data  , atr=      , t=T, lvl= 1
 4. pat=/data/item            , val=         , s=1, e=0, tag=item  , atr=      , t=T, lvl= 2
 5. pat=/data/item/dummy      , val=         , s=1, e=1, tag=dummy , atr=      , t=T, lvl= 3
 6. pat=/data/item            , val=fgh      , s=0, e=0, tag=item  , atr=      , t=T, lvl= 2
 7. pat=/data/item/inner/@id  , val=fff      , s=0, e=0, tag=@id   , atr=id    , t=@, lvl= 4
 8. pat=/data/item/inner/@name, val=ttt      , s=0, e=0, tag=@name , atr=name  , t=@, lvl= 4
 9. pat=/data/item/inner      , val=ooo ppp  , s=1, e=1, tag=inner , atr=      , t=T, lvl= 3
10. pat=/data/item            , val=         , s=0, e=1, tag=item  , atr=      , t=T, lvl= 2
11. pat=/data                 , val=         , s=0, e=1, tag=data  , atr=      , t=T, lvl= 1

INTERFACE

Object creation

To create an XML::Reader object, the following syntax is used:

my $rdr = XML::Reader->new($data,
  {strip => 1, filter => 2, using => ['/path1', '/path2']})
  or die "Error: $!";

The element $data (which is mandatory) is the name of the XML-file, or a reference to a string, in which case the content of that string is taken as the text of the XML.

Alternatively, $data can also be a previously opened filehandle, such as \*STDIN, in which case that filehandle is used to read the XML.

Here is an example to create an XML::Reader object with a file-name:

my $rdr = XML::Reader->new('input.xml') or die "Error: $!";

Here is another example to create an XML::Reader object with a reference:

my $rdr = XML::Reader->new(\'<data>abc</data>') or die "Error: $!";

Here is an example to create an XML::Reader object with an open filehandle:

open my $fh, '<', 'input.xml' or die "Error: $!";
my $rdr = XML::Reader->new($fh);

Here is an example to create an XML::Reader object with \*STDIN:

my $rdr = XML::Reader->new(\*STDIN);

One or more of the following options can be added as a hash-reference:

option {parse_ct => }

Option {parse_ct => 1} allows for comments to be parsed, default is {parse_ct => 0}

option {parse_pi => }

Option {parse_pi => 1} allows for processing-instructions and XML-Declarations to be parsed, default is {parse_pi => 0}

option {using => }

Option {using => } allows for selecting a sub-tree of the XML.

The syntax is {using => ['/path1/path2/path3', '/path4/path5/path6']}

option {filter => } and {mode => }

Option {filter => 2} or {mode => 'attr-bef-start'} shows all lines, including attributes.

Option {filter => 3} or {mode => 'attr-in-hash'} removes attribute lines (i.e. it removes lines where $rdr->type eq '@'). Instead, it returns the attributes in a hash $rdr->att_hash.

Option {filter => 4} or {mode => 'pyx'} breaks down each line into its individual start-tags, end-tags, attributes, comments and processing-instructions. This allows the parsing of XML into pyx-formatted lines.

Option {filter => 5} or {mode => 'branches'} selects only data for a given root. The elements for each root are collected in an array reference (as specified by the branch) and returned when the root is complete. This processing lies half way between option using (where all elements are returned one by one) and the function slurp_xml (where all elements are collected in a branch, and all branches are collected in an in-memory structure).

The syntax is {filter => 2|3|4|5, mode => 'attr-bef-start'|'attr-in-hash'|'pyx'|'branches'}, default is {filter => 2, mode => 'attr-bef-start'}

option {strip => }

Option {strip => 1} strips all leading and trailing spaces from text and comments. (attributes are never stripped). {strip => 0} leaves text and comments unmodified.

The syntax is {strip => 0|1}, default is {strip => 1}

Methods

A successfully created object of type XML::Reader provides the following methods:

iterate

Reads one single XML-value. It returns 1 after a successful read, or undef when it hits end-of-file.

path

Provides the complete path of the currently selected value, attributes are represented by leading '@'-signs.

value

Provides the actual value (i.e. the value of the current text or attribute).

Note that, when {filter => 2 or 3} and in case of an XML declaration (i.e. $rdr->is_decl == 1), you want to suppress any value (which would be empty anyway). A typical code fragment would be:

print $rdr->value, "\n" unless $rdr->is_decl;

The above code does *not* apply for {filter => 4}, in which case a simple "print $rdr->value;" suffices:

print $rdr->value, "\n";

comment

Provides the comment of the XML. You should check if $rdr->is_comment is true before accessing the comment.

type

Provides the type of the value: 'T' for text, '@' for attributes.

If option {filter => 4} is in effect, then the type can be: 'T' for text, '@' for attributes, 'S' for start tags, 'E' for end-tags, '#' for comments, 'D' for the XML Declaration, '?' for processing-instructions.

tag

Provides the current tag-name.

attr

Provides the current attribute name (returns the empty string for non-attribute lines).

level

Indicates the nesting level of the XPath expression (numeric value greater than zero).

prefix

Shows the prefix which has been removed in option {using => ...}. Returns the empty string if option {using => ...} has not been specified.

att_hash

Returns a reference to a hash with the current attributes of a start-tag (or empty hash if it is not a start-tag).

dec_hash

Returns a reference to a hash with the current attributes of an XML-Declaration (or empty hash if it is not an XML-Declaration).

proc_tgt

Returns the target (i.e. the first part) of a processing-instruction (or an empty string if the current event is not a processing-instruction).

proc_data

Returns the data (i.e. the second part) of a processing-instruction (or an empty string if the current event is not a processing-instruction).

pyx

Returns the pyx string of the current XML-event.

The pyx string is a string that starts with a specific first character. That first character of each line of PYX tells you what type of event you are dealing with: if the first character is '(', then you are dealing with a start event. If it's a ')', then you are dealing with and end event. If it's an 'A' then you are dealing with attributes. If it's '-', then you are dealing with text. If it's '?', then you are dealing with processing-instructions. (see http://www.xml.com/pub/a/2000/03/15/feature/index.html for an introduction to PYX).

The method pyx makes sense only if option {filter => 4} is selected, for any filter other than 4, undef is returned.

is_start

Returns 1 if the XML-file had a start tag at the current position, otherwise 0 is returned.

is_end

Returns 1 if the XML-file had an end tag at the current position, otherwise 0 is returned.

is_decl

Returns 1 if the XML-file had an XML-Declaration at the current position, otherwise 0 is returned.

is_proc

Returns 1 if the XML-file had a processing-instruction at the current position, otherwise 0 is returned.

is_comment

Returns 1 if the XML-file had a comment at the current position, otherwise 0 is returned.

is_text

Returns 1 if the XML-file had text at the current position, otherwise 0 is returned.

is_attr

Returns 1 if the XML-file had an attribute at the current position, otherwise 0 is returned.

is_value

Returns 1 if the XML-file has either a text or an attribute at the current position, otherwise 0 is returned. This is mostly useful in {filter => 4, mode => 'pyx'} to see whether the method value() can be used.

rx

This is the index of the currently selected branch (only useful when {filter => 5, mode => 'branches'} was set).

rval

This is the value of the currently selected branch (only useful when {filter => 5, mode => 'branches'} was set).

rvalue

This is a reference to either a scalar or to an array of the currently selected branch (only useful when {filter => 5, mode => 'branches'} was set). rvalue is a faster, but not so convenient version of rval (with rvalue you will have to do the dereferencing yourself).

rstem

This is function is a duplicate of the existing function path.

OPTION USING

Option {using => ...} allows for selecting a sub-tree of the XML.

Here is how it works in detail...

option {using => ['/path1/path2/path3', '/path4/path5/path6']} eliminates all lines which do not start with '/path1/path2/path3' (or with '/path4/path5/path6', for that matter). This effectively leaves only lines starting with '/path1/path2/path3' or '/path4/path5/path6'.

Those lines (which are not eliminated) will have a shorter path by effectively removing the prefix '/path1/path2/path3' (or '/path4/path5/path6') from the path. The removed prefix, however, shows up in the prefix-method.

'/path1/path2/path3' (or '/path4/path5/path6') are supposed to be absolute and complete, i.e. absolute meaning they have to start with a '/'-character and complete meaning that the last item in path 'path3' (or 'path6', for that matter) will be completed internally by a trailing '/'-character.

An example with option 'using'

The following program takes this XML and parses it with XML::Reader, including the option 'using' to target specific elements:

use XML::Reader;

my $line2 = q{
<data>
  <order>
    <database>
      <customer name="aaa" />
      <customer name="bbb" />
      <customer name="ccc" />
      <customer name="ddd" />
    </database>
  </order>
  <dummy value="ttt">test</dummy>
  <supplier>hhh</supplier>
  <supplier>iii</supplier>
  <supplier>jjj</supplier>
</data>
};

my $rdr = XML::Reader->new(\$line2,
  {using => ['/data/order/database/customer', '/data/supplier']});

my $i = 0;
while ($rdr->iterate) { $i++;
    printf "%3d. prf=%-29s, pat=%-7s, val=%-3s, tag=%-6s, t=%-1s, lvl=%2d\n",
      $i, $rdr->prefix, $rdr->path, $rdr->value, $rdr->tag, $rdr->type, $rdr->level;
}

This is the output of that program:

 1. prf=/data/order/database/customer, pat=/@name , val=aaa, tag=@name , t=@, lvl= 1
 2. prf=/data/order/database/customer, pat=/      , val=   , tag=      , t=T, lvl= 0
 3. prf=/data/order/database/customer, pat=/@name , val=bbb, tag=@name , t=@, lvl= 1
 4. prf=/data/order/database/customer, pat=/      , val=   , tag=      , t=T, lvl= 0
 5. prf=/data/order/database/customer, pat=/@name , val=ccc, tag=@name , t=@, lvl= 1
 6. prf=/data/order/database/customer, pat=/      , val=   , tag=      , t=T, lvl= 0
 7. prf=/data/order/database/customer, pat=/@name , val=ddd, tag=@name , t=@, lvl= 1
 8. prf=/data/order/database/customer, pat=/      , val=   , tag=      , t=T, lvl= 0
 9. prf=/data/supplier               , pat=/      , val=hhh, tag=      , t=T, lvl= 0
10. prf=/data/supplier               , pat=/      , val=iii, tag=      , t=T, lvl= 0
11. prf=/data/supplier               , pat=/      , val=jjj, tag=      , t=T, lvl= 0

An example without option 'using'

The following program takes the same XML and parses it with XML::Reader, but without the option 'using'.

use XML::Reader;

my $rdr = XML::Reader->new(\$line2);
my $i = 0;
while ($rdr->iterate) { $i++;
    printf "%3d. prf=%-1s, pat=%-37s, val=%-6s, tag=%-11s, t=%-1s, lvl=%2d\n",
     $i, $rdr->prefix, $rdr->path, $rdr->value, $rdr->tag, $rdr->type, $rdr->level;
}

As you can see in the following output, there are many more lines written, the prefix is empty and the path is much longer than in the previous program:

 1. prf= , pat=/data                                , val=      , tag=data       , t=T, lvl= 1
 2. prf= , pat=/data/order                          , val=      , tag=order      , t=T, lvl= 2
 3. prf= , pat=/data/order/database                 , val=      , tag=database   , t=T, lvl= 3
 4. prf= , pat=/data/order/database/customer/@name  , val=aaa   , tag=@name      , t=@, lvl= 5
 5. prf= , pat=/data/order/database/customer        , val=      , tag=customer   , t=T, lvl= 4
 6. prf= , pat=/data/order/database                 , val=      , tag=database   , t=T, lvl= 3
 7. prf= , pat=/data/order/database/customer/@name  , val=bbb   , tag=@name      , t=@, lvl= 5
 8. prf= , pat=/data/order/database/customer        , val=      , tag=customer   , t=T, lvl= 4
 9. prf= , pat=/data/order/database                 , val=      , tag=database   , t=T, lvl= 3
10. prf= , pat=/data/order/database/customer/@name  , val=ccc   , tag=@name      , t=@, lvl= 5
11. prf= , pat=/data/order/database/customer        , val=      , tag=customer   , t=T, lvl= 4
12. prf= , pat=/data/order/database                 , val=      , tag=database   , t=T, lvl= 3
13. prf= , pat=/data/order/database/customer/@name  , val=ddd   , tag=@name      , t=@, lvl= 5
14. prf= , pat=/data/order/database/customer        , val=      , tag=customer   , t=T, lvl= 4
15. prf= , pat=/data/order/database                 , val=      , tag=database   , t=T, lvl= 3
16. prf= , pat=/data/order                          , val=      , tag=order      , t=T, lvl= 2
17. prf= , pat=/data                                , val=      , tag=data       , t=T, lvl= 1
18. prf= , pat=/data/dummy/@value                   , val=ttt   , tag=@value     , t=@, lvl= 3
19. prf= , pat=/data/dummy                          , val=test  , tag=dummy      , t=T, lvl= 2
20. prf= , pat=/data                                , val=      , tag=data       , t=T, lvl= 1
21. prf= , pat=/data/supplier                       , val=hhh   , tag=supplier   , t=T, lvl= 2
22. prf= , pat=/data                                , val=      , tag=data       , t=T, lvl= 1
23. prf= , pat=/data/supplier                       , val=iii   , tag=supplier   , t=T, lvl= 2
24. prf= , pat=/data                                , val=      , tag=data       , t=T, lvl= 1
25. prf= , pat=/data/supplier                       , val=jjj   , tag=supplier   , t=T, lvl= 2
26. prf= , pat=/data                                , val=      , tag=data       , t=T, lvl= 1

OPTION PARSE_CT

Option {parse_ct => 1} allows for comments to be parsed (usually, comments are ignored by XML::Reader, that is {parse_ct => 0} is the default.

Here is an example where comments are ignored by default:

use XML::Reader;

my $text = q{<?xml version="1.0"?><dummy>xyz <!-- remark --> stu <?ab cde?> test</dummy>};

my $rdr = XML::Reader->new(\$text) or die "Error: $!";

while ($rdr->iterate) {
    if ($rdr->is_decl)    { my %h = %{$rdr->dec_hash};
                            print "Found decl     ",  join('', map{" $_='$h{$_}'"} sort keys %h), "\n"; }
    if ($rdr->is_proc)    { print "Found proc      ", "t=", $rdr->proc_tgt, ", d=", $rdr->proc_data, "\n"; }
    if ($rdr->is_comment) { print "Found comment   ", $rdr->comment, "\n"; }
    print "Text '", $rdr->value, "'\n" unless $rdr->is_decl;
}

Here is the output:

Text 'xyz stu test'

Now, the very same XML data, and the same algorithm, except for the option {parse_ct => 1}, which is now activated:

use XML::Reader;

my $text = q{<?xml version="1.0"?><dummy>xyz <!-- remark --> stu <?ab cde?> test</dummy>};

my $rdr = XML::Reader->new(\$text, {parse_ct => 1}) or die "Error: $!";

while ($rdr->iterate) {
    if ($rdr->is_decl)    { my %h = %{$rdr->dec_hash};
                            print "Found decl     ",  join('', map{" $_='$h{$_}'"} sort keys %h), "\n"; }
    if ($rdr->is_proc)    { print "Found proc      ", "t=", $rdr->proc_tgt, ", d=", $rdr->proc_data, "\n"; }
    if ($rdr->is_comment) { print "Found comment   ", $rdr->comment, "\n"; }
    print "Text '", $rdr->value, "'\n" unless $rdr->is_decl;
}

Here is the output:

Text 'xyz'
Found comment   remark
Text 'stu test'

OPTION PARSE_PI

Option {parse_pi => 1} allows for processing-instructions and XML-Declarations to be parsed (usually, processing-instructions and XML-Declarations are ignored by XML::Reader, that is {parse_pi => 0} is the default.

As an example, we use the very same XML data, and the same algorithm from the above paragraph, except for the option {parse_pi => 1}, which is now activated (together with option {parse_ct => 1}):

use XML::Reader;

my $text = q{<?xml version="1.0"?><dummy>xyz <!-- remark --> stu <?ab cde?> test</dummy>};

my $rdr = XML::Reader->new(\$text, {parse_ct => 1, parse_pi => 1}) or die "Error: $!";

while ($rdr->iterate) {
    if ($rdr->is_decl)    { my %h = %{$rdr->dec_hash};
                            print "Found decl     ",  join('', map{" $_='$h{$_}'"} sort keys %h), "\n"; }
    if ($rdr->is_proc)    { print "Found proc      ", "t=", $rdr->proc_tgt, ", d=", $rdr->proc_data, "\n"; }
    if ($rdr->is_comment) { print "Found comment   ", $rdr->comment, "\n"; }
    print "Text '", $rdr->value, "'\n" unless $rdr->is_decl;
}

Note the "unless $rdr->is_decl" in the above code. This is to avoid outputting any value after the XML declaration (which would be empty anyway).

Here is the output:

Found decl      version='1.0'
Text 'xyz'
Found comment   remark
Text 'stu'
Found proc      t=ab, d=cde
Text 'test'

OPTION FILTER / MODE

Option {filter => } or {mode => } allows to select different operation modes when processing the XML data.

Option {filter => 2} or {mode => 'attr-bef-start'}

With option {filter => 2} or {mode => 'attr-bef-start'}, XML::Reader produces one line for each character event. A preceding start-tag results in method is_start to be set to 1, a trailing end-tag results in method is_end to be set to 1. Likewise, a preceding comment results in method is_comment to be set to 1, a preceding XML-declaration results in method is_decl to be set to 1, a preceding processing-instruction results in method is_proc to be set to 1.

Also, attribute lines are added via the special '/@...' syntax.

Option {filter => 2, mode => 'attr-bef-start'} is the default.

Here is an example...

use XML::Reader;

my $text = q{<root><test param='&lt;&gt;v"'><a><b>"e"<data id="&lt;&gt;z'">'g'&amp;&lt;&gt;</data>}.
           q{f</b></a></test>x <!-- remark --> yz</root>};

my $rdr = XML::Reader->new(\$text) or die "Error: $!";

# the following four alternatives are equivalent:
# -----------------------------------------------
#   XML::Reader->new(\$text);
#   XML::Reader->new(\$text, {filter => 2                          });
#   XML::Reader->new(\$text, {filter => 2, mode => 'attr-bef-start'});
#   XML::Reader->new(\$text, {             mode => 'attr-bef-start'});

while ($rdr->iterate) {
    printf "Path: %-24s, Value: %s\n", $rdr->path, $rdr->value;
}

This program (with implicit option {filter => 2, mode => 'attr-bef-start'} as default) produces the following output:

Path: /root                   , Value:
Path: /root/test/@param       , Value: <>v"
Path: /root/test              , Value:
Path: /root/test/a            , Value:
Path: /root/test/a/b          , Value: "e"
Path: /root/test/a/b/data/@id , Value: <>z'
Path: /root/test/a/b/data     , Value: 'g'&<>
Path: /root/test/a/b          , Value: f
Path: /root/test/a            , Value:
Path: /root/test              , Value:
Path: /root                   , Value: x yz

The same {filter => 2, mode => 'attr-bef-start'} also allows to rebuild the structure of the XML with the help of the methods is_start and is_end. To make things more interesting, we have the following additional requirement: We want any text (but not tags and not attributes) to be wrapped inside a pair of "**...**" when it is displayed. Please note also that in the above output, the first line ("Path: /root, Value:") is empty, but important for the structure of the XML. Therefore we can't ignore it.

Let us now look at the same example (with option {filter => 2, mode => 'attr-bef-start'}), but with an additional algorithm to reconstruct the original XML plus the additional requirement to wrap text (but not tags and not attributes) inside "** **":

use XML::Reader;

my $text = q{<root><test param='&lt;&gt;v"'><a><b>"e"<data id="&lt;&gt;z'">'g'&amp;&lt;&gt;</data>}.
           q{f</b></a></test>x <!-- remark --> yz</root>};

my $rdr = XML::Reader->new(\$text) or die "Error: $!";

# the following four alternatives are equivalent:
# -----------------------------------------------
#   XML::Reader->new(\$text);
#   XML::Reader->new(\$text, {filter => 2                          });
#   XML::Reader->new(\$text, {filter => 2, mode => 'attr-bef-start'});
#   XML::Reader->new(\$text, {             mode => 'attr-bef-start'});

my %at;

while ($rdr->iterate) {
    my $indentation = '  ' x ($rdr->level - 1);

    if ($rdr->type eq '@')  {
        $at{$rdr->attr} = $rdr->value;
        for ($at{$rdr->attr}) {
            s{&}'&amp;'xmsg;
            s{'}'&apos;'xmsg;
            s{<}'&lt;'xmsg;
            s{>}'&gt;'xmsg;
        }
    }


    if ($rdr->is_start) {
        print $indentation, '<', $rdr->tag, join('', map{" $_='$at{$_}'"} sort keys %at), '>', "\n";
    }

    unless ($rdr->type eq '@') { %at = (); }

    if ($rdr->type eq 'T' and $rdr->value ne '') {
        my $v = $rdr->value;
        for ($v) {
            s{&}'&amp;'xmsg;
            s{<}'&lt;'xmsg;
            s{>}'&gt;'xmsg;
        }
        print $indentation, "  ** $v **\n";
    }

    if ($rdr->is_end) {
        print $indentation, '</', $rdr->tag, '>', "\n";
    }
}

...and here is the output:

<root>
  <test param='&lt;&gt;v"'>
    <a>
      <b>
        ** "e" **
        <data id='&lt;&gt;z&apos;'>
          ** 'g'&amp;&lt;&gt; **
        </data>
        ** f **
      </b>
    </a>
  </test>
  ** x yz **
</root>

...this is proof that the original structure of the XML is not lost.

Option {filter => 3} or {mode => 'attr-in-hash'}

Option {filter => 3, mode => 'attr-in-hash'} works very much like {filter => 2, mode => 'attr-bef-start'}.

The difference, though, is that with option {filter => 3, mode => 'attr-in-hash'} all attribute-lines are suppressed and instead, the attributes are presented for each start-line in the hash $rdr->att_hash().

This allows, in fact, to dispense with the global %at variable of the previous algorithm, and use %{$rdr->att_hash} instead:

Here is the new algorithm for {filter => 3, mode => 'attr-in-hash'}, we don't need to worry about attributes (that is, we don't need to check for $rdr->type eq '@') and, as already mentioned, the %at variable is replaced by %{$rdr->att_hash} :

use XML::Reader;

my $text = q{<root><test param='&lt;&gt;v"'><a><b>"e"<data id="&lt;&gt;z'">'g'&amp;&lt;&gt;</data>}.
           q{f</b></a></test>x <!-- remark --> yz</root>};

my $rdr = XML::Reader->new(\$text, {filter => 3}) or die "Error: $!";

# the following three alternatives are equivalent:
# ------------------------------------------------
#   XML::Reader->new(\$text, {filter => 3                        });
#   XML::Reader->new(\$text, {filter => 3, mode => 'attr-in-hash'});
#   XML::Reader->new(\$text, {             mode => 'attr-in-hash'});

while ($rdr->iterate) {
    my $indentation = '  ' x ($rdr->level - 1);

    if ($rdr->is_start) {
        my %h = %{$rdr->att_hash};
        for (values %h) {
            s{&}'&amp;'xmsg;
            s{'}'&apos;'xmsg;
            s{<}'&lt;'xmsg;
            s{>}'&gt;'xmsg;
        }
        print $indentation, '<', $rdr->tag,
          join('', map{" $_='$h{$_}'"} sort keys %h),
          '>', "\n";
    }

    if ($rdr->type eq 'T' and $rdr->value ne '') {
        my $v = $rdr->value;
        for ($v) {
            s{&}'&amp;'xmsg;
            s{<}'&lt;'xmsg;
            s{>}'&gt;'xmsg;
        }
        print $indentation, "  ** $v **\n";
    }

    if ($rdr->is_end) {
        print $indentation, '</', $rdr->tag, '>', "\n";
    }
}

...the output for {filter => 3, mode => 'attr-in-hash'} is identical to the output for {filter => 2, mode => 'attr-bef-start'}:

<root>
  <test param='&lt;&gt;v"'>
    <a>
      <b>
        ** "e" **
        <data id='&lt;&gt;z&apos;'>
          ** 'g'&amp;&lt;&gt; **
        </data>
        ** f **
      </b>
    </a>
  </test>
  ** x yz **
</root>

Finally, we can (and we should) delegate the writing of XML to another module. I would suggest that we use XML::Writer for that. Here is the program that uses XML::Writer to output XML:

use XML::Reader;
use XML::Writer;

my $text = q{<root><test param='&lt;&gt;v"'><a><b>"e"<data id="&lt;&gt;z'">'g'&amp;&lt;&gt;</data>}.
           q{f</b></a></test>x <!-- remark --> yz</root>};

my $rdr = XML::Reader->new(\$text, {filter => 3}) or die "Error: $!";
my $wrt = XML::Writer->new(OUTPUT => \*STDOUT, NEWLINES => 1);

while ($rdr->iterate) {
    if ($rdr->is_start)                          { $wrt->startTag($rdr->tag, %{$rdr->att_hash}); }
    if ($rdr->type eq 'T' and $rdr->value ne '') { $wrt->characters('** '.$rdr->value.' **'); }
    if ($rdr->is_end)                            { $wrt->endTag($rdr->tag); }
}

$wrt->end();

Here is the output from XML::Writer:

<root
><test param="&lt;&gt;v&quot;"
><a
><b
>** "e" **<data id="&lt;&gt;z'"
>** 'g'&amp;&lt;&gt; **</data
>** f **</b
></a
></test
>** x yz **</root
>

The format written by XML::Writer needs some getting used to, but it is valid XML.

Option {filter => 4} or {mode => 'pyx'}

Although this is not the main purpose of XML::Reader, option {filter => 4, mode => 'pyx'} can generate individual lines for start-tags, end-tags, comments, processing-instructions and XML-Declarations. Its aim is to generate a pyx string for further processing and analysis.

Here is an example:

use XML::Reader;

my $text = q{<?xml version="1.0" encoding="iso-8859-1"?>
  <delta>
    <dim alter="511">
      <gamma />
      <beta>
        car <?tt dat?>
      </beta>
    </dim>
    dskjfh <!-- remark --> uuu
  </delta>};

my $rdr = XML::Reader->new(\$text, {filter => 4, parse_pi => 1}) or die "Error: $!";

# the following three alternatives are equivalent:
# ------------------------------------------------
#   XML::Reader->new(\$text, {filter => 4               , parse_pi => 1});
#   XML::Reader->new(\$text, {filter => 4, mode => 'pyx', parse_pi => 1});
#   XML::Reader->new(\$text, {             mode => 'pyx', parse_pi => 1});

while ($rdr->iterate) {
    printf "Type = %1s, pyx = %s\n", $rdr->type, $rdr->pyx;
}

And here is the output:

Type = D, pyx = ?xml version='1.0' encoding='iso-8859-1'
Type = S, pyx = (delta
Type = S, pyx = (dim
Type = @, pyx = Aalter 511
Type = S, pyx = (gamma
Type = E, pyx = )gamma
Type = S, pyx = (beta
Type = T, pyx = -car
Type = ?, pyx = ?tt dat
Type = E, pyx = )beta
Type = E, pyx = )dim
Type = T, pyx = -dskjfh uuu
Type = E, pyx = )delta

Be aware that comments can be produced by pyx in a non-standard way if requested by {parse_ct => 1}. In fact, comments are produced with a leading hash symbol which is not part of the pyx specification, as can be seen by the following example:

use XML::Reader;

my $text = q{
  <delta>
    <!-- remark -->
  </delta>};

my $rdr = XML::Reader->new(\$text, {filter => 4, parse_ct => 1}) or die "Error: $!";

# the following three alternatives are equivalent:
# ------------------------------------------------
#   XML::Reader->new(\$text, {filter => 4,                parse_ct => 1});
#   XML::Reader->new(\$text, {filter => 4, mode => 'pyx', parse_ct => 1});
#   XML::Reader->new(\$text, {             mode => 'pyx', parse_ct => 1});

while ($rdr->iterate) {
    printf "Type = %1s, pyx = %s\n", $rdr->type, $rdr->pyx;
}

Here is the output:

Type = S, pyx = (delta
Type = #, pyx = #remark
Type = E, pyx = )delta

Finally, when operating with {filter => 4, mode => 'pyx'}, the usual methods (value, attr, path, is_start, is_end, is_decl, is_proc, is_comment, is_attr, is_text, is_value, comment, proc_tgt, proc_data, dec_hash or att_hash) remain operational. Here is an example:

use XML::Reader;

my $text = q{<?xml version="1.0"?>
  <parent abc="def"> <?pt hmf?>
    dskjfh <!-- remark -->
    <child>ghi</child>
  </parent>};

my $rdr = XML::Reader->new(\$text, {filter => 4, parse_ct => 1, parse_pi => 1}) or die "Error: $!";

# the following three alternatives are equivalent:
# ------------------------------------------------
#   XML::Reader->new(\$text, {filter => 4,                parse_ct => 1, parse_pi => 1});
#   XML::Reader->new(\$text, {filter => 4, mode => 'pyx', parse_ct => 1, parse_pi => 1});
#   XML::Reader->new(\$text, {             mode => 'pyx', parse_ct => 1, parse_pi => 1});

while ($rdr->iterate) {
    printf "Path %-15s v=%s ", $rdr->path, $rdr->is_value;

    if    ($rdr->is_start)   { print "Found start tag ", $rdr->tag, "\n"; }
    elsif ($rdr->is_end)     { print "Found end tag   ", $rdr->tag, "\n"; }
    elsif ($rdr->is_decl)    { my %h = %{$rdr->dec_hash};
                               print "Found decl     ",  join('', map{" $_='$h{$_}'"} sort keys %h), "\n"; }
    elsif ($rdr->is_proc)    { print "Found proc      ", "t=",    $rdr->proc_tgt, ", d=", $rdr->proc_data, "\n"; }
    elsif ($rdr->is_comment) { print "Found comment   ", $rdr->comment, "\n"; }
    elsif ($rdr->is_attr)    { print "Found attribute ", $rdr->attr, "='", $rdr->value, "'\n"; }
    elsif ($rdr->is_text)    { print "Found text      ", $rdr->value, "\n"; }
}

Here is the output:

Path /               v=0 Found decl      version='1.0'
Path /parent         v=0 Found start tag parent
Path /parent/@abc    v=1 Found attribute abc='def'
Path /parent         v=0 Found proc      t=pt, d=hmf
Path /parent         v=1 Found text      dskjfh
Path /parent         v=0 Found comment   remark
Path /parent/child   v=0 Found start tag child
Path /parent/child   v=1 Found text      ghi
Path /parent/child   v=0 Found end tag   child
Path /parent         v=0 Found end tag   parent

Note that v=1 (i.e. $rdr->is_value == 1) for all text and all attributes.

Option {filter => 5} or {mode => 'branches'}

With option {filter => 5, mode => 'branches'}, you specify one (or many) roots, each root has a set of branches attached. What you then get back is one record for each occurence of a root in the XML tree. A root can start with a single slash (such as {root => '/tag1/tag2'}), in which case the path is absolute, or it can start with a double-slash (such as {root => '//tag1/tag2'}), in which case the path is relative. If you start your root with no slash at all (such as {root => 'tag1/tag2'}), your path is also relative.

Each record then contains the elements that have been specified in the branches. (As a special case, the branch can be a single '*' character, in which case the complete XML is produced for the root).

To obtain the elements that have been specified in the branches, you can use either function $rdr->rvalue or $rdr->rval.

The easiest way to explain its effect is to show an example.

use XML::Reader;

my $line2 = q{
<data>
  <supplier>ggg</supplier>
  <customer name="o'rob" id="444">
    <street>pod alley</street>
    <city>no city</city>
  </customer>
  <customer1 name="troy" id="333">
    <street>one way</street>
    <city>any city</city>
  </customer1>
  <tcustomer name="nbc" id="777">
    <street>away</street>
    <city>acity</city>
  </tcustomer>
  <supplier>hhh</supplier>
  <zzz>
    <customer name='"sue"' id="111">
      <street>baker street</street>
      <city>sidney</city>
    </customer>
  </zzz>
  <order>
    <database>
      <customer name="&lt;smith&gt;" id="652">
        <street>high street</street>
        <city>boston</city>
      </customer>
      <customer name="&amp;jones" id="184">
        <street>maple street</street>
        <city>new york</city>
      </customer>
      <customer name="stewart" id="520">
        <street>  ring   road   </street>
        <city>  "'&amp;&lt;&#65;&gt;'"  </city>
      </customer>
    </database>
  </order>
  <dummy value="ttt">test</dummy>
  <supplier>iii</supplier>
  <supplier>jjj</supplier>
  <p>
    <p>b1</p>
    <p>b2</p>
  </p>
  <p>
    b3
  </p>
</data>
};

Let's say we want to read the name, the street and the city of all customers in any relative path ('customer') and we also want to read the supplier in the absolute path '/data/supplier'. Then, we want to have data for the relative customer ('//customer') supplied as pure XML ({branch => '*'}). Finally we want to have data for the relative path 'p' as pure XML.

Data for our first root ('customer') is identified by $rdr->rx == 0, data for our second root ('/data/supplier') is identified by $rdr->rx == 1, data for our third root ('//customer') is identified by $rdr->rx == 2, and data for our fourth root ('p') is identified by $rdr-rx == 3.

In the following program we will use function rdr->rvalue to obtain the data:

my $rdr = XML::Reader->new(\$line2, {filter => 5},
  { root => 'customer',       branch => ['/@name', '/street', '/city'] },
  { root => '/data/supplier', branch => ['/']                          },
  { root => '//customer',     branch => '*' },
  { root => 'p',              branch => '*' },
);

# the following three alternatives are equivalent:
# ------------------------------------------------
#   XML::Reader->new(\$line2, {filter => 5,                   });
#   XML::Reader->new(\$line2, {filter => 5, mode => 'branches'});
#   XML::Reader->new(\$line2, {             mode => 'branches'});

my $root0 = '';
my $root1 = '';
my $root2 = '';
my $root3 = '';

my $path0 = '';

while ($rdr->iterate) {
    if ($rdr->rx == 0) {
        $path0 .= "  ".$rdr->path."\n";
        for ($rdr->rvalue) {
            $root0 .= sprintf "  Cust: Name = %-7s Street = %-12s City = %s\n", $_->[0], $_->[1], $_->[2];
        }
    }
    elsif ($rdr->rx == 1) {
        for ($rdr->rvalue) {
            $root1 .= "  Supp: Name = ".$_->[0]."\n";
        }
    }
    elsif ($rdr->rx == 2) {
        for ($rdr->rvalue) {
            $root2 .= "  Xml: ".$$_."\n";
        }
    }
    elsif ($rdr->rx == 3) {
        for ($rdr->rvalue) {
            $root3 .= "  P: ".$$_."\n";
        }
    }
}

print "root0:\n$root0\n";
print "path0:\n$path0\n";
print "root1:\n$root1\n";
print "root2:\n$root2\n";
print "root3:\n$root3\n";

This is the output:

root0:
  Cust: Name = o'rob   Street = pod alley    City = no city
  Cust: Name = "sue"   Street = baker street City = sidney
  Cust: Name = <smith> Street = high street  City = boston
  Cust: Name = &jones  Street = maple street City = new york
  Cust: Name = stewart Street = ring road    City = "'&<A>'"

path0:
  /data/customer
  /data/zzz/customer
  /data/order/database/customer
  /data/order/database/customer
  /data/order/database/customer

root1:
  Supp: Name = ggg
  Supp: Name = hhh
  Supp: Name = iii
  Supp: Name = jjj

root2:
  Xml: <customer id='444' name='o&apos;rob'><street>pod alley</street><city>no city</city></customer>
  Xml: <customer id='111' name='"sue"'><street>baker street</street><city>sidney</city></customer>
  Xml: <customer id='652' name='&lt;smith&gt;'><street>high street</street><city>boston</city></customer>
  Xml: <customer id='184' name='&amp;jones'><street>maple street</street><city>new york</city></customer>
  Xml: <customer id='520' name='stewart'><street>ring road</street><city>"'&amp;&lt;A&gt;'"</city></customer>

root3:
  P: <p><p>b1</p><p>b2</p></p>
  P: <p>b3</p>

We can also use function rdr->rval to obtain the same data:

my $rdr = XML::Reader->new(\$line2, {filter => 5},
  { root => 'customer',       branch => ['/@name', '/street', '/city'] },
  { root => 'p',              branch => '*' },
);

# the following three alternatives are equivalent:
# ------------------------------------------------
#   XML::Reader->new(\$line2, {filter => 5,                   });
#   XML::Reader->new(\$line2, {filter => 5, mode => 'branches'});
#   XML::Reader->new(\$line2, {             mode => 'branches'});

my $out0 = '';
my $out1 = '';

while ($rdr->iterate) {
    if ($rdr->rx == 0) {
        my @rv = $rdr->rval;
        $out0 .= sprintf "  Cust: Name = %-7s Street = %-12s City = %s\n", $rv[0], $rv[1], $rv[2];
    }
    elsif ($rdr->rx == 1) {
        $out1 .= "  P: ".$rdr->rval."\n";
    }
}

print "output0:\n$out0\n";
print "output1:\n$out1\n";

This is the output:

output0:
  Cust: Name = o'rob   Street = pod alley    City = no city
  Cust: Name = "sue"   Street = baker street City = sidney
  Cust: Name = <smith> Street = high street  City = boston
  Cust: Name = &jones  Street = maple street City = new york
  Cust: Name = stewart Street = ring road    City = "'&<A>'"

output1:
  P: <p><p>b1</p><p>b2</p></p>
  P: <p>b3</p>

It is important to notice here that the case "root3" / "output1" { root => 'p', branch => '*' } clearly shows that always the biggest possible XML subtree is matched for relative roots. In other words, the output of "P: b1" and "P: b2" on its own line is not possible, because they are already part of the bigger line "P: b1b2".

EXAMPLES

Let's look at the following piece of XML from which we want to extract the values in <item> (by that I mean only the first 'start...'-value, not the 'end...'-value), plus the attributes "p1" and "p3". The item-tag must be exactly in the /start/param/data range (and *not* in the /start/param/dataz range).

my $text = q{
  <start>
    <param>
      <data>
        <item p1="a" p2="b" p3="c">start1 <inner p1="p">i1</inner> end1</item>
        <item p1="d" p2="e" p3="f">start2 <inner p1="q">i2</inner> end2</item>
        <item p1="g" p2="h" p3="i">start3 <inner p1="r">i3</inner> end3</item>
      </data>
      <dataz>
        <item p1="j" p2="k" p3="l">start9 <inner p1="s">i9</inner> end9</item>
      </dataz>
      <data>
        <item p1="m" p2="n" p3="o">start4 <inner p1="t">i4</inner> end4</item>
      </data>
    </param>
  </start>};

We expect exactly 4 output-lines from our parse (i.e. we don't expect the 'dataz' part - 'start9' - in the output):

item = 'start1', p1 = 'a', p3 = 'c'
item = 'start2', p1 = 'd', p3 = 'f'
item = 'start3', p1 = 'g', p3 = 'i'
item = 'start4', p1 = 'm', p3 = 'o'

Parsing XML with {filter => 2} or {mode => 'attr-bef-start'}

Here is a sample program to parse that XML with {filter => 2, mode => 'attr-bef-start'}. (Note how the prefix '/start/param/data/item' is located in the {using =>} option of new). We need two scalars ('$p1' and '$p3') to hold the parameters in '/@p1' and in '/@p3' and carry them over to the $rdr->is_start section, where they can be printed.

my $rdr = XML::Reader->new(\$text,
  {mode => 'attr-bef-start', using => '/start/param/data/item'}) or die "Error: $!";

my ($p1, $p3);

while ($rdr->iterate) {
    if    ($rdr->path eq '/@p1') { $p1 = $rdr->value; }
    elsif ($rdr->path eq '/@p3') { $p3 = $rdr->value; }
    elsif ($rdr->path eq '/' and $rdr->is_start) {
        printf "item = '%s', p1 = '%s', p3 = '%s'\n",
          $rdr->value, $p1, $p3;
    }
    unless ($rdr->is_attr) { $p1 = undef; $p3 = undef; }
}

Parsing XML with {filter => 3} or {mode => 'attr-in-hash'}

With {filter => 3, mode => 'attr-in-hash'} we can dispense with the two scalars '$p1' and '$p3', the code becomes very simple:

my $rdr = XML::Reader->new(\$text,
  {mode => 'attr-in-hash', using => '/start/param/data/item'}) or die "Error: $!";

while ($rdr->iterate) {
    if ($rdr->path eq '/' and $rdr->is_start) {
        printf "item = '%s', p1 = '%s', p3 = '%s'\n",
          $rdr->value, $rdr->att_hash->{p1}, $rdr->att_hash->{p3};
    }
}

Parsing XML with {filter => 4} or {mode => 'pyx'}

With {filter => 4, mode => 'pyx'}, however, the code becomes slightly more complicated again: As already shown for {filter => 2, mode => 'attr-bef-start'}, we need again two scalars ('$p1' and '$p3') to hold the parameters in '/@p1' and in '/@p3' and carry them over. In addition to that, we also need a way to count text-values (see scalar '$count'), so that we can distinguish between the first value 'start...' (that we want to print) and the second value 'end...' (that we do not want to print).

my $rdr = XML::Reader->new(\$text,
  {mode => 'pyx', using => '/start/param/data/item'}) or die "Error: $!";

my ($count, $p1, $p3);

while ($rdr->iterate) {
    if    ($rdr->path eq '/@p1') { $p1 = $rdr->value; }
    elsif ($rdr->path eq '/@p3') { $p3 = $rdr->value; }
    elsif ($rdr->path eq '/') {
        if    ($rdr->is_start) { $count = 0; $p1 = undef; $p3 = undef; }
        elsif ($rdr->is_text) {
            $count++;
            if ($count == 1) {
                printf "item = '%s', p1 = '%s', p3 = '%s'\n",
                  $rdr->value, $p1, $p3;
            }
        }
    }
}

Parsing XML with {filter => 5} or {mode => 'branches'}

You could combine {mode => 'branches'} and regular expressions to parse the XML:

my $rdr = XML::Reader->new(\$text, {mode => 'branches'},
  { root => '/start/param/data/item', branch => '*' },
) or die "Error: $!";

while ($rdr->iterate) {
    if ($rdr->rval =~ m{\A <item
        (?:\s+ p1='([^']*)')?
        (?:\s+ p2='([^']*)')?
        (?:\s+ p3='([^']*)')?
        \s* > ([^<]*) <}xms) {
        printf "item = '%s', p1 = '%s', p3 = '%s'\n", $4, $1, $3;
    }
}

FUNCTIONS

Function slurp_xml

The function slurp_xml reads an XML file and slurps it into an array-ref. Here is an example where we want to slurp the name, the street and the city of all customers in the path '/data/order/database/customer' and we also want to slurp the supplier in '/data/supplier':

use XML::Reader qw(slurp_xml);

my $line2 = q{
<data>
  <supplier>ggg</supplier>
  <supplier>hhh</supplier>
  <order>
    <database>
      <customer name="smith" id="652">
        <street>high street</street>
        <city>boston</city>
      </customer>
      <customer name="jones" id="184">
        <street>maple street</street>
        <city>new york</city>
      </customer>
      <customer name="stewart" id="520">
        <street>ring road</street>
        <city>dallas</city>
      </customer>
    </database>
  </order>
  <dummy value="ttt">test</dummy>
  <supplier>iii</supplier>
  <supplier>jjj</supplier>
</data>
};

my $aref = slurp_xml(\$line2,
  { root => '/data/order/database/customer', branch => ['/@name', '/street', '/city'] },
  { root => '/data/supplier',                branch => '*'                            },
);

for (@{$aref->[0]}) {
    printf "Cust: Name = %-7s Street = %-12s City = %s\n", $_->[0], $_->[1], $_->[2];
}

print "\n";

for (@{$aref->[1]}) {
    printf "S: %s\n", $$_;
}

The first parameter to slurp_xml is the filename (or scalar reference, or open filehandle) of the XML that will be slurped. In this case we read from a scalar ref \$line2. The next parameter is a hash-ref with the root of the sub-tree that we want to slurp (in our case that's '/data/order/database/customer') and the branches, a list of the elements that we want to slurp, relative to the sub-tree. In this case it is ['/@name', '/street', '/city']. The next parameter is our second root/branch definition, in this case it is root => '/data/supplier' with branch => ['/'].

Here is the output:

Cust: Name = smith   Street = high street  City = boston
Cust: Name = jones   Street = maple street City = new york
Cust: Name = stewart Street = ring road    City = dallas

S: <supplier>ggg</supplier>
S: <supplier>hhh</supplier>
S: <supplier>iii</supplier>
S: <supplier>jjj</supplier>

slurp_xml works similar to XML::Simple, in that it reads all required information in one go into an in-memory data structure. The difference, however, is that slurp_xml lets you be specific in what you actually want before you do the slurping, so that in most cases your in-memory data structure is smaller and less complicated.

AUTHOR

Klaus Eichner, March 2009

COPYRIGHT AND LICENSE

All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the artistic license 2.0, see http://www.opensource.org/licenses/artistic-license-2.0.php

If you also want to write XML, have a look at XML::Writer. This module provides a simple interface for writing XML. (If you are writing non-mixed content XML, consider setting DATA_MODE=>1 and DATA_INDENT=>2, which allows for proper indentation in your XML-Output file)

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

NAME

SYNOPSIS

DESCRIPTION

INTERFACE

Object creation

Methods

OPTION USING

An example with option 'using'

An example without option 'using'

OPTION PARSE_CT

OPTION PARSE_PI

OPTION FILTER / MODE

Option {filter => 2} or {mode => 'attr-bef-start'}

Option {filter => 3} or {mode => 'attr-in-hash'}

Option {filter => 4} or {mode => 'pyx'}

Option {filter => 5} or {mode => 'branches'}

EXAMPLES

Parsing XML with {filter => 2} or {mode => 'attr-bef-start'}

Parsing XML with {filter => 3} or {mode => 'attr-in-hash'}

Parsing XML with {filter => 4} or {mode => 'pyx'}

Parsing XML with {filter => 5} or {mode => 'branches'}

FUNCTIONS

Function slurp_xml

AUTHOR

COPYRIGHT AND LICENSE

RELATED MODULES

SEE ALSO

Module Install Instructions