NAME
XML::DT::Sequence - Down Translator (XML::DT) for sequence XMLs
SYNOPSIS
A lot of XML files nowadays are just catalogues, simple sequences of small chunks, that repeat, and repeat. These files can get enormous, and DOM processing hard. SAX processing it interesting but not always the best approach.
This module chunks the XML file in Header, a sequence of the repeating blocks, and a footer, and each one of these chunks can be processed by DOM, using XML::DT technology.
use XML::DT::Sequence;
my $dt = XML::DT::Sequence->new();
$dt->process("file.xml",
-tag => 'item',
-head => sub {
my ($self, $xml) = @_;
# do something with $xml
},
-body => {
item => sub {
# XML::DT like handler
}
},
-foot => sub {
my ($self, $xml) = @_;
# do something with $xml
},
);
EXPLANATION
There are four options, only two mandatory: -tag
and -body
. -tag
is the element name that repeats in the XML file, and that you want to process one at a time. -body
is the handler to process each one of these elements.
-head
is the handler to process the XML that appears before the first instance of the repeating element, and -foot
the handler to process the XML that apperas after the last instance of the repeating element.
Each one of these handlers can be a code reference that receives the XML::DT::Sequence
object and the XML string, or a hash reference, with XML::DT handlers to process each XML snippet.
Note that when processing header or footer, XML is incomplete, and the parser can recover in weird ways.
The process
method returns a hash reference with three keys: -head
is the return value of the -head
handler, and -foot
is the return value of the -foot
handler. -body
is the number of elements of the sequence that were processed.
METHODS
new
Constructor.
process
Processor. Se explanation above.
break
Forces the process to finish. Useful when you processed enough number of elements. Note that if you break the process the -foot
code will not be run.
If you are using a code reference as a handler, call it from the first argument (reference to the object). If you are using a XML::DT
handler, $u
has the object, so just call break
on it.
AUTHOR
Alberto Simões, <ambs at cpan.org>
BUGS
Please report any bugs or feature requests to bug-xml-dt-sequence at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-DT-Sequence. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc XML::DT::Sequence
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
KNOWN BUGS AND LIMITATIONS
Spaced tags
It is not usual, but XML allows the usage of spaces inside element tags, for instance, between the
<
and the element name. This is NOT supported.Multiple usage tags
If the same tag is used in different levels of the XML hierarchy, it is likely that the implemented algorithm will not work.
ACKNOWLEDGEMENTS
LICENSE AND COPYRIGHT
Copyright 2012 Alberto Simões.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.