NAME
WebService::ReutersConnect::XMLDocument - A decoration of XML::LibXML::Document with extra gizmos
SYNOPSIS
This basically acts as an XML::LibXML::Document execpts it has the following extra attributes:
xml_namespaces
Returns a Array Ref list of all XML::LibXML::Namespace included in this document. This is mainly for internal use.
usage:
foreach my $ns_node ( @{$this->xml_namespaces() ){
## Print some stuff.
}
xml_xpath
A ready to serve instance of <XML::LibXML::XPathContext> with the namespaces preregistered.
NOTE: The default namespace is 'rcx' (rEUTERS cONNECT xML).
Usage:
print( $this->xml_xpath->findvalue('//rcx::headline') );
print( $this->xml_xpath->findvalue('//rcx::description') );
get_subjects
Returns an ARRAY of WebService::ReutersConnect::DB::Result::Concept representing the subjects of this reuters news document.
Usage:
my @subjects = $this->get_subjects();
foreach my $subject ( @subjects ){
print $subject->name_main()."\n";
...
}
get_html_body
Get the XML::LibXML::Element that is the HTML Body of this rich document.
In an array context, directly returns the non blank children of the body as an array. This is useful to directly display the body content without outputting the 'body' element again.
Usage:
if( my $body = $this->get_html_body() ){
print $body->toString(1);
}
if( my @body_parts = $this->get_html_body() ){
print join("\n" , map{ $_->toString(1) } @body_parts );
}