NAME

XML::Atom::OWL - parse an Atom file into RDF

SYNOPSIS

use XML::Atom::OWL;

$parser = XML::Atom::OWL->new($xml, $baseuri);
$graph  = $parser->graph;

VERSION

0.100

DESCRIPTION

Constructor

$p = XML::Atom::OWL->new($xml, $baseuri, \%options, $storage)

This method creates a new XML::Atom::OWL object and returns it.

The $xml variable may contain an XML (Atom) string, an XML::LibXML::Document object, or undef. If a string, the document is parsed using XML::LibXML::Parser, which will throw an exception if it is not well-formed. XML::Atom::OWL does not catch the exception.

The base URI is used to resolve relative URIs found in the document. If $xml was undef, the URI will be fetched using LWP::UserAgent.

Currently only one option is defined, 'no_fetch_content_src', a boolean indicating whether <content src> URLs should be automatically fetched and added to the model as if inline content had been provided. They are fetched by default, but it's pretty rare for feeds to include this attribute.

$storage is an RDF::Trine::Storage object. If undef, then a new temporary store is created.

Public Methods

<$p-uri >>

Returns the base URI of the document being parsed. This will usually be the same as the base URI provided to the constructor.

Optionally it may be passed a parameter - an absolute or relative URI - in which case it returns the same URI which it was passed as a parameter, but as an absolute URI, resolved relative to the document's base URI.

This seems like two unrelated functions, but if you consider the consequence of passing a relative URI consisting of a zero-length string, it in fact makes sense.

$p->dom

Returns the parsed XML::LibXML::Document.

$p->graph

This method will return an RDF::Trine::Model object with all statements of the full graph.

This method automatically calls consume.

$p->root_identifier

Returns the blank node or URI for the root element of the Atom document as an RDF::Trine::Node

Calls consume automatically.

$p->set_callbacks(\%callbacks)

Set callback functions for the parser to call on certain events. These are only necessary if you want to do something especially unusual.

$p->set_callbacks({
  'pretriple_resource' => sub { ... } ,
  'pretriple_literal'  => sub { ... } ,
  'ontriple'           => undef ,
  });

For details of the callback functions, see the section CALLBACKS. set_callbacks must be used before consume. set_callbacks itself returns a reference to the parser object itself.

$p->consume

The document is parsed. Triples extracted from the document are passed to the callbacks as each one is found; triples are made available in the model returned by the graph method.

This function returns the parser object itself, making it easy to abbreviate several of XML::Atom::OWL's functions:

my $iterator = XML::Atom::OWL->new(undef, $uri)
               ->consume->graph->as_stream;

You probably only need to call this explicitly if you're using callbacks.

CALLBACKS

Several callback functions are provided. These may be set using the set_callbacks function, which taskes a hashref of keys pointing to coderefs. The keys are named for the event to fire the callback on.

pretriple_resource

This is called when a triple has been found, but before preparing the triple for adding to the model. It is only called for triples with a non-literal object value.

The parameters passed to the callback function are:

  • A reference to the XML::Atom::OWL object

  • A reference to the XML::LibXML::Element being parsed

  • Subject URI or bnode (string)

  • Predicate URI (string)

  • Object URI or bnode (string)

  • Graph URI or bnode (string or undef)

The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise.

pretriple_literal

This is the equivalent of pretriple_resource, but is only called for triples with a literal object value.

The parameters passed to the callback function are:

  • A reference to the XML::Atom::OWL object

  • A reference to the XML::LibXML::Element being parsed

  • Subject URI or bnode (string)

  • Predicate URI (string)

  • Object literal (string)

  • Datatype URI (string or undef)

  • Language (string or undef)

  • Graph URI or bnode (string or undef)

Beware: sometimes both a datatype and a language will be passed. This goes beyond the normal RDF data model.)

The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise.

ontriple

This is called once a triple is ready to be added to the graph. (After the pretriple callbacks.) The parameters passed to the callback function are:

  • A reference to the XML::Atom::OWL object

  • A reference to the XML::LibXML::Element being parsed

  • An RDF::Trine::Statement object.

The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise. The callback may modify the RDF::Trine::Statement object.

BUGS

Please report any bugs to http://rt.cpan.org/.

SEE ALSO

RDF::Trine, RDF::RDFa::Parser.

http://www.perlrdf.org/.

AUTHOR

Toby Inkster <tobyink@cpan.org>.

COPYRIGHT

Copyright 2010 Toby Inkster

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.