NAME
XRD::Parser - parse XRD and host-meta files into RDF::Trine models
SYNOPSIS
use RDF::Query;
use XRD::Parser;
my $parser = XRD::Parser->new(undef, "http://example.com/foo.xrd");
my $results = RDF::Query->new(
"SELECT * WHERE {?who <http://spec.example.net/auth/1.0> ?auth.}")
->execute($parser->graph);
while (my $result = $results->next)
{
print $result->{'auth'}->uri . "\n";
}
or maybe:
my $data = XRD::Parser->hostmeta('gmail.com')
->graph
->as_hashref;
VERSION
0.100
DESCRIPTION
While XRD has a rather different history, it turns out it can mostly be thought of as a serialisation format for a limited subset of RDF.
This package ignores the order of <Link> elements, as RDF is a graph format with no concept of statements coming in an "order". The XRD spec says that grokking the order of <Link> elements is only a SHOULD. That said, if you're concerned about the order of <Link> elements, the callback routines allowed by this package may be of use.
This package aims to be roughly compatible with RDF::RDFa::Parser's interface.
Constructors
$p = XRD::Parser->new($content, $uri, \%options, $store)
-
This method creates a new XRD::Parser object and returns it.
The $content variable may contain an XML string, or a XML::LibXML::Document. If a string, the document is parsed using XML::LibXML::Parser, which may throw an exception. XRD::Parser does not catch the exception.
$uri the supposed URI of the content; it is used to resolve any relative URIs found in the XRD document. Also, if $content is undef, then XRD::Parser will attempt to retrieve $uri using LWP::UserAgent.
Options [default in brackets]:
* default_subject - If no <Subject> element. [undef] * link_prop - How to handle <Property> in <Link>? [0] 0=skip, 1=reify, 2=subproperty, 3=both. * loose_mime - Accept text/plain & app/octet-stream. [0] * tdb_service - thing-described-by.org when possible. [0]
$storage is an RDF::Trine::Storage object. If undef, then a new temporary store is created.
$p = XRD::Parser->hostmeta($uri)
-
This method creates a new XRD::Parser object and returns it.
The parameter may be a URI (from which the hostname will be extracted) or just a bare host name (e.g. "example.com"). The resource "/.well-known/host-meta" will then be fetched from that host using an appropriate HTTP Accept header, and the parser object returned.
Public Methods
$p->uri($uri)
-
Returns the base URI of the document being parsed. This will usually be the same as the base URI provided to the constructor.
Optionally it may be passed a parameter - an absolute or relative URI - in which case it returns the same URI which it was passed as a parameter, but as an absolute URI, resolved relative to the document's base URI.
This seems like two unrelated functions, but if you consider the consequence of passing a relative URI consisting of a zero-length string, it in fact makes sense.
$p->dom
-
Returns the parsed XML::LibXML::Document.
$p->graph
-
This method will return an RDF::Trine::Model object with all statements of the full graph.
This method will automatically call
consume
first, if it has not already been called. - $p->set_callbacks(\%callbacks)
-
Set callback functions for the parser to call on certain events. These are only necessary if you want to do something especially unusual.
$p->set_callbacks({ 'pretriple_resource' => sub { ... } , 'pretriple_literal' => sub { ... } , 'ontriple' => undef , });
Either of the two pretriple callbacks can be set to the string 'print' instead of a coderef. This enables built-in callbacks for printing Turtle to STDOUT.
For details of the callback functions, see the section CALLBACKS.
set_callbacks
must be used beforeconsume
.set_callbacks
itself returns a reference to the parser object itself.NOTE: the behaviour of this function was changed in version 0.05.
$p->consume
-
This method processes the input DOM and sends the resulting triples to the callback functions (if any).
It called again, does nothing.
Returns the parser object itself.
Utility Functions
$uri = XRD::Parser::host_uri($uri)
-
Returns a URI representing the host. These crop up often in graphs gleaned from host-meta files.
$uri can be an absolute URI like 'http://example.net/foo#bar' or a host name like 'example.com'.
$uri = XRD::Parser::template_uri($relationship_uri)
-
Returns a URI representing not a normal relationship, but the relationship between a host and a template URI literal.
CALLBACKS
Several callback functions are provided. These may be set using the set_callbacks
function, which taskes a hashref of keys pointing to coderefs. The keys are named for the event to fire the callback on.
pretriple_resource
This is called when a triple has been found, but before preparing the triple for adding to the model. It is only called for triples with a non-literal object value.
The parameters passed to the callback function are:
A reference to the
XRD::Parser
objectA reference to the
XML::LibXML::Element
being parsedSubject URI or bnode (string)
Predicate URI (string)
Object URI or bnode (string)
The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise.
pretriple_literal
This is the equivalent of pretriple_resource, but is only called for triples with a literal object value.
The parameters passed to the callback function are:
A reference to the
XRD::Parser
objectA reference to the
XML::LibXML::Element
being parsedSubject URI or bnode (string)
Predicate URI (string)
Object literal (string)
Datatype URI (string or undef)
Language (string or undef)
The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise.
ontriple
This is called once a triple is ready to be added to the graph. (After the pretriple callbacks.) The parameters passed to the callback function are:
A reference to the
XRD::Parser
objectA reference to the
XML::LibXML::Element
being parsedAn RDF::Trine::Statement object.
The callback should return 1 to tell the parser to skip this triple (not add it to the graph); return 0 otherwise. The callback may modify the RDF::Trine::Statement object.
SEE ALSO
RDF::Trine, RDF::Query, RDF::RDFa::Parser.
AUTHOR
Toby Inkster, <tobyink@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2009-2010 by Toby Inkster
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8 or, at your option, any later version of Perl 5 you may have available.