NAME

RDF::RDFa::Parser::Config - configuration sets for RDFa parser

DESCRIPTION

The third argument to the constructor for RDF::RDFa::Parser objects is a configuration set. This module provides such configuration sets.

Confguration sets are needed by the parser so that it knows how to handle certain features which vary between different host languages, or different RDFa versions.

All you need to know about is the constructor:

$config = RDF::RDFa::Parser::Config->new($host, $version, %options);

$host is the host language. Generally you would supply one of the following constants; the default is HOST_XHTML. Internet media types are accepted (e.g. 'text/html' or 'image/svg+xml').

  • RDF::RDFa::Parser::Config->HOST_ATOM

  • RDF::RDFa::Parser::Config->HOST_DATARSS

  • RDF::RDFa::Parser::Config->HOST_HTML4

  • RDF::RDFa::Parser::Config->HOST_HTML5

  • RDF::RDFa::Parser::Config->HOST_OPENDOCUMENT_XML

  • RDF::RDFa::Parser::Config->HOST_OPENDOCUMENT_ZIP

  • RDF::RDFa::Parser::Config->HOST_SVG

  • RDF::RDFa::Parser::Config->HOST_XHTML

  • RDF::RDFa::Parser::Config->HOST_XML

$version is the RDFa version. Generally you would supply one of the following constants; the default is RDFA_LATEST.

  • RDF::RDFa::Parser::Config->RDFA_10

  • RDF::RDFa::Parser::Config->RDFA_11

  • RDF::RDFa::Parser::Config->RDFA_GUESS

  • RDF::RDFa::Parser::Config->RDFA_LATEST

(RDFA_GUESS is currently just an alias for RDFA_LATEST. Version guessing is a planned feature.)

%options is a hash of additional options to use which override the defaults. While many of these are useful, they probably also reduce conformance to the official RDFa specifications. The following options exist; defaults for XHTML+RDFa1.0 and XHTML+RDFa1.1 are shown in brackets.

  • alt_stylesheet - magic rel="alternate stylesheet". [0]

  • atom_elements - process <feed> and <entry> specially. [0]

  • atom_parser - extract Atom 1.0 native semantics. [0]

  • auto_config - see section "Auto Config" [0]

  • bookmark_start, bookmark_end, bookmark_name - Elements to treat like OpenDocument's <text:bookmark-start> and <text:bookmark-end> element, and associated text:name attribute. Must set all three to use this feature. Use Clark Notation to specify namespaces. [all undef]

  • default_profiles - whitespace-separated list of profiles to load by default. [undef]

  • dom_parser - parser to use to turn a markup string into a DOM. 'html', 'opendocument' (i.e. zipped XML) or 'xml'. ['xml']

  • embedded_rdfxml - find plain RDF/XML chunks within document. 0=no, 1=handle, 2=skip. [0]

  • full_uris - support full URIs in CURIE-only attributes. [0, 1]

  • graph - enable support for named graphs. [0]

  • graph_attr - attribute to use for named graphs. Use Clark Notation to specify a namespace. ['graph']

  • graph_type - graph attribute behaviour ('id' or 'about'). ['id']

  • graph_default - default graph name. ['_:RDFaDefaultGraph']

  • keyword_bundles - space-separated list of bundles of keywords ('rdfa', 'html32', 'html4', 'html5', 'xhtml-role', 'aria-role', 'iana', 'xhv') ['rdfa']

  • lwp_ua - an LWP::UserAgent to use for HTTP requests. [undef]

  • ns - namespace for RDFa attributes. [undef]

  • prefix_attr - support @prefix rather than just @xmlns:*. [0, 1]

  • prefix_bare - support CURIEs with no colon+suffix. [0]

  • prefix_default - URI for default prefix (e.g. rel="foo"). [undef]

  • prefix_empty - URI for empty prefix (e.g. rel=":foo"). ['http://www.w3.org/1999/xhtml/vocab#']

  • prefix_nocase - DEPRECATED - shortcut for prefix_nocase_attr and prefix_nocase_xmlns.

  • prefix_nocase_attr - ignore case-sensitivity of CURIE prefixes defined via @prefix attribute. [0, 1]

  • prefix_nocase_xmlns - ignore case-sensitivity of CURIE prefixes defined via xmlns. [0, 1]

  • profiles - support RDFa profiles. [0, 1]

  • profile_pi - support DataRSS-style profile processing instruction. [0]

  • role_attr - support for XHTML @role [0]

  • safe_anywhere - allow Safe CURIEs in @rel/@rev/etc. [0, 1]

  • tdb_service - use thing-described-by.org to name some bnodes. [0]

  • user_agent - a User-Agent header to use for HTTP requests. Ignored if lwp_ua is provided. [undef]

  • use_rtnlx - use RDF::Trine::Node::Literal::XML. 0=no, 1=if available. [0]

  • vocab_attr - support @vocab from RDFa 1.1. [0, 1]

  • xhtml_base - process <base> element. 0=no, 1=yes, 2=use it for RDF/XML too. [1]

  • xhtml_elements - process <head> and <body> specially. [1]

  • xhtml_lang - support @lang rather than just @xml:lang. [0]

  • xml_base - support for 'xml:base' attribute. 0=only RDF/XML; 1=except @href/@src; 2=always. [0]

  • xml_lang - Support for 'xml:lang' attribute. [1]

  • xmlns_attr - Support for 'xmlns:foo' to define CURIE prefixes. [1]

EXAMPLES

The following full example parses RDFa 1.1 in an Atom document, also using the non-default 'atom_parser' option which parses native Atom elements into the graph too.

use RDF::RDFa::Parser;

$config = RDF::RDFa::Parser::Config->new(
  RDF::RDFa::Parser::Config->HOST_ATOM,
  RDF::RDFa::Parser::Config->RDFA_11,
  atom_parser => 1,
  );
$parser = RDF::RDFa::Parser->new_from_url($url, $config);
$data   = $parser->graph;

The following configuration set parses XHTML+RDFa 1.1 while also parsing any RDF/XML chunks that are embedded in the document.

use RDF::RDFa::Parser::Config qw(HOST_XHTML RDFA_11);
$config = RDF::RDFa::Parser::Config->new(
  HOST_XHTML, RDFA_11, embedded_rdfxml=>1);

SEE ALSO

RDF::RDFa::Parser.

AUTHOR

Toby Inkster <tobyink@cpan.org>.

COPYRIGHT

Copyright 2008-2010 Toby Inkster

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.