NAME
Catmandu::Importer::OAI - Package that imports OAI-PMH feeds
SYNOPSIS
# From the command line
$ catmandu convert OAI --url http://myrepo.org/oai
$ catmandu convert OAI --url http://myrepo.org/oai --metadataPrefix didl --handler RAW
# In perl
use Catmandu::Importer::OAI;
my $importer = Catmandu::Importer::OAI->new(
url => "...",
metadataPrefix => "..." ,
from => "..." ,
until => "..." ,
set => "...",
handler => "..." );
my $n = $importer->each(sub {
my $hashref = $_[0];
# ...
});
CONFIGURATION
- url
-
OAI-PMH Base URL.
- metadataPrefix
-
Metadata prefix to specify the metadata format. Set to
oai_dc
by default. - handler( sub {} | $object | 'NAME' | '+NAME' )
-
Handler to transform each record from XML DOM (XML::LibXML::Element) into Perl hash.
Handlers can be provided as function reference, an instance of a Perl package that implements 'parse', or by a package NAME. Package names should be prepended by
+
or prefixed withCatmandu::Importer::OAI::Parser
. E.gfoobar
will create aCatmandu::Importer::OAI::Parser::foobar
instance.By default the handler Catmandu::Importer::OAI::Parser::oai_dc is used for metadataPrefix
oai_dc
, Catmandu::Importer::OAI::Parser::marcxml formarcxml
, and Catmandu::Importer::OAI::Parser::struct for other formats. In addition there is Catmandu::Importer::OAI::Parser::raw to return the XML as it is. - set
-
An optional set for selective harvesting.
- from
-
An optional datetime value (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ) as lower bound for datestamp-based selective harvesting.
- until
-
An optional datetime value (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ) as upper bound for datestamp-based selective harvesting.
- listIdentifiers
-
Harvest identifiers instead of full records.
- resumptionToken
-
An optional resumptionToken to start harvesting from.
- dry
-
Don't do any HTTP requests but return URLs that data would be queried from.
- xslt
-
Preprocess XML records with XSLT script(s) given as comma separated list or array reference. Requires Catmandu::XML.
DESCRIPTION
Every Catmandu::Importer is a Catmandu::Iterable all its methods are inherited. The Catmandu::Importer::OAI methods are not idempotent: OAI-PMH feeds can only be read once.
METHOD
In addition to methods inherited from Catmandu::Iterable, this module provides the following public methods:
handle_record( $dom )
Process an XML DOM as with xslt and handler as configured and return the result.