NAME

Catmandu::Importer::RDF - parse RDF data

SYNOPSIS

Command line client catmandu:

catmandu convert RDF --url http://d-nb.info/gnd/4151473-7 to YAML

catmandu convert RDF --file rdfdump.ttl to JSON

# Parse the input into on JSON document per triplet. This is the
# most memory efficient (and fastest) way to parse RDF input.
catmandu convert RDF --triples 1 --file rdfdump.ttl to JSON

# Transform back into NTriples (conversions to and from triples is the
# most efficient way to process RDF)
catmandu convert RDF --triples 1 --file rdfdump.ttl to RDF --type NTriples

# Query a SPARQL endpoint
catmandu convert RDF --url http://dbpedia.org/sparql
                     --sparql "SELECT ?film WHERE { ?film dct:subject <http://dbpedia.org/resource/Category:French_films> }"

catmandu convert RDF --url http://example.org/sparql --sparql query.rq

# Query a Linked Data Fragment endpoint
catmandu convert RDF --url http://fragments.dbpedia.org/2014/en
                     --sparql "SELECT ?film WHERE { ?film dct:subject <http://dbpedia.org/resource/Category:French_films> }"

In Perl code:

use Catmandu::Importer::RDF;
my $url = "http://dx.doi.org/10.2474/trol.7.147";
my $rdf = Catmandu::Importer::RDF->new( url => $url )->first;

DESCRIPTION

This Catmandu::Importer can be use to import RDF data from URLs, files or input streams, SPARQL endpoints, and Linked Data Fragment endpoints.

By default an RDF graph is imported as single item in aREF format (see RDF::aREF).

CONFIGURATION

url

URL to retrieve RDF from.

type

RDF serialization type (e.g. ttl for RDF/Turtle).

base

Base URL. By default derived from the URL or file name.

ns

Use default namespace prefixes as provided by RDF::NS to abbreviate predicate and datatype URIs. Set to 0 to disable abbreviating URIs. Set to a specific date to get stable namespace prefix mappings.

triples

Import each RDF triple as one aREF subject map (default) or predicate map (option predicate_map), if enabled. This is the most efficient way to process large input files. All the processing can be streamed.

predicate_map

Import RDF as aREF predicate map, if possible.

file
fh
encoding
fix

Default configuration options of Catmandu::Importer.

sparql

The SPARQL query to be executed on the URL endpoint (currectly only SELECT is supported). The query can be supplied as string or as filename. The importer tries to automatically add missing PREFIX statements from the default namespace prefixes.

sparql_result

Encoding of SPARQL result values. With aref, query results are encoded in aREF format, with URIs in < and > (no qNames) and literal nodes appended by @ and optional language code. By default (value simple), all RDF nodes are simplfied to their literal form.

cache

Set to a true value to cache repeated URL responses in a CHI based backend.

cache_options

Provide the CHI based options for caching result sets. By default a memory store of 1MB size is used. This is equal to:

Catamandu::Importer::RDF->new( ...,
    cache => 1,
    cache_options => {
        driver => 'Memory',
        global => 1,
        max_size => 1024*1024
    });
speed

If set to a true value, then write RDF file processing speed on the STDERR as number of triples parsed per second.

METHODS

See Catmandu::Importer.

SEE ALSO

RDF::Trine::Store, RDF::Trine::Parser