NAME
RDF::Flow::Source - Source of RDF data
VERSION
version 0.178
SYNOPSIS
$src = rdflow( "mydata.ttl", name => "RDF file as source" );
$src = rdflow( "mydirectory", name => "directory with RDF files as source" );
$src = rdflow( \&mysource, name => "code reference as source" );
$src = rdflow( $model, name => "RDF::Trine::Model as source" );
package MySource;
use parent 'RDF::Flow::Source';
sub retrieve_rdf {
my ($self, $env) = @_;
my $uri = $env->{'rdflow.uri'};
# ... your logic here ...
return $model;
}
DESCRIPTION
Each RDF::Flow::Source provides a retrieve
method, which returns RDF data on request. RDF data is always returned as instance of RDF::Trine::Model or as instance of RDF::Trine::Iterator with simple statements. The request format is specified below. Sources can access RDF for instance parsed from a file or multiple files in a directory, via HTTP, from a RDF::Trine::Store, or from a custom method. All sources share a set of common configurations options.
METHODS
new ( $from {, %configuration } )
Create a new RDF source by wrapping a code reference, a RDF::Trine::Model, or loading RDF data from a file or URL.
If you pass an existing RDF::Flow::Source object, it will not be wrapped.
A source returns RDF data as instance of RDF::Trine::Model or RDF::Trine::Iterator when queried by a PSGI requests. This is similar to PSGI applications, which return HTTP responses instead of RDF data. RDF::Light supports three types of sources: code references, instances of RDF::Flow, and instances of RDF::Trine::Model.
This constructor is exported as function rdflow
by RDF::Flow:
use RDF::Flow qw(rdflow);
$src = rdflow( @args ); # short form
$src = RDF:Source->new( @args ); # explicit constructor
init
Called from the constructor. Can be used in your sources.
retrieve
Retrieve RDF data. Always returns an instance of RDF::Trine::Model or RDF::Trine::Iterator. You can use the method "empty_rdf" to check whether the RDF data contains some triples or not.
retrieve_rdf
Internal method to retrieve RDF data. You should define this when subclassing RDF::Flow::Source, it is called by method retrieve
.
trigger_retrieved ( $source, $result [, $message ] )
Creates a logging event at trace level to log that some result has been retrieved from a source. Returns the result. By default the logging messages is constructed from the source's name and the result's size. This function is automatically called at the end of method retrieve
, so you do not have to call it, if your source only implements the method retrieve_rdf
.
name
Returns the name of the source.
about
Returns a string with short information (name and size) of the source.
size
Returns the number of inputs (for multi-part sources, such as RDF::Flow::Source::Union).
inputs
Returns a list of inputs (unstable).
id
Returns a unique id of the source, based on its memory address.
pipe_to
Pipes the source to another source (RDF::Flow::Pipeline). $a->pipe_to($b)
is equivalent to RDF::Flow::Pipeline->new($a,$b)
.
timestamp
Returns an ISO 8601 timestamp and possibly sets in rdflow.timestamp
environment variable.
trigger_error
Triggers an error and possibly sets the rdflow.error
environment variable.
graphviz
Purely experimental method for visualizing nets of sources.
graphviz_addnode
Purely experimental method for visualizing nets of sources.
CONFIGURATION
- name
-
Name of the source. Defaults to "anonymous source".
- from
-
Filename, URL, directory, RDF::Trine::Model or code reference to retrieve RDF from. This option is not supported by all source types.
- match
-
Optional regular expression or code reference to match and/or map request URIs. For instance you can rewrite URNs to HTTP URIs like this:
match => sub { $_[0] =~ s/^urn:isbn:/http://example.org/isbn/; }
The URI in
rdflow.uri
is set back to its original value after retrieval.
REQUEST FORMAT
A valid request can either by an URI (as byte string) or a hash reference, that is called an environment. The environment must be a specific subset of a PSGI environment with the following variables:
rdflow.uri
-
A request URI as byte string. If this variable is provided, no other variables are needed and the following variables will not modify this value.
psgi.url_scheme
-
A string
http
(assumed if not set) orhttps
. HTTP_HOST
-
The base URL of the host for constructing an URI. This or SERVER_NAME is required unless rdflow.uri is set.
SERVER_NAME
-
Name of the host for construction an URI. Only used if HTTP_HOST is not set.
SERVER_PORT
-
Port of the host for constructing an URI. By default
80
is used, but not kept as part of an HTTP-URI due to URI normalization. SCRIPT_NAME
-
Path for constructing an URI. Must start with
/
if given. QUERY_STRING
-
Portion of the request URI that follows the ?, if any.
rdflow.ignorepath
-
If this variable is set, no query part is used when constructing an URI.
The method reuses code from Plack::Request by Tatsuhiko Miyagawa. Note that the environment variable REQUEST_URI is not included. When this method constructs a request URI from a given environment hash, it always sets the variable rdflow.uri
, so it is always guaranteed to be set after calling. However it may be the empty string, if an environment without HTTP_HOST or SERVER_NAME was provided.
FUNCTIONS
The following functions are defined to be used in custom source types.
rdflow_uri ( $env | $uri )
Prepares and returns a request URI, as given by an evironment hash or by an existing URI. Sets rdflow.uri
if an environment has been given. URI construction is based on code from Plack, as described in the "REQUEST FORMAT". The following environment variables are used: psgi.url_scheme
, HTTP_HOST
or SERVER_NAME
, SERVER_PORT
, SCRIPT_NAME
, PATH_INFO
, QUERY_STRING
, and rdflow.ignorepath
.
sourcelist_args ( @_ )
Parses a list of inputs (code or other references) mixed with key-value pairs and returns both separated in an array and and hash.
iterator_to_model ( [ $iterator ] [, $model ] )
Adds all statements from a RDF::Trine::Iterator to a (possibly new) RDF::Trine::Model model and returns the model.
empty_rdf ( $rdf )
Returns true if the argument is an empty RDF::Trine::Model, an empty RDF::Trine::Iterator, or no RDF data at all.
AUTHOR
Jakob Voß <voss@gbv.de>
COPYRIGHT AND LICENSE
This software is copyright (c) 2011 by Jakob Voß.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.