NAME
RDF::Core::Query - Implementation of query language
SYNOPSIS
my %namespaces = (Default => 'http://myApp.gingerall.org/ns#',
ns => 'http://myApp.gingerall.org/ns#',
);
sub printRow {
my (@row) = @_;
foreach (@row) {
my $label = defined($_) ? $_->getLabel : 'NULL';
print $label, ' ';
}
print "\n";
}
my $functions = new RDF::Core::Function(Data => $model,
Schema => $schema,
Factory => $factory,
);
my $evaluator = new RDF::Core::Evaluator
(Model => $model, #an instance of RDF::Core::Model
Factory => $factory, #an instance of RDF::Core::NodeFactory
Functions => $functions,
Namespaces => \%namespaces,
Row => \&printRow
);
my $query = new RDF::Core::Query(Evaluator=> $evaluator);
$query->query("Select ?x->title
From store->book{?x}->author{?y}
Where ?y = 'Lewis'");
DESCRIPTION
Query module together with RDF::Core::Evaluator and RDF::Core::Function implements a query language. A result of a query is a set of handler calls, each call corresponding to one row of data returned.
Interface
new(%options)
Available options are:
Evaluator
RDF::Core::Evaluator object.
query($queryString)
Evaluates $queryString. Returns an array reference, each item containing one resulting row. There is an option Row in RDF::Core::Evaluator, which contains a function to handle a row returned from query. If the handler is set, it is called for each row of the result and no result array is returned. Parameters of the handler are RDF::Core::Resource or RDF::Core::Literal or undef values.
prepare($queryString)
Prepares parsed query from $queryString. The string can contain external variables - names with hash prepended (#name), which are bound to values in execute().
execute(\%bindings,$parsedQuery)
Executes prepared query. If $parsedQuery is not supplied, the last prepared/executed/queried query is executed. Binding hash must contain value for each external variable used. The value is RDF::Core::Resource or RDF::Core::Literal object.
Query language
Query language has three major parts, beginning with select, from and where keywords. The select part specifies which "columns" of data should be returned. The from part defines the pattern or path in the graph I'm searching for and binds variables to specific points of the path. The where part specifies conditions that each path found must conform.
Let's start in midst, with from part:
Select ?x from ?x->ns:author
This will find all resources that have property ns:author. We can chain properties:
Select ?x from ?x->ns:author->ns:name
This means find all resources that have property ns:author and value of the property has property ns:name. We can bind values to variables to refer them back:
Select ?x, ?authorName from ?x->ns:author{?authorID}->ns:name{?authorName}
This means find the same as in the recent example and bind ?authorID variable to author value and ?authorName to name value. The variable is bound to a value of property, not property itself. If there is a second variable bound, it's bound to property itself:
Select ?x from ?x->ns:author{?authorID}->ns:name{?authorName,?prop}
The variable ?authorName will contain a name of an author, while ?prop variable will contain an uri of ns:name property. This kind of binding can be useful with function calls (see below).
If there is more then one path specified, the result must satisfy all of them. Common variables represent the same value, describing how the paths are joined together. If there are no common variables in two paths, cartesian product is produced.
Select ?x
From ?x->ns:author{?author}->ns:name{?name},
?author->ns:birth{?birth}
Target element. The value of the last property in the path can be specified:
Select ?x from ?x->ns:author->ns:name=>'Lewis'
Class expression. Class of the starting element in the path can be specified:
Select ?x from ns:Book::?x->ns:author
which is equivalent to
Select ?x from ?x->ns:author, ?x->rdf:type=>ns:Book
supposing we have defined namespace rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'. (See Names and URIs paragraph later in the text.)
Condition. Now we described data we talk about and let's put more conditions on them in where section:
Select ?x
From ?x->ns:author{?author}->ns:name{?name}, ?author->ns:birth{?birth}
Where ?name = 'Lewis' And ?birth->ns:year < '1900'
This means: get all paths in the graph described in from section and exclude those that don't conform the condition. Only variables declared in from section can be used, binding is not allowed in condition.
In condition, each element (resource, predicate or value) can be replaced with a list of variants. So we may ask:
Select ?x
From ?x->ns:author{?author}
Where ?author->(ns:book,ns:booklet,ns:article)->ns:published < '1938'
and it means
Select ?x
From ?x->ns:author{?author}, ?author->ns:birth{?birth}
Where ?author->ns:book.published < '1938'
Or ?author->ns:booklet.published < '1938'
Or ?author->ns:article.published < '1938'
The list of variants can be combined with class expression:
Select ?x
From ?x->ns:author{?author}
Where (clss:Writer, clss:Teacher)::?author->ns:birth < '1900'
and it means
...
Where (?author->rdf:type = clss:Writer
Or ?author->rdf:type = clss:Teacher)
And ?author->ns:birth < '1900'
Resultset. The select section describes how to output each path found. We can think of a path as a n-tuple of values bound to variables.
Select ?x->ns:title, ?author->ns:name
From ?x->ns:author{?author}
Where (clss:Writer, clss:Teacher)::?author->ns:birth < '1900'
For each n-tuple [?x, ?author] conforming the query ?x->ns:title and ?author->ns:name are evaluated and the pair of values is returned as one row of the result. If there is no value for ?x->ns:title, undef is returned instead of the value. If there are more values for one particular ?x->ns:title, all of them are returned in cartesian product with ?author->ns:name.
Names and URIs
'ns:name' is a shortcut for URI. Each prefix:name is evaluated to URI as prefix value concatenated with name. If prefix is not present, prefix Default is taken. There are two ways to assign a namespace prefix to its value. You can specify prefix and its value in Evaluator's option Namespaces. This is a global setting, which applies to all queries evaluated by Query object. Locally you can set namespaces in each select, using USE clause. This overrides global settings for the current select. URIs can be typed explicitly in square brackets. The following queries are equivalent:
Select ?x from ?x->[http://myApp.gingerall.org/ns#name]
Select ?x from ?x->ns:name
Use ns For [http://myApp.gingerall.org/ns#]
Functions
Functions can be used to obtain custom values for a resource. They accept recources or literals as parameters and return set of resources or literals. They can be used in place of URI or name. If they are at position of property, they get resource as a special parameter and what they return is considered to be a value of the expression rather then 'real' properties.
Let's have function foo() that always returns resource with URI http://myApp.gingerall.org/ns#foo. The expression
?x->foo()
evaluates to
[http://myApp.gingerall.org/ns#foo],
not
?x->[http://myApp.gingerall.org/ns#foo]
Now we can restate the condition with variants to a condition with a function call.
Select ?x
From ?x->ns:author{?author}
Where ?author->subproperty(ns:publication)->ns:published < '1938'
We consider we have apropriate schema where book, booklet, article etc. are (direct or indirect) rdfs:subPropertyOf publication.
The above function does this: search schema for subproperties of publication and return value of the subproperty. Sometimes we'd like to know not only value of that "hidden" property, but the property itself. Again, we can use a multiple binding. In following example we get uri of publication in ?publication and uri of property (book, booklet, article, ...) in ?property.
Select ?publication, ?property
From ?author->subproperty(ns:publication){?publication, ?property}
Where ?publication->ns:published < '1938'
Comments.
Comments are prepended with two dashes (to end of line or string), or enclosed in slash asterisk parenthesis /*...*/.
Select ?publication, ?property --the rest of line is a comment
From ?author->subproperty(publication){?publication, ?property}
Where /*another
comment*/ ?publication->published < '1938'
A BNF diagram for query language
<query> ::= Select <resultset> From <source> [Where <condition>]
["Use" <namespaces>]
<resultset> ::= <elementpath>{","<elementpath>}
<source> ::= <sourcepath>{","<sourcepath>}
<sourcepath> ::= [<element>[ "{" <variable> "}" ]"::"]
<element>[ "{" <variable> "}" ]
{"->"<element>[ "{" <variable> [, <variable>]"}" ]}
["=>"<element> | <expression>]
<condition> ::= <match> | <condition> <connection> <condition>
{<connection> <condition>}
| "(" <condition> ")"
<namespaces> ::= <name> ["For"] "["<uri>"]" { "," <name> [for] "["<uri>"]"}
<match> ::= <path> [<relation> <path>]
<path> ::= [<elements>"::"]<elements>{"->"<elements>} | <expression>
<elements> ::= <element> | "(" <element> {"," <element>} ")"
<elementpath> ::= <element>{"->"<element>} | <expression>
<element> ::= <variable> | <node> | <function>
<function> ::= <name> "(" <elementpath>["," <elementpath>] ")"
<node> ::= "[" <uri> "]" | "[" "_:" <name> "]" | [<name>":"]<name>
<variable> ::= "?"<name>
<name> ::= [a-zA-Z_][a-zA-Z0-9_]
<expression> ::= <literal> | <expression> <operation> <expression>
{<operation> <expression>}
| "(" <expression> ")"
<connection> ::= and | or
<relation> ::= "=" | "<" | ">"
<operation> ::= "|"
<literal> ::= """{any_character}""" | "'"{any_character}"'"
<uri> ::= absolute uri resource, see uri specification
LICENSE
This package is subject to the MPL (or the GPL alternatively).
AUTHOR
Ginger Alliance, rdf@gingerall.cz
SEE ALSO
RDF::Core::Evaluator, RDF::Core::Function