NAME

Webservice::InterMine::Cookbook::Recipe2 - Adding Constraints

SYNOPSIS

# Get a list of the drosophilids in the database

use Webservice::InterMine ('www.flymine.org');

my $query = Webservice::InterMine->new_query;

# Specifying a name and a description is purely optional
$query->name('Tutorial 2 Query');
$query->description('A list of all the drosophilids in the database');
$query->add_view(qw/Organism.name Organism.taxonId/);

$query->add_constraint(
   path  => 'Organism.name',
   op    => 'CONTAINS',
   value => 'Drosophila'
);

print query->results(as => 'string');

# Get the authors, titles and PubMed IDs of all publications on drosophilid
# genes between 2004 and 2010

my $query2 = Webservice::InterMine->new_query;

$query2->add_view(qw/
    Gene.publications.firstAuthor
    Gene.publications.pubMedId
    Gene.publications.title
/);

$query2->add_constraint(
    path  => 'Gene.organism.name',
    op    => '=',
    value => '*Drosophila*',
);
$query2->add_constraint(
    path  => 'Gene.publications.year',
    op    => '>=',
    value => 2004,
);
$query2->add_constraint(
    path  => 'Gene.publications.year',
    op    => '<=',
    value => 2010,
);

print $query2->results(as => 'string');

DESCRIPTION

Querying is understandably more powerful with constraints on the results returned. The mechanism for adding constraints to a query is in some ways similar to what we have seen with the add_view command - here instead we have add_constraint. Whereas add_view just takes a list of paths, add_constraint takes a wider, and more variable, list of arguments.

  • path - the path representing the attribute to be constrained

  • op - the 'operator', which defines how to constrain the path

  • value - the value to be applied to the operator

In the example above we have three different operators: =, >=, and <=, all of which take a value and use it to constrain their path. The = operator permits the use of wild-cards (* in the example above) to search on a basic pattern.

It is also possible to specifiy constraints using a list of their parameters:

$query->add_constraint('Gene.symbol', '=', 'eve');
$query->add_constraint('Gene.homologue', 'IS NOT NULL');
$query->add_constraint('Gene.name', 'IN', ['Even skipped', 'Zerknullt']);

This works for all constraint types.

For the simpler constraints (Binary and Unary constraints only), it is also possible to add constraints using the following pattern:

$query->add_constraint('Gene.organism.name = "Drosophila Yakuba"');

Note the quoting: the constraint here is a single string, which is parsed for the path, operator and value.

In the examples above the constraints are all cumulative, meaning we only get results back if an item satisfies all of their requirements. We say that the logic for $query2 is "A and B and C". It is possible however to 'or' your constraints together as well - see below:

# Get the authors, titles and PubMed IDs of all publications
# since 2004 on genes in D. Yakuba or D. Melanogaster

my $query3 = Webservice::InterMine->new_query;

$query3->add_view(qw/
    Gene.publications.firstAuthor
    Gene.publications.pubMedId
    Gene.publications.title
/);

my $con1 = $query3->add_constraint(
    path  => 'Gene.organism.name',
    op    => '=',
    value => 'Drosophila Yakuba',
);
my $con2 = $query3->add_constraint(
    path  => 'Gene.organism.name',
    op    => '=',
    value => 'Drosophila Melanogaster',
);
my $con3 = $query3->add_constraint(
    path  => 'Gene.publications.year',
    op    => '>=',
    value => 2004,
);

$query3->logic(($con1 | $con2) & $con3);

my $publications_results = $query2->results(as => 'string');
print $publications_results;

Note that here we keep the constraint objects returned by add_constraint, which we would normally just ignore. Then these are combined to create the logic for the query using the | and & operators. You can always inspect the logic for a query by calling $query->logic->code, which here would return "(A or B) and C". It is also possible to use string parsing to define the logic:

$query3->logic('(A or B) and C');

The letters used here are the 'codes' associated with each constraint - to find a constraint's code you can always call $con->code, and to find out what it does you can call $con->to_string, which for constraint A would return:

'Gene.organism.name = "Drosophila Yakuba"'

Normally the constraint codes are simply a series that increments for each constraint that is added, but if you really want to rely on a specific constraint having a specific code you can call:

$query->add_constraint(
    path  => 'Organism.name',
    op    => '=',
    value => 'Drosophila Melanogaster',
    code  => 'Q',
);

and then you know that this constraint has code 'Q'. Generally however, you shouldn't need to directly handle these codes yourself, as they will be generated automatically for you.

CONCLUSION

A query can be made very powerful with a few basic constraints and some simple logic. There are several different kinds of constraints as well, which are detailed in Recipe3.

AUTHOR

Alex Kalderimis <dev@intermine.org>

BUGS

Please report any bugs or feature requests to dev@intermine.org.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Webservice::InterMine

You can also look for information at:

COPYRIGHT AND LICENSE

Copyright 2006 - 2010 FlyMine, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.