NAME
CracTools::Interval::Query - Store and query genomics intervals.
VERSION
version 1.24
SYNOPSIS
my $interval_query = CracTools::Interval::Query->new();
$interval_query->addInterval("chr1",1,12,1,"geneA");
$interval_query->addInterval("chr2",5,14,1,"geneB");
@results = @{$intervalQuery->fetchByRegion("chr1",12,15,1)};
foreach my $gene (@results) {
print STDERR "Found $gene overlapping gene\n";
}
DESCRIPTION
This module stores and query genomic intervals associated with variables. It is based on the interval tree datastructure provided by Set::IntervalTree.
CracTools::Interval::Query query methods all returns a Array reference with all the scalar associated to the retrieved intervals. But it also return an ArrayRef with the intervals (start,end) themself, see "_processReturnValues" for more informations.
All CracTools::Interval::Query method can be used without the strand argument (or undef). In this case, we will only consider the forward strand.
This class can be easily overloaded with "_processReturnValue" hook method.
SEE ALSO
You may want to check CracTools::Interval::Query::File that is an implementation of CracTools::Interval::Query that directly retrieve intervals from standard files (BED,SAM,GTF,GFF) and returns the lines associated to the queried intervals.
METHODS
new
Example : my $intervalQuery = CracTools::Interval::Query->new();
Description : Create a new CracTools::Interval::Query object
ReturnType : CracTools::Interval::Query
Exceptions : none
addInterval
Arg [1] : String - Chromosome
Arg [2] : Integer - Start position
Arg [3] : Integer - End position
Arg [4] : (Optional) Integer - Strand
Arg [5] : Scalar - The value to be hold by this interval. It can
be anything, an Integer, a String, a hash
reference, an array reference, ...
Example : $interval_query->addInterval("chr1",12,30,-1,"geneA")
Description : Add a new genomic interval, with an associated value to the interval_query.
fetchByRegion
Arg [1] : String - Chromosome
Arg [2] : Integer - Start position
Arg [3] : Integer - End position
Arg [4] : (Optional) Integer - Strand
Arg [5] : (Optional) Boolean - Windowed query, only return intervals which
are completely contained in the queried region.
Example : my @values = $IntervalQuery->fetchByRegion('1',298345,309209,'+');
Description : Retrieves intervals that belong to the region.
ReturnType : ArrayRef of scalar
fetchByLocation
Arg [1] : String - Chromosome
Arg [2] : Integer - Positon
Arg [3] : (Optional) Integer - Strand
Example : my @values = $intervalQuery->fetchByLocation('1',298345,'+');
Description : Retrieves lines that overlapped the given location.
ReturnType : ArrayRef of Scalar
fetchNearestDown
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Arg [3] : (Optional) Integer - Strand
Example : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for the closest interval in downstream that does not contain the query
and returns the line associated to this interval.
ReturnType : Scalar
fetchNearestUp
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Arg [3] : (Optional) Integer - Strand
Example : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for the closest interval in upstream that does not contain the query
and returns the line associated to this interval.
ReturnType : Scalar
fetchAllNearestDown
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Arg [3] : (Optional) Integer - Strand
Example : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for all the closest interval in downstream that does not contain the query
and returns the line associated to this interval.
ReturnType : ArrayRef of Scalar
fetchAllNearestUp
Arg [1] : String - Chromosome
Arg [2] : Integer - Position
Arg [3] : (Optional) Integer - Strand
Example : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for all the closest interval in upstream that does not contain the query
and returns the line associated to this interval.
ReturnType : ArrayRef of Scalar
PRIVATE METHODS
_getIntervalTree
Arg [1] : String - Chromosome
Arg [2] : (Optional) Integer - Strand
Description : Return the Set::IntervalTree reference for the chromosome and strand (Default : 1)
ReturnType : Set::IntervalTree
_addIntervalTree
Arg [1] : String - Chromosome
Arg [2] : (Optional) Integer - Strand
Arg [3] : Set::IntervalTree - Interval tree
Description : Add an Set::IntervalTree object for a specific ("chr","strand") pair.
Strand is set to 1 if none (or undef) is provided
_getIntervalTreeKey
Arg [1] : String - Chromosome
Arg [2] : (Optional) Integer - Strand
Description : Static method that return and unique key for the ("chr","strand") pair passed in arguements.
Strand is set to 1 if none (or undef) is provided
ReturnType : String
_processReturnValues
Arg [1] : ArrayRef - Values returned by Set::IntervalTree
Example : # Either get only the values holded by the retrieved intervals
my @values = @{$interval_query->_processReturnValues($interval_results)};
# Or also get the intervals themselves
my ($intervals,$values) = $interval_query->_processReturnValues($interval_results);
Description : Call _processReturnValue() method on each values of the array ref passed in parameters.
ReturnType : Array(ArrayRef({start => .., end => ..}),ArrayRef(Scalar))
(
[ { start => 12, end => 20 }, ... ],
[ "geneA", ...]
)
_processReturnValue
Arg [1] : Scalar - Value holded by an interval
Description : This method process the values contains by each intervals that
match a query before returning it. It is designed to be
overloaded by doughter classes.
ReturnType : Scalar (ArrayRef,HashRef,String,Integer...)
AUTHORS
Nicolas PHILIPPE <nphilippe.research@gmail.com>
Jérôme AUDOUX <jaudoux@cpan.org>
Sacha BEAUMEUNIER <sacha.beaumeunier@gmail.com>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2016 by IRMB/INSERM (Institute for Regenerative Medecine and Biotherapy / Institut National de la Santé et de la Recherche Médicale) and AxLR/SATT (Lanquedoc Roussilon / Societe d'Acceleration de Transfert de Technologie).
This is free software, licensed under:
The GNU Affero General Public License, Version 3, November 2007