NAME

CracTools::Interval::Query - Store and query genomics intervals.

VERSION

version 1.24

SYNOPSIS

my $interval_query = CracTools::Interval::Query->new();

$interval_query->addInterval("chr1",1,12,1,"geneA");
$interval_query->addInterval("chr2",5,14,1,"geneB");

@results = @{$intervalQuery->fetchByRegion("chr1",12,15,1)};

foreach my $gene (@results) {
  print STDERR "Found $gene overlapping gene\n";
}

DESCRIPTION

This module stores and query genomic intervals associated with variables. It is based on the interval tree datastructure provided by Set::IntervalTree.

CracTools::Interval::Query query methods all returns a Array reference with all the scalar associated to the retrieved intervals. But it also return an ArrayRef with the intervals (start,end) themself, see "_processReturnValues" for more informations.

All CracTools::Interval::Query method can be used without the strand argument (or undef). In this case, we will only consider the forward strand.

This class can be easily overloaded with "_processReturnValue" hook method.

SEE ALSO

You may want to check CracTools::Interval::Query::File that is an implementation of CracTools::Interval::Query that directly retrieve intervals from standard files (BED,SAM,GTF,GFF) and returns the lines associated to the queried intervals.

METHODS

new

Example     : my $intervalQuery = CracTools::Interval::Query->new();
Description : Create a new CracTools::Interval::Query object
ReturnType  : CracTools::Interval::Query
Exceptions  : none

addInterval

Arg [1] : String              - Chromosome
Arg [2] : Integer             - Start position
Arg [3] : Integer             - End position
Arg [4] : (Optional) Integer  - Strand
Arg [5] : Scalar              - The value to be hold by this interval. It can
                                be anything, an Integer, a String, a hash 
                                reference, an array reference, ...

Example     : $interval_query->addInterval("chr1",12,30,-1,"geneA")
Description : Add a new genomic interval, with an associated value to the interval_query.

fetchByRegion

Arg [1] : String              - Chromosome
Arg [2] : Integer             - Start position
Arg [3] : Integer             - End position
Arg [4] : (Optional) Integer  - Strand
Arg [5] : (Optional) Boolean  - Windowed query, only return intervals which
                                are completely contained in the queried region.

Example     : my @values = $IntervalQuery->fetchByRegion('1',298345,309209,'+');
Description : Retrieves intervals that belong to the region.
ReturnType  : ArrayRef of scalar

fetchByLocation

Arg [1] : String              - Chromosome
Arg [2] : Integer             - Positon
Arg [3] : (Optional) Integer  - Strand

Example     : my @values = $intervalQuery->fetchByLocation('1',298345,'+');
Description : Retrieves lines that overlapped the given location.
ReturnType  : ArrayRef of Scalar

fetchNearestDown

Arg [1] : String              - Chromosome
Arg [2] : Integer             - Position
Arg [3] : (Optional) Integer  - Strand

Example     : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for the closest interval in downstream that does not contain the query
              and returns the line associated to this interval. 
ReturnType  : Scalar

fetchNearestUp

Arg [1] : String             - Chromosome
Arg [2] : Integer            - Position
Arg [3] : (Optional) Integer - Strand

Example     : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for the closest interval in upstream that does not contain the query
              and returns the line associated to this interval. 
ReturnType  : Scalar

fetchAllNearestDown

Arg [1] : String             - Chromosome
Arg [2] : Integer            - Position
Arg [3] : (Optional) Integer - Strand

Example     : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for all the closest interval in downstream that does not contain the query
              and returns the line associated to this interval. 
ReturnType  : ArrayRef of Scalar

fetchAllNearestUp

Arg [1] : String             - Chromosome
Arg [2] : Integer            - Position
Arg [3] : (Optional) Integer - Strand

Example     : my @values = $interval_query->fetchNearestDown('1',298345,'+');
Description : Search for all the closest interval in upstream that does not contain the query
              and returns the line associated to this interval. 
ReturnType  : ArrayRef of Scalar

PRIVATE METHODS

_getIntervalTree

Arg [1] : String             - Chromosome
Arg [2] : (Optional) Integer - Strand

Description : Return the Set::IntervalTree reference for the chromosome and strand (Default : 1)
ReturnType  : Set::IntervalTree

_addIntervalTree

Arg [1] : String             - Chromosome
Arg [2] : (Optional) Integer - Strand
Arg [3] : Set::IntervalTree  - Interval tree

Description : Add an Set::IntervalTree object for a specific ("chr","strand") pair.
              Strand is set to 1 if none (or undef) is provided

_getIntervalTreeKey

Arg [1] : String             - Chromosome
Arg [2] : (Optional) Integer - Strand

Description : Static method that return and unique key for the ("chr","strand") pair passed in arguements.
              Strand is set to 1 if none (or undef) is provided
ReturnType  : String

_processReturnValues

Arg [1] : ArrayRef - Values returned by Set::IntervalTree

Example     : # Either get only the values holded by the retrieved intervals
              my @values = @{$interval_query->_processReturnValues($interval_results)};
              # Or also get the intervals themselves
              my ($intervals,$values) = $interval_query->_processReturnValues($interval_results);
Description : Call _processReturnValue() method on each values of the array ref passed in parameters.
ReturnType  : Array(ArrayRef({start => .., end => ..}),ArrayRef(Scalar))
              (
                [ { start => 12, end => 20 }, ... ],
                [ "geneA", ...]
              )

_processReturnValue

Arg [1] : Scalar - Value holded by an interval

Description : This method process the values contains by each intervals that
              match a query before returning it.  It is designed to be
              overloaded by doughter classes.
ReturnType  : Scalar (ArrayRef,HashRef,String,Integer...)

AUTHORS

  • Nicolas PHILIPPE <nphilippe.research@gmail.com>

  • Jérôme AUDOUX <jaudoux@cpan.org>

  • Sacha BEAUMEUNIER <sacha.beaumeunier@gmail.com>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2016 by IRMB/INSERM (Institute for Regenerative Medecine and Biotherapy / Institut National de la Santé et de la Recherche Médicale) and AxLR/SATT (Lanquedoc Roussilon / Societe d'Acceleration de Transfert de Technologie).

This is free software, licensed under:

The GNU Affero General Public License, Version 3, November 2007