LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::DBSQL::BaseFeatureAdaptor - An Abstract Base class for all FeatureAdaptors

SYNOPSIS

Abstract class - should not be instantiated. Implementation of abstract methods must be performed by subclasses.

DESCRIPTION

This is a base adaptor for feature adaptors. This base class is simply a way of eliminating code duplication through the implementation of methods common to all feature adaptors.

METHODS

new

Arg [1]    : list of args @args
             Superclass constructor arguments
Example    : none
Description: Constructor which warns if caching has been switched off
Returntype : Bio::EnsEMBL::BaseFeatureAdaptor
Exceptions : none
Caller     : implementing subclass constructors
Status     : Stable

start_equals_end

Arg [1]    : (optional) boolean $newval
Example    : $bfa->start_equals_end(1);
Description: Getter/Setter for the start_equals_end flag.  If set
             to true sub _slice_fetch will use a simplified sql to retrieve 1bp slices.
Returntype : boolean
Exceptions : none
Caller     : EnsemblGenomes variation DB build
Status     : Stable

clear_cache

Args      : None
Example   : my $sa =
              $registry->get_adaptor( 'Mus musculus', 'Core',
                                      'Slice' );
            my $ga =
              $registry->get_adaptor( 'Mus musculus', 'Core',
                                      'Gene' );

            my $slice =
              $sa->fetch_by_region( 'Chromosome', '1', 1e8,
                                    1.05e8 );

            my $genes = $ga->fetch_all_by_Slice($slice);

            $ga->clear_cache();

Description   : Empties the feature cache associated with this
                feature adaptor.
Return type   : None
Exceptions    : None
Caller        : General
Status        : At risk (under development)

_slice_feature_cache

Description	: Returns the feature cache if we are allowed to cache and
              will build it if we need to. We will never return a reference
              to the hash to avoid unintentional auto-vivfying caching
Returntype 	: Bio::EnsEMBL::Utils::Cache
Exceptions 	: None
Caller     	: Internal

fetch_all_by_Slice

Arg [1]    : Bio::EnsEMBL::Slice $slice
             the slice from which to obtain features
Arg [2]    : (optional) string $logic_name
             the logic name of the type of features to obtain
Example    : $fts = $a->fetch_all_by_Slice($slice, 'Swall');
Description: Returns a listref of features created from the database 
             which are on the Slice defined by $slice. If $logic_name is 
             defined only features with an analysis of type $logic_name 
             will be returned. 
             NOTE: only features that are entirely on the slice's seq_region
             will be returned (i.e. if they hang off the start/end of a
             seq_region they will be discarded). Features can extend over the
             slice boundaries though (in cases where you have a slice that
             doesn't span the whole seq_region).
Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
Exceptions : none
Caller     : Bio::EnsEMBL::Slice
Status     : Stable

fetch_Iterator_by_Slice_method

  Arg [1]    : CODE ref of Slice fetch method
  Arg [2]    : ARRAY ref of parameters for Slice fetch method
  Arg [3]    : Optional int: Slice index in parameters array
  Arg [4]    : Optional int: Slice chunk size. Default=500000
  Example    : my $slice_iter = $feature_adaptor->fetch_Iterator_by_Slice_method
                               	      ($feature_adaptor->can('fetch_all_by_Slice_Arrays'),
	                                   \@fetch_method_params,
	                                   0,#Slice idx
	                                  );

               while(my $feature = $slice_iter->next && defined $feature){
                 #Do something here
               }

  Description: Creates an Iterator which chunks the query Slice to facilitate
               large Slice queries which would have previously run out of memory
  Returntype : Bio::EnsEMBL::Utils::Iterator
  Exceptions : Throws if mandatory params not valid
  Caller     : general
  Status     : at risk

fetch_Iterator_by_Slice

Arg [1]    : Bio::EnsEMBL::Slice
Arg [2]    : Optional string: logic name of analysis
Arg [3]    : Optional int: Chunk size to iterate over. Default is 500000
Example    : my $slice_iter = $feature_adaptor->fetch_Iterator_by_Slice($slice);

             while(my $feature = $slice_iter->next && defined $feature){
               #Do something here
             }

Description: Creates an Iterator which chunks the query Slice to facilitate
             large Slice queries which would have previously run out of memory
Returntype : Bio::EnsEMBL::Utils::Iterator
Exceptions : None
Caller     : general
Status     : at risk

fetch_all_by_Slice_and_score

Arg [1]    : Bio::EnsEMBL::Slice $slice
             the slice from which to obtain features
Arg [2]    : (optional) float $score
             lower bound of the the score of the features retrieved
Arg [3]    : (optional) string $logic_name
             the logic name of the type of features to obtain
Example    : $fts = $a->fetch_all_by_Slice_and_score($slice,90,'Swall');
Description: Returns a list of features created from the database which are 
             are on the Slice defined by $slice and which have a score 
             greater than $score. If $logic_name is defined, 
             only features with an analysis of type $logic_name will be 
             returned. 
Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
Exceptions : none
Caller     : Bio::EnsEMBL::Slice
Status     : Stable

fetch_all_by_Slice_constraint

Arg [1]    : Bio::EnsEMBL::Slice $slice
             the slice from which to obtain features
Arg [2]    : (optional) string $constraint
             An SQL query constraint (i.e. part of the WHERE clause)
Arg [3]    : (optional) string $logic_name
             the logic name of the type of features to obtain
Example    : $fs = $a->fetch_all_by_Slice_constraint($slc, 'perc_ident > 5');
Description: Returns a listref of features created from the database which 
             are on the Slice defined by $slice and fulfill the SQL 
             constraint defined by $constraint. If logic name is defined, 
             only features with an analysis of type $logic_name will be 
             returned. 
Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
Exceptions : thrown if $slice is not defined
Caller     : Bio::EnsEMBL::Slice
Status     : Stable

fetch_all_by_logic_name

Arg [1]    : string $logic_name
             the logic name of the type of features to obtain
Example    : $fs = $a->fetch_all_by_logic_name('foobar');
Description: Returns a listref of features created from the database.
             only features with an analysis of type $logic_name will
             be returned.  If the logic name is invalid (not in the
             analysis table), a reference to an empty list will be
             returned.
Returntype : listref of Bio::EnsEMBL::SeqFeatures
Exceptions : thrown if no $logic_name
Caller     : General
Status     : Stable

fetch_all_by_stable_id_list

Arg [1]    : string $logic_name
             the logic name of the type of features to obtain
Arg [2]    : Bio::EnsEMBL::Slice $slice
             the slice from which to obtain features
Example    : $fs = $a->fetch_all_by_stable_id_list(["ENSG00001","ENSG00002", ...]);
Description: Returns a listref of features identified by their stable IDs.
             This method only fetches features of the same type as the calling
             adaptor. 
             Results are constrained to a slice if the slice is provided.
Returntype : listref of Bio::EnsEMBL::Feature
Exceptions : thrown if no stable ID list is provided.
Caller     : General
Status     : Stable

count_by_Slice_constraint

Arg [1]     : Bio::EnsEMBL::Slice
Arg [2]     : String Custom SQL constraint
Arg [3]     : String Logic name to search by
Description : Finds all features with at least partial overlap to the given
              slice and sums them up.
              Explanation of workings with projections:
              
              |-------------------------Constraint Slice---------------------------------|
              |             |                                           |                |
              |--Segment 1--|                                           |                |
              |             |--Segment 2, on desired Coordinate System--|                |
              |             |                                           |---Segment 3----|
          #Feature 1#    #Feature 2#                               #Feature 3#           |
              |         #####################Feature 4####################               |
              | #Feature 5# |                                           |                |
              
              Feature 1 is overlapping the original constraint. Counted in Segment 1
              Feature 2,3 and 4  are counted when inspecting Segment 2
              Feature 5 is counted in Segment 1
              
Returntype  : Integer

_get_and_filter_Slice_projections

Arg [1]     : Bio::EnsEMBL::Slice
Description : Delegates onto SliceAdaptor::fetch_normalized_slice_projection() 
              with filtering on
Returntype  : ArrayRef Bio::EnsEMBL::ProjectionSegment; Returns an array
              of projected segments

_generate_feature_bounds

Arg [1]     : Bio::EnsEMBL::Slice
Description : Performs a projection of Slice and records the bounds
              of that projection. This can be used later on to restrict
              Features which overlap into unwanted areas such as
              regions which exist on another HAP/PAR region.
              
              Bounds are defined as projection_start - slice_start + 1.
Example     : my $bounds = $self->_generate_feature_bounds($slice);
Returntype  : ArrayRef Integer; Returns the location of the bounds.

_get_by_Slice Arg [0] : Bio::EnsEMBL::Slice to find all the features within Arg [1] : SQL constraint string Arg [2] : Type of query to run. Default behaviour is to select, but 'count' is also valid Description: Abstracted logic from _slice_fetch Returntype : Listref of Bio::EnsEMBL::Feature, or integers for counting mode

store

Arg [1]    : list of Bio::EnsEMBL::SeqFeature
Example    : $adaptor->store(@feats);
Description: ABSTRACT  Subclasses are responsible for implementing this 
             method.  It should take a list of features and store them in 
             the database.
Returntype : none
Exceptions : thrown method is not implemented by subclass
Caller     : general
Status     : At Risk
           : throws if called.

remove

Arg [1]    : A feature $feature 
Example    : $feature_adaptor->remove($feature);
Description: This removes a feature from the database.  The table the
             feature is removed from is defined by the abstract method
             _tablename, and the primary key of the table is assumed
             to be _tablename() . '_id'.  The feature argument must 
             be an object implementing the dbID method, and for the
             feature to be removed from the database a dbID value must
             be returned.
Returntype : none
Exceptions : thrown if $feature arg does not implement dbID(), or if
             $feature->dbID is not a true value
Caller     : general
Status     : Stable

remove_by_Slice

Arg [1]    : Bio::Ensembl::Slice $slice
Example    : $feature_adaptor->remove_by_Slice($slice);
Description: This removes features from the database which lie on a region
             represented by the passed in slice.  Only features which are
             fully contained by the slice are deleted; features which overlap
             the edge of the slice are not removed.
             The table the features are removed from is defined by
             the abstract method_tablename.
Returntype : none
Exceptions : thrown if no slice is supplied
Caller     : general
Status     : Stable

fetch_nearest_by_Feature

Arg [1]    : Reference Feature to start the search from
Description: Searches iteratively outward from the starting feature until a nearby Feature is found
             If you require more than one result or more control of which features are returned, see
             fetch_all_nearest_by_Feature and fetch_all_by_outward_search. fetch_nearest_by_Feature
             is a convenience method.
ReturnType : Bio::EnsEMBL::Feature
Arguments the same as fetch_all_nearest_by_Feature
Arg [0]    : -MAX_RANGE : Set an upper limit on the search range, defaults to 10000 bp 
Arg [1]    : -FEATURE ,Bio::EnsEMBL::Feature : 'Source' Feature to anchor the search for nearest Features
Arg [2]    : -SAME_STRAND, Boolean (optional)  : Respect the strand of the source Feature with ref, only 
             returning Features on the same strand.
Arg [3]    : -OPPOSITE_STRAND, Boolean (optional) : Find features on the opposite strand of the same
Arg [4]    : -DOWNSTREAM/-UPSTREAM, (optional) : Search ONLY downstream or upstream from the source Feature.
             Can be omitted for searches in both directions.
Arg [5]    : -RANGE, Int     : The size of the space to search for Features. Defaults to 1000 as a sensible starting point
Arg [6]    : -NOT_OVERLAPPING, Boolean (optional) : Do not return Features that overlap the source Feature
Arg [7]    : -FIVE_PRIME, Boolean (optional) : Determine range to a Feature by the 5' end, respecting strand
Arg [8]    : -THREE_PRIME, Boolean (optional): Determine range to a Feature by the 3' end, respecting strand
Arg [9]    : -LIMIT, Int     : The maximum number of Features to return, defaulting to one. Equally near features are all returned

Description: Searches for features within the suggested -RANGE, and if it finds none, expands the search area
             until it satisfies -LIMIT or hits -MAX_RANGE. Useful if you don't know how far away the features
             might be, or if dealing with areas of high feature density. In the case of Variation Features, it is
             conceivable that a 2000 bp window might contain very many features, resulting in a slow and bloated
             response, thus the ability to explore outward in smaller sections can be useful.
Returntype : Listref of [$feature,$distance]

fetch_all_nearest_by_Feature

Arg [1]    : -FEATURE ,Bio::EnsEMBL::Feature : 'Source' Feature to anchor the search for nearest Features
Arg [2]    : -SAME_STRAND, Boolean (optional): Respect the strand of the source Feature with ref, only 
                                               returning Features on the same strand
Arg [3]    : -OPPOSITE_STRAND, Boolean (optional) : Find features on the opposite strand of the same
Arg [4]    : -DOWNSTREAM/-UPSTREAM, (optional) : Search ONLY downstream or upstream from the source Feature.
             Can be omitted for searches in both directions.
Arg [5]    : -RANGE, Int     : The size of the space to search for Features. Defaults to 1000 as a sensible starting point
Arg [6]    : -NOT_OVERLAPPING, Boolean (optional) : Do not return Features that overlap the source Feature
Arg [7]    : -FIVE_PRIME, Boolean (optional) : Determine range to a Feature by its 5' end, respecting strand
Arg [8]    : -THREE_PRIME, Boolean (optional): Determine range to a Feature by its 3' end, respecting strand
Arg [9]    : -LIMIT, Int     : The maximum number of Features to return, defaulting to one. Equally near features are all returned
Example    : #To fetch the gene(s) with the nearest 5' end:
             $genes = $gene_adaptor->fetch_all_nearest_by_Feature(-FEATURE => $feat, -FIVE_PRIME => 1);

Description: Gets the nearest Features to a given 'source' Feature. The Feature returned and the format of the result
             are non-obvious, please read on.

             When looking beyond the boundaries of the source Feature, the distance is measured to the nearest end 
             of that Feature to the nearby Feature's nearest end.
             If Features overlap the source Feature, then they are given a distance of zero but ordered by
             their proximity to the centre of the Feature.
             
             Features are found and prioritised within 1000 base pairs unless a -RANGE is given to the method. Any overlap with
             the search region is included, and the results can be restricted to upstream, downstream, forward strand or reverse

             The -FIVE_PRIME and -THREE_PRIME options allow searching for specific ends of nearby features, but still needs
             a -DOWN/UPSTREAM value and/or -NOT_OVERLAPPING to fulfil its most common application.


Returntype : Listref containing an Arrayref of Bio::EnsEMBL::Feature objects and the distance
             [ [$feature, $distance] ... ]
Caller     : general

select_nearest

Arg [1]    : Bio::Ensembl::Feature, a Feature to find the nearest neighbouring feature to.
Arg [2]    : Listref of Features to be considered for nearness.
Arg [3]    : Integer, limited number of Features to return. Equally near features are all returned in spite of this limit
Arg [4]    : Boolean, Overlapping prohibition. Overlapped Features are forgotten
Arg [5]    : Boolean, use the 5' ends of the nearby features for distance calculation
Arg [6]    : Boolean, use the 3' ends of the nearby features for distance calculation
Example    : $feature_list = $feature_adaptor->select_nearest($ref_feature,\@candidates,$limit,$not_overlapping)
Description: Take a list of possible features, and determine which is nearest. Nearness is a
             tricky concept. Beware of using the distance between Features, as it may not be the number you think
             it should be.
Returntype : listref of Features ordered by proximity
Caller     : BaseFeatureAdaptor->fetch_all_nearest_by_Feature

_compute_nearest_end

Arg [1]    : Reference feature start
Arg [2]    : Reference feature mid-point
Arg [3]    : Reference feature end
Arg [4]    : Considered feature start
Arg [5]    : Considered feature mid-point
Arg [6]    : Considered feature end
Example    : $distance = $feature_adaptor->_compute_nearest_end($ref_start,$ref_midpoint,$ref_end,$f_start,$f_midpoint,$f_end)
Description: For a given feature, calculate the smallest legitimate distance to a reference feature
             Calculate by mid-points to accommodate overlaps
Returntype : Integer distance in base pairs
Caller     : BaseFeatureAdaptor->select_nearest()

_compute_prime_distance

Arg [1]    : Reference feature start
Arg [2]    : Reference feature mid-point
Arg [3]    : Reference feature end
Arg [4]    : Considered feature start
Arg [5]    : Considered feature mid-point
Arg [6]    : Considered feature end
Arg [7]    : Considered feature strand
Example    : $distance,$weighted_centre_distance = $feature_adaptor->_compute_prime_distance($ref_start,$ref_midpoint,$ref_end,$f_start,$f_midpoint,$f_end,$f_strand)
Description: Calculate the smallest distance to the 5' end of the considered feature
Returntype : Integer distance in base pairs or a string warning that the result doesn't mean anything.
             Nearest 5' and 3' features shouldn't reside inside the reference Feature
Caller     : BaseFeatureAdaptor->select_nearest()

_compute_midpoint

Arg [1]    : Bio::EnsEMBL::Feature
Example    : $middle = $feature_adaptor->_compute_midpoint($feature);
Description: Calculate the mid-point of a Feature. Used for comparing Features that overlap each other
             and determining a canonical distance between two Features for the majority of use cases.
Returntype : Integer coordinate rounded down.
Caller     : BaseFeatureAdaptor->select_nearest()