The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

  Please email comments or questions to the public Ensembl
  developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

  Questions may also be sent to the Ensembl help desk at
  <http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::DBSQL::BaseFeatureAdaptor - An Abstract Base class for all FeatureAdaptors

SYNOPSIS

Abstract class - should not be instantiated. Implementation of abstract methods must be performed by subclasses.

DESCRIPTION

This is a base adaptor for feature adaptors. This base class is simply a way of eliminating code duplication through the implementation of methods common to all feature adaptors.

METHODS

new

  Arg [1]    : list of args @args
               Superclass constructor arguments
  Example    : none
  Description: Constructor which warns if caching has been switched off
  Returntype : Bio::EnsEMBL::BaseFeatureAdaptor
  Exceptions : none
  Caller     : implementing subclass constructors
  Status     : Stable

start_equals_end

  Arg [1]    : (optional) boolean $newval
  Example    : $bfa->start_equals_end(1);
  Description: Getter/Setter for the start_equals_end flag.  If set
               to true sub _slice_fetch will use a simplified sql to retrieve 1bp slices.
  Returntype : boolean
  Exceptions : none
  Caller     : EnsemblGenomes variation DB build
  Status     : Stable
  

clear_cache

  Args      : None
  Example   : my $sa =
                $registry->get_adaptor( 'Mus musculus', 'Core',
                                        'Slice' );
              my $ga =
                $registry->get_adaptor( 'Mus musculus', 'Core',
                                        'Gene' );

              my $slice =
                $sa->fetch_by_region( 'Chromosome', '1', 1e8,
                                      1.05e8 );

              my $genes = $ga->fetch_all_by_Slice($slice);

              $ga->clear_cache();

  Description   : Empties the feature cache associated with this
                  feature adaptor.
  Return type   : None
  Exceptions    : None
  Caller        : General
  Status        : At risk (under development)

_slice_feature_cache

  Description   : Returns the feature cache if we are allowed to cache and
                will build it if we need to. We will never return a reference
                to the hash to avoid unintentional auto-vivfying caching
  Returntype    : Bio::EnsEMBL::Utils::Cache
  Exceptions    : None
  Caller        : Internal

fetch_all_by_Slice

  Arg [1]    : Bio::EnsEMBL::Slice $slice
               the slice from which to obtain features
  Arg [2]    : (optional) string $logic_name
               the logic name of the type of features to obtain
  Example    : $fts = $a->fetch_all_by_Slice($slice, 'Swall');
  Description: Returns a listref of features created from the database 
               which are on the Slice defined by $slice. If $logic_name is 
               defined only features with an analysis of type $logic_name 
               will be returned. 
               NOTE: only features that are entirely on the slice's seq_region
               will be returned (i.e. if they hang off the start/end of a
               seq_region they will be discarded). Features can extend over the
               slice boundaries though (in cases where you have a slice that
               doesn't span the whole seq_region).
  Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
  Exceptions : none
  Caller     : Bio::EnsEMBL::Slice
  Status     : Stable

fetch_Iterator_by_Slice_method

  Arg [1]    : CODE ref of Slice fetch method
  Arg [2]    : ARRAY ref of parameters for Slice fetch method
  Arg [3]    : Optional int: Slice index in parameters array
  Arg [4]    : Optional int: Slice chunk size. Default=500000
  Example    : my $slice_iter = $feature_adaptor->fetch_Iterator_by_Slice_method
                                      ($feature_adaptor->can('fetch_all_by_Slice_Arrays'),
                                           \@fetch_method_params,
                                           0,#Slice idx
                                          );

               while(my $feature = $slice_iter->next && defined $feature){
                 #Do something here
               }

  Description: Creates an Iterator which chunks the query Slice to facilitate
               large Slice queries which would have previously run out of memory
  Returntype : Bio::EnsEMBL::Utils::Iterator
  Exceptions : Throws if mandatory params not valid
  Caller     : general
  Status     : at risk

fetch_Iterator_by_Slice

  Arg [1]    : Bio::EnsEMBL::Slice
  Arg [2]    : Optional string: logic name of analysis
  Arg [3]    : Optional int: Chunk size to iterate over. Default is 500000
  Example    : my $slice_iter = $feature_adaptor->fetch_Iterator_by_Slice($slice);

               while(my $feature = $slice_iter->next && defined $feature){
                 #Do something here
               }

  Description: Creates an Iterator which chunks the query Slice to facilitate
               large Slice queries which would have previously run out of memory
  Returntype : Bio::EnsEMBL::Utils::Iterator
  Exceptions : None
  Caller     : general
  Status     : at risk

fetch_all_by_Slice_and_score

  Arg [1]    : Bio::EnsEMBL::Slice $slice
               the slice from which to obtain features
  Arg [2]    : (optional) float $score
               lower bound of the the score of the features retrieved
  Arg [3]    : (optional) string $logic_name
               the logic name of the type of features to obtain
  Example    : $fts = $a->fetch_all_by_Slice_and_score($slice,90,'Swall');
  Description: Returns a list of features created from the database which are 
               are on the Slice defined by $slice and which have a score 
               greater than $score. If $logic_name is defined, 
               only features with an analysis of type $logic_name will be 
               returned. 
  Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
  Exceptions : none
  Caller     : Bio::EnsEMBL::Slice
  Status     : Stable

fetch_all_by_Slice_constraint

  Arg [1]    : Bio::EnsEMBL::Slice $slice
               the slice from which to obtain features
  Arg [2]    : (optional) string $constraint
               An SQL query constraint (i.e. part of the WHERE clause)
  Arg [3]    : (optional) string $logic_name
               the logic name of the type of features to obtain
  Example    : $fs = $a->fetch_all_by_Slice_constraint($slc, 'perc_ident > 5');
  Description: Returns a listref of features created from the database which 
               are on the Slice defined by $slice and fulfill the SQL 
               constraint defined by $constraint. If logic name is defined, 
               only features with an analysis of type $logic_name will be 
               returned. 
  Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
  Exceptions : thrown if $slice is not defined
  Caller     : Bio::EnsEMBL::Slice
  Status     : Stable

fetch_all_by_logic_name

  Arg [1]    : string $logic_name
               the logic name of the type of features to obtain
  Example    : $fs = $a->fetch_all_by_logic_name('foobar');
  Description: Returns a listref of features created from the database.
               only features with an analysis of type $logic_name will
               be returned.  If the logic name is invalid (not in the
               analysis table), a reference to an empty list will be
               returned.
  Returntype : listref of Bio::EnsEMBL::SeqFeatures
  Exceptions : thrown if no $logic_name
  Caller     : General
  Status     : Stable

fetch_all_by_stable_id_list

  Arg [1]    : string $logic_name
               the logic name of the type of features to obtain
  Arg [2]    : Bio::EnsEMBL::Slice $slice
               the slice from which to obtain features
  Example    : $fs = $a->fetch_all_by_stable_id_list(["ENSG00001","ENSG00002", ...]);
  Description: Returns a listref of features identified by their stable IDs.
               This method only fetches features of the same type as the calling
               adaptor. 
               Results are constrained to a slice if the slice is provided.
  Returntype : listref of Bio::EnsEMBL::Feature
  Exceptions : thrown if no stable ID list is provided.
  Caller     : General
  Status     : Stable

count_by_Slice_constraint

    Arg [1]     : Bio::EnsEMBL::Slice
    Arg [2]     : String Custom SQL constraint
    Arg [3]     : String Logic name to search by
    Description : Finds all features with at least partial overlap to the given
                  slice and sums them up.
                  Explanation of workings with projections:
                  
                  |-------------------------Constraint Slice---------------------------------|
                  |             |                                           |                |
                  |--Segment 1--|                                           |                |
                  |             |--Segment 2, on desired Coordinate System--|                |
                  |             |                                           |---Segment 3----|
              #Feature 1#    #Feature 2#                               #Feature 3#           |
                  |         #####################Feature 4####################               |
                  | #Feature 5# |                                           |                |
                  
                  Feature 1 is overlapping the original constraint. Counted in Segment 1
                  Feature 2,3 and 4  are counted when inspecting Segment 2
                  Feature 5 is counted in Segment 1
                  
    Returntype  : Integer

_get_and_filter_Slice_projections

    Arg [1]     : Bio::EnsEMBL::Slice
    Description : Delegates onto SliceAdaptor::fetch_normalized_slice_projection() 
                  with filtering on
    Returntype  : ArrayRef Bio::EnsEMBL::ProjectionSegment; Returns an array
                  of projected segments

_generate_feature_bounds

    Arg [1]     : Bio::EnsEMBL::Slice
    Description : Performs a projection of Slice and records the bounds
                  of that projection. This can be used later on to restrict
                  Features which overlap into unwanted areas such as
                  regions which exist on another HAP/PAR region.
                  
                  Bounds are defined as projection_start - slice_start + 1.
    Example     : my $bounds = $self->_generate_feature_bounds($slice);
    Returntype  : ArrayRef Integer; Returns the location of the bounds.

_get_by_Slice Arg [0] : Bio::EnsEMBL::Slice to find all the features within Arg [1] : SQL constraint string Arg [2] : Type of query to run. Default behaviour is to select, but 'count' is also valid Description: Abstracted logic from _slice_fetch Returntype : Listref of Bio::EnsEMBL::Feature, or integers for counting mode

store

  Arg [1]    : list of Bio::EnsEMBL::SeqFeature
  Example    : $adaptor->store(@feats);
  Description: ABSTRACT  Subclasses are responsible for implementing this 
               method.  It should take a list of features and store them in 
               the database.
  Returntype : none
  Exceptions : thrown method is not implemented by subclass
  Caller     : general
  Status     : At Risk
             : throws if called.

remove

  Arg [1]    : A feature $feature 
  Example    : $feature_adaptor->remove($feature);
  Description: This removes a feature from the database.  The table the
               feature is removed from is defined by the abstract method
               _tablename, and the primary key of the table is assumed
               to be _tablename() . '_id'.  The feature argument must 
               be an object implementing the dbID method, and for the
               feature to be removed from the database a dbID value must
               be returned.
  Returntype : none
  Exceptions : thrown if $feature arg does not implement dbID(), or if
               $feature->dbID is not a true value
  Caller     : general
  Status     : Stable

remove_by_Slice

  Arg [1]    : Bio::Ensembl::Slice $slice
  Example    : $feature_adaptor->remove_by_Slice($slice);
  Description: This removes features from the database which lie on a region
               represented by the passed in slice.  Only features which are
               fully contained by the slice are deleted; features which overlap
               the edge of the slice are not removed.
               The table the features are removed from is defined by
               the abstract method_tablename.
  Returntype : none
  Exceptions : thrown if no slice is supplied
  Caller     : general
  Status     : Stable

fetch_nearest_by_Feature

  Arg [1]    : Reference Feature to start the search from
  Description: Searches iteratively outward from the starting feature until a nearby Feature is found
               If you require more than one result or more control of which features are returned, see
               fetch_all_nearest_by_Feature and fetch_all_by_outward_search. fetch_nearest_by_Feature
               is a convenience method.
  ReturnType : Bio::EnsEMBL::Feature
  Arguments the same as fetch_all_nearest_by_Feature
  Arg [0]    : -MAX_RANGE : Set an upper limit on the search range, defaults to 10000 bp 
  Arg [1]    : -FEATURE ,Bio::EnsEMBL::Feature : 'Source' Feature to anchor the search for nearest Features
  Arg [2]    : -SAME_STRAND, Boolean (optional)  : Respect the strand of the source Feature with ref, only 
               returning Features on the same strand.
  Arg [3]    : -OPPOSITE_STRAND, Boolean (optional) : Find features on the opposite strand of the same
  Arg [4]    : -DOWNSTREAM/-UPSTREAM, (optional) : Search ONLY downstream or upstream from the source Feature.
               Can be omitted for searches in both directions.
  Arg [5]    : -RANGE, Int     : The size of the space to search for Features. Defaults to 1000 as a sensible starting point
  Arg [6]    : -NOT_OVERLAPPING, Boolean (optional) : Do not return Features that overlap the source Feature
  Arg [7]    : -FIVE_PRIME, Boolean (optional) : Determine range to a Feature by the 5' end, respecting strand
  Arg [8]    : -THREE_PRIME, Boolean (optional): Determine range to a Feature by the 3' end, respecting strand
  Arg [9]    : -LIMIT, Int     : The maximum number of Features to return, defaulting to one. Equally near features are all returned

  Description: Searches for features within the suggested -RANGE, and if it finds none, expands the search area
               until it satisfies -LIMIT or hits -MAX_RANGE. Useful if you don't know how far away the features
               might be, or if dealing with areas of high feature density. In the case of Variation Features, it is
               conceivable that a 2000 bp window might contain very many features, resulting in a slow and bloated
               response, thus the ability to explore outward in smaller sections can be useful.
  Returntype : Listref of [$feature,$distance]

fetch_all_nearest_by_Feature

  Arg [1]    : -FEATURE ,Bio::EnsEMBL::Feature : 'Source' Feature to anchor the search for nearest Features
  Arg [2]    : -SAME_STRAND, Boolean (optional): Respect the strand of the source Feature with ref, only 
                                                 returning Features on the same strand
  Arg [3]    : -OPPOSITE_STRAND, Boolean (optional) : Find features on the opposite strand of the same
  Arg [4]    : -DOWNSTREAM/-UPSTREAM, (optional) : Search ONLY downstream or upstream from the source Feature.
               Can be omitted for searches in both directions.
  Arg [5]    : -RANGE, Int     : The size of the space to search for Features. Defaults to 1000 as a sensible starting point
  Arg [6]    : -NOT_OVERLAPPING, Boolean (optional) : Do not return Features that overlap the source Feature
  Arg [7]    : -FIVE_PRIME, Boolean (optional) : Determine range to a Feature by its 5' end, respecting strand
  Arg [8]    : -THREE_PRIME, Boolean (optional): Determine range to a Feature by its 3' end, respecting strand
  Arg [9]    : -LIMIT, Int     : The maximum number of Features to return, defaulting to one. Equally near features are all returned
  Example    : #To fetch the gene(s) with the nearest 5' end:
               $genes = $gene_adaptor->fetch_all_nearest_by_Feature(-FEATURE => $feat, -FIVE_PRIME => 1);

  Description: Gets the nearest Features to a given 'source' Feature. The Feature returned and the format of the result
               are non-obvious, please read on.

               When looking beyond the boundaries of the source Feature, the distance is measured to the nearest end 
               of that Feature to the nearby Feature's nearest end.
               If Features overlap the source Feature, then they are given a distance of zero but ordered by
               their proximity to the centre of the Feature.
               
               Features are found and prioritised within 1000 base pairs unless a -RANGE is given to the method. Any overlap with
               the search region is included, and the results can be restricted to upstream, downstream, forward strand or reverse

               The -FIVE_PRIME and -THREE_PRIME options allow searching for specific ends of nearby features, but still needs
               a -DOWN/UPSTREAM value and/or -NOT_OVERLAPPING to fulfil its most common application.


  Returntype : Listref containing an Arrayref of Bio::EnsEMBL::Feature objects and the distance
               [ [$feature, $distance] ... ]
  Caller     : general

select_nearest

  Arg [1]    : Bio::Ensembl::Feature, a Feature to find the nearest neighbouring feature to.
  Arg [2]    : Listref of Features to be considered for nearness.
  Arg [3]    : Integer, limited number of Features to return. Equally near features are all returned in spite of this limit
  Arg [4]    : Boolean, Overlapping prohibition. Overlapped Features are forgotten
  Arg [5]    : Boolean, use the 5' ends of the nearby features for distance calculation
  Arg [6]    : Boolean, use the 3' ends of the nearby features for distance calculation
  Example    : $feature_list = $feature_adaptor->select_nearest($ref_feature,\@candidates,$limit,$not_overlapping)
  Description: Take a list of possible features, and determine which is nearest. Nearness is a
               tricky concept. Beware of using the distance between Features, as it may not be the number you think
               it should be.
  Returntype : listref of Features ordered by proximity
  Caller     : BaseFeatureAdaptor->fetch_all_nearest_by_Feature

_compute_nearest_end

  Arg [1]    : Reference feature start
  Arg [2]    : Reference feature mid-point
  Arg [3]    : Reference feature end
  Arg [4]    : Considered feature start
  Arg [5]    : Considered feature mid-point
  Arg [6]    : Considered feature end
  Example    : $distance = $feature_adaptor->_compute_nearest_end($ref_start,$ref_midpoint,$ref_end,$f_start,$f_midpoint,$f_end)
  Description: For a given feature, calculate the smallest legitimate distance to a reference feature
               Calculate by mid-points to accommodate overlaps
  Returntype : Integer distance in base pairs
  Caller     : BaseFeatureAdaptor->select_nearest()

_compute_prime_distance

  Arg [1]    : Reference feature start
  Arg [2]    : Reference feature mid-point
  Arg [3]    : Reference feature end
  Arg [4]    : Considered feature start
  Arg [5]    : Considered feature mid-point
  Arg [6]    : Considered feature end
  Arg [7]    : Considered feature strand
  Example    : $distance,$weighted_centre_distance = $feature_adaptor->_compute_prime_distance($ref_start,$ref_midpoint,$ref_end,$f_start,$f_midpoint,$f_end,$f_strand)
  Description: Calculate the smallest distance to the 5' end of the considered feature
  Returntype : Integer distance in base pairs or a string warning that the result doesn't mean anything.
               Nearest 5' and 3' features shouldn't reside inside the reference Feature
  Caller     : BaseFeatureAdaptor->select_nearest()

_compute_midpoint

  Arg [1]    : Bio::EnsEMBL::Feature
  Example    : $middle = $feature_adaptor->_compute_midpoint($feature);
  Description: Calculate the mid-point of a Feature. Used for comparing Features that overlap each other
               and determining a canonical distance between two Features for the majority of use cases.
  Returntype : Integer coordinate rounded down.
  Caller     : BaseFeatureAdaptor->select_nearest()