LICENSE
Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
CONTACT
Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.
Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.
NAME
Bio::EnsEMBL::DBSQL::BaseFeatureAdaptor - An Abstract Base class for all FeatureAdaptors
SYNOPSIS
Abstract class - should not be instantiated. Implementation of abstract methods must be performed by subclasses.
DESCRIPTION
This is a base adaptor for feature adaptors. This base class is simply a way of eliminating code duplication through the implementation of methods common to all feature adaptors.
METHODS
new
Arg [1] : list of args @args
Superclass constructor arguments
Example : none
Description: Constructor which warns if caching has been switched off
Returntype : Bio::EnsEMBL::BaseFeatureAdaptor
Exceptions : none
Caller : implementing subclass constructors
Status : Stable
start_equals_end
Arg [1] : (optional) boolean $newval
Example : $bfa->start_equals_end(1);
Description: Getter/Setter for the start_equals_end flag. If set
to true sub _slice_fetch will use a simplified sql to retrieve 1bp slices.
Returntype : boolean
Exceptions : none
Caller : EnsemblGenomes variation DB build
Status : Stable
clear_cache
Args : None
Example : my $sa =
$registry->get_adaptor( 'Mus musculus', 'Core',
'Slice' );
my $ga =
$registry->get_adaptor( 'Mus musculus', 'Core',
'Gene' );
my $slice =
$sa->fetch_by_region( 'Chromosome', '1', 1e8,
1.05e8 );
my $genes = $ga->fetch_all_by_Slice($slice);
$ga->clear_cache();
Description : Empties the feature cache associated with this
feature adaptor.
Return type : None
Exceptions : None
Caller : General
Status : At risk (under development)
_slice_feature_cache
Description : Returns the feature cache if we are allowed to cache and
will build it if we need to. We will never return a reference
to the hash to avoid unintentional auto-vivfying caching
Returntype : Bio::EnsEMBL::Utils::Cache
Exceptions : None
Caller : Internal
fetch_all_by_Slice
Arg [1] : Bio::EnsEMBL::Slice $slice
the slice from which to obtain features
Arg [2] : (optional) string $logic_name
the logic name of the type of features to obtain
Example : $fts = $a->fetch_all_by_Slice($slice, 'Swall');
Description: Returns a listref of features created from the database
which are on the Slice defined by $slice. If $logic_name is
defined only features with an analysis of type $logic_name
will be returned.
NOTE: only features that are entirely on the slice's seq_region
will be returned (i.e. if they hang off the start/end of a
seq_region they will be discarded). Features can extend over the
slice boundaries though (in cases where you have a slice that
doesn't span the whole seq_region).
Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
Exceptions : none
Caller : Bio::EnsEMBL::Slice
Status : Stable
fetch_Iterator_by_Slice_method
Arg [1] : CODE ref of Slice fetch method
Arg [2] : ARRAY ref of parameters for Slice fetch method
Arg [3] : Optional int: Slice index in parameters array
Arg [4] : Optional int: Slice chunk size. Default=500000
Example : my $slice_iter = $feature_adaptor->fetch_Iterator_by_Slice_method
($feature_adaptor->can('fetch_all_by_Slice_Arrays'),
\@fetch_method_params,
0,#Slice idx
);
while(my $feature = $slice_iter->next && defined $feature){
#Do something here
}
Description: Creates an Iterator which chunks the query Slice to facilitate
large Slice queries which would have previously run out of memory
Returntype : Bio::EnsEMBL::Utils::Iterator
Exceptions : Throws if mandatory params not valid
Caller : general
Status : at risk
fetch_Iterator_by_Slice
Arg [1] : Bio::EnsEMBL::Slice
Arg [2] : Optional string: logic name of analysis
Arg [3] : Optional int: Chunk size to iterate over. Default is 500000
Example : my $slice_iter = $feature_adaptor->fetch_Iterator_by_Slice($slice);
while(my $feature = $slice_iter->next && defined $feature){
#Do something here
}
Description: Creates an Iterator which chunks the query Slice to facilitate
large Slice queries which would have previously run out of memory
Returntype : Bio::EnsEMBL::Utils::Iterator
Exceptions : None
Caller : general
Status : at risk
fetch_all_by_Slice_and_score
Arg [1] : Bio::EnsEMBL::Slice $slice
the slice from which to obtain features
Arg [2] : (optional) float $score
lower bound of the the score of the features retrieved
Arg [3] : (optional) string $logic_name
the logic name of the type of features to obtain
Example : $fts = $a->fetch_all_by_Slice_and_score($slice,90,'Swall');
Description: Returns a list of features created from the database which are
are on the Slice defined by $slice and which have a score
greater than $score. If $logic_name is defined,
only features with an analysis of type $logic_name will be
returned.
Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
Exceptions : none
Caller : Bio::EnsEMBL::Slice
Status : Stable
fetch_all_by_Slice_constraint
Arg [1] : Bio::EnsEMBL::Slice $slice
the slice from which to obtain features
Arg [2] : (optional) string $constraint
An SQL query constraint (i.e. part of the WHERE clause)
Arg [3] : (optional) string $logic_name
the logic name of the type of features to obtain
Example : $fs = $a->fetch_all_by_Slice_constraint($slc, 'perc_ident > 5');
Description: Returns a listref of features created from the database which
are on the Slice defined by $slice and fulfill the SQL
constraint defined by $constraint. If logic name is defined,
only features with an analysis of type $logic_name will be
returned.
Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates
Exceptions : thrown if $slice is not defined
Caller : Bio::EnsEMBL::Slice
Status : Stable
fetch_all_by_logic_name
Arg [1] : string $logic_name
the logic name of the type of features to obtain
Example : $fs = $a->fetch_all_by_logic_name('foobar');
Description: Returns a listref of features created from the database.
only features with an analysis of type $logic_name will
be returned. If the logic name is invalid (not in the
analysis table), a reference to an empty list will be
returned.
Returntype : listref of Bio::EnsEMBL::SeqFeatures
Exceptions : thrown if no $logic_name
Caller : General
Status : Stable
fetch_all_by_stable_id_list
Arg [1] : string $logic_name
the logic name of the type of features to obtain
Arg [2] : Bio::EnsEMBL::Slice $slice
the slice from which to obtain features
Example : $fs = $a->fetch_all_by_stable_id_list(["ENSG00001","ENSG00002", ...]);
Description: Returns a listref of features identified by their stable IDs.
This method only fetches features of the same type as the calling
adaptor.
Results are constrained to a slice if the slice is provided.
Returntype : listref of Bio::EnsEMBL::Feature
Exceptions : thrown if no stable ID list is provided.
Caller : General
Status : Stable
count_by_Slice_constraint
Arg [1] : Bio::EnsEMBL::Slice
Arg [2] : String Custom SQL constraint
Arg [3] : String Logic name to search by
Description : Finds all features with at least partial overlap to the given
slice and sums them up.
Explanation of workings with projections:
|-------------------------Constraint Slice---------------------------------|
| | | |
|--Segment 1--| | |
| |--Segment 2, on desired Coordinate System--| |
| | |---Segment 3----|
#Feature 1# #Feature 2# #Feature 3# |
| #####################Feature 4#################### |
| #Feature 5# | | |
Feature 1 is overlapping the original constraint. Counted in Segment 1
Feature 2,3 and 4 are counted when inspecting Segment 2
Feature 5 is counted in Segment 1
Returntype : Integer
_get_and_filter_Slice_projections
Arg [1] : Bio::EnsEMBL::Slice
Description : Delegates onto SliceAdaptor::fetch_normalized_slice_projection()
with filtering on
Returntype : ArrayRef Bio::EnsEMBL::ProjectionSegment; Returns an array
of projected segments
_generate_feature_bounds
Arg [1] : Bio::EnsEMBL::Slice
Description : Performs a projection of Slice and records the bounds
of that projection. This can be used later on to restrict
Features which overlap into unwanted areas such as
regions which exist on another HAP/PAR region.
Bounds are defined as projection_start - slice_start + 1.
Example : my $bounds = $self->_generate_feature_bounds($slice);
Returntype : ArrayRef Integer; Returns the location of the bounds.
_get_by_Slice Arg [0] : Bio::EnsEMBL::Slice to find all the features within Arg [1] : SQL constraint string Arg [2] : Type of query to run. Default behaviour is to select, but 'count' is also valid Description: Abstracted logic from _slice_fetch Returntype : Listref of Bio::EnsEMBL::Feature, or integers for counting mode
store
Arg [1] : list of Bio::EnsEMBL::SeqFeature
Example : $adaptor->store(@feats);
Description: ABSTRACT Subclasses are responsible for implementing this
method. It should take a list of features and store them in
the database.
Returntype : none
Exceptions : thrown method is not implemented by subclass
Caller : general
Status : At Risk
: throws if called.
remove
Arg [1] : A feature $feature
Example : $feature_adaptor->remove($feature);
Description: This removes a feature from the database. The table the
feature is removed from is defined by the abstract method
_tablename, and the primary key of the table is assumed
to be _tablename() . '_id'. The feature argument must
be an object implementing the dbID method, and for the
feature to be removed from the database a dbID value must
be returned.
Returntype : none
Exceptions : thrown if $feature arg does not implement dbID(), or if
$feature->dbID is not a true value
Caller : general
Status : Stable
remove_by_Slice
Arg [1] : Bio::Ensembl::Slice $slice
Example : $feature_adaptor->remove_by_Slice($slice);
Description: This removes features from the database which lie on a region
represented by the passed in slice. Only features which are
fully contained by the slice are deleted; features which overlap
the edge of the slice are not removed.
The table the features are removed from is defined by
the abstract method_tablename.
Returntype : none
Exceptions : thrown if no slice is supplied
Caller : general
Status : Stable
fetch_nearest_by_Feature
Arg [1] : Reference Feature to start the search from
Description: Searches iteratively outward from the starting feature until a nearby Feature is found
If you require more than one result or more control of which features are returned, see
fetch_all_nearest_by_Feature and fetch_all_by_outward_search. fetch_nearest_by_Feature
is a convenience method.
ReturnType : Bio::EnsEMBL::Feature
fetch_all_by_outward_search
Arguments the same as fetch_all_nearest_by_Feature
Arg [0] : -MAX_RANGE : Set an upper limit on the search range, defaults to 10000 bp
Arg [1] : -FEATURE ,Bio::EnsEMBL::Feature : 'Source' Feature to anchor the search for nearest Features
Arg [2] : -SAME_STRAND, Boolean (optional) : Respect the strand of the source Feature with ref, only
returning Features on the same strand.
Arg [3] : -OPPOSITE_STRAND, Boolean (optional) : Find features on the opposite strand of the same
Arg [4] : -DOWNSTREAM/-UPSTREAM, (optional) : Search ONLY downstream or upstream from the source Feature.
Can be omitted for searches in both directions.
Arg [5] : -RANGE, Int : The size of the space to search for Features. Defaults to 1000 as a sensible starting point
Arg [6] : -NOT_OVERLAPPING, Boolean (optional) : Do not return Features that overlap the source Feature
Arg [7] : -FIVE_PRIME, Boolean (optional) : Determine range to a Feature by the 5' end, respecting strand
Arg [8] : -THREE_PRIME, Boolean (optional): Determine range to a Feature by the 3' end, respecting strand
Arg [9] : -LIMIT, Int : The maximum number of Features to return, defaulting to one. Equally near features are all returned
Description: Searches for features within the suggested -RANGE, and if it finds none, expands the search area
until it satisfies -LIMIT or hits -MAX_RANGE. Useful if you don't know how far away the features
might be, or if dealing with areas of high feature density. In the case of Variation Features, it is
conceivable that a 2000 bp window might contain very many features, resulting in a slow and bloated
response, thus the ability to explore outward in smaller sections can be useful.
Returntype : Listref of [$feature,$distance]
fetch_all_nearest_by_Feature
Arg [1] : -FEATURE ,Bio::EnsEMBL::Feature : 'Source' Feature to anchor the search for nearest Features
Arg [2] : -SAME_STRAND, Boolean (optional): Respect the strand of the source Feature with ref, only
returning Features on the same strand
Arg [3] : -OPPOSITE_STRAND, Boolean (optional) : Find features on the opposite strand of the same
Arg [4] : -DOWNSTREAM/-UPSTREAM, (optional) : Search ONLY downstream or upstream from the source Feature.
Can be omitted for searches in both directions.
Arg [5] : -RANGE, Int : The size of the space to search for Features. Defaults to 1000 as a sensible starting point
Arg [6] : -NOT_OVERLAPPING, Boolean (optional) : Do not return Features that overlap the source Feature
Arg [7] : -FIVE_PRIME, Boolean (optional) : Determine range to a Feature by its 5' end, respecting strand
Arg [8] : -THREE_PRIME, Boolean (optional): Determine range to a Feature by its 3' end, respecting strand
Arg [9] : -LIMIT, Int : The maximum number of Features to return, defaulting to one. Equally near features are all returned
Example : #To fetch the gene(s) with the nearest 5' end:
$genes = $gene_adaptor->fetch_all_nearest_by_Feature(-FEATURE => $feat, -FIVE_PRIME => 1);
Description: Gets the nearest Features to a given 'source' Feature. The Feature returned and the format of the result
are non-obvious, please read on.
When looking beyond the boundaries of the source Feature, the distance is measured to the nearest end
of that Feature to the nearby Feature's nearest end.
If Features overlap the source Feature, then they are given a distance of zero but ordered by
their proximity to the centre of the Feature.
Features are found and prioritised within 1000 base pairs unless a -RANGE is given to the method. Any overlap with
the search region is included, and the results can be restricted to upstream, downstream, forward strand or reverse
The -FIVE_PRIME and -THREE_PRIME options allow searching for specific ends of nearby features, but still needs
a -DOWN/UPSTREAM value and/or -NOT_OVERLAPPING to fulfil its most common application.
Returntype : Listref containing an Arrayref of Bio::EnsEMBL::Feature objects and the distance
[ [$feature, $distance] ... ]
Caller : general
select_nearest
Arg [1] : Bio::Ensembl::Feature, a Feature to find the nearest neighbouring feature to.
Arg [2] : Listref of Features to be considered for nearness.
Arg [3] : Integer, limited number of Features to return. Equally near features are all returned in spite of this limit
Arg [4] : Boolean, Overlapping prohibition. Overlapped Features are forgotten
Arg [5] : Boolean, use the 5' ends of the nearby features for distance calculation
Arg [6] : Boolean, use the 3' ends of the nearby features for distance calculation
Example : $feature_list = $feature_adaptor->select_nearest($ref_feature,\@candidates,$limit,$not_overlapping)
Description: Take a list of possible features, and determine which is nearest. Nearness is a
tricky concept. Beware of using the distance between Features, as it may not be the number you think
it should be.
Returntype : listref of Features ordered by proximity
Caller : BaseFeatureAdaptor->fetch_all_nearest_by_Feature
_compute_nearest_end
Arg [1] : Reference feature start
Arg [2] : Reference feature mid-point
Arg [3] : Reference feature end
Arg [4] : Considered feature start
Arg [5] : Considered feature mid-point
Arg [6] : Considered feature end
Example : $distance = $feature_adaptor->_compute_nearest_end($ref_start,$ref_midpoint,$ref_end,$f_start,$f_midpoint,$f_end)
Description: For a given feature, calculate the smallest legitimate distance to a reference feature
Calculate by mid-points to accommodate overlaps
Returntype : Integer distance in base pairs
Caller : BaseFeatureAdaptor->select_nearest()
_compute_prime_distance
Arg [1] : Reference feature start
Arg [2] : Reference feature mid-point
Arg [3] : Reference feature end
Arg [4] : Considered feature start
Arg [5] : Considered feature mid-point
Arg [6] : Considered feature end
Arg [7] : Considered feature strand
Example : $distance,$weighted_centre_distance = $feature_adaptor->_compute_prime_distance($ref_start,$ref_midpoint,$ref_end,$f_start,$f_midpoint,$f_end,$f_strand)
Description: Calculate the smallest distance to the 5' end of the considered feature
Returntype : Integer distance in base pairs or a string warning that the result doesn't mean anything.
Nearest 5' and 3' features shouldn't reside inside the reference Feature
Caller : BaseFeatureAdaptor->select_nearest()
_compute_midpoint
Arg [1] : Bio::EnsEMBL::Feature
Example : $middle = $feature_adaptor->_compute_midpoint($feature);
Description: Calculate the mid-point of a Feature. Used for comparing Features that overlap each other
and determining a canonical distance between two Features for the majority of use cases.
Returntype : Integer coordinate rounded down.
Caller : BaseFeatureAdaptor->select_nearest()