LICENSE
Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
CONTACT
Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.
Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.
NAME
Bio::EnsEMBL::IdMapping::Cache - a cache to hold data objects used by the IdMapping application
DESCRIPTION
METHODS
new
Arg [LOGGER]: Bio::EnsEMBL::Utils::Logger $logger - a logger object
Arg [CONF] : Bio::EnsEMBL::Utils::ConfParser $conf - a configuration object
Example : my $cache = Bio::EnsEMBL::IdMapping::Cache->new(
-LOGGER => $logger,
-CONF => $conf,
);
Description : constructor
Return type : Bio::EnsEMBL::IdMapping::Cache object
Exceptions : thrown on wrong or missing arguments
Caller : general
Status : At Risk
: under development
build_cache_by_slice
Arg[1] : String $dbtype - db type (source|target)
Arg[2] : String $slice_name - the name of a slice (format as returned by
Bio::EnsEMBL::Slice->name)
Example : my ($num_genes, $filesize) = $cache->build_cache_by_slice(
'source', 'chromosome:NCBI36:X:1:1000000:-1');
Description : Builds a cache of genes, transcripts, translations and exons
needed by the IdMapping application and serialises the resulting
cache object to a file, one slice at a time.
Return type : list of the number of genes processed and the size of the
serialised cache file
Exceptions : thrown on invalid slice name
Caller : general
Status : At Risk
: under development
build_cache_all
Arg[1] : String $dbtype - db type (source|target)
Example : my ($num_genes, $filesize) = $cache->build_cache_all('source');
Description : Builds a cache of genes, transcripts, translations and exons
needed by the IdMapping application and serialises the
resulting cache object to a file. All genes across the genome
are processed in one go. This method should be used when
build_cache_by_seq_region can't be used due to a large number
of toplevel seq_regions (e.g. 2x genomes).
Return type : list of the number of genes processed and the size of the
serialised cache file
Exceptions : thrown on invalid slice name
Caller : general
Status : At Risk
: under development
build_cache_from_genes
Arg[1] : String $type - cache type
Arg[2] : Listref of Bio::EnsEMBL::Genes $genes - genes to build cache
from
Arg[3] : Boolean $need_project - indicate if we need to project exons to
common coordinate system
Example : $cache->build_cache_from_genes(
'source.chromosome:NCBI36:X:1:100000:1', \@genes);
Description : Builds the cache by fetching transcripts, translations and exons
for a list of genes from the database, and creating lightweight
Bio::EnsEMBL::IdMapping::TinyFeature objects containing only the
data needed by the IdMapping application. These objects are
attached to a name cache in this cache object. Exons only need
to be projected to a commond coordinate system if their native
coordinate system isn't common to source and target assembly
itself.
Return type : int - number of genes after filtering
Exceptions : thrown on wrong or missing arguments
Caller : internal
Status : At Risk
: under development
filter_biotypes
Arg[1] : Listref of Bio::EnsEMBL::Genes $genes - the genes to filter
Example : my @filtered = @{ $cache->filter_biotypes(\@genes) };
Description : Filters a list of genes by biotype. Biotypes are
taken from the IdMapping configuration parameter
'biotypes_include' or 'biotypes_exclude'.
If the configuration parameter 'biotypes_exclude' is
defined, then rather than returning the genes whose
biotype is listed in the configuration parameter
'biotypes_include' the method will return the genes
whose biotype is *not* listed in the 'biotypes_exclude'
configuration parameter.
It is an error to define both these configuration
parameters.
The old parameter 'biotypes' is equivalent to
'biotypes_include'.
Return type : Listref of Bio::EnsEMBL::Genes (or empty list)
Exceptions : none
Caller : internal
Status : At Risk
: under development
add
Arg[1] : String $name - a cache name (e.g. 'genes_by_id')
Arg[2] : String type - a cache type (e.g. "source.$slice_name")
Arg[3] : String $key - key of this entry (e.g. a gene dbID)
Arg[4] : Bio::EnsEMBL::IdMappping::TinyFeature $val - value to cache
Example : $cache->add('genes_by_id',
'source.chromosome:NCBI36:X:1:1000000:1', '1234', $tiny_gene);
Description : Adds a TinyFeature object to a named cache.
Return type : Bio::EnsEMBL::IdMapping::TinyFeature
Exceptions : thrown on wrong or missing arguments
Caller : internal
Status : At Risk
: under development
add_list
Arg[1] : String $name - a cache name (e.g. 'genes_by_id')
Arg[2] : String type - a cache type (e.g. "source.$slice_name")
Arg[3] : String $key - key of this entry (e.g. a gene dbID)
Arg[4] : List of Bio::EnsEMBL::IdMappping::TinyFeature @val - values
to cache
Example : $cache->add_list('transcripts_by_exon_id',
'source.chromosome:NCBI36:X:1:1000000:1', '1234',
$tiny_transcript1, $tiny_transcript2);
Description : Adds a list of TinyFeature objects to a named cache.
Return type : Listref of Bio::EnsEMBL::IdMapping::TinyFeature objects
Exceptions : thrown on wrong or missing arguments
Caller : internal
Status : At Risk
: under development