LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::IdMapping::Cache - a cache to hold data objects used by the IdMapping application

DESCRIPTION

METHODS

new

Arg [LOGGER]: Bio::EnsEMBL::Utils::Logger $logger - a logger object
Arg [CONF]  : Bio::EnsEMBL::Utils::ConfParser $conf - a configuration object
Example     : my $cache = Bio::EnsEMBL::IdMapping::Cache->new(
                -LOGGER => $logger,
                -CONF   => $conf,
              );
Description : constructor
Return type : Bio::EnsEMBL::IdMapping::Cache object
Exceptions  : thrown on wrong or missing arguments
Caller      : general
Status      : At Risk
            : under development

build_cache_by_slice

Arg[1]      : String $dbtype - db type (source|target)
Arg[2]      : String $slice_name - the name of a slice (format as returned by
              Bio::EnsEMBL::Slice->name)
Example     : my ($num_genes, $filesize) = $cache->build_cache_by_slice(
                'source', 'chromosome:NCBI36:X:1:1000000:-1');
Description : Builds a cache of genes, transcripts, translations and exons
              needed by the IdMapping application and serialises the resulting
              cache object to a file, one slice at a time.
Return type : list of the number of genes processed and the size of the
              serialised cache file
Exceptions  : thrown on invalid slice name
Caller      : general
Status      : At Risk
            : under development

build_cache_all

Arg[1]      : String $dbtype - db type (source|target)
Example     : my ($num_genes, $filesize) = $cache->build_cache_all('source');
Description : Builds a cache of genes, transcripts, translations and exons
              needed by the IdMapping application and serialises the
              resulting cache object to a file. All genes across the genome
              are processed in one go. This method should be used when
              build_cache_by_seq_region can't be used due to a large number
              of toplevel seq_regions (e.g. 2x genomes).
Return type : list of the number of genes processed and the size of the
              serialised cache file
Exceptions  : thrown on invalid slice name
Caller      : general
Status      : At Risk
            : under development

build_cache_from_genes

Arg[1]      : String $type - cache type
Arg[2]      : Listref of Bio::EnsEMBL::Genes $genes - genes to build cache
              from
Arg[3]      : Boolean $need_project - indicate if we need to project exons to
              common coordinate system
Example     : $cache->build_cache_from_genes(
                'source.chromosome:NCBI36:X:1:100000:1', \@genes);
Description : Builds the cache by fetching transcripts, translations and exons
              for a list of genes from the database, and creating lightweight
              Bio::EnsEMBL::IdMapping::TinyFeature objects containing only the
              data needed by the IdMapping application. These objects are
              attached to a name cache in this cache object. Exons only need
              to be projected to a commond coordinate system if their native
              coordinate system isn't common to source and target assembly
              itself.
Return type : int - number of genes after filtering
Exceptions  : thrown on wrong or missing arguments
Caller      : internal
Status      : At Risk
            : under development

filter_biotypes

Arg[1]      : Listref of Bio::EnsEMBL::Genes $genes - the genes to filter
Example     : my @filtered = @{ $cache->filter_biotypes(\@genes) };

Description : Filters a list of genes by biotype.  Biotypes are
              taken from the IdMapping configuration parameter
              'biotypes_include' or 'biotypes_exclude'.

              If the configuration parameter 'biotypes_exclude' is
              defined, then rather than returning the genes whose
              biotype is listed in the configuration parameter
              'biotypes_include' the method will return the genes
              whose biotype is *not* listed in the 'biotypes_exclude'
              configuration parameter.

              It is an error to define both these configuration
              parameters.

              The old parameter 'biotypes' is equivalent to
              'biotypes_include'.

Return type : Listref of Bio::EnsEMBL::Genes (or empty list)
Exceptions  : none
Caller      : internal
Status      : At Risk
            : under development

add

Arg[1]      : String $name - a cache name (e.g. 'genes_by_id')
Arg[2]      : String type - a cache type (e.g. "source.$slice_name")
Arg[3]      : String $key - key of this entry (e.g. a gene dbID)
Arg[4]      : Bio::EnsEMBL::IdMappping::TinyFeature $val - value to cache
Example     : $cache->add('genes_by_id',
                'source.chromosome:NCBI36:X:1:1000000:1', '1234', $tiny_gene);
Description : Adds a TinyFeature object to a named cache.
Return type : Bio::EnsEMBL::IdMapping::TinyFeature
Exceptions  : thrown on wrong or missing arguments
Caller      : internal
Status      : At Risk
            : under development

add_list

Arg[1]      : String $name - a cache name (e.g. 'genes_by_id')
Arg[2]      : String type - a cache type (e.g. "source.$slice_name")
Arg[3]      : String $key - key of this entry (e.g. a gene dbID)
Arg[4]      : List of Bio::EnsEMBL::IdMappping::TinyFeature @val - values
              to cache
Example     : $cache->add_list('transcripts_by_exon_id',
                'source.chromosome:NCBI36:X:1:1000000:1', '1234',
                $tiny_transcript1, $tiny_transcript2);
Description : Adds a list of TinyFeature objects to a named cache.
Return type : Listref of Bio::EnsEMBL::IdMapping::TinyFeature objects
Exceptions  : thrown on wrong or missing arguments
Caller      : internal
Status      : At Risk
            : under development