The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

  Please email comments or questions to the public Ensembl
  developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

  Questions may also be sent to the Ensembl help desk at
  <http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::IdMapping::Cache - a cache to hold data objects used by the IdMapping application

DESCRIPTION

METHODS

new

  Arg [LOGGER]: Bio::EnsEMBL::Utils::Logger $logger - a logger object
  Arg [CONF]  : Bio::EnsEMBL::Utils::ConfParser $conf - a configuration object
  Example     : my $cache = Bio::EnsEMBL::IdMapping::Cache->new(
                  -LOGGER => $logger,
                  -CONF   => $conf,
                );
  Description : constructor
  Return type : Bio::EnsEMBL::IdMapping::Cache object
  Exceptions  : thrown on wrong or missing arguments
  Caller      : general
  Status      : At Risk
              : under development

build_cache_by_slice

  Arg[1]      : String $dbtype - db type (source|target)
  Arg[2]      : String $slice_name - the name of a slice (format as returned by
                Bio::EnsEMBL::Slice->name)
  Example     : my ($num_genes, $filesize) = $cache->build_cache_by_slice(
                  'source', 'chromosome:NCBI36:X:1:1000000:-1');
  Description : Builds a cache of genes, transcripts, translations and exons
                needed by the IdMapping application and serialises the resulting
                cache object to a file, one slice at a time.
  Return type : list of the number of genes processed and the size of the
                serialised cache file
  Exceptions  : thrown on invalid slice name
  Caller      : general
  Status      : At Risk
              : under development

build_cache_all

  Arg[1]      : String $dbtype - db type (source|target)
  Example     : my ($num_genes, $filesize) = $cache->build_cache_all('source');
  Description : Builds a cache of genes, transcripts, translations and exons
                needed by the IdMapping application and serialises the
                resulting cache object to a file. All genes across the genome
                are processed in one go. This method should be used when
                build_cache_by_seq_region can't be used due to a large number
                of toplevel seq_regions (e.g. 2x genomes).
  Return type : list of the number of genes processed and the size of the
                serialised cache file
  Exceptions  : thrown on invalid slice name
  Caller      : general
  Status      : At Risk
              : under development

build_cache_from_genes

  Arg[1]      : String $type - cache type
  Arg[2]      : Listref of Bio::EnsEMBL::Genes $genes - genes to build cache
                from
  Arg[3]      : Boolean $need_project - indicate if we need to project exons to
                common coordinate system
  Example     : $cache->build_cache_from_genes(
                  'source.chromosome:NCBI36:X:1:100000:1', \@genes);
  Description : Builds the cache by fetching transcripts, translations and exons
                for a list of genes from the database, and creating lightweight
                Bio::EnsEMBL::IdMapping::TinyFeature objects containing only the
                data needed by the IdMapping application. These objects are
                attached to a name cache in this cache object. Exons only need
                to be projected to a commond coordinate system if their native
                coordinate system isn't common to source and target assembly
                itself.
  Return type : int - number of genes after filtering
  Exceptions  : thrown on wrong or missing arguments
  Caller      : internal
  Status      : At Risk
              : under development

filter_biotypes

  Arg[1]      : Listref of Bio::EnsEMBL::Genes $genes - the genes to filter
  Example     : my @filtered = @{ $cache->filter_biotypes(\@genes) };

  Description : Filters a list of genes by biotype.  Biotypes are
                taken from the IdMapping configuration parameter
                'biotypes_include' or 'biotypes_exclude'.

                If the configuration parameter 'biotypes_exclude' is
                defined, then rather than returning the genes whose
                biotype is listed in the configuration parameter
                'biotypes_include' the method will return the genes
                whose biotype is *not* listed in the 'biotypes_exclude'
                configuration parameter.

                It is an error to define both these configuration
                parameters.

                The old parameter 'biotypes' is equivalent to
                'biotypes_include'.

  Return type : Listref of Bio::EnsEMBL::Genes (or empty list)
  Exceptions  : none
  Caller      : internal
  Status      : At Risk
              : under development

add

  Arg[1]      : String $name - a cache name (e.g. 'genes_by_id')
  Arg[2]      : String type - a cache type (e.g. "source.$slice_name")
  Arg[3]      : String $key - key of this entry (e.g. a gene dbID)
  Arg[4]      : Bio::EnsEMBL::IdMappping::TinyFeature $val - value to cache
  Example     : $cache->add('genes_by_id',
                  'source.chromosome:NCBI36:X:1:1000000:1', '1234', $tiny_gene);
  Description : Adds a TinyFeature object to a named cache.
  Return type : Bio::EnsEMBL::IdMapping::TinyFeature
  Exceptions  : thrown on wrong or missing arguments
  Caller      : internal
  Status      : At Risk
              : under development

add_list

  Arg[1]      : String $name - a cache name (e.g. 'genes_by_id')
  Arg[2]      : String type - a cache type (e.g. "source.$slice_name")
  Arg[3]      : String $key - key of this entry (e.g. a gene dbID)
  Arg[4]      : List of Bio::EnsEMBL::IdMappping::TinyFeature @val - values
                to cache
  Example     : $cache->add_list('transcripts_by_exon_id',
                  'source.chromosome:NCBI36:X:1:1000000:1', '1234',
                  $tiny_transcript1, $tiny_transcript2);
  Description : Adds a list of TinyFeature objects to a named cache.
  Return type : Listref of Bio::EnsEMBL::IdMapping::TinyFeature objects
  Exceptions  : thrown on wrong or missing arguments
  Caller      : internal
  Status      : At Risk
              : under development