LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::IdMapping::SyntenyFramework - framework representing syntenic regions across the genome

SYNOPSIS

# build the SyntenyFramework from unambiguous gene mappings
my $sf = Bio::EnsEMBL::IdMapping::SyntenyFramework->new(
  -DUMP_PATH  => $dump_path,
  -CACHE_FILE => 'synteny_framework.ser',
  -LOGGER     => $self->logger,
  -CONF       => $self->conf,
  -CACHE      => $self->cache,
);
$sf->build_synteny($gene_mappings);

# use it to rescore the genes
$gene_scores = $sf->rescore_gene_matrix_lsf($gene_scores);

DESCRIPTION

The SyntenyFramework is a set of SyntenyRegions. These are pairs of locations very analoguous to the information in the assembly table (the locations dont have to be the same length though). They are built from genes that map uniquely between source and target.

Once built, the SyntenyFramework is used to score source and target gene pairs to determine whether they are similar. This process is slow (it involves testing all gene pairs against all SyntenyRegions), this module therefor has built-in support to run the process in parallel via LSF.

METHODS

new
build_synteny
_by_overlap
add_SyntenyRegion
get_all_SyntenyRegions
rescore_gene_matrix_lsf
rescore_gene_matrix
logger
conf
cache

new

Arg [LOGGER]: Bio::EnsEMBL::Utils::Logger $logger - a logger object
Arg [CONF]  : Bio::EnsEMBL::Utils::ConfParser $conf - a configuration object
Arg [CACHE] : Bio::EnsEMBL::IdMapping::Cache $cache - a cache object
Arg [DUMP_PATH] : String - path for object serialisation
Arg [CACHE_FILE] : String - filename of serialised object
Example     : my $sf = Bio::EnsEMBL::IdMapping::SyntenyFramework->new(
                -DUMP_PATH    => $dump_path,
                -CACHE_FILE   => 'synteny_framework.ser',
                -LOGGER       => $self->logger,
                -CONF         => $self->conf,
                -CACHE        => $self->cache,
              );
Description : Constructor.
Return type : Bio::EnsEMBL::IdMapping::SyntenyFramework
Exceptions  : thrown on wrong or missing arguments
Caller      : InternalIdMapper plugins
Status      : At Risk
            : under development

build_synteny

Arg[1]      : Bio::EnsEMBL::IdMapping::MappingList $mappings - gene mappings
              to build the SyntenyFramework from
Example     : $synteny_framework->build_synteny($gene_mappings);
Description : Builds the SyntenyFramework from unambiguous gene mappings.
              SyntenyRegions are allowed to overlap. At most two overlapping
              SyntenyRegions are merged (otherwise we'd get too large
              SyntenyRegions with little information content).
Return type : none
Exceptions  : thrown on wrong or missing argument
Caller      : InternalIdMapper plugins
Status      : At Risk
            : under development

add_SyntenyRegion

Arg[1]      : Bio::EnsEMBL::IdMaping::SyntenyRegion - SyntenyRegion to add
Example     : $synteny_framework->add_SyntenyRegion($synteny_region);
Description : Adds a SyntenyRegion to the framework. For speed reasons (and
              since this is an internal method), no argument check is done.
Return type : none
Exceptions  : none
Caller      : internal
Status      : At Risk
            : under development

get_all_SyntenyRegions

Example     : foreach my $sr (@{ $sf->get_all_SyntenyRegions }) {
                # do something with the SyntenyRegion
              }
Description : Get a list of all SyntenyRegions in the framework.
Return type : Arrayref of Bio::EnsEMBL::IdMapping::SyntenyRegion
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

rescore_gene_matrix_lsf

Arg[1]      : Bio::EnsEMBL::IdMapping::ScoredmappingMatrix $matrix - gene
              scores to rescore
Example     : my $new_scores = $sf->rescore_gene_matrix_lsf($gene_scores);
Description : This method runs rescore_gene_matrix() (via the
              synteny_resocre.pl script) in parallel with lsf, then combines
              the results to return a single rescored scoring matrix.
              Parallelisation is done by chunking the scoring matrix into
              several pieces (determined by the --synteny_rescore_jobs
              configuration option).
Return type : Bio::EnsEMBL::IdMapping::ScoredMappingMatrix
Exceptions  : thrown on wrong or missing argument
              thrown on filesystem I/O error
              thrown on failure of one or mor lsf jobs
Caller      : InternalIdMapper plugins
Status      : At Risk
            : under development

rescore_gene_matrix

Arg[1]      : Bio::EnsEMBL::IdMapping::ScoredmappingMatrix $matrix - gene
              scores to rescore
Example     : my $new_scores = $sf->rescore_gene_matrix($gene_scores);
Description : Rescores a gene matrix. Retains 70% of old score and builds
              other 30% from the synteny match.
Return type : Bio::EnsEMBL::IdMapping::ScoredMappingMatrix
Exceptions  : thrown on wrong or missing argument
Caller      : InternalIdMapper plugins
Status      : At Risk
            : under development

logger

Arg[1]      : (optional) Bio::EnsEMBL::Utils::Logger - the logger to set
Example     : $object->logger->info("Starting ID mapping.\n");
Description : Getter/setter for logger object
Return type : Bio::EnsEMBL::Utils::Logger
Exceptions  : none
Caller      : constructor
Status      : At Risk
            : under development

conf

Arg[1]      : (optional) Bio::EnsEMBL::Utils::ConfParser - the configuration
              to set
Example     : my $basedir = $object->conf->param('basedir');
Description : Getter/setter for configuration object
Return type : Bio::EnsEMBL::Utils::ConfParser
Exceptions  : none
Caller      : constructor
Status      : At Risk
            : under development

cache

Arg[1]      : (optional) Bio::EnsEMBL::IdMapping::Cache - the cache to set
Example     : $object->cache->read_from_file('source');
Description : Getter/setter for cache object
Return type : Bio::EnsEMBL::IdMapping::Cache
Exceptions  : none
Caller      : constructor
Status      : At Risk
            : under development