LICENSE
Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
CONTACT
Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.
Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.
NAME
Bio::EnsEMBL::IdMapping::SyntenyFramework - framework representing syntenic regions across the genome
SYNOPSIS
# build the SyntenyFramework from unambiguous gene mappings
my $sf = Bio::EnsEMBL::IdMapping::SyntenyFramework->new(
-DUMP_PATH => $dump_path,
-CACHE_FILE => 'synteny_framework.ser',
-LOGGER => $self->logger,
-CONF => $self->conf,
-CACHE => $self->cache,
);
$sf->build_synteny($gene_mappings);
# use it to rescore the genes
$gene_scores = $sf->rescore_gene_matrix_lsf($gene_scores);
DESCRIPTION
The SyntenyFramework is a set of SyntenyRegions. These are pairs of locations very analoguous to the information in the assembly table (the locations dont have to be the same length though). They are built from genes that map uniquely between source and target.
Once built, the SyntenyFramework is used to score source and target gene pairs to determine whether they are similar. This process is slow (it involves testing all gene pairs against all SyntenyRegions), this module therefor has built-in support to run the process in parallel via LSF.
METHODS
new
build_synteny
_by_overlap
add_SyntenyRegion
get_all_SyntenyRegions
rescore_gene_matrix_lsf
rescore_gene_matrix
logger
conf
cache
new
Arg [LOGGER]: Bio::EnsEMBL::Utils::Logger $logger - a logger object
Arg [CONF] : Bio::EnsEMBL::Utils::ConfParser $conf - a configuration object
Arg [CACHE] : Bio::EnsEMBL::IdMapping::Cache $cache - a cache object
Arg [DUMP_PATH] : String - path for object serialisation
Arg [CACHE_FILE] : String - filename of serialised object
Example : my $sf = Bio::EnsEMBL::IdMapping::SyntenyFramework->new(
-DUMP_PATH => $dump_path,
-CACHE_FILE => 'synteny_framework.ser',
-LOGGER => $self->logger,
-CONF => $self->conf,
-CACHE => $self->cache,
);
Description : Constructor.
Return type : Bio::EnsEMBL::IdMapping::SyntenyFramework
Exceptions : thrown on wrong or missing arguments
Caller : InternalIdMapper plugins
Status : At Risk
: under development
build_synteny
Arg[1] : Bio::EnsEMBL::IdMapping::MappingList $mappings - gene mappings
to build the SyntenyFramework from
Example : $synteny_framework->build_synteny($gene_mappings);
Description : Builds the SyntenyFramework from unambiguous gene mappings.
SyntenyRegions are allowed to overlap. At most two overlapping
SyntenyRegions are merged (otherwise we'd get too large
SyntenyRegions with little information content).
Return type : none
Exceptions : thrown on wrong or missing argument
Caller : InternalIdMapper plugins
Status : At Risk
: under development
add_SyntenyRegion
Arg[1] : Bio::EnsEMBL::IdMaping::SyntenyRegion - SyntenyRegion to add
Example : $synteny_framework->add_SyntenyRegion($synteny_region);
Description : Adds a SyntenyRegion to the framework. For speed reasons (and
since this is an internal method), no argument check is done.
Return type : none
Exceptions : none
Caller : internal
Status : At Risk
: under development
get_all_SyntenyRegions
Example : foreach my $sr (@{ $sf->get_all_SyntenyRegions }) {
# do something with the SyntenyRegion
}
Description : Get a list of all SyntenyRegions in the framework.
Return type : Arrayref of Bio::EnsEMBL::IdMapping::SyntenyRegion
Exceptions : none
Caller : general
Status : At Risk
: under development
rescore_gene_matrix_lsf
Arg[1] : Bio::EnsEMBL::IdMapping::ScoredmappingMatrix $matrix - gene
scores to rescore
Example : my $new_scores = $sf->rescore_gene_matrix_lsf($gene_scores);
Description : This method runs rescore_gene_matrix() (via the
synteny_resocre.pl script) in parallel with lsf, then combines
the results to return a single rescored scoring matrix.
Parallelisation is done by chunking the scoring matrix into
several pieces (determined by the --synteny_rescore_jobs
configuration option).
Return type : Bio::EnsEMBL::IdMapping::ScoredMappingMatrix
Exceptions : thrown on wrong or missing argument
thrown on filesystem I/O error
thrown on failure of one or mor lsf jobs
Caller : InternalIdMapper plugins
Status : At Risk
: under development
rescore_gene_matrix
Arg[1] : Bio::EnsEMBL::IdMapping::ScoredmappingMatrix $matrix - gene
scores to rescore
Example : my $new_scores = $sf->rescore_gene_matrix($gene_scores);
Description : Rescores a gene matrix. Retains 70% of old score and builds
other 30% from the synteny match.
Return type : Bio::EnsEMBL::IdMapping::ScoredMappingMatrix
Exceptions : thrown on wrong or missing argument
Caller : InternalIdMapper plugins
Status : At Risk
: under development
logger
Arg[1] : (optional) Bio::EnsEMBL::Utils::Logger - the logger to set
Example : $object->logger->info("Starting ID mapping.\n");
Description : Getter/setter for logger object
Return type : Bio::EnsEMBL::Utils::Logger
Exceptions : none
Caller : constructor
Status : At Risk
: under development
conf
Arg[1] : (optional) Bio::EnsEMBL::Utils::ConfParser - the configuration
to set
Example : my $basedir = $object->conf->param('basedir');
Description : Getter/setter for configuration object
Return type : Bio::EnsEMBL::Utils::ConfParser
Exceptions : none
Caller : constructor
Status : At Risk
: under development
cache
Arg[1] : (optional) Bio::EnsEMBL::IdMapping::Cache - the cache to set
Example : $object->cache->read_from_file('source');
Description : Getter/setter for cache object
Return type : Bio::EnsEMBL::IdMapping::Cache
Exceptions : none
Caller : constructor
Status : At Risk
: under development