LICENSE
Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
CONTACT
Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.
Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.
NAME
Bio::EnsEMBL::Mapper
SYNOPSIS
$map = Bio::EnsEMBL::Mapper->new( 'rawcontig', 'chromosome' );
# add a coodinate mapping - supply two pairs or coordinates
$map->add_map_coordinates(
$contig_id, $contig_start, $contig_end, $contig_ori,
$chr_name, chr_start, $chr_end
);
# map from one coordinate system to another
my @coordlist =
$mapper->map_coordinates( 627012, 2, 5, -1, "rawcontig" );
DESCRIPTION
Generic mapper to provide coordinate transforms between two disjoint coordinate systems. This mapper is intended to be 'context neutral' - in that it does not contain any code relating to any particular coordinate system. This is provided in, for example, Bio::EnsEMBL::AssemblyMapper.
Mappings consist of pairs of 'to-' and 'from-' contigs with coordinates on each. Orientation is abbreviated to 'ori',
The contig pair hash is divided into mappings per seq_region, the code below makes assumptions about how to filter these results, thus the comparisons for some properties are absent in the code but implicit by data structure.
The assembly mapping hash '_pair_last' orders itself by the target seq region and looks like this:
1 => ARRAY(0x1024c79c0)
0 Bio::EnsEMBL::Mapper::Pair=HASH(0x1024d6198)
'from' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025edf98)
'end' => 4
'id' => 4
'start' => 1
'ori' => 1
'to' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025edf68)
'end' => 4
'id' => 1
'start' => 1
1 Bio::EnsEMBL::Mapper::Pair=HASH(0x1026c20f0)
'from' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025ee3a0)
'end' => 12
'id' => 4
'start' => 9
'ori' => 1
'to' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025ee370)
'end' => 4
'id' => 1
'start' => 1
2 => ARRAY(0x1025ee460)
0 Bio::EnsEMBL::Mapper::Pair=HASH(0x1025ee400)
'from' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025ee2c8)
'end' => 8
'id' => 4
'start' => 5
'ori' => 1
'to' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025ee2b0)
'end' => 4
'id' => 2
'start' => 1
1 Bio::EnsEMBL::Mapper::Pair=HASH(0x1025ee658)
'from' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025eea48)
'end' => 16
'id' => 4
'start' => 13
'ori' => 1
'to' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025eea18)
'end' => 4
'id' => 2
'start' => 1
The other mapping hash available is the reverse sense, putting the 'from' seq_region as the sorting key. Here is an excerpt.
0 HASH(0x102690bb8) 4 => ARRAY(0x1025ee028) 0 Bio::EnsEMBL::Mapper::Pair=HASH(0x1024d6198) 'from' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025edf98) 'end' => 4 'id' => 4 'start' => 1 'ori' => 1 'to' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025edf68) 'end' => 4 'id' => 1 'start' => 1 1 Bio::EnsEMBL::Mapper::Pair=HASH(0x1025ee400) 'from' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025ee2c8) 'end' => 8 'id' => 4 'start' => 5 'ori' => 1 'to' => Bio::EnsEMBL::Mapper::Unit=HASH(0x1025ee2b0) 'end' => 4 'id' => 2 'start' => 1
METHODS
new
Arg [1] : string $from
The name of the 'from' coordinate system
Arg [2] : string $to
The name of the 'to' coordinate system
Arg [3] : (optional) Bio::EnsEMBL::CoordSystem $from_cs
The 'from' coordinate system
Arg [4] : (optional) Bio::EnsEMBL::CoordSystem $to_cs
Example : my $mapper = Bio::EnsEMBL::Mapper->new('FROM', 'TO');
Description: Constructor. Creates a new Bio::EnsEMBL::Mapper object.
Returntype : Bio::EnsEMBL::Mapper
Exceptions : none
Caller : general
flush
Args : none
Example : none
Description: removes all cached information out of this mapper
Returntype : none
Exceptions : none
Caller : AssemblyMapper, ChainedAssemblyMapper
map_coordinates
Arg 1 string $id
id of 'source' sequence
Arg 2 int $start
start coordinate of 'source' sequence
Arg 3 int $end
end coordinate of 'source' sequence
Arg 4 int $strand
raw contig orientation (+/- 1)
Arg 5 string $type
nature of transform - gives the type of
coordinates to be transformed *from*
Arg 6 boolean (0 or 1) $include_original_region
option to include original input coordinate region mappings in the result
Arg 7 int $cdna_coding_start
cdna coding start
Function generic map method
Returntype if $include_original_region == 0
array of mappped Bio::EnsEMBL::Mapper::Coordinate
and/or Bio::EnsEMBL::Mapper::Gap
if $include_original_region == 1
hash of mapped and original Bio::EnsEMBL::Mapper::Coordinate
and/or Bio::EnsEMBL::Mapper::Gap
Exceptions none
Caller Bio::EnsEMBL::Mapper
map_insert
Arg [1] : string $id
Arg [2] : int $start - start coord. Since this is an insert should always
be one greater than end.
Arg [3] : int $end - end coord. Since this is an insert should always
be one less than start.
Arg [4] : int $strand (0, 1, -1)
Arg [5] : string $type - the coordinate system name the coords are from.
Arg [6] : boolean $fastmap - if specified, this is being called from
the fastmap call. The mapping done is not any faster for
inserts, but the return value is different.
Example :
Description: This is in internal function which handles the special mapping
case for inserts (start = end +1). This function will be called
automatically by the map function so there is no reason to
call it directly.
Returntype : list of Bio::EnsEMBL::Mapper::Coordinate and/or Gap objects
Exceptions : none
Caller : map_coordinates()
fastmap
Arg 1 string $id
id of 'source' sequence
Arg 2 int $start
start coordinate of 'source' sequence
Arg 3 int $end
end coordinate of 'source' sequence
Arg 4 int $strand
raw contig orientation (+/- 1)
Arg 5 int $type
nature of transform - gives the type of
coordinates to be transformed *from*
Function inferior map method. Will only do ungapped unsplit mapping.
Will return id, start, end strand in a list.
Returntype list of results
Exceptions none
Caller Bio::EnsEMBL::AssemblyMapper
add_map_coordinates
Arg 1 int $id
id of 'source' sequence
Arg 2 int $start
start coordinate of 'source' sequence
Arg 3 int $end
end coordinate of 'source' sequence
Arg 4 int $strand
relative orientation of source and target (+/- 1)
Arg 5 int $id
id of 'target' sequence
Arg 6 int $start
start coordinate of 'target' sequence
Arg 7 int $end
end coordinate of 'target' sequence
Function Stores details of mapping between
'source' and 'target' regions.
Returntype none
Exceptions none
Caller Bio::EnsEMBL::Mapper
add_indel_coordinates
Arg 1 int $id
id of 'source' sequence
Arg 2 int $start
start coordinate of 'source' sequence
Arg 3 int $end
end coordinate of 'source' sequence
Arg 4 int $strand
relative orientation of source and target (+/- 1)
Arg 5 int $id
id of 'targe' sequence
Arg 6 int $start
start coordinate of 'targe' sequence
Arg 7 int $end
end coordinate of 'targe' sequence
Function stores details of mapping between two regions:
'source' and 'target'. Returns 1 if the pair was added, 0 if it
was already in. Used when adding an indel
Returntype int 0,1
Exceptions none
Caller Bio::EnsEMBL::Mapper
map_indel
Arg [1] : string $id
Arg [2] : int $start - start coord. Since this is an indel should always
be one greater than end.
Arg [3] : int $end - end coord. Since this is an indel should always
be one less than start.
Arg [4] : int $strand (0, 1, -1)
Arg [5] : string $type - the coordinate system name the coords are from.
Example : @coords = $mapper->map_indel();
Description: This is in internal function which handles the special mapping
case for indels (start = end +1). It will be used to map from
a coordinate system with a gap to another that contains an
insertion. It will be mainly used by the Variation API.
Returntype : Bio::EnsEMBL::Mapper::Unit objects
Exceptions : none
Caller : general
add_Mapper
Arg 1 Bio::EnsEMBL::Mapper $mapper2
Example $mapper->add_Mapper($mapper2)
Function add all the map coordinates from $mapper to this mapper.
This object will contain mapping pairs from both the old
object and $mapper2.
Returntype int 0,1
Exceptions throw if 'to' and 'from' from both Bio::EnsEMBL::Mappers
are incompatible
Caller $mapper->methodname()
list_pairs
Arg 1 int $id
id of 'source' sequence
Arg 2 int $start
start coordinate of 'source' sequence
Arg 3 int $end
end coordinate of 'source' sequence
Arg 4 string $type
nature of transform - gives the type of
coordinates to be transformed *from*
Function list all pairs of mappings in a region
Returntype list of Bio::EnsEMBL::Mapper::Pair
Exceptions none
Caller Bio::EnsEMBL::Mapper
to
Arg 1 Bio::EnsEMBL::Mapper::Unit $id
id of 'source' sequence
Function accessor method form the 'source'
and 'target' in a Mapper::Pair
Returntype Bio::EnsEMBL::Mapper::Unit
Exceptions none
Caller Bio::EnsEMBL::Mapper
from
Arg 1 Bio::EnsEMBL::Mapper::Unit $id
id of 'source' sequence
Function accessor method form the 'source'
and 'target' in a Mapper::Pair
Returntype Bio::EnsEMBL::Mapper::Unit
Exceptions none
Caller Bio::EnsEMBL::Mapper