LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::StableIdHistoryTree - object representing a stable ID history tree

SYNOPSIS

my $registry = "Bio::EnsEMBL::Registry";
my $archiveStableIdAdaptor =
  $registry->get_adaptor( 'human', 'core', 'ArchiveStableId' );

my $stable_id = 'ENSG00000068990';
my $history =
  $archiveStableIdAdaptor->fetch_history_tree_by_stable_id('ENSG01');

print "Unique stable IDs in this tree:\n";
print join( ", ", @{ $history->get_unique_stable_ids } ), "\n";

print "\nReleases in this tree:\n";
print join( ", ", @{ $history->get_release_display_names } ), "\n";

print "\nCoordinates of nodes in the tree:\n\n";
foreach my $a ( @{ $history->get_all_ArchiveStableIds } ) {
  print "  Stable ID: " . $a->stable_id . "." . $a->version . "\n";
  print "  Release: "
    . $a->release . " ("
    . $a->assembly . ", "
    . $a->db_name . ")\n";
  print "  coords: "
    . join( ', ', @{ $history->coords_by_ArchiveStableId($a) } )
    . "\n\n";
}

DESCRIPTION

This object represents a stable ID history tree graph.

The graph is implemented as a collection of nodes (ArchiveStableId objects) and links (StableIdEvent objects) which have positions on an (x,y) grid. The x axis is used for releases, the y axis for stable_ids. The idea is to create a plot similar to this (the numbers shown on the nodes are the stable ID versions):

ENSG001   1-------------- 2--
                              \
ENSG003                         1-----1
                              /
ENSG002   1-------2----------

         38      39      40    41    42

The grid coordinates of the ArchiveStableId objects in this example would be (note that coordinates are zero-based):

ENSG001.1               (0, 0)
ENSG001.2               (2, 0)
ENSG003.1 (release 41)  (3, 1) 
ENSG003.1 (release 42)  (4, 1) 
ENSG002.1               (0, 2)
ENSG002.2               (1, 2)

The tree will only contain those nodes which had a change in the stable ID version. Therefore, in the above example, in release 39 ENSG001 was present and had version 1 (but will not be drawn there, to unclutter the output).

The grid positions will be calculated by the API and will try to untangle the tree (i.e. try to avoid overlapping lines).

METHODS

new
add_ArchiveStableIds
add_ArchiveStableIds_for_events
remove_ArchiveStableId
flush_ArchiveStableIds
add_StableIdEvents
remove_StableIdEvent
flush_StableIdEvents
get_all_ArchiveStableIds
get_all_StableIdEvents
get_latest_StableIdEvent
get_release_display_names
get_release_db_names
get_unique_stable_ids
optimise_tree
coords_by_ArchiveStableId
calculate_coords
consolidate_tree
reset_tree
current_dbname
current_release
current_assembly
is_incomplete

RELATED MODULES

Bio::EnsEMBL::ArchiveStableId
Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor
Bio::EnsEMBL::StableIdEvent

new

Arg [CURRENT_DBNAME]   : (optional) name of current db
Arg [CURRENT_RELEASE]  : (optional) current release number
Arg [CURRENT_ASSEMBLY] : (optional) current assembly name
Example     : my $history = Bio::EnsEMBL::StableIdHistoryTree->new;
Description : object constructor
Return type : Bio::EnsEMBL::StableIdHistoryTree
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

add_ArchiveStableIds

Arg[1..n]   : Bio::EnsEMBL::ArchiveStableId's @archive_ids
              The ArchiveStableIds to add to the the history tree
Example     : my $archive_id = $archiveStableIdAdaptor->fetch_by_stable_id(
                'ENSG00024808');
              $history->add_ArchiveStableId($archive_id);
Description : Adds ArchiveStableIds (nodes) to the history tree. No
              calculation of grid coordinates is done at this point, you need
              to initiate this manually with calculate_coords().
              ArchiveStableIds are only added once for each release (to avoid
              duplicates).
Return type : none
Exceptions  : thrown on invalid or missing argument
Caller      : Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor::fetch_history_by_stable_id, general
Status      : At Risk
            : under development

add_ArchiveStableIds_for_events

Example     : my $history = Bio::EnsEMBL::StableIdHistoryTree->new;
              $history->add_StableIdEvents($event1, $event2);
              $history->add_ArchiveStableIds_for_events;
Description : Convenience method that adds all ArchiveStableIds for all
              StableIdEvents attached to this object to the tree.
Return type : none
Exceptions  : none
Caller      : Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor::fetch_history_by_stable_id, general
Status      : At Risk
            : under development

remove_ArchiveStableId

Arg[1]      : Bio::EnsEMBL::ArchiveStableId $archive_id
              the ArchiveStableId to remove from the tree
Example     : $history->remove_ArchiveStableId($archive_id);
Description : Removes an ArchiveStableId from the tree.
Return type : none
Exceptions  : thrown on missing or invalid argument
Caller      : Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor::fetch_history_by_stable_id, general
Status      : At Risk
            : under development

flush_ArchiveStableIds

Example     : $history->flush_ArchiveStableIds;
Description : Remove all ArchiveStableIds from the tree.
Return type : none
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

add_StableIdEvents

Arg[1..n]   : Bio::EnsEMBL::StableIdEvent's @events
              The StableIdEvents to add to the the history tree
Example     : $history->add_StableIdEvents($event);
Description : Adds StableIdEvents (links) to the history tree. Note that 
              ArchiveStableIds attached to the StableIdEvent aren't added to
              the tree automatically, you'll need to call
              add_ArchiveStableIds_for_events later.
Return type : none
Exceptions  : thrown on invalid or missing argument
Caller      : Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor::fetch_history_by_stable_id, general
Status      : At Risk
            : under development

remove_StableIdEvent

Arg[1]      : Bio::EnsEMBL::StableIdEvent $event
              the StableIdEvent to remove from the tree
Example     : $history->remove_StableIdEvent($event);
Description : Removes a StableIdEvent from the tree.
Return type : none
Exceptions  : thrown on missing or invalid arguments
Caller      : Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor::fetch_history_by_stable_id, general
Status      : At Risk
            : under development

flush_StableIdEvents

Example     : $history->flush_StableIdEvents; 
Description : Removes all StableIdEvents from the tree.
Return type : none
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

get_all_ArchiveStableIds

Example     : foreach my $arch_id (@{ $history->get_all_ArchiveStableIds }) {
                print $arch_id->stable_id, '.', $arch_id->version, "\n";
              }
Description : Gets all ArchiveStableIds (nodes) in this tree.
Return type : Arrayref of Bio::EnsEMBL::ArchiveStableId objects
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

get_all_current_ArchiveStableIds

Example     : foreach my $arch_id (@{ $history->get_all_current_ArchiveStableIds }) {
                print $arch_id->stable_id, '.', $arch_id->version, "\n";
              }
Description : Convenience method to get all current ArchiveStableIds in this
              tree.
              
              Note that no lazy loading of "current" status is done at that
              stage; as long as you retrieve your StableIdHistoryTree object
              from ArchiveStableIdAdaptor, you'll get the right answer. In
              other use cases, if you want to make sure you really get all
              current stable IDs, loop over the result of
              get_all_ArchiveStableIds() and call
              ArchiveStableId->current_version() on all of them.
Return type : Arrayref of Bio::EnsEMBL::ArchiveStableId objects
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

get_all_StableIdEvents

Example     : foreach my $event (@{ $history->get_all_StableIdsEvents }) {
                print "Old stable ID: ", 
                  ($event->get_attribute('old', 'stable_id') or 'none'), "\n";
                print "New stable ID: ", 
                  ($event->get_attribute('new', 'stable_id') or 'none'), "\n";
                print "Mapping score: ", $event->score, "\n";
              }
Description : Gets all StableIdsEvents (links) in this tree.
Return type : Arrayref of Bio::EnsEMBL::StableIdEvent objects
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

get_latest_StableIdEvent

Arg[1]      : Bio::EnsEMBL::ArchiveStableId $arch_id - the stable ID to get
              the latest Event for
Example     : my $arch_id = Bio::EnsEMBL::ArchiveStableId->new(
                -stable_id => 'ENSG00001'
              );
              my $event = $history->get_latest_Event($arch_id);
Description : Returns the latest StableIdEvent found in the tree where a given
              stable ID is the new stable ID. If more than one is found (e.g.
              in a merge scenario in the latest mapping), preference is given
              to self-events.
Return type : Bio::EnsEMBL::StableIdEvent
Exceptions  : thrown on missing or wrong argument
Caller      : Bio::EnsEMBL::DBSQL::ArchiveStableIdAdaptor::add_all_current_to_history, general
Status      : At Risk
            : under development

get_release_display_names

Example     : print "Unique release display_names in this tree:\n"
              foreach my $name (@{ $history->get_release_display_names }) {
                print "  $name\n";
              }
Description : Returns a chronologically sorted list of unique release
              display_names in this tree.

              This method can be used to determine the number of columns when
              plotting the history tree.
Return type : Arrayref of strings.
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

get_release_db_names

Example     : print "Unique release db_names in this tree:\n"
              foreach my $name (@{ $history->get_release_db_names }) {
                print "  $name\n";
              }
Description : Returns a chronologically sorted list of unique release
              db_names in this tree.
Return type : Arrayref of strings.
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

get_unique_stable_ids

Example     : print "Unique stable IDs in this tree:\n"
              foreach my $id (@{ $history->get_unique_stable_ids }) {
                print "  $id\n";
              }
Description : Returns a list of unique stable IDs in this tree. Version is not
              taken into account here. This method can be used to determine
              the number of rows when plotting the history with each stable ID
              occupying one line.

              Sort algorithm will depend on what was chosen when the sorted
              tree was generated. This ranges from a simple alphanumeric sort
              to algorithms trying to untangle the history tree. If no
              pre-sorted data is found, an alphanumerically sorted list will
              be returned by default.
Return type : Arrayref of strings.
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

optimise_tree

Arg [1]     : (optional) Float $time_limit
              Optimise tree normally runs until it hits a minimised state
              but this can take a very long time. Therefore you can
              opt to bail out of the optimisation early. Specify the
              time in seconds. Floating point values are supported should you
              require sub-second limits              
Example     : $history->optimise_tree;
Description : This method sorts the history tree so that the number of
              overlapping branches is minimised (thus "untangling" the tree).
              
              It uses a clustering algorithm for this which iteratively moves
              the nodes with the largest vertical distance next to each other
              and looking for a mininum in total branch length. This might not
              produce the overall optimum but usually converges on a local
              optimum very quickly.
Return type : none
Exceptions  : none
Caller      : calculate_coords
Status      : At Risk
            : under development

coords_by_ArchiveStableId

Arg[1]      : Bio::EnsEMBL::ArchiveStableId $archive_id
              The ArchiveStableId to get tree grid coordinates for
Example     : my ($x, $y) =
                @{ $history->coords_by_ArchiveStableId($archive_id) };
              print $archive_id->stable_id, " coords: $x, $y\n";
Description : Returns the coordinates of an ArchiveStableId in the history
              tree grid. If the ArchiveStableId isn't found in this tree, an
              empty list is returned.
              
              Coordinates are zero-based (i.e. the top leftmost element in
              the grid has coordinates [0, 0], not [1, 1]). This is to
              facilitate using them to create a matrix as a two-dimensional
              array of arrays.
Return type : Arrayref (x coordinate, y coordinate)
Exceptions  : thrown on wrong argument type
Caller      : general
Status      : At Risk
            : under development

calculate_coords

Arg [1]     : (optional) Float $time_limit
              Optimise tree normally runs until it hits a minimised state
              but this can take a very long time. Therefore you can
              opt to bail out of the optimisation early. Specify the
              time in seconds. Floating point values are supported should you
              require sub-second limits
Example     : $history->calculate_coords;
Description : Pre-calculates the grid coordinates of all nodes in the tree.
Return type : none
Exceptions  : none
Caller      : ArchiveStableIdAdaptor::fetch_history_by_stable_id
Status      : At Risk
            : under development

consolidate_tree

Example     : $history->consolidate_tree;
Description : Consolidate the history tree. This means removing nodes where
              there wasn't a change and bridging gaps in the history. The end
              result will be a sparse tree which only contains the necessary
              information.
Return type : none
Exceptions  : none
Caller      : ArchiveStableIdAdaptor->fetch_history_tree_by_stable_id
Status      : At Risk
            : under development

reset_tree

Example     : $history->reset_tree;
Description : Resets all pre-calculated tree grid data. Mostly used internally
              by methods that modify the tree.
Return type : none
Exceptions  : none
Caller      : internal
Status      : At Risk
            : under development

current_dbname

Arg[1]      : (optional) String $dbname - the dbname to set
Example     : my $dbname = $history->current_dbname;
Description : Getter/setter for current dbname.
Return type : String
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

current_release

Arg[1]      : (optional) Int $release - the release to set
Example     : my $release = $history->current_release;
Description : Getter/setter for current release.
Return type : Int
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

current_assembly

Arg[1]      : (optional) String $assembly - the assembly to set
Example     : my $assembly = $history->current_assembly;
Description : Getter/setter for current assembly.
Return type : String
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development

is_incomplete

Arg[1]      : (optional) Boolean $incomplete 
Example     : if ($history->is_incomplete) {
                print "Returned tree is incomplete due to too many mappings
                  in the database.\n";
              }
Description : Getter/setter for incomplete flag. This is used by
              ArchiveStableIdAdaptor to indicate that it finished building
              the tree prematurely due to too many mappins in the db and can
              be used by applications to print warning messages.
Return type : Boolean
Exceptions  : none
Caller      : general
Status      : At Risk
            : under development