The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.


Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


  Please email comments or questions to the public Ensembl
  developers list at <>.

  Questions may also be sent to the Ensembl help desk at


Bio::EnsEMBL::Gene - Object representing a genes


  my $gene = Bio::EnsEMBL::Gene->new(
    -START  => 123,
    -END    => 1045,
    -STRAND => 1,
    -SLICE  => $slice

  # print gene information
  print("gene start:end:strand is "
      . join( ":", map { $gene->$_ } qw(start end strand) )
      . "\n" );

  # set some additional attributes
  $gene->description('This is the gene description');


A representation of a Gene within the Ensembl system. A gene is a set of one or more alternative transcripts.



  Arg [-START]  : 
       int - start postion of the gene
  Arg [-END]    : 
       int - end position of the gene
  Arg [-STRAND] : 
       int - 1,-1 tehe strand the gene is on
  Arg [-SLICE]  : 
       Bio::EnsEMBL::Slice - the slice the gene is on
  Arg [-STABLE_ID] :
        string - the stable identifier of this gene
  Arg [-VERSION] :
        int - the version of the stable identifier of this gene
        string - the external database name associated with this gene
  Arg [-EXTERNAL_DB] :
        string - the name of the database the external name is from
        string - the status of the external identifier
        Bio::EnsEMBL::DBEntry - The external database entry that is used
        to label this gene when it is displayed.
        Listref of Bio::EnsEMBL::Transcripts - this gene's transcripts
        string - the date the gene was created
        string - the date the gene was last modified
        string - the genes description
  Arg [-BIOTYPE]:
        string - the biotype e.g. "protein_coding"
  Arg [-SOURCE]:
        string - the genes source, e.g. "ensembl"
  Arg [-IS_CURRENT]:
        Boolean - specifies if this is the current version of the gene
        Bio::EnsEMBL::Transcript - the canonical transcript of this gene
        integer - the canonical transcript dbID of this gene, if the
        transcript object itself is not available.

  Example    : $gene = Bio::EnsEMBL::Gene->new(...);
  Description: Creates a new gene object
  Returntype : Bio::EnsEMBL::Gene
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - the external name to set
  Example    : $gene->external_name('BRCA2');
  Description: Getter/setter for attribute external_name.
  Returntype : String or undef
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - the source to set
  Example    : $gene->source('ensembl');
  Description: Getter/setter for attribute source
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - name of external db to set
  Example    : $gene->external_db('HGNC');
  Description: Getter/setter for attribute external_db. The db is the one that 
               belongs to the external_name.  
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - status of the external db
  Example    : $gene->external_status('KNOWNXREF');
  Description: Getter/setter for attribute external_status. The status of
               the external db of the one that belongs to the external_name.
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - the description to set
  Example    : $gene->description('This is the gene\'s description');
  Description: Getter/setter for gene description
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]       : Bio::EnsEMBL::Gene gene
  Example       : if ($geneA->equals($geneB)) { ... }
  Description   : Compares two genes for equality.
                  The test for eqality goes through the following list
                  and terminates at the first true match:

                  1. If Bio::EnsEMBL::Feature::equals() returns false,
                     then the genes are *not* equal.
                  2. If the biotypes differ, then the genes are *not*
                  3. If both genes have stable IDs: if these are the
                     same, the genes are equal, otherwise not.
                  4. If both genes have the same number of transcripts
                     and if these are (when compared pair-wise sorted by
                     start-position and length) the same, then they are
                     equal, otherwise not.

  Return type   : Boolean (0, 1)

  Exceptions    : Thrown if a non-gene is passed as the argument.


  Arg [1]    : (optional) Bio::EnsEMBL::Transcript - canonical_transcript object
  Example    : $gene->canonical_transcript($canonical_transcript);
  Description: Getter/setter for the canonical_transcript
  Returntype : Bio::EnsEMBL::Transcript
  Exceptions : Throws if argument is not a transcript object.
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String $attrib_code
               The code of the attribute type to retrieve values for
  Example    : my ($author) = @{ $gene->get_all_Attributes('author') };
               my @gene_attributes = @{ $gene->get_all_Attributes };
  Description: Gets a list of Attributes of this gene.
               Optionally just get Attributes for given code.
  Returntype : Listref of Bio::EnsEMBL::Attribute
  Exceptions : warning if gene does not have attached adaptor and attempts lazy
  Caller     : general
  Status     : Stable


  Arg [1-N]  : list of Bio::EnsEMBL::Attribute's @attribs
               Attribute(s) to add
  Example    : my $attrib = Bio::EnsEMBL::Attribute->new(...);
  Description: Adds an Attribute to the Gene. If you add an attribute before
               you retrieve any from database, lazy loading will be disabled.
  Returntype : none
  Exceptions : throw on incorrect arguments
  Caller     : general
  Status     : Stable


  Arg [1]    : Bio::EnsEMBL::DBEntry $dbe
               The dbEntry to be added
  Example    : my $dbe = Bio::EnsEMBL::DBEntery->new(...);
  Description: Associates a DBEntry with this gene. Note that adding DBEntries
               will prevent future lazy-loading of DBEntries for this gene
               (see get_all_DBEntries).
  Returntype : none
  Exceptions : thrown on incorrect argument type
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String, external database name,
               SQL wildcard characters (_ and %) can be used to
               specify patterns.

  Arg [2]    : (optional) String, external_db type, can be one of
               SQL wildcard characters (_ and %) can be used to
               specify patterns.

  Example    : my @dbentries = @{ $gene->get_all_DBEntries() };
               @dbentries = @{ $gene->get_all_DBEntries('Uniprot%') };
               @dbentries = @{ $gene->get_all_DBEntries('%', 'ENSEMBL') };}

  Description: Retrieves DBEntries (xrefs) for this gene.  This does
               *not* include DBEntries that are associated with the
               transcripts and corresponding translations of this
               gene (see get_all_DBLinks()).

               This method will attempt to lazy-load DBEntries
               from a database if an adaptor is available and no
               DBEntries are present on the gene (i.e. they have not
               already been added or loaded).

  Return type: Listref of Bio::EnsEMBL::DBEntry objects
  Exceptions : none
  Caller     : get_all_DBLinks, GeneAdaptor::store
  Status     : Stable


  Arg [1]    : (optional) String, external database name

  Arg [2]    : (optional) String, external_db type

  Example    : @oxrefs = @{ $gene->get_all_object_xrefs() };

  Description: Retrieves xrefs for this gene.  This does *not*
               include xrefs that are associated with the
               transcripts or corresponding translations of this
               gene (see get_all_xrefs()).

               This method will attempt to lazy-load xrefs from a
               database if an adaptor is available and no xrefs are
               present on the gene (i.e. they have not already been
               added or loaded).

                NB: This method is an alias for the
                    get_all_DBentries() method.

  Return type: Listref of Bio::EnsEMBL::DBEntry objects

  Status     : Stable
  Arg [1]    : String database name (optional)
               SQL wildcard characters (_ and %) can be used to
               specify patterns.

  Arg [2]    : (optional) String, external database type, can be one of
               SQL wildcard characters (_ and %) can be used to
               specify patterns.

  Example    : @dblinks = @{ $gene->get_all_DBLinks() };
               @dblinks = @{ $gene->get_all_DBLinks('Uniprot%') };
               @dblinks = @{ $gene->get_all_DBLinks('%', 'ENSEMBL') };}

  Description: Retrieves *all* related DBEntries for this gene. This
               includes all DBEntries that are associated with the
               transcripts and corresponding translations of this

               If you only want to retrieve the DBEntries
               associated with the gene (and not the transcript
               and translations) then you should use the
               get_all_DBEntries() call instead.

               Note: Each entry may be listed more than once.  No
               uniqueness checks are done.  Also if you put in an
               incorrect external database name no checks are done
               to see if this exists, you will just get an empty

  Return type: Listref of Bio::EnsEMBL::DBEntry objects
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : String database name (optional)
               SQL wildcard characters (_ and %) can be used to
               specify patterns.

  Example    : @xrefs = @{ $gene->get_all_xrefs() };
               @xrefs = @{ $gene->get_all_xrefs('Uniprot%') };

  Description: Retrieves *all* related xrefs for this gene.  This
               includes all xrefs that are associated with the
               transcripts and corresponding translations of this

               If you want to retrieve the xrefs associated
               with only the gene (and not the transcript
               or translations) then you should use the
               get_all_object_xrefs() method instead.

               Note: Each entry may be listed more than once.  No
               uniqueness checks are done.  Also if you put in an
               incorrect external database name no checks are done
               to see if this exists, you will just get an empty

                NB: This method is an alias for the
                    get_all_DBLinks() method.

  Return type: Listref of Bio::EnsEMBL::DBEntry objects

  Status     : Stable


  Example    : my @exons = @{ $gene->get_all_Exons };
  Description: Returns a set of all the exons associated with this gene.
  Returntype : Listref of Bio::EnsEMBL::Exon objects
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : none
  Example    : my @introns = @{$gene->get_all_Introns()};
  Description: Returns an listref of the introns in this gene in order.
               i.e. the first intron in the listref is the 5prime most exon in
               the gene.
  Returntype : listref to Bio::EnsEMBL::Intron objects
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg[1]     : String The compara synonym to use when looking for a database in the
               registry. If not provided we will use the very first compara database
               we find.
  Description: Queries the Ensembl Compara database and retrieves all
               Genes from other species that are orthologous.
               REQUIRES properly setup Registry conf file. Meaning that
               one of the aliases for each core db has to be "Genus species"
               e.g. "Homo sapiens" (as in the name column in genome_db table
               in the compara database).

               The data is cached in this Object for faster re-retreival.
  Returntype : listref [
                        string $species # needed as cannot get spp from Gene 
  Exceptions : none
  Caller     : general
  Status     : Stable


  Description: Removes any cached homologues from the Gene which could have been
               fetched from the C<get_all_homologous_Genes()> call.
  Returntype : none
  Exceptions : none
  Caller     : general


  Arg [1]    : Bio::EnsEMBL::Transcript $trans
               The transcript to add to the gene
  Example    : my $transcript = Bio::EnsEMBL::Transcript->new(...);
  Description: Adds another Transcript to the set of alternatively
               spliced Transcripts of this gene. If it shares exons 
               with another Transcript, these should be object-identical.
  Returntype : none
  Exceptions : none
  Caller     : general
  Status     : Stable


  Example    : my @transcripts = @{ $gene->get_all_Transcripts };
  Description: Returns the Transcripts in this gene.
  Returntype : Listref of Bio::EnsEMBL::Transcript objects
  Warning    : This method returns the internal transcript array 
               used by this object. Avoid any modification
               of this array. We class use of shift and 
               reassignment of the loop variable when iterating
               this array as modification.

               Dereferencing the structure as shown in the example is
               a safe way of using this data structure.
  Exceptions : none
  Caller     : general
  Status     : Stable


  Example    : my @alt_genes = @{ $gene->get_all_alt_alleles };
               foreach my $alt_gene (@alt_genes) {
                 print "Alternate allele: " . $alt_gene->stable_id() . "\n";
  Description: Returns a listref of Gene objects that represent this Gene on
               an alternative haplotype. Empty list if there is no such
               Gene (eg there is no overlapping haplotype).
  Returntype : listref of Bio::EnsEMBL::Gene objects
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) Int
               A version number for the stable_id
  Example    : $gene->version(2);
  Description: Getter/setter for version number
  Returntype : Int
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - the stable ID to set
  Example    : $gene->stable_id("ENSG0000000001");
  Description: Getter/setter for stable id for this gene.
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - the stable ID with version to set
  Example    : $gene->stable_id("ENSG0000000001.3");
  Description: Getter/setter for stable id with version for this gene.
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : Boolean $is_current
  Example    : $gene->is_current(1)
  Description: Getter/setter for is_current state of this gene.
  Returntype : Int
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - created date to set (as a UNIX time int)
  Example    : $gene->created_date('1141948800');
  Description: Getter/setter for attribute created_date
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) String - modified date to set (as a UNIX time int)
  Example    : $gene->modified_date('1141948800');
  Description: Getter/setter for attribute modified_date
  Returntype : String
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : String - coordinate system name to transform to
  Arg [2]    : String - coordinate system version
  Example    : my $new_gene = $gene->transform('supercontig');
  Description: Moves this gene to the given coordinate system. If this gene has
               Transcripts attached, they move as well.
  Returntype : Bio::EnsEMBL::Gene
  Exceptions : throw on wrong parameters
  Caller     : general
  Status     : Stable


  Arg [1]    : Bio::EnsEMBL::Slice $destination_slice
  Example    : my $new_gene = $gene->transfer($slice);
  Description: Moves this Gene to given target slice coordinates. If Transcripts
               are attached they are moved as well. Returns a new gene.
  Returntype : Bio::EnsEMBL::Gene
  Exceptions : none
  Caller     : general
  Status     : Stable


  Arg [1]    : (optional) Bio::EnsEMBL::DBEntry - the display xref to set
  Example    : $gene->display_xref($db_entry);
  Description: Getter/setter display_xref for this gene.
  Returntype : Bio::EnsEMBL::DBEntry
  Exceptions : none
  Caller     : general
  Status     : Stable


  Example    : print $gene->display_id();
  Description: This method returns a string that is considered to be
               the 'display' identifier. For genes this is (depending on
               availability and in this order) the stable Id, the dbID or an
               empty string.
  Returntype : String
  Exceptions : none
  Caller     : web drawing code
  Status     : Stable


  Example    : $gene->recalculate_coordinates;
  Description: Called when transcript added to the gene, tries to adapt the
               coords for the gene.
  Returntype : none
  Exceptions : none
  Caller     : internal
  Status     : Stable


  Example    : $dasref = $prot->get_all_DASFactories
  Description: Retrieves a listref of registered DAS objects
              TODO: Abstract to a DBLinkContainer obj
  Returntype : [ DAS_objects ]
  Exceptions : none
  Caller     : general
  Status     : Stable


  Example    : $features = $prot->get_all_DAS_Features;
  Description: Retrieves a hash reference to a hash of DAS feature
               sets, keyed by the DNS, NOTE the values of this hash
               are an anonymous array containing:
                (1) a pointer to an array of features
                (2) a pointer to the DAS stylesheet
  Returntype : hashref of Bio::SeqFeatures
  Exceptions : none
  Caller     : webcode
  Status     : Stable


  Arg [1]       : Boolean $load_xrefs
                  Load (or don't load) xrefs.  Default is to load xrefs.
  Example       : $gene->load();
  Description   : The Ensembl API makes extensive use of
                  lazy-loading.  Under some circumstances (e.g.,
                  when copying genes between databases), all data of
                  an object needs to be fully loaded.  This method
                  loads the parts of the object that are usually
                  lazy-loaded.  It will also call the equivalent
                  method on all the transcripts of the gene.
  Returns       : 


  Description : Empties out caches and unsets fields of this Gene.
                Beware of further actions without adding some new transcripts.
  Example     : $gene->flush_Transcripts();


  Description: getter setter for the gene attribute is_ref
  Arg [1]    : (optional) 1 or 0
  return     : boolean


  Example       : $gene_summary = $gene->summary_as_hash();
  Description   : Extends Feature::summary_as_hash
                  Retrieves a summary of this Gene object.
  Returns       : hashref of arrays of descriptive strings
  Status        : Intended for internal use


  Example       : $havana_gene = $transcript->havana_gene();
  Description   : Locates the corresponding havana gene
  Returns       : Bio::EnsEMBL::DBEntry


  Example    : my $biotype = $gene->get_Biotype;
  Description: Returns the Biotype object of this gene.
               When no biotype exists, defaults to 'protein_coding'.
               When used to set to a biotype that does not exist in
               the biotype table, a biotype object is created with
               the provided argument as name and object_type gene.
  Returntype : Bio::EnsEMBL::Biotype
  Exceptions : none


  Arg [1]    : Arg [1] : String - the biotype name to set
  Example    : my $biotype = $gene->set_Biotype('protin_coding');
  Description: Sets the Biotype of this gene to the provided biotype name.
               Returns the Biotype object of this gene.
               When no biotype exists, defaults to 'protein_coding' name.
               When setting a biotype that does not exist in
               the biotype table, a biotype object is created with
               the provided argument as name and object_type gene.
  Returntype : Bio::EnsEMBL::Biotype
  Exceptions : If no argument provided