LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::Gene - Object representing a genes

SYNOPSIS

my $gene = Bio::EnsEMBL::Gene->new(
  -START  => 123,
  -END    => 1045,
  -STRAND => 1,
  -SLICE  => $slice
);

# print gene information
print("gene start:end:strand is "
    . join( ":", map { $gene->$_ } qw(start end strand) )
    . "\n" );

# set some additional attributes
$gene->stable_id('ENSG000001');
$gene->description('This is the gene description');

DESCRIPTION

A representation of a Gene within the Ensembl system. A gene is a set of one or more alternative transcripts.

METHODS

new

Arg [-START]  : 
     int - start postion of the gene
Arg [-END]    : 
     int - end position of the gene
Arg [-STRAND] : 
     int - 1,-1 tehe strand the gene is on
Arg [-SLICE]  : 
     Bio::EnsEMBL::Slice - the slice the gene is on
Arg [-STABLE_ID] :
      string - the stable identifier of this gene
Arg [-VERSION] :
      int - the version of the stable identifier of this gene
Arg [-EXTERNAL_NAME] :
      string - the external database name associated with this gene
Arg [-EXTERNAL_DB] :
      string - the name of the database the external name is from
Arg [-EXTERNAL_STATUS]:
      string - the status of the external identifier
Arg [-DISPLAY_XREF]:
      Bio::EnsEMBL::DBEntry - The external database entry that is used
      to label this gene when it is displayed.
Arg [-TRANSCRIPTS]:
      Listref of Bio::EnsEMBL::Transcripts - this gene's transcripts
Arg [-CREATED_DATE]:
      string - the date the gene was created
Arg [-MODIFIED_DATE]:
      string - the date the gene was last modified
Arg [-DESCRIPTION]:
      string - the genes description
Arg [-BIOTYPE]:
      string - the biotype e.g. "protein_coding"
Arg [-SOURCE]:
      string - the genes source, e.g. "ensembl"
Arg [-IS_CURRENT]:
      Boolean - specifies if this is the current version of the gene
Arg [-CANONICAL_TRANSCRIPT]:
      Bio::EnsEMBL::Transcript - the canonical transcript of this gene
Arg [-CANONICAL_TRANSCRIPT_ID]:
      integer - the canonical transcript dbID of this gene, if the
      transcript object itself is not available.

Example    : $gene = Bio::EnsEMBL::Gene->new(...);
Description: Creates a new gene object
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller     : general
Status     : Stable

external_name

Arg [1]    : (optional) String - the external name to set
Example    : $gene->external_name('BRCA2');
Description: Getter/setter for attribute external_name.
Returntype : String or undef
Exceptions : none
Caller     : general
Status     : Stable

source

Arg [1]    : (optional) String - the source to set
Example    : $gene->source('ensembl');
Description: Getter/setter for attribute source
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

external_db

Arg [1]    : (optional) String - name of external db to set
Example    : $gene->external_db('HGNC');
Description: Getter/setter for attribute external_db. The db is the one that 
             belongs to the external_name.  
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

external_status

Arg [1]    : (optional) String - status of the external db
Example    : $gene->external_status('KNOWNXREF');
Description: Getter/setter for attribute external_status. The status of
             the external db of the one that belongs to the external_name.
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

description

Arg [1]    : (optional) String - the description to set
Example    : $gene->description('This is the gene\'s description');
Description: Getter/setter for gene description
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

equals

Arg [1]       : Bio::EnsEMBL::Gene gene
Example       : if ($geneA->equals($geneB)) { ... }
Description   : Compares two genes for equality.
                The test for eqality goes through the following list
                and terminates at the first true match:

                1. If Bio::EnsEMBL::Feature::equals() returns false,
                   then the genes are *not* equal.
                2. If the biotypes differ, then the genes are *not*
                   equal.
                3. If both genes have stable IDs: if these are the
                   same, the genes are equal, otherwise not.
                4. If both genes have the same number of transcripts
                   and if these are (when compared pair-wise sorted by
                   start-position and length) the same, then they are
                   equal, otherwise not.

Return type   : Boolean (0, 1)

Exceptions    : Thrown if a non-gene is passed as the argument.

canonical_transcript

Arg [1]    : (optional) Bio::EnsEMBL::Transcript - canonical_transcript object
Example    : $gene->canonical_transcript($canonical_transcript);
Description: Getter/setter for the canonical_transcript
Returntype : Bio::EnsEMBL::Transcript
Exceptions : Throws if argument is not a transcript object.
Caller     : general
Status     : Stable

get_all_Attributes

Arg [1]    : (optional) String $attrib_code
             The code of the attribute type to retrieve values for
Example    : my ($author) = @{ $gene->get_all_Attributes('author') };
             my @gene_attributes = @{ $gene->get_all_Attributes };
Description: Gets a list of Attributes of this gene.
             Optionally just get Attributes for given code.
Returntype : Listref of Bio::EnsEMBL::Attribute
Exceptions : warning if gene does not have attached adaptor and attempts lazy
             load.
Caller     : general
Status     : Stable

add_Attributes

Arg [1-N]  : list of Bio::EnsEMBL::Attribute's @attribs
             Attribute(s) to add
Example    : my $attrib = Bio::EnsEMBL::Attribute->new(...);
             $gene->add_Attributes($attrib);
Description: Adds an Attribute to the Gene. If you add an attribute before
             you retrieve any from database, lazy loading will be disabled.
Returntype : none
Exceptions : throw on incorrect arguments
Caller     : general
Status     : Stable

add_DBEntry

Arg [1]    : Bio::EnsEMBL::DBEntry $dbe
             The dbEntry to be added
Example    : my $dbe = Bio::EnsEMBL::DBEntery->new(...);
             $gene->add_DBEntry($dbe);
Description: Associates a DBEntry with this gene. Note that adding DBEntries
             will prevent future lazy-loading of DBEntries for this gene
             (see get_all_DBEntries).
Returntype : none
Exceptions : thrown on incorrect argument type
Caller     : general
Status     : Stable

get_all_DBEntries

Arg [1]    : (optional) String, external database name,
             SQL wildcard characters (_ and %) can be used to
             specify patterns.

Arg [2]    : (optional) String, external_db type, can be one of
             ('ARRAY','ALT_TRANS','ALT_GENE','MISC','LIT','PRIMARY_DB_SYNONYM','ENSEMBL'),
             SQL wildcard characters (_ and %) can be used to
             specify patterns.

Example    : my @dbentries = @{ $gene->get_all_DBEntries() };
             @dbentries = @{ $gene->get_all_DBEntries('Uniprot%') };
             @dbentries = @{ $gene->get_all_DBEntries('%', 'ENSEMBL') };}

Description: Retrieves DBEntries (xrefs) for this gene.  This does
             *not* include DBEntries that are associated with the
             transcripts and corresponding translations of this
             gene (see get_all_DBLinks()).

             This method will attempt to lazy-load DBEntries
             from a database if an adaptor is available and no
             DBEntries are present on the gene (i.e. they have not
             already been added or loaded).

Return type: Listref of Bio::EnsEMBL::DBEntry objects
Exceptions : none
Caller     : get_all_DBLinks, GeneAdaptor::store
Status     : Stable

get_all_object_xrefs

Arg [1]    : (optional) String, external database name

Arg [2]    : (optional) String, external_db type

Example    : @oxrefs = @{ $gene->get_all_object_xrefs() };

Description: Retrieves xrefs for this gene.  This does *not*
             include xrefs that are associated with the
             transcripts or corresponding translations of this
             gene (see get_all_xrefs()).

             This method will attempt to lazy-load xrefs from a
             database if an adaptor is available and no xrefs are
             present on the gene (i.e. they have not already been
             added or loaded).

              NB: This method is an alias for the
                  get_all_DBentries() method.

Return type: Listref of Bio::EnsEMBL::DBEntry objects

Status     : Stable
Arg [1]    : String database name (optional)
             SQL wildcard characters (_ and %) can be used to
             specify patterns.

Arg [2]    : (optional) String, external database type, can be one of
             ('ARRAY','ALT_TRANS','ALT_GENE','MISC','LIT','PRIMARY_DB_SYNONYM','ENSEMBL'),
             SQL wildcard characters (_ and %) can be used to
             specify patterns.

Example    : @dblinks = @{ $gene->get_all_DBLinks() };
             @dblinks = @{ $gene->get_all_DBLinks('Uniprot%') };
             @dblinks = @{ $gene->get_all_DBLinks('%', 'ENSEMBL') };}

Description: Retrieves *all* related DBEntries for this gene. This
             includes all DBEntries that are associated with the
             transcripts and corresponding translations of this
             gene.

             If you only want to retrieve the DBEntries
             associated with the gene (and not the transcript
             and translations) then you should use the
             get_all_DBEntries() call instead.

             Note: Each entry may be listed more than once.  No
             uniqueness checks are done.  Also if you put in an
             incorrect external database name no checks are done
             to see if this exists, you will just get an empty
             list.

Return type: Listref of Bio::EnsEMBL::DBEntry objects
Exceptions : none
Caller     : general
Status     : Stable

get_all_xrefs

Arg [1]    : String database name (optional)
             SQL wildcard characters (_ and %) can be used to
             specify patterns.

Example    : @xrefs = @{ $gene->get_all_xrefs() };
             @xrefs = @{ $gene->get_all_xrefs('Uniprot%') };

Description: Retrieves *all* related xrefs for this gene.  This
             includes all xrefs that are associated with the
             transcripts and corresponding translations of this
             gene.

             If you want to retrieve the xrefs associated
             with only the gene (and not the transcript
             or translations) then you should use the
             get_all_object_xrefs() method instead.

             Note: Each entry may be listed more than once.  No
             uniqueness checks are done.  Also if you put in an
             incorrect external database name no checks are done
             to see if this exists, you will just get an empty
             list.

              NB: This method is an alias for the
                  get_all_DBLinks() method.

Return type: Listref of Bio::EnsEMBL::DBEntry objects

Status     : Stable

get_all_Exons

Example    : my @exons = @{ $gene->get_all_Exons };
Description: Returns a set of all the exons associated with this gene.
Returntype : Listref of Bio::EnsEMBL::Exon objects
Exceptions : none
Caller     : general
Status     : Stable

get_all_Introns

Arg [1]    : none
Example    : my @introns = @{$gene->get_all_Introns()};
Description: Returns an listref of the introns in this gene in order.
             i.e. the first intron in the listref is the 5prime most exon in
             the gene.
Returntype : listref to Bio::EnsEMBL::Intron objects
Exceptions : none
Caller     : general
Status     : Stable

get_all_homologous_Genes

Arg[1]     : String The compara synonym to use when looking for a database in the
             registry. If not provided we will use the very first compara database
             we find.
Description: Queries the Ensembl Compara database and retrieves all
             Genes from other species that are orthologous.
             REQUIRES properly setup Registry conf file. Meaning that
             one of the aliases for each core db has to be "Genus species"
             e.g. "Homo sapiens" (as in the name column in genome_db table
             in the compara database).

             The data is cached in this Object for faster re-retreival.
Returntype : listref [
                      Bio::EnsEMBL::Gene,
                      Bio::EnsEMBL::Compara::Homology,
                      string $species # needed as cannot get spp from Gene 
                     ]
Exceptions : none
Caller     : general
Status     : Stable

_clear_homologues

Description: Removes any cached homologues from the Gene which could have been
             fetched from the C<get_all_homologous_Genes()> call.
Returntype : none
Exceptions : none
Caller     : general

add_Transcript

Arg [1]    : Bio::EnsEMBL::Transcript $trans
             The transcript to add to the gene
Example    : my $transcript = Bio::EnsEMBL::Transcript->new(...);
             $gene->add_Transcript($transcript);
Description: Adds another Transcript to the set of alternatively
             spliced Transcripts of this gene. If it shares exons 
             with another Transcript, these should be object-identical.
Returntype : none
Exceptions : none
Caller     : general
Status     : Stable

get_all_Transcripts

Example    : my @transcripts = @{ $gene->get_all_Transcripts };
Description: Returns the Transcripts in this gene.
Returntype : Listref of Bio::EnsEMBL::Transcript objects
Warning    : This method returns the internal transcript array 
             used by this object. Avoid any modification
             of this array. We class use of shift and 
             reassignment of the loop variable when iterating
             this array as modification.

             Dereferencing the structure as shown in the example is
             a safe way of using this data structure.
Exceptions : none
Caller     : general
Status     : Stable

get_all_alt_alleles

Example    : my @alt_genes = @{ $gene->get_all_alt_alleles };
             foreach my $alt_gene (@alt_genes) {
               print "Alternate allele: " . $alt_gene->stable_id() . "\n";
             }
Description: Returns a listref of Gene objects that represent this Gene on
             an alternative haplotype. Empty list if there is no such
             Gene (eg there is no overlapping haplotype).
Returntype : listref of Bio::EnsEMBL::Gene objects
Exceptions : none
Caller     : general
Status     : Stable

version

Arg [1]    : (optional) Int
             A version number for the stable_id
Example    : $gene->version(2);
Description: Getter/setter for version number
Returntype : Int
Exceptions : none
Caller     : general
Status     : Stable

stable_id

Arg [1]    : (optional) String - the stable ID to set
Example    : $gene->stable_id("ENSG0000000001");
Description: Getter/setter for stable id for this gene.
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

stable_id_version

Arg [1]    : (optional) String - the stable ID with version to set
Example    : $gene->stable_id("ENSG0000000001.3");
Description: Getter/setter for stable id with version for this gene.
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

is_current

Arg [1]    : Boolean $is_current
Example    : $gene->is_current(1)
Description: Getter/setter for is_current state of this gene.
Returntype : Int
Exceptions : none
Caller     : general
Status     : Stable

created_date

Arg [1]    : (optional) String - created date to set (as a UNIX time int)
Example    : $gene->created_date('1141948800');
Description: Getter/setter for attribute created_date
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

modified_date

Arg [1]    : (optional) String - modified date to set (as a UNIX time int)
Example    : $gene->modified_date('1141948800');
Description: Getter/setter for attribute modified_date
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

transform

Arg [1]    : String - coordinate system name to transform to
Arg [2]    : String - coordinate system version
Example    : my $new_gene = $gene->transform('supercontig');
Description: Moves this gene to the given coordinate system. If this gene has
             Transcripts attached, they move as well.
Returntype : Bio::EnsEMBL::Gene
Exceptions : throw on wrong parameters
Caller     : general
Status     : Stable

transfer

Arg [1]    : Bio::EnsEMBL::Slice $destination_slice
Example    : my $new_gene = $gene->transfer($slice);
Description: Moves this Gene to given target slice coordinates. If Transcripts
             are attached they are moved as well. Returns a new gene.
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller     : general
Status     : Stable

display_xref

Arg [1]    : (optional) Bio::EnsEMBL::DBEntry - the display xref to set
Example    : $gene->display_xref($db_entry);
Description: Getter/setter display_xref for this gene.
Returntype : Bio::EnsEMBL::DBEntry
Exceptions : none
Caller     : general
Status     : Stable

display_id

Example    : print $gene->display_id();
Description: This method returns a string that is considered to be
             the 'display' identifier. For genes this is (depending on
             availability and in this order) the stable Id, the dbID or an
             empty string.
Returntype : String
Exceptions : none
Caller     : web drawing code
Status     : Stable

recalculate_coordinates

Example    : $gene->recalculate_coordinates;
Description: Called when transcript added to the gene, tries to adapt the
             coords for the gene.
Returntype : none
Exceptions : none
Caller     : internal
Status     : Stable

get_all_DASFactories

Example    : $dasref = $prot->get_all_DASFactories
Description: Retrieves a listref of registered DAS objects
            TODO: Abstract to a DBLinkContainer obj
Returntype : [ DAS_objects ]
Exceptions : none
Caller     : general
Status     : Stable

get_all_DAS_Features

Example    : $features = $prot->get_all_DAS_Features;
Description: Retrieves a hash reference to a hash of DAS feature
             sets, keyed by the DNS, NOTE the values of this hash
             are an anonymous array containing:
              (1) a pointer to an array of features
              (2) a pointer to the DAS stylesheet
Returntype : hashref of Bio::SeqFeatures
Exceptions : none
Caller     : webcode
Status     : Stable

load

Arg [1]       : Boolean $load_xrefs
                Load (or don't load) xrefs.  Default is to load xrefs.
Example       : $gene->load();
Description   : The Ensembl API makes extensive use of
                lazy-loading.  Under some circumstances (e.g.,
                when copying genes between databases), all data of
                an object needs to be fully loaded.  This method
                loads the parts of the object that are usually
                lazy-loaded.  It will also call the equivalent
                method on all the transcripts of the gene.
Returns       : 

flush_Transcripts

Description : Empties out caches and unsets fields of this Gene.
              Beware of further actions without adding some new transcripts.
Example     : $gene->flush_Transcripts();

is_ref

Description: getter setter for the gene attribute is_ref
Arg [1]    : (optional) 1 or 0
return     : boolean

summary_as_hash

  Example       : $gene_summary = $gene->summary_as_hash();
  Description   : Extends Feature::summary_as_hash
                  Retrieves a summary of this Gene object.
	                  
  Returns       : hashref of arrays of descriptive strings
  Status        : Intended for internal use

havana_gene

Example       : $havana_gene = $transcript->havana_gene();
Description   : Locates the corresponding havana gene
Returns       : Bio::EnsEMBL::DBEntry

get_Biotype

Example    : my $biotype = $gene->get_Biotype;
Description: Returns the Biotype object of this gene.
             When no biotype exists, defaults to 'protein_coding'.
             When used to set to a biotype that does not exist in
             the biotype table, a biotype object is created with
             the provided argument as name and object_type gene.
Returntype : Bio::EnsEMBL::Biotype
Exceptions : none

set_Biotype

Arg [1]    : Arg [1] : String - the biotype name to set
Example    : my $biotype = $gene->set_Biotype('protin_coding');
Description: Sets the Biotype of this gene to the provided biotype name.
             Returns the Biotype object of this gene.
             When no biotype exists, defaults to 'protein_coding' name.
             When setting a biotype that does not exist in
             the biotype table, a biotype object is created with
             the provided argument as name and object_type gene.
Returntype : Bio::EnsEMBL::Biotype
Exceptions : If no argument provided