LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME

Bio::EnsEMBL::Exon - A class representing an Exon

SYNOPSIS

  $exon = new Bio::EnsEMBL::Exon(
    -START     => 100,
    -END       => 200,
    -STRAND    => 1,
    -SLICE     => $slice,
    -DBID      => $dbID,
    -ANALYSIS  => $analysis,
    -STABLE_ID => 'ENSE000000123',
    -VERSION   => 2
  );

# seq() returns a Bio::Seq
my $seq = $exon->seq->seq();

# Peptide only makes sense within transcript context
my $pep = $exon->peptide($transcript)->seq();

# Normal feature operations can be performed:
$exon = $exon->transform('clone');
$exon->move( $new_start, $new_end, $new_strand );
print $exon->slice->seq_region_name();

DESCRIPTION

This is a class which represents an exon which is part of a transcript. See Bio::EnsEMBL:Transcript

METHODS

new

Arg [-SLICE]: Bio::EnsEMBL::SLice - Represents the sequence that this
              feature is on. The coordinates of the created feature are
              relative to the start of the slice.
Arg [-START]: The start coordinate of this feature relative to the start
              of the slice it is sitting on.  Coordinates start at 1 and
              are inclusive.
Arg [-END]  : The end coordinate of this feature relative to the start of
              the slice it is sitting on.  Coordinates start at 1 and are
              inclusive.
Arg [-STRAND]: The orientation of this feature.  Valid values are 1,-1,0.
Arg [-SEQNAME] : (optional) A seqname to be used instead of the default name
              of the of the slice.  Useful for features that do not have an
              attached slice such as protein features.
Arg [-dbID]   : (optional) internal database id
Arg [-ADAPTOR]: (optional) Bio::EnsEMBL::DBSQL::BaseAdaptor
Arg [-PHASE]    : the phase. 
Arg [-END_PHASE]: the end phase
Arg [-STABLE_ID]: (optional) the stable id of the exon
Arg [-VERSION]  : (optional) the version
Arg [-CREATED_DATE] : (optional) the created date
Arg [-MODIFIED_DATE]: (optional) the last midifeid date

Example    : none
Description: create an Exon object
Returntype : Bio::EnsEMBL::Exon
Exceptions : if phase is not valid (i.e. 0,1, 2 -1)
Caller     : general
Status     : Stable

end_phase

Arg [1]    : (optional) int $end_phase
Example    : $end_phase = $feat->end_phase;
Description: Gets/Sets the end phase of the exon.
             end_phase = number of bases from the last incomplete codon of 
             this exon.
             Usually, end_phase = (phase + exon_length)%3
             but end_phase could be -1 if the exon is half-coding and its 3 
             prime end is UTR.
Returntype : int
Exceptions : warning if end_phase is called without an argument and the
             value is not set.
Caller     : general
Status     : Stable

phase

Arg [1]    : (optional) int $phase
Example    :  my $phase = $exon->phase;
              $exon->phase(2);
Description: Gets/Sets the phase of the exon.
Returntype : int
Exceptions : throws if phase is not (0, 1 2 or -1).
Caller     : general
Status     : Stable

Get or set the phase of the Exon, which tells the translation machinery, which makes a peptide from the DNA, where to start.

The Ensembl phase convention can be thought of as "the number of bases of the first codon which are on the previous exon". It is therefore 0, 1 or 2 (or -1 if the exon is non-coding). In ascii art, with alternate codons represented by ### and +++:

   Previous Exon   Intron   This Exon
...-------------            -------------...

5'                    Phase                3'
...#+++###+++###          0 +++###+++###+...
...+++###+++###+          1 ++###+++###++...
...++###+++###++          2 +###+++###+++...

Here is another explanation from Ewan:

Phase means the place where the intron lands inside the codon - 0 between codons, 1 between the 1st and second base, 2 between the second and 3rd base. Exons therefore have a start phase and a end phase, but introns have just one phase.

frame

Arg [1]    : none
Example    : $frame = $exon->frame
Description: Gets the frame of this exon
Returntype : int
Exceptions : thrown if an arg is passed
             thrown if frame cannot be calculated due to a bad phase value
Caller     : general
Status     : Stable

start

Arg [1]    : int $start (optional)
Example    : $start = $exon->start();
Description: Getter/Setter for the start of this exon.  The superclass
             implmentation is overridden to flush the internal sequence
             cache if this value is altered
Returntype : int
Exceptions : none
Caller     : general
Status     : Stable

end

Arg [1]    : int $end (optional)
Example    : $end = $exon->end();
Description: Getter/Setter for the end of this exon.  The superclass
             implmentation is overridden to flush the internal sequence
             cache if this value is altered
Returntype : int
Exceptions : none
Caller     : general
Status     : Stable

strand

Arg [1]    : int $strand (optional)
Example    : $start = $exon->strand();
Description: Getter/Setter for the strand of this exon.  The superclass
             implmentation is overridden to flush the internal sequence
             cache if this value is altered
Returntype : int
Exceptions : none
Caller     : general
Status     : Stable

cdna_start

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
              The transcript for which cDNA coordinates should be
              relative to.
Example     : $cdna_start = $exon->cdna_start($transcript);
Description : Returns the start position of the exon in cDNA
              coordinates.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer
Exceptions  : Throws if the given argument is not a transcript.
              Throws if the first part of the exon maps into a gap.
              Throws if the exon can not be mapped at all.
Caller      : General
Status      : Stable

cdna_end

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
              The transcript for which cDNA coordinates should be
              relative to.
Example     : $cdna_end = $exon->cdna_end($transcript);
Description : Returns the end position of the exon in cDNA
              coordinates.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer
Exceptions  : Throws if the given argument is not a transcript.
              Throws if the last part of the exon maps into a gap.
              Throws if the exon can not be mapped at all.
Caller      : General
Status      : Stable

cdna_coding_start

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
              The transcript for which cDNA coordinates should be
              relative to.
Example     : $cdna_coding_start = $exon->cdna_coding_start($transcript);
Description : Returns the start position of the coding region of the
              exon in cDNA coordinates.  Returns undef if the whole
              exon is non-coding.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer or undef
Exceptions  : Throws if the given argument is not a transcript.
Caller      : General
Status      : Stable

cdna_coding_end

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
              The transcript for which cDNA coordinates should be
              relative to.
Example     : $cdna_coding_end = $exon->cdna_coding_end($transcript);
Description : Returns the end position of the coding region of the
              exon in cDNA coordinates.  Returns undef if the whole
              exon is non-coding.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer or undef
Exceptions  : Throws if the given argument is not a transcript.
Caller      : General
Status      : Stable

coding_region_start

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
Example     : $coding_region_start =
                $exon->coding_region_start($transcript);
Description : Returns the start position of the coding region
              of the exon in slice-relative coordinates on the
              forward strand.  Returns undef if the whole exon is
              non-coding.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer or undef
Exceptions  : Throws if the given argument is not a transcript.
Caller      : General
Status      : Stable

coding_region_end

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
Example     : $coding_region_end =
                $exon->coding_region_end($transcript);
Description : Returns the end position of the coding region of
              the exon in slice-relative coordinates on the
              forward strand.  Returns undef if the whole exon is
              non-coding.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer or undef
Exceptions  : Throws if the given argument is not a transcript.
Caller      : General
Status      : Stable

rank

Arg [1]     : Bio::EnsEMBL::Transcript $transcript
              The transcript for which the exon rank
              is requested.
Example     : $rank = $exon->rank($transcript);
Description : Returns the rank of the exon relative to
              the transcript.
              Since an exon may be part of one or more transcripts,
              the relevant transcript must be given as argument to
              this method.
Return type : Integer
Exceptions  : Throws if the given argument is not a transcript.
              Throws if the exon does not belong to the transcript.
Caller      : General
Status      : Stable

slice

Arg [1]    : Bio::EnsEMBL::Slice
Example    : $slice = $exon->slice();
Description: Getter/Setter for the slice this exon is on.  The superclass
             implmentation is overridden to flush the internal sequence
             cache if this value is altered
Returntype : Bio::EnsEMBL::Slice
Exceptions : none
Caller     : general
Status     : Stable

equals

Arg [1]       : Bio::EnsEMBL::Exon exon
Example       : if ($exonA->equals($exonB)) { ... }
Description   : Compares two exons for equality.
                The test for eqality goes through the following list
                and terminates at the first true match:

                1. If Bio::EnsEMBL::Feature::equals() returns false,
                   then the exons are *not* equal.
                2. If both exons have stable IDs: if these are the
                   same, the exons are equal, otherwise not.
                3. If the exons have the same start, end, strand, and
                   phase, then they are equal, otherwise not.

Return type   : Boolean (0, 1)

Exceptions    : Thrown if a non-transcript is passed as the argument.

move

Arg [1]    : int start
Arg [2]    : int end
Arg [3]    : (optional) int strand
Example    : None
Description: Sets the start, end and strand in one call rather than in 
             3 seperate calls to the start(), end() and strand() methods.
             This is for convenience and for speed when this needs to be
             done within a tight loop.  This overrides the superclass
             move() method so that the internal sequence cache can be
             flushed if the exon if moved.
Returntype : none
Exceptions : Thrown is invalid arguments are provided
Caller     : general
Status     : Stable

transform

Arg  1     : String $coordinate_system_name
Arg [2]    : String $coordinate_system_version
Description: moves this exon to the given coordinate system. If this exon has
             attached supporting evidence, they move as well.
Returntype : Bio::EnsEMBL::Exon
Exceptions : wrong parameters
Caller     : general
Status     : Stable

transfer

Arg [1]    : Bio::EnsEMBL::Slice $destination_slice
Example    : none
Description: Moves this Exon to given target slice coordinates. If Features
             are attached they are moved as well. Returns a new exon.
Returntype : Bio::EnsEMBL::Exon
Exceptions : none
Caller     : general
Status     : Stable

add_supporting_features

Arg [1]    : Bio::EnsEMBL::Feature $feature
Example    : $exon->add_supporting_features(@features);
Description: Adds a list of supporting features to this exon. 
             Duplicate features are not added.  
             If supporting features are added manually in this
             way, prior to calling get_all_supporting_features then the
             get_all_supporting_features call will not retrieve supporting
             features from the database.
Returntype : none
Exceptions : throw if any of the features are not Feature
             throw if any of the features are not in the same coordinate
             system as the exon
Caller     : general
Status     : Stable

flush_supporting_features

Example     : $exon->flush_supporting_features;
Description : Removes all supporting evidence from the exon.
Return type : (Empty) listref
Exceptions  : none
Caller      : general
Status      : Stable

get_all_supporting_features

Arg [1]    : none
Example    : @evidence = @{$exon->get_all_supporting_features()};
Description: Retrieves any supporting features added manually by 
             calls to add_supporting_features. If no features have been
             added manually and this exon is in a database (i.e. it has
             an adaptor), fetch from the database
Returntype : listreference of Bio::EnsEMBL::BaseAlignFeature objects 
Exceptions : none
Caller     : general
Status     : Stable

find_supporting_evidence

# This method is only for genebuild backwards compatibility. # Avoid using it if possible

Arg [1]    : Bio::EnsEMBL::Feature $features
             The list of features to search for supporting (i.e. overlapping)
             evidence.
Arg [2]    : (optional) boolean $sorted
             Used to speed up the calculation of overlapping features.  
             Should be set to true if the list of features is sorted in 
             ascending order on their start coordinates.
Example    : $exon->find_supporting_evidence(\@features);
Description: Looks through all the similarity features and
             stores as supporting features any feature
             that overlaps with an exon.  
Returntype : none
Exceptions : none
Caller     : general
Status     : Medium Risk

stable_id

Arg [1]    : string $stable_id
Example    : none
Description: get/set for attribute stable_id
Returntype : string
Exceptions : none
Caller     : general
Status     : Stable

created_date

Arg [1]    : string $created_date
Example    : none
Description: get/set for attribute created_date
Returntype : string
Exceptions : none
Caller     : general
Status     : Stable

modified_date

Arg [1]    : string $modified_date
Example    : none
Description: get/set for attribute modified_date
Returntype : string
Exceptions : none
Caller     : general
Status     : Stable

version

Arg [1]    : string $version
Example    : none
Description: get/set for attribute version
Returntype : string
Exceptions : none
Caller     : general
Status     : Stable

stable_id_version

Arg [1]    : (optional) String - the stable ID with version to set
Example    : $exon->stable_id("ENSE0000000001.3");
Description: Getter/setter for stable id with version for this exon.
Returntype : String
Exceptions : none
Caller     : general
Status     : Stable

is_current

Arg [1]    : Boolean $is_current
Example    : $exon->is_current(1)
Description: Getter/setter for is_current state of this exon.
Returntype : Int
Exceptions : none
Caller     : general
Status     : Stable

is_constitutive

Arg [1]    : Boolean $is_constitutive
Example    : $exon->is_constitutive(0)
Description: Getter/setter for is_constitutive state of this exon.
Returntype : Int
Exceptions : none
Caller     : general
Status     : Stable

is_coding

Arg [1]    : Bio::EnsEMBL::Transcript
Example    : $exon->is_coding()
Description: Says if the exon is within the translation or not
Returntype : Int
Exceptions : none
Caller     : general
Status     : Stable

adjust_start_end

Arg  1     : int $start_adjustment
Arg  2     : int $end_adjustment
Example    : none
Description: returns a new Exon with this much shifted coordinates
Returntype : Bio::EnsEMBL::Exon
Exceptions : none
Caller     : Transcript->get_all_translateable_Exons()
Status     : Stable

peptide

Arg [1]    : Bio::EnsEMBL::Transcript $tr
Example    : my $pep_str = $exon->peptide($transcript)->seq; 
Description: Retrieves the portion of the transcripts peptide
             encoded by this exon.  The transcript argument is necessary
             because outside of the context of a transcript it is not
             possible to correctly determine the translation.  Note that
             an entire amino acid will be present at the exon boundaries
             even if only a partial codon is present.  Therefore the 
             concatenation of all of the peptides of a transcripts exons 
             is not the same as a transcripts translation because the 
             summation may contain duplicated amino acids at splice sites.
             In the case that this exon is entirely UTR, a Bio::Seq object 
             with an empty sequence string is returned.
Returntype : Bio::Seq
Exceptions : thrown if transcript argument is not provided
Caller     : general
Status     : Stable

_merge_ajoining_coords

Arg [1]     : ArrayRef of Bio::EnsEMBL::Mapper::Coordinate objects
Example     : 
Description : Merges coords which are ajoining or overlapping
Returntype  : Bio::EnsEMBL::Mapper::Coordinate or undef if it cannot happen
Exceptions  : Exception if the cooords cannot be condensed into one location
Caller      : internal
Status      : Development

seq

Arg [1]    : none
Example    : my $seq_str = $exon->seq->seq;
Description: Retrieves the dna sequence of this Exon.
             Returned in a Bio::Seq object.  Note that the sequence may
             include UTRs (or even be entirely UTR).
Returntype : Bio::Seq or undef
Exceptions : warning if argument passed,
             warning if exon does not have attatched slice
             warning if exon strand is not defined (or 0)
Caller     : general
Status     : Stable

hashkey

Arg [1]    : none
Example    : if(exists $hash{$exon->hashkey}) { do_something(); }
Description: Returns a unique hashkey that can be used to uniquely identify
             this exon.  Exons are considered to be identical if they share
             the same seq_region, start, end, strand, phase, end_phase.
             Note that this will consider two exons on different slices
             to be different, even if they actually are not. 
Returntype : string formatted as slice_name-start-end-strand-phase-end_phase
Exceptions : thrown if not all the necessary attributes needed to generate
             a unique hash value are set
             set
Caller     : general
Status     : Stable

display_id

Arg [1]    : none
Example    : print $exons->display_id();
Description: This method returns a string that is considered to be
             the 'display' identifier. For exons this is (depending on
             availability and in this order) the stable Id, the dbID or an
             empty string.
Returntype : string
Exceptions : none
Caller     : web drawing code
Status     : Stable

load

Args          : None
Example       : $exon->load();
Description   : The Ensembl API makes extensive use of
                lazy-loading.  Under some circumstances (e.g.,
                when copying genes between databases), all data of
                an object needs to be fully loaded.  This method
                loads the parts of the object that are usually
                lazy-loaded.
Returns       : Nothing.

summary_as_hash

Example       : $exon_summary = $exon->summary_as_hash();
Description   : Extends Feature::summary_as_hash
                Retrieves a summary of this Exon.
Returns       : hashref of descriptive strings
Status        : Intended for internal use