LICENSE

Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute Copyright [2016-2024] EMBL-European Bioinformatics Institute

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

CONTACT

Please email comments or questions to the public Ensembl
developers list at <http://lists.ensembl.org/mailman/listinfo/dev>.

Questions may also be sent to the Ensembl help desk at
<http://www.ensembl.org/Help/Contact>.

NAME Bio::EnsEMBL::DBSQL::GeneAdaptor - Database adaptor for the retrieval and storage of Gene objects

SYNOPSIS

use Bio::EnsEMBL::Registry;

Bio::EnsEMBL::Registry->load_registry_from_db(
  -host => 'ensembldb.ensembl.org',
  -user => 'anonymous',
);

$gene_adaptor =
  Bio::EnsEMBL::Registry->get_adaptor( "human", "core", "gene" );

$gene = $gene_adaptor->fetch_by_dbID(1234);

$gene = $gene_adaptor->fetch_by_stable_id('ENSG00000184129');

@genes = @{ $gene_adaptor->fetch_all_by_external_name('BRCA2') };

$slice_adaptor =
  Bio::EnsEMBL::Registry->get_adaptor( "human", "core", "slice" );

$slice =
  $slice_adaptor->fetch_by_region( 'chromosome', '1', 1, 1000000 );

@genes = @{ $gene_adaptor->fetch_all_by_Slice($slice) };

DESCRIPTION

This is a database aware adaptor for the retrieval and storage of gene objects.

METHODS

list_dbIDs

Example    : @gene_ids = @{$gene_adaptor->list_dbIDs()};
Description: Gets an array of internal ids for all genes in the current db
Arg[1]     : <optional> int. not 0 for the ids to be sorted by the seq_region.
Returntype : Listref of Ints
Exceptions : none
Caller     : general
Status     : Stable

list_stable_ids

Example    : @stable_gene_ids = @{$gene_adaptor->list_stable_ids()};
Description: Gets an listref of stable ids for all genes in the current db
Returntype : reference to a list of strings
Exceptions : none
Caller     : general
Status     : Stable

fetch_by_display_label

Arg [1]    : String $label - display label of gene to fetch
Example    : my $gene = $geneAdaptor->fetch_by_display_label("BRCA2");
Description: Returns the gene which has the given display label or undef if
             there is none. If there are more than 1, the gene on the 
             reference slice is reported or if none are on the reference,
             the first one is reported.
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller     : general
Status     : Stable

fetch_all_by_display_label

Arg [1]    : String $label - display label of genes to fetch
Example    : my @genes = @{$geneAdaptor->fetch_all_by_display_label("PPP1R2P1")};
Description: Returns all genes which have the given display label or undef if
             there are none. 
Returntype : listref of Bio::EnsEMBL::Gene objects
Exceptions : none
Caller     : general
Status     : Stable

fetch_by_stable_id

Arg [1]    : String $id 
             The stable ID of the gene to retrieve
Example    : $gene = $gene_adaptor->fetch_by_stable_id('ENSG00000148944');
Description: Retrieves a gene object from the database via its stable id.
             The gene will be retrieved in its native coordinate system (i.e.
             in the coordinate system it is stored in the database). It may
             be converted to a different coordinate system through a call to
             transform() or transfer(). If the gene or exon is not found
             undef is returned instead.
Returntype : Bio::EnsEMBL::Gene or undef
Exceptions : if we cant get the gene in given coord system
Caller     : general
Status     : Stable

fetch_by_stable_id_version

Arg [1]    : String $id 
             The stable ID of the gene to retrieve
Arg [2]    : Integer $version
             The version of the stable_id to retrieve
Example    : $gene = $gene_adaptor->fetch_by_stable_id('ENSG00000148944', 14);
Description: Retrieves a gene object from the database via its stable id and version.
             The gene will be retrieved in its native coordinate system (i.e.
             in the coordinate system it is stored in the database). It may
             be converted to a different coordinate system through a call to
             transform() or transfer(). If the gene or exon is not found
             undef is returned instead.
Returntype : Bio::EnsEMBL::Gene or undef
Exceptions : if we cant get the gene in given coord system
Caller     : general
Status     : Stable

fetch_all_by_source

Arg [1]    : String $source
             listref of $sources
             The source of the gene to retrieve. You can have as an argument a reference
             to a list of sources
Example    : $genes = $gene_adaptor->fetch_all_by_source('havana'); 
             $genes = $gene_adaptor->fetch_all_by_source(['ensembl', 'vega']);
Description: Retrieves an array reference of gene objects from the database via its source or sources.
             The gene will be retrieved in its native coordinate system (i.e.
             in the coordinate system it is stored in the database). It may
             be converted to a different coordinate system through a call to
             transform() or transfer(). If the gene or exon is not found
             undef is returned instead.
Returntype  : listref of Bio::EnsEMBL::Gene
Exceptions : if we cant get the gene in given coord system
Caller     : general
Status     : Stable

source_constraint

Arg [1]    : String $source
             listref of $sources
             The source of the gene to retrieve. You can have as an argument a reference
             to a list of sources
Description: Used internally to generate a SQL constraint to restrict a transcript query by source
Returntype  : String
Exceptions : If source is not supplied
Caller     : general
Status     : Stable

count_all_by_source

Arg [1]     : String $source
              listref of $source
              The source of the gene to retrieve. You can have as an argument a reference
              to a list of sources
Example     : $cnt = $gene_adaptor->count_all_by_source('ensembl'); 
              $cnt = $gene_adaptor->count_all_by_source(['havana', 'vega']);
Description : Retrieves count of gene objects from the database via its source or sources.
Returntype  : integer
Caller      : general
Status      : Stable

fetch_all_by_biotype

Arg [1]    : String $biotype 
             listref of $biotypes
             The biotype of the gene to retrieve. You can have as an argument a reference
             to a list of biotypes
Example    : $gene = $gene_adaptor->fetch_all_by_biotype('protein_coding'); 
             $gene = $gene_adaptor->fetch_all_by_biotypes(['protein_coding', 'sRNA', 'miRNA']);
Description: Retrieves an array reference of gene objects from the database via its biotype or biotypes.
             The genes will be retrieved in its native coordinate system (i.e.
             in the coordinate system it is stored in the database). It may
             be converted to a different coordinate system through a call to
             transform() or transfer(). If the gene or exon is not found
             undef is returned instead.
Returntype  : listref of Bio::EnsEMBL::Gene
Exceptions : if we cant get the gene in given coord system
Caller     : general
Status     : Stable

biotype_constraint

Arg [1]    : String $biotypes 
             listref of $biotypes
             The biotype of the gene to retrieve. You can have as an argument a reference
             to a list of biotypes
Description: Used internally to generate a SQL constraint to restrict a gene query by biotype
Returntype  : String
Exceptions : If biotype is not supplied
Caller     : general
Status     : Stable

count_all_by_biotype

Arg [1]     : String $biotype 
              listref of $biotypes
              The biotype of the gene to retrieve. You can have as an argument a reference
              to a list of biotypes
Example     : $cnt = $gene_adaptor->count_all_by_biotype('protein_coding'); 
              $cnt = $gene_adaptor->count_all_by_biotypes(['protein_coding', 'sRNA', 'miRNA']);
Description : Retrieves count of gene objects from the database via its biotype or biotypes.
Returntype  : integer
Caller      : general
Status      : Stable

fetch_all_versions_by_stable_id

Arg [1]     : String $stable_id 
              The stable ID of the gene to retrieve
Example     : $gene = $gene_adaptor->fetch_all_versions_by_stable_id
                ('ENSG00000148944');
Description : Similar to fetch_by_stable_id, but retrieves all versions of a
              gene stored in the database.
Returntype  : listref of Bio::EnsEMBL::Gene
Exceptions  : if we cant get the gene in given coord system
Caller      : general
Status      : At Risk

fetch_by_exon_stable_id

Arg [1]    : String $id
             The stable id of an exon of the gene to retrieve
Example    : $gene = $gene_adptr->fetch_by_exon_stable_id('ENSE00000148944');
Description: Retrieves a gene object from the database via an exon stable id.
             The gene will be retrieved in its native coordinate system (i.e.
             in the coordinate system it is stored in the database). It may
             be converted to a different coordinate system through a call to
             transform() or transfer(). If the gene or exon is not found
             undef is returned instead.
Returntype : Bio::EnsEMBL::Gene or undef
Exceptions : none
Caller     : general
Status     : Stable

fetch_all_by_domain

Arg [1]    : String $domain
             The domain to fetch genes from
Example    : my @genes = @{ $gene_adaptor->fetch_all_by_domain($domain) };
Description: Retrieves a listref of genes whose translation contain interpro
             domain $domain. The genes are returned in their native coord
             system (i.e. the coord_system they are stored in). If the coord
             system needs to be changed, then tranform or transfer should be
             called on the individual objects returned.
Returntype : list of Bio::EnsEMBL::Genes
Exceptions : none
Caller     : domainview
Status     : Stable
Arg [1]    : Bio::EnsEMBL::Slice $slice
             The slice to fetch genes on.
Arg [2]    : (optional) string $logic_name
             the logic name of the type of features to obtain
Arg [3]    : (optional) boolean $load_transcripts
             if true, transcripts will be loaded immediately
             rather than lazy loaded later.
Arg [4]    : String
             Name of the external database to fetch the Genes by
Example    : @genes = @{
               $ga->fetch_all_by_Slice_and_external_dbname_link(
                                        $slice, undef, undef, "HGNC" ) };
Description: Overrides superclass method to optionally load
             transcripts immediately rather than lazy-loading them
             later.  This is more efficient when there are a lot
             of genes whose transcripts are going to be used. The
             genes are then filtered to return only those with
             external database links of the type specified
Returntype : reference to list of genes
Exceptions : thrown if exon cannot be placed on transcript slice
Caller     : 
Status     : Stable

fetch_all_by_Slice

Arg [1]    : Bio::EnsEMBL::Slice $slice
             The slice to fetch genes on.
Arg [2]    : (optional) string $logic_name
             the logic name of the type of features to obtain
Arg [3]    : (optional) boolean $load_transcripts
             if true, transcripts will be loaded immediately rather than
             lazy loaded later.
Arg [4]    : (optional) string $source
             the source name of the features to obtain.
Arg [5]    : (optional) string biotype
              the biotype of the features to obtain.
Example    : @genes = @{$gene_adaptor->fetch_all_by_Slice()};
Description: Overrides superclass method to optionally load transcripts
             immediately rather than lazy-loading them later.  This
             is more efficient when there are a lot of genes whose
             transcripts are going to be used.
Returntype : reference to list of genes 
Exceptions : thrown if exon cannot be placed on transcript slice
Caller     : Slice::get_all_Genes
Status     : Stable

count_all_by_Slice

Arg [1]    : Bio::EnsEMBL::Slice $slice
             The slice to count genes on.
Arg [2]    : (optional) biotype(s) string or arrayref of strings 
              the biotype of the features to count.
Arg [1]    : (optional) string $source
             the source name of the features to count.
Example    : $cnt = $gene_adaptor->count_all_by_Slice();
Description: Method to count genes on a given slice, filtering by biotype and source
Returntype : integer
Exceptions : thrown if exon cannot be placed on transcript slice
Status     : Stable
Caller     : general

fetch_by_transcript_id

Arg [1]    : Int $trans_id
             Unique database identifier for the transcript whose gene should
             be retrieved. The gene is returned in its native coord
             system (i.e. the coord_system it is stored in). If the coord
             system needs to be changed, then tranform or transfer should
             be called on the returned object. undef is returned if the
             gene or transcript is not found in the database.
Example    : $gene = $gene_adaptor->fetch_by_transcript_id(1241);
Description: Retrieves a gene from the database via the database identifier
             of one of its transcripts.
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller     : general
Status     : Stable

fetch_by_transcript_stable_id

Arg [1]    : string $trans_stable_id
             transcript stable ID whose gene should be retrieved
Example    : my $gene = $gene_adaptor->fetch_by_transcript_stable_id
               ('ENST0000234');
Description: Retrieves a gene from the database via the stable ID of one of
             its transcripts
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller     : general
Status     : Stable

fetch_by_translation_stable_id

Arg [1]    : String $translation_stable_id
             The stable id of a translation of the gene to be obtained
Example    : my $gene = $gene_adaptor->fetch_by_translation_stable_id
               ('ENSP00000278194');
Description: Retrieves a gene via the stable id of one of its translations.
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller     : general
Status     : Stable

fetch_all_by_external_name

Arg [1]    : String $external_name
             The external identifier for the gene to be obtained
Arg [2]    : (optional) String $external_db_name
             The name of the external database from which the
             identifier originates.
Arg [3]    : Boolean override. Force SQL regex matching for users
             who really do want to find all 'NM%'
Example    : @genes = @{$gene_adaptor->fetch_all_by_external_name('BRCA2')}
             @many_genes = @{$gene_adaptor->fetch_all_by_external_name('BRCA%')}
Description: Retrieves a list of genes with an external database
             identifier $external_name. The genes returned are in
             their native coordinate system, i.e. in the coordinate
             system they are stored in the database in.  If another
             coordinate system is required then the Gene::transfer or
             Gene::transform method can be used.
             SQL wildcards % and _ are supported in the $external_name,
             but their use is somewhat restricted for performance reasons.
             Users that really do want % and _ in the first three characters
             should use argument 3 to prevent optimisations
Returntype : listref of Bio::EnsEMBL::Gene
Exceptions : none
Caller     : goview, general
Status     : Stable

fetch_all_by_description

Arg [1]    : String of description
Example    : $gene_list = $gene_adaptor->fetch_all_by_description('RNA%');
Description: Fetches genes by their textual description. Fully supports SQL
             wildcards, since getting an exact hit is unlikely.
Returntype : listref of Bio::EnsEMBL::Gene

fetch_all_by_GOTerm

Arg [1]   : Bio::EnsEMBL::OntologyTerm
            The GO term for which genes should be fetched.

Example:  @genes = @{
            $gene_adaptor->fetch_all_by_GOTerm(
              $go_adaptor->fetch_by_accession('GO:0030326') ) };

Description   : Retrieves a list of genes that are associated with
                the given GO term, or with any of its descendent
                GO terms.  The genes returned are in their native
                coordinate system, i.e. in the coordinate system
                in which they are stored in the database.  If
                another coordinate system is required then the
                Gene::transfer or Gene::transform method can be
                used.

Return type   : listref of Bio::EnsEMBL::Gene
Exceptions    : Throws of argument is not a GO term
Caller        : general
Status        : Stable

fetch_all_by_ontology_linkage_type

Arg [1]   : (optional) string $db_name
            The database name to search for. Defaults to GO
Arg [2]   : string $linkage_type
            Linkage type to search for e.g. IMP

Example:    my $genes = $gene_adaptor->fetch_all_by_ontology_linkage_type('GO', 'IMP');
            my $genes = $gene_adaptor->fetch_all_by_ontology_linkage_type(undef, 'IMP');

Description   : Retrieves a list of genes that are associated with
                the given ontology linkage type.  The genes returned 
                are in their native coordinate system, i.e. in the 
                coordinate system in which they are stored in the database.
Return type   : listref of Bio::EnsEMBL::Gene
Exceptions    : Throws if a linkage type is not given
Caller        : general
Status        : Stable

fetch_all_by_GOTerm_accession

Arg [1]   : String
            The GO term accession for which genes should be
            fetched.

Example   :

  @genes =
    @{ $gene_adaptor->fetch_all_by_GOTerm_accession(
      'GO:0030326') };

Description   : Retrieves a list of genes that are associated with
                the given GO term, or with any of its descendent
                GO terms.  The genes returned are in their native
                coordinate system, i.e. in the coordinate system
                in which they are stored in the database.  If
                another coordinate system is required then the
                Gene::transfer or Gene::transform method can be
                used.

Return type   : listref of Bio::EnsEMBL::Gene
Exceptions    : Throws of argument is not a GO term accession
Caller        : general
Status        : Stable

fetch_all_alt_alleles

Arg [1]    : Bio::EnsEMBL::Gene $gene
             The gene to fetch alternative alleles for
Arg [2]    : Boolean (optional)
             Ask the method to warn about any gene without an alt allele 
             group. Defaults to false
Example    : my @alt_genes = @{ $gene_adaptor->fetch_all_alt_alleles($gene) };
             foreach my $alt_gene (@alt_genes) {
               print "Alternate allele: " . $alt_gene->stable_id() . "\n" ;
             }
Description: Retrieves genes which are alternate alleles to a provided gene.
             Alternate alleles in Ensembl are genes which are similar and are
             on an alternative haplotype of the same region. There are not 
             currently very many of these. This method will return a 
             reference to an empty list if no alternative alleles are found.
Returntype : ArrayRef of Bio::EnsEMBL::Gene objects
Exceptions : throw if incorrect arg provided
             warning if gene arg does not have an entry in an alt allele and if
             the warn flag is true
Caller     : Gene::get_all_alt_alleles
Status     : Stable

is_ref

Arg [1]    : Gene dbID
Description: Used to determine whether a given Gene is the representative 
             Gene of an alt allele group. If it does not have an alternative
             allele that is more representative, then this ID will be said to
             be representative.
Returntype : Boolean - True for yes or no alternatives  

store_alt_alleles

Arg [1]    : reference to list of Bio::EnsEMBL::Genes $genes
Example    : $gene_adaptor->store_alt_alleles([$gene1, $gene2, $gene3]);
Description: This method creates a group of alternative alleles (i.e. locus)
             from a set of genes. The genes should be genes from alternate
             haplotypes which are similar. The genes must already be stored
             in this database. WARNING - now that more fine-grained support
             for alt_alleles has been implemented, this method is rather coarse.
             Consider working directly with AltAlleleGroup and 
             AltAlleleGroupAdaptor.
Returntype : int alt_allele_group_id or undef if no alt_alleles were stored
Exceptions : throw on incorrect arguments
             throw on sql error (e.g. duplicate unique id)
Caller     : general
Status     : Stable

store

Arg [1]    : Bio::EnsEMBL::Gene $gene
             The gene to store in the database
Arg [2]    : ignore_release in xrefs [default 1] set to 0 to use release info 
             in external database references
Arg [3]    : prevent coordinate recalculation if you are persisting 
             transcripts with this gene
Arg [4]    : prevent copying supporting features across exons
             increased speed for lost accuracy
Example    : $gene_adaptor->store($gene);
Description: Stores a gene in the database.
Returntype : the database identifier (dbID) of the newly stored gene
Exceptions : thrown if the $gene is not a Bio::EnsEMBL::Gene or if 
             $gene does not have an analysis object
Caller     : general
Status     : Stable

remove

Arg [1]    : Bio::EnsEMBL::Gene $gene
             the gene to remove from the database
Example    : $gene_adaptor->remove($gene);
Description: Removes a gene completely from the database. All associated
             transcripts, exons, stable_identifiers, descriptions, etc.
             are removed as well. Use with caution!
Returntype : none
Exceptions : throw on incorrect arguments 
             warning if gene is not stored in this database
Caller     : general
Status     : Stable

get_Interpro_by_geneid

Arg [1]    : String $gene_stable_id
             The stable ID of the gene to obtain
Example    : @i = @{
                $gene_adaptor->get_Interpro_by_geneid(
                  $gene->stable_id() ) };
Description: Gets interpro accession numbers by gene stable id. A hack really
             - we should have a much more structured system than this.
Returntype : listref of strings (Interpro_acc:description)
Exceptions : none 
Caller     : domainview
Status     : Stable

update

Arg [1]    : Bio::EnsEMBL::Gene $gene
             The gene to update
Example    : $gene_adaptor->update($gene);
Description: Updates the type, analysis, display_xref, is_current and
             description of a gene in the database.
Returntype : None
Exceptions : thrown if the $gene is not a Bio::EnsEMBL::Gene
Caller     : general
Status     : Stable

update_coords

Arg [1]    : Bio::EnsEMBL::Gene $gene
             The gene to update
Example    : $gene_adaptor->update_coords($gene);
Description: In the event of a transcript being removed, coordinates for the Gene
             need to be reset, but update() does not do this. update_coords 
             fills this niche
Returntype : None
Exceptions : thrown if the $gene is not supplied
Caller     : general

cache_gene_seq_mappings

Example    : $gene_adaptor->cache_gene_seq_mappings();
Description: caches all the assembly mappings needed for genes
Returntype : None
Exceptions : None
Caller     : general
Status     : At Risk
           : New experimental code

fetch_all_by_exon_supporting_evidence

Arg [1]    : String $hit_name
             Name of supporting feature
Arg [2]    : String $feature_type 
             one of "dna_align_feature" or "protein_align_feature"
Arg [3]    : (optional) Bio::Ensembl::Analysis
Example    : $genes = $gene_adaptor->fetch_all_by_exon_supporting_evidence(
                'XYZ', 'dna_align_feature');
Description: Gets all the genes with transcripts with exons which have a
             specified hit on a particular type of feature. Optionally filter
             by analysis.
Returntype : Listref of Bio::EnsEMBL::Gene
Exceptions : If feature_type is not of correct type.
Caller     : general
Status     : Stable

fetch_all_by_transcript_supporting_evidence

Arg [1]    : String $hit_name
             Name of supporting feature
Arg [2]    : String $feature_type 
             one of "dna_align_feature" or "protein_align_feature"
Arg [3]    : (optional) Bio::Ensembl::Analysis
Example    : $genes = $gene_adaptor->fetch_all_by_transcript_supporting_evidence('XYZ', 'dna_align_feature');
Description: Gets all the genes with transcripts with evidence for a
             specified hit on a particular type of feature. Optionally filter
             by analysis.
Returntype : Listref of Bio::EnsEMBL::Gene.
Exceptions : If feature_type is not of correct type.
Caller     : general
Status     : Stable