NAME
Bio::ORA - Olfactory Receptor family Assigner (ORA) [bioperl module]
SYNOPSIS
Take a sequence object from, say, an inputstream, and find an Olfactory Receptor gene. HMM profiles are used in order to identify location, frame and orientation of such gene.
Creating the ORA object, eg:
my $inputstream = Bio::SeqIO->new( -file => 'seqfile', -format => 'fasta' );
my $seqobj = $inputstream->next_seq();
my $ORA_obj = Bio::ORA->new( $seqobj );
Obtain an array holding the start point, the stop point, the DNA sequence and amino-acid sequence, eg:
my $array_ref = $ORA_obj->{'_result'} if ( $ORA_obj->find() );
Display result in genbank format, eg:
$ORA_obj->show( 'genbank' );
DESCRIPTION
Bio::ORA is a featherweight object for identifying mammalian olfactory receptor genes. The sequences should not be longer than 40kb. The returned array include location, sequence and statistic for the putative olfactory receptor gene. Fully functional with DNA and EST sequence, no intron supported.
See Synopsis above for the object creation code.
DRIVER SCRIPT
#!/usr/bin/perl
use strict;
use warnings;
use Bio::Seq;
use Bio::ORA;
my $inseq = Bio::SeqIO->new( '-file' => q{<} . $ARGV[0], -format => 'fasta' );
while (my $seq = $inseq->next_seq) {
my $ORA_obj = Bio::ORA->new( $seq );
if ( $ORA_obj->find() ) {
$ORA_obj->show( 'genbank' );
} else {
print {*STDOUT} " no hit!\n";
}
}
REQUIREMENTS
To use this module you may need: * Bioperl (http://bioperl.org/) modules, * HMMER v3+ distribution (http://hmmer.org/) and * FASTA 36+ distribution (ftp://ftp.ebi.ac.uk/pub/software/unix/fasta/).
LOCAL ADAPTATION
This module uses three softwares. If HMMER or FASTA are updated make sure that HMMER's hmmscan and FASTA's tfastx36 and fastx36 still exists under these names. You change the call my editing the "Default softwares" section.
# Default softwares
my $hmmscan = 'hmmscan';
my $tfastx = 'tfastx36';
my $fastx = 'fastx36';
FEEDBACK
If you have any problems with or questions about the scripts, please contact us through a GitHub issue (https://github.com/pseudogene/ora/issues). You are invited to contribute new features, fixes, or updates, large or small; we are always thrilled to receive pull requests, and do our best to process them as fast as we can.
AUTHOR
Michael Bekaert (michael.bekaert@stir.ac.uk)
Address: Institute of Aquaculture University of Stirling Stirling Scotland, FK9 4LA UK
SEE ALSO
perl(1), bioperl web site
LICENSE
Copyright 2007-2016 - Michael Bekaert
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
_findexec
Title : _findexec
Usage : my $path = $self->_findexec( $exec );
Function: Find an executable file in the $PATH.
Returns : The full path to the executable.
Args : $exec (mandatory) executable to be find.
new
Title : new
Usage : my $nb = Bio::ORA->new( $seqobj, $table, $aug, $hmm );
Function: Initialize ORA object.
Returns : An ORA object.
Args : $seqobj (mandatory) PrimarySeqI object (dna or rna),
$table (optional) translation table/genetic code number,
the default value is 1,
$aug (optional) use other start codon than AUG (default 0),
$hmm (optional) path to hmm profiles by default ORA looks at
./or.hmm.
find
Title : find
Usage : my $bool = $ORI_obj->find( $evalue, $strand, $start, $end );
Function: Identify an olfactory receptor protein.
Returns : boolean.
Args : $evalue (optional) set the E-value (expected) threshold.
Default is 1e-30,
$strand(optional) strand where search should be done (1 direct,
-1 reverse or 0 both). Default is 0,
$start (optional) coordinate of the first nucleotide. Useful
for coordinate calculation when first is not 1. Default is 1,
$end (optional) coordinate of the last nucleotide. Default is
the sequence length.
_what_or
Title : _what_or
Usage : my $bool = $self->_what_or( $strand );
Function: Use HMM profiles to identify an olfactory receptor gene.
Returns : boolean.
Args : $strand (optional) strand where search should be done
(1 direct, -1 reverse or 0 both). Default is 0.
_find_orf
Title : _find_orf
Usage : my $bool = $self->_find_or( $strand, $start, $end );
Function: Retrieve the olfactory receptor ORF.
Returns : boolean.
Args : $strand (mandatory) strand where ORA have been found
(1 direct or -1 reverse),
$start (mandatory) coordinate of the first nucleotide,
$end (mandatory) coordinate of the last nucleotide.
getHits
Title : getHits
Usage : my @hits = Bio::ORA->getHits( $seq, $evalue, $ref );
Function: Quick localization of ORs (use FASTA).
Returns : Array of hits start/stop positions.
Args : $seq (mandatory) primarySeqI object (dna or rna),
$evalue (mandatory) det the E-value threshold,
$ref (optional) path to fasta reference file, by default ORA
look at ./or.fasta.
fastScan
Title : fastScan
Usage : my @hits = Bio::ORA->fastScan( $seq, $ref );
Function: Quick localization of ORs (use FASTA).
Returns : Array of hits start/stop positions.
Args : $seq (mandatory) primarySeqI object (dna or rna),
$ref (optional) path to fasta reference file, by default ORA
look at ./or.fasta.
show
Title : show
Usage : $ORA_obj->show( $outstyle );
Function: Print result in various style.
Returns : none.
Args : $outstyle (mandatory) 'fasta', 'genbank', 'cvs', 'xml' or 'tsv' style.
_translation
Title : _translation
Usage : my ( $start, $end ) = $self->_translation();
Function: format initiation and stop codons for regex.
Returns : array with initiation and stop codons.
Args : none.