NAME
Bio::ProteinFeatures - Deriving features of amino acid sequences
SYNOPSIS
use Bio::ProteinFeatures;
$pf = new Bio::ProteinFeatures;
$pf->sequence($sequence_string);
# you may use Data::Dumper to see the result.
use Data::Dumper;
print Dumper $pf->features();
DESCRIPTION
This module applies several statistical methods on amino acid sequences for deriving various useful features for identifying sequences and they may be used to measure similarities between sequences. You may also use this module to do coarse matching before doing Blast.
METHODS
new
You can set the sequence on invoking the constructor.
$pf = new Bio::ProteinFeatures(sequence => $sequence_string);
Or set it using the next method.
sequence
Set or get the sequence string
# set the sequence
$pf->sequence($sequence);
# return the sequence
$pf->sequence();
features
The features this module deals with are listed below.
composition
Amino acids are grouped into three categories: polar, neutral, and hydrophobic. The methods calculates the compositions of the three groups of amino acids.
transition probability
Characterizes the percent frequency with which group A is followed by group B or B is followed by A.
accumulative distribution
Sequences are cut into 5 sections. It calculates the accumulative probabilities of a certain group within a section.
per-amino-acid probability
Calculates per-se probability of each amino acid.
first order energy
summation prob(i**2) for each i of amino acids.
first order entropy
summation -prob(i)*log(prob(i)) for each i of amino acids.
histogram difference
Calculates the difference of the numbers of two neighboring amino acids.
AA pair probability
Probabilities of amino acid bigrams.
average seperation between two amino acid of the same group
Counts the average number of characters between two amino acids of the same group.
COPYRIGHT
xern <xern@cpan.org>
This module is free software; you can redistribute it or modify it under the same terms as Perl itself.