NAME
Bio::Tools::Run::Phylo::Phyml - Wrapper for rapid reconstruction of phylogenies using Phyml
SYNOPSIS
use Bio::Tools::Run::Phylo::Phyml;
# Make a Phyml factory
$factory = Bio::Tools::Run::Phylo::Phyml->new(-verbose => 2);
# it defaults to protein alignment
# change parameters
$factory->model('Dayhoff');
# Pass the factory an alignment and run
$inputfilename = 't/data/protpars.phy';
$tree = $factory->run($inputfilename); # $tree is a Bio::Tree::Tree object.
# or set parameters at object creation
my %args = (
-data_type => 'dna',
-model => 'HKY',
-kappa => 4,
-invar => 'e',
-category_number => 4,
-alpha => 'e',
-tree => 'BIONJ',
-opt_topology => '0',
-opt_lengths => '1',
);
$factory = Bio::Tools::Run::Phylo::Phyml->new(%args);
# if you need the output files do
$factory->save_tempfiles(1);
$factory->tempdir($workdir);
# and get a Bio::Align::AlignI (SimpleAlign) object from somewhere
$tree = $factory->run($aln);
DESCRIPTION
This is a wrapper for running the phyml application by Stephane Guindon and Olivier Gascuel. You can download it from: http://atgc.lirmm.fr/phyml/
Installing
After downloading, you need to rename a the copy of the program that runs under your operating system. I.e. phyml_linux
into phyml
.
You will need to help this Phyml wrapper to find the phyml
program. This can be done in (at least) three ways:
Make sure the Phyml executable is in your path. Copy it to, or create a symbolic link from a directory that is in your path.
Define an environmental variable PHYMLDIR which is a directory which contains the 'phyml' application: In bash:
export PHYMLDIR=/home/username/phyml_v2.4.4/exe
In csh/tcsh:
setenv PHYMLDIR /home/username/phyml_v2.4.4/exe
Include a definition of an environmental variable PHYMLDIR in every script that will use this Phyml wrapper module, e.g.:
BEGIN { $ENV{PHYMLDIR} = '/home/username/phyml_v2.4.4/exe' } use Bio::Tools::Run::Phylo::Phyml;
Running
This wrapper has been tested with PHYML v2.4.4 and v.3.0
In its current state, the wrapper supports only input of one MSA and output of one tree. It can easily be extended to support more advanced capabilities of phyml
.
Two convienience methods have been added on top of the standard BioPerl WrapperBase ones: stats() and tree_string(). You can call them to after running the phyml program to retrieve into a string the statistics and the tree in Newick format.
FEEDBACK
Mailing Lists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web:
http://redmine.open-bio.org/projects/bioperl/
AUTHOR - Heikki Lehvaslaiho
heikki at bioperl dot org
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
new
Title : new
Usage : $factory = Bio::Tools::Run::Phylo::Phyml->new(@params)
Function: creates a new Phyml factory
Returns : Bio::Tools::Run::Phylo::Phyml
Args : Optionally, provide any of the following (default in []):
-data_type => 'dna' or 'protein', [protein]
-dataset_count => 'integer, [1]
-model => 'HKY'... , [HKY|JTT]
-kappa => 'e' or float, [e]
-invar => 'e' or float, [e]
-category_number => integer, [1]
-alpha => 'e' or float (int v3),[e]
-tree => 'BIONJ' or your own, [BION]
-opt_topology => boolean [y]
-opt_lengths => boolean [y]
program_name
Title : program_name
Usage : $factory>program_name()
Function: holds the program name
Returns : string
Args : None
program_dir
Title : program_dir
Usage : $factory->program_dir(@params)
Function: returns the program directory, obtained from ENV variable.
Returns : string
Args : None
version
Title : version
Usage : exit if $prog->version < 1.8
Function: Determine the version number of the program
Example :
Returns : float or undef
Args : none
Phyml before 3.0 did not display the version. Assume 2.44 when can not determine it.
run
Title : run
Usage : $factory->run($aln_file);
$factory->run($align_object);
Function: Runs Phyml to generate a tree
Returns : Bio::Tree::Tree object
Args : file name for your input alignment in a format
recognised by AlignIO, OR Bio::Align::AlignI
complient object (eg. Bio::SimpleAlign).
stats
Title : stats
Usage : $factory->stats;
Function: Returns the contents of the phyml '_phyml_stat.txt' output file
Returns : string with statistics about the run, undef before run()
Args : none
tree_string
Title : tree_string
Usage : $factory->tree_string;
$factory->run($align_object);
Function: Returns the contents of the phyml '_phyml_tree.txt' ouput file
Returns : string with tree in Newick format, undef before run()
Args : none
Getsetters
These methods are used to set and get program parameters before running.
data_type
Title : data_type
Usage : $phyml->data_type('nt');
Function: Sets sequence alphabet to 'dna' (nt in v3) or 'aa'
If leaved unset, will be set automatically
Returns : set value, defaults to 'protein'
Args : None to get, 'dna' ('nt') or 'aa' to set.
data_format
Title : data_format
Usage : $phyml->data_format('s');
Function: Sets PHYLIP format to 'i' interleaved or
's' sequential
Returns : set value, defaults to 'i'
Args : None to get, 'i' or 's' to set.
dataset_count
Title : dataset_count
Usage : $phyml->dataset_count(3);
Function: Sets dataset number to deal with
Returns : set value, defaults to 1
Args : None to get, positive integer to set.
model
Title : model
Usage : $phyml->model('HKY');
Function: Choose the substitution model to use. One of
JC69 | K2P | F81 | HKY | F84 | TN93 | GTR (DNA)
JTT | MtREV | Dayhoff | WAG (amino acids)
v3.0:
HKY85 (default) | JC69 | K80 | F81 | F84 |
TN93 | GTR (DNA)
WAG (default) | JTT | MtREV | Dayhoff | DCMut |
RtREV | CpREV | VT | Blosum62 | MtMam | MtArt |
HIVw | HIVb (amino acids)
Returns : Name of the model, defaults to {HKY|JTT}
Args : None to get, string to set.
kappa
Title : kappa
Usage : $phyml->kappa(4);
Function: Sets transition/transversion ratio, leave unset to estimate
Returns : set value, defaults to 'e'
Args : None to get, float or integer to set.
invar
Title : invar
Usage : $phyml->invar(.3);
Function: Sets proportion of invariable sites, leave unset to estimate
Returns : set value, defaults to 'e'
Args : None to get, float or integer to set.
category_number
Title : category_number
Usage : $phyml->category_number(4);
Function: Sets number of relative substitution rate categories
Returns : set value, defaults to 1
Args : None to get, integer to set.
alpha
Title : alpha
Usage : $phyml->alpha(1.0);
Function: Sets gamma distribution parameter, leave unset to estimate
Returns : set value, defaults to 'e'
Args : None to get, float or integer to set.
tree
Title : tree
Usage : $phyml->tree('/tmp/tree.nwk');
Function: Sets starting tree, leave unset to estimate a distance tree
Returns : set value, defaults to 'BIONJ'
Args : None to get, newick tree file name to set.
v2 options
These methods can be used with PhyML v2* only.
opt_topology
Title : opt_topology
Usage : $factory->opt_topology(1);
Function: Choose to optimise the tree topology
Returns : {y|n} (default y)
Args : None to get, boolean to set.
v2.* only
opt_lengths
Title : opt_lengths
Usage : $factory->opt_lengths(0);
Function: Choose to optimise branch lengths and rate parameters
Returns : {y|n} (default y)
Args : None to get, boolean to set.
v2.* only
v3 options
These methods can be used with PhyML v3* only.
freq
Title : freq
Usage : $phyml->freq(e); $phyml->freq("0.2, 0.6, 0.6, 0.2");
Function: Sets nucleotide frequences or asks residue to be estimated
according to two models: e or d
Returns : set value,
Args : None to get, string to set.
v3 only.
opt
Title : opt
Usage : $factory->opt(1);
Function: Optimise tree parameters: tlr|tl|tr|l|n
Returns : {value|n} (default n)
Args : None to get, string to set.
v3.* only
search
Title : search
Usage : $factory->search(SPR);
Function: Tree topology search operation algorithm: NNI|SPR|BEST
Returns : string (defaults to NNI)
Args : None to get, string to set.
v3.* only
rand_start
Title : rand_start
Usage : $factory->rand_start(1);
Function: Sets the initial SPR tree to random.
Returns : boolean (defaults to false)
Args : None to get, boolean to set.
v3.* only; only meaningful if $prog->search is 'SPR'
rand_starts
Title : rand_starts
Usage : $factory->rand_starts(10);
Function: Sets the number of initial random SPR trees
Returns : integer (defaults to 1)
Args : None to get, integer to set.
v3.* only; only valid if $prog->search is 'SPR'
rand_seed
Title : rand_seed
Usage : $factory->rand_seed(1769876);
Function: Seeds the random number generator
Returns : random integer
Args : None to get, integer to set.
v3.* only; only valid if $prog->search is 'SPR'
Uses perl rand() to initialize if not explicitely set.
Internal methods
These methods are private and should not be called outside this class.
_setparams
Title : _setparams
Usage : Internal function, not to be called directly
Function: Creates a string of params to be used in the command string
Returns : string of params
Args : none
_write_phylip_align_file
Title : _write_phylip_align_file
Usage : obj->__write_phylip_align_file($aln)
Function: Internal (not to be used directly)
Writes the alignment into the tmp directory
in PHYLIP interlieved format
Returns : filename
Args : Bio::Align::AlignI