NAME

Molevol::Complex - A Perl module for protein contact interfaces evolutive analisys

VERSION

Version 0.30

SYNOPSIS

# Create the object
my $coe = Coevolution::Compcoe->new(
                                pdb         => 'pdb/files/path/2occ.pdb',
                                alignment   => 'aln/files/path/chainM.aln',
                                chain       => 'M',
                                );
# Run calcs
$coe->run;

# Write HTML Report
open REP, ">report.html";
print REP $coe->run_report;
close REP;

DESCRIPTION

This module integrates a workflow aimed to address the evolvability of the contact interfaces within a protein complex. The method Coevolution::Compcoe->run launchs the whole analisys. Also, such workflow is divided into the following steps:

Step 1, Structural analisys: Coevolution::Compcoe->run_struct
Step 2, Sub-sets filtering: Coevolution::Compcoe->run_filtering
Step 3, Sub-alignments building: Coevolution::Compcoe->run_subalign
Step 4, Evolutionary calcs: Coevolution::Compcoe->run_yang
Step 5, Statistical analisys: Coevolution::Compcoe->run_stats1

A detailed explanation about this methods is reported below.

REQUERIMENTS

Bioperl
Bioperl-run
PAML
DSSP

CONSTRUCTOR

new()

$obj = Coevolution::Compcoe->new(%input_data);

The new class method construct a new Coevolution::Compcoe object. The returned object can be used to perform several evolutive analisys. new accepts the following parameters in an input hash as above used %input_data:

  • pdbfilepath (required if contactfile is missing)

    A valid pdb file path to be opened for reading.

  • contactfilepath (required if pdb is missing)

    A valid contact file path. This file must contain the structural information retrieved by a previous analysis on the same chain

  • alignfilepath (required)

    A valid DNA alignment file path. The alignment must correspond with the specified chain and must be at least as long as the pdb chain (x3)

  • chain (required)

    A given subunits within the studied complex

  • pth (default 4 Angstroms)

    Proximity threshold. The maximun distance between two residues to be considered as a contact pair

  • sth (default 0.05)

    Exposure threshold. The maximun exposure fraction to be considered as a buried residue.

  • sthmargin (default 0)

    An error margin for sth. For instance: if is set to 0.01, residues with exposure higher than 0.06 will be considered as exposed, those with exposure lower than 0.04 will be buried and those residues with exposure between 0.04 and 0.06 will not be considered

  • contactwith

    A string with valid chain identificators separated by commas:

    $contactwith = "A,B,D";

    if it is set, the program will only consider as contact residues those in close proximity with the specified chains. The others will be excluded.

  • informat (default fasta)

    Specify the format of the input alignment file. Supported formats include fasta, genbank, embl, swiss (SwissProt), Entrez Gene and tracefile formats such as abi (ABI) and scf. There are many more, for a complete listing see the SeqIO HOWTO (http://bioperl.open-bio.org/wiki/HOWTO:SeqIO).

    If no format is specified and a filename is given then the module will attempt to deduce the format from the filename suffix. If there is no suffix that Bioperl understands then it will attempt to guess the format based on file content. If this is unsuccessful then SeqIO will throw a fatal error.

    The format name is case-insensitive: 'FASTA', 'Fasta' and 'fasta' are all valid.

    Currently, the tracefile formats (except for SCF) require installation of the external Staden "io_lib" package, as well as the Bio::SeqIO::staden::read package available from the bioperl-ext repository.

  • oformat (default clustalw)

    Specify the format of the output sub-alignments. As above.

  • gc (default 0)

    The genetic code. The attribute must be one of the following integers, which correspond with the indicated:

    0: Standar
    1: Mammailan mitochondrial
    2: Yeast mitochondrial
    3: Mold mitochondiral
    4: Invertebrate mitochondrial
    5: Ciliate nuclear
    6: Echinoderm mitochondrial
    7: Euplotid mitochondrial
    8: Alternative yeast nuclear
    9: Ascidian mitochondrial
    10: Blepharisma nuclear

    These codes correspond to transl_table 1 to 11 of GENEBANK

  • ocontact (default ocontact)

    A valid file path to write the structural results

  • dsspbin (default dssp)

    The path to the DSSP binary

MAIN METHODS

run()

Title   : run
Usage   : $obj->run
Function: Launch the whole workflow analisys
Returns : 
Args    :

run_struct()

Title   : run_struct
Usage   : $obj->run_struct
Function: Launch structural analisys and stores the result in the attribute:
          "structdata"
Returns : True if success
Args    :

run_filtering()

Title   : run_struct
Usage   : $obj->run_filtering
Function: Build different categories of sets (Contact, NonContact ...)
          and set the attribute "lists" with the result
Returns : True if success
Args    :

run_subalign()

Title   : run_subalign
Usage   : $obj->run_subalign
Function: Build new alignments from the input chain alignment and the categories
          built by run_filtering method. Stores the result into "subalns" attribute
Returns : True if success
Args    :

run_yang()

Title   : run_yang
Usage   : $obj->run_yang
Function: Launch PAML for each alignment stored at "sub_alns" attribute and
          store the results into "paml_res"
Returns : True if success
Args    :

run_stats1()

Title   : run_stats1
Usage   : $obj->run_stats1
Function: Run a Z-Test with the obtained evolutionary data and store the
          results into "stats" attribute
Returns : True if success
Args    :

run_report()

Title   : run_report
Usage   : $obj->run_report
Function: Write a HTML report
Returns : [String] HTML report with the results and input data
Args    :

PROCESSED DATA STORAGE

Once each analisys has been performed, the resulting data is stored in other setable attributes:

  • structdata

    [Array] A table with the structural information calculated by Coevolution::Contact.pm and DSSP

  • lists

    [Hash] Each item contains a list of number corresponding with each type of residue. The key for a given item is the name for the category.

    Contact
    NonContact
    ExposedNonContact
    ContactWith_$specified_chains [...]
  • subalns

    [Hash] Each item contains a sub-alignment for a given category (see above)

  • pamlres

    [Hash] Results for evolutive analisys. Each item contains the results for a given sub-alignment (see above)

  • stats

    [Hash] Statistical results

SECONDARY METHODS (but not less important)

All attributes are accesible and mutable from methods called get_attribute and set_attribute, respectively. For example, to get the value for the proximity threshold:

$proximity_threshold = $obj->get_pth;

AUTHOR - Hector Valverde

Hector Valverde, <hvalverde@uma.es>

CONTRIBUTORS

Juan Carlos Aledo, <caledo@uma.es>

BUGS

Please report any bugs or feature requests to bug-coevolution-compcoe at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Coevolution-Compcoe. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the UNIX man command.

man Coevolution::Compcoe

LICENSE AND COPYRIGHT

Copyright 2013 Hector Valverde and Juan Carlos Aledo.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.