NAME
RankEnumeratedStructures
VERSION
Version 0.05
SYNOPSIS
This module ranks all the enumerated structures using a composite energy function that consists of four parameters: (1) Radius of Gyration (2) Solvation potential (3) Hydrogen bond potential (4) Statistical potential
use RankEnumeratedStructures;
rank_structures ($pdbcode,$stericlimit,@indices);
EXPORT rank_structures pre_rank_structures get_energy
pre_rank_structures
Subroutine to prepare for rank_structures
rank_structures
This subroutine ranks the structures generated by the full enumeration of the candidate smotif combinations. The ranking takes place in two parts: the full set is ranked using a 'coarse' scoring function, and the top 1000 structures are re-ranked using a 'refined' scoring function. Both functions use 4 scoring component values: radius of gyration, statistical pairwise contact potential, implicit solvation potential, and long range H bond potential.
INPUT ARGUMENTS 1) $pdbcode - the 4-character name of the folder to store input and output data 2) $sterlimit - number of allowable steric clashes (these clashes are calculated during the enumeration and are part of the input file - they are not calculated directly by this script) 3) @st - list of 4 numbers corresponding to the indices of the scoring function components in the tab-delimited output file from the full enumeration. Index numbering starts from 0 (not 1). See "INPUT FILES" for further details.
REQUIRED FILES (all to be found in the <pdbcode> directory) <pdbcode>.out - file containing a list of start and end points of smotifs in the query protein, as well as secondary structure and loop lengths. This is one of the standard output files of the generate_shift_files.pl script. <pdbcode>_motifs_best.csv - file containing a list of candidates for each putative smotif. This is one of the standard output files of the findranks.pl script.
INPUT FILES In the <pdbcode> directory, a set of files indicating the results of the full enumeration. These are the standard output files from the all_enum.pl script, and have the following format:
Sample line for a structure with 4 smotifs 1.437 0.740 1.867 8.377 224162 148918 54194 127698 1.7483 0.9973 0.9616 1.2306 8.8294 58.8240 12 0 0 0 0
Explanation: 1.437 0.740 1.867 8.377 : RMSDs of the 4 smotif components individually 224162 148918 54194 127698 : Nids of the 4 smotif components 1.7483 : Per-residue radius of gyration z-score 0.9973 : Per-residue pairwise contact potential z-score 0.9616 : Per-residue solvation potential z-score 1.2306 : Long-range H-bond potential z-score 8.8294 : Overall structure RMSD (from solved structure) 58.8250 : Overall structure GDT_TS score 12 0 0 0: List of indices of smotifs, as found in the <pdbcode>_motifs_best.csv file 0 : Number of steric clashes
In this case, the indices for the scoring function components are 8,9,10, and 11. In general, the indices will be from 2*n through 2*n+3 inclusive, where n is the number of smotifs
OUTPUT FILES In the <pdbcode> directory: 1) <pdbcode>_ranked_coarse.csv : Top 5000 structures as ranked by the coarse scoring function. The format of each line is the same as in the enumeration output files (see INPUT FILES, above), with an additional final entry representing the scoring function output for each line. 2) <pdbcode>_ranked_refined.csv : Same as 1), but for the top structures re-ranked using the refined scoring function.
rank_energies
Subroutine to calculate energy and rank structures, given a list of energy function component scores and weights
AUTHOR
Fiserlab Members , <andras at fiserlab.org>
BUGS
Please report any bugs or feature requests to bug-. at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=.. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc RankEnumeratedStructures
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
LICENSE AND COPYRIGHT
Copyright 2015 Fiserlab Members .
This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:
http://www.perlfoundation.org/artistic_license_2_0
Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.
If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.
This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.
This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed.
Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.