NAME

Bio::Structure::SecStr::DSSP::Res - Module for parsing/accessing dssp output

SYNOPSIS

my $dssp_obj = Bio::Structure::SecStr::DSSP::Res->new('-file'=>'filename.dssp');

# or

my $dssp_obj = Bio::Structure::SecStr::DSSP::Res->new('-fh'=>\*STDOUT);

# get DSSP defined Secondary Structure for residue 20
$sec_str = $dssp_obj->resSecStr( 20 );

# get dssp defined sec. structure summary for PDB residue  # 10 of chain A

$sec_str = $dssp_obj->resSecStrSum( '10:A' );

DESCRIPTION

DSSP::Res is a module for objectifying DSSP output. Methods are then available for extracting all the information within the output file and convenient subsets of it. The principal purpose of DSSP is to determine secondary structural elements of a given structure.

( Dictionary of protein secondary structure: pattern recognition
  of hydrogen-bonded and geometrical features.
  Biopolymers. 1983 Dec;22(12):2577-637. )

The DSSP program is available from: http://www.cmbi.kun.nl/swift/dssp

This information is available on a per residue basis ( see resSecStr and resSecStrSum methods ) or on a per chain basis ( see secBounds method ).

resSecStr() & secBounds() return one of the following: 'H' = alpha helix 'B' = residue in isolated beta-bridge 'E' = extended strand, participates in beta ladder 'G' = 3-helix (3/10 helix) 'I' = 5 helix (pi helix) 'T' = hydrogen bonded turn 'S' = bend '' = no assignment

A more general classification is returned using the resSecStrSum() method. The purpose of this is to have a method for DSSP and STRIDE derived output whose range is the same. Its output is one of the following:

'H' = helix         ( => 'H', 'G', or 'I' from above )
'B' = beta          ( => 'B' or 'E' from above )
'T' = turn          ( => 'T' or 'S' from above )
' ' = no assignment ( => ' ' from above )

The methods are roughly divided into 3 sections: 1. Global features of this structure (PDB ID, total surface area, etc.). These methods do not require an argument. 2. Residue specific features ( amino acid, secondary structure, solvent exposed surface area, etc. ). These methods do require an arguement. The argument is supposed to uniquely identify a residue described within the structure. It can be of any of the following forms: ('#A:B') or ( #, 'A', 'B' ) || | || - Chain ID (blank for single chain) |--- Insertion code for this residue. Blank for most residues. |--- Numeric portion of residue ID.

  (#)
   |
   --- Numeric portion of residue ID.  If there is only one chain and
       it has no ID AND there is no residue with an insertion code at this
       number, then this can uniquely specify a residue.

  ('#:C') or ( #, 'C' )
    | |
    | -Chain ID
    ---Numeric portion of residue ID.

If a residue is incompletely specified then the first residue that
fits the arguments is returned.  For example, if 19 is the argument
and there are three chains, A, B, and C with a residue whose number
is 19, then 19:A will be returned (assuming its listed first).

Since neither DSSP nor STRIDE correctly handle alt-loc codes, they
are not supported by these modules.

3. Value-added methods. Return values are not verbatem strings parsed from DSSP or STRIDE output.

FEEDBACK

Mailing Lists

User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

bioperl-l@bioperl.org                  - General discussion
http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

Support

Please direct usage questions or support issues to the mailing list:

bioperl-l@bioperl.org

rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:

http://bugzilla.open-bio.org/

AUTHOR - Ed Green

Email ed@compbio.berkeley.edu

APPENDIX

The rest of the documentation details each method. Internal methods are preceded with a _

CONSTRUCTOR

new

Title         : new
Usage         : makes new object of this class
Function      : Constructor
Example       : $dssp_obj = Bio::DSSP:Res->new( filename or FILEHANDLE )
Returns       : object (ref)
Args          : filename ( must be proper DSSP output file )

ACCESSORS

totSurfArea

Title         : totSurfArea
Usage         : returns total accessible surface area in square Ang.
Function      :
Example       : $surArea = $dssp_obj->totSurfArea();
Returns       : scalar
Args          : none

numResidues

Title         : numResidues
Usage         : returns the total number of residues in all chains or
                just the specified chain if a chain is specified
Function      :
Example       : $num_res = $dssp_obj->numResidues();
Returns       : scalar int
Args          : none

pdbID

Title         : pdbID
Usage         : returns pdb identifier ( 1FJM, e.g.)
Function      :
Example       : $pdb_id = $dssp_obj->pdbID();
Returns       : scalar string
Args          : none

pdbAuthor

Title         : pdbAuthor
Usage         : returns author field
Function      :
Example       : $auth = $dssp_obj->pdbAuthor()
Returns       : scalar string
Args          : none

pdbCompound

Title         : pdbCompound
Usage         : returns pdbCompound given in PDB file
Function      :
Example       : $cmpd = $dssp_obj->pdbCompound();
Returns       : scalar string
Args          : none

pdbDate

Title         : pdbDate
Usage         : returns date given in PDB file
Function      :
Example       : $pdb_date = $dssp_obj->pdbDate();
Returns       : scalar
Args          : none

pdbHeader

Title         : pdbHeader
Usage         : returns header info from PDB file
Function      :
Example       : $header = $dssp_obj->pdbHeader();
Returns       : scalar
Args          : none

pdbSource

Title         : pdbSource
Usage         : returns pdbSource information from PDBSOURCE line
Function      :
Example       : $pdbSource = $dssp_obj->pdbSource();
Returns       : scalar
Args          : none

resAA

Title         : resAA
Usage         : fetches the 1 char amino acid code, given an id
Function      :
Example       : $aa = $dssp_obj->resAA( '20:A' ); # pdb id as arg
Returns       : 1 character scalar string
Args          : RESIDUE_ID

resPhi

Title         : resPhi
Usage         : returns phi angle of a single residue
Function      : accessor
Example       : $phi = $dssp_obj->resPhi( RESIDUE_ID )
Returns       : scalar
Args          : RESIDUE_ID

resPsi

Title         : resPsi
Usage         : returns psi angle of a single residue
Function      : accessor
Example       : $psi = $dssp_obj->resPsi( RESIDUE_ID )
Returns       : scalar
Args          : RESIDUE_ID

resSolvAcc

Title         : resSolvAcc
Usage         : returns solvent exposed area of this residue in
                square Angstroms
Function      :
Example       : $solv_acc = $dssp_obj->resSolvAcc( RESIDUE_ID );
Returns       : scalar
Args          : RESIDUE_ID

resSurfArea

Title         : resSurfArea
Usage         : returns solvent exposed area of this residue in
                square Angstroms
Function      :
Example       : $solv_acc = $dssp_obj->resSurfArea( RESIDUE_ID );
Returns       : scalar
Args          : RESIDUE_ID

resSecStr

Title         : resSecStr
Usage         : $ss = $dssp_obj->resSecStr( RESIDUE_ID );
Function      : returns the DSSP secondary structural designation of this residue
Example       :
Returns       : a character ( 'B', 'E', 'G', 'H', 'I', 'S', 'T', or ' ' )
Args          : RESIDUE_ID
NOTE          : The range of this method differs from that of the
   resSecStr method in the STRIDE SecStr parser.  That is because of the
   slightly different format for STRIDE and DSSP output.  The resSecStrSum
   method exists to map these different ranges onto an identical range.

resSecStrSum

Title         : resSecStrSum
Usage         : $ss = $dssp_obj->resSecStrSum( $id );
Function      : returns what secondary structure group this residue belongs
                to.  One of:  'H': helix ( H, G, or I )
                              'B': beta  ( B or E )
                              'T': turn  ( T or S )
                              ' ': none  ( ' ' )
                This method is similar to resSecStr, but the information
                it returns is less specific.
Example       :
Returns       : a character ( 'H', 'B', 'T', or ' ' )
Args          : dssp residue number of pdb residue identifier

hBonds

Title         : hBonds
Usage         : returns number of 14 different types of H Bonds
Function      :
Example       : $hb = $dssp_obj->hBonds
Returns       : pointer to 14 element array of ints
Args          : none
NOTE          : The different type of H-Bonds reported are, in order:
   TYPE O(I)-->H-N(J)
   IN PARALLEL BRIDGES
   IN ANTIPARALLEL BRIDGES
   TYPE O(I)-->H-N(I-5)
   TYPE O(I)-->H-N(I-4)
   TYPE O(I)-->H-N(I-3)
   TYPE O(I)-->H-N(I-2)
   TYPE O(I)-->H-N(I-1)
   TYPE O(I)-->H-N(I+0)
   TYPE O(I)-->H-N(I+1)
   TYPE O(I)-->H-N(I+2)
   TYPE O(I)-->H-N(I+3)
   TYPE O(I)-->H-N(I+4)
   TYPE O(I)-->H-N(I+5)

numSSBr

Title         : numSSBr
Usage         : returns info about number of SS-bridges
Function      :
Example       : @SS_br = $dssp_obj->numSSbr();
Returns       : 3 element scalar int array
Args          : none

resHB_O_HN

Title         : resHB_O_HN
Usage         : returns pointer to a 4 element array
                consisting of: relative position of binding
                partner #1, energy of that bond (kcal/mol),
                relative positionof binding partner #2,
                energy of that bond (kcal/mol).  If the bond
                is not bifurcated, the second bond is reported
                as 0, 0.0
Function      : accessor
Example       : $oBonds_ptr = $dssp_obj->resHB_O_HN( RESIDUE_ID )
Returns       : pointer to 4 element array
Args          : RESIDUE_ID

resHB_NH_O

Title         : resHB_NH_O
Usage         : returns pointer to a 4 element array
                consisting of: relative position of binding
                partner #1, energy of that bond (kcal/mol),
                relative positionof binding partner #2,
                energy of that bond (kcal/mol).  If the bond
                is not bifurcated, the second bond is reported
                as 0, 0.0
Function      : accessor
Example       : $nhBonds_ptr = $dssp_obj->resHB_NH_O( RESIDUE_ID )
Returns       : pointer to 4 element array
Args          : RESIDUE_ID

resTco

Title         : resTco
Usage         : returns tco angle around this residue
Function      : accessor
Example       : resTco = $dssp_obj->resTco( RESIDUE_ID )
Returns       : scalar
Args          : RESIDUE_ID

resKappa

Title         : resKappa
Usage         : returns kappa angle around this residue
Function      : accessor
Example       : $kappa = $dssp_obj->resKappa( RESIDUE_ID )
Returns       : scalar
Args          : RESIDUE_ID ( dssp or PDB )

resAlpha

Title         : resAlpha
Usage         : returns alpha angle around this residue
Function      : accessor
Example       : $alpha = $dssp_obj->resAlpha( RESIDUE_ID )
Returns       : scalar
Args          : RESIDUE_ID ( dssp or PDB )

secBounds

Title         : secBounds
Usage         : gets residue ids of boundary residues in each
                contiguous secondary structural element of specified
                chain
Function      : returns pointer to array of 3 element arrays.  First
                two elements are the PDB IDs of the start and end points,
                respectively and inclusively.  The last element is the
                DSSP secondary structural assignment code,
                i.e. one of : ('B', 'E', 'G', 'H', 'I', 'S', 'T', or ' ')
Example       : $ss_elements_pts = $dssp_obj->secBounds( 'A' );
Returns       : pointer to array of arrays
Args          : chain id ( 'A', for example ).  No arg => no chain id

chains

Title         : chains
Usage         : returns pointer to array of chain I.D.s (characters)
Function      :
Example       : $chains_pnt = $dssp_obj->chains();
Returns       : array of characters, one of which may be ' '
Args          : none

residues

Title : residues
Usage : returns array of residue identifiers for all residues in
the output file, or in a specific chain
Function :
Example : @residues_ids = $dssp_obj->residues()
Returns : array of residue identifiers
Args : if none => returns residue ids of all residues of all
chains (in order); if chain id is given, returns just the residue
ids of residues in that chain

getSeq

Title         : getSeq
Usage         : returns a Bio::PrimarySeq object which represents a good
                guess at the sequence of the given chain
Function      : For most chains of most entries, the sequence returned by
                this method will be very good.  However, it is inherently
                unsafe to rely on DSSP to extract sequence information about
                a PDB entry.  More reliable information can be obtained from
                the PDB entry itself.
Example       : $pso = $dssp_obj->getSeq( 'A' );
Returns       : (pointer to) a PrimarySeq object
Args          : Chain identifier.  If none given, ' ' is assumed.  If no ' '
                chain, the first chain is used.

INTERNAL METHODS

_pdbChain

Title         : _pdbChain
Usage         : returns the pdb chain id of given residue
Function      :
Example       : $chain_id = $dssp_obj->pdbChain( DSSP_KEY );
Returns       : scalar
Args          : DSSP_KEY ( dssp or pdb )

_resAA

Title         : _resAA
Usage         : fetches the 1 char amino acid code, given a dssp id
Function      :
Example       : $aa = $dssp_obj->_resAA( dssp_id );
Returns       : 1 character scalar string
Args          : dssp_id

_pdbNum

Title        : _pdbNum
Usage        : fetches the numeric portion of the identifier for a given
               residue as reported by the pdb entry.  Note, this DOES NOT
               uniquely specify a residue.  There may be an insertion code
               and/or chain identifier differences.
Function     :
Example      : $pdbNum = $self->_pdbNum( DSSP_ID );
Returns      : a scalar
Args         : DSSP_ID

_pdbInsCo

Title        : _pdbInsCo
Usage        : fetches the Insertion Code for this residue, if it has one.
Function     :
Example      : $pdbNum = $self->_pdbInsCo( DSSP_ID );
Returns      : a scalar
Args         : DSSP_ID

_toPdbId

Title        : _toPdbId
Usage        : Takes a dssp key and builds the corresponding
               PDB identifier string
Function     :
Example      : $pdbId = $self->_toPdbId( DSSP_ID );
Returns      : scalar
Args         : DSSP_ID

_contSegs

Title         : _contSegs
Usage         : find the endpoints of continuous regions of this structure
Function      : returns pointer to array of 3 element array.
                Elements are the dssp keys of the start and end points of each
                continuous element and its PDB chain id (may be blank).
                Note that it is common to have several
                continuous elements with the same chain id.  This occurs
                when an internal region is disordered and no structural
                information is available.
Example       : $cont_seg_ptr = $dssp_obj->_contSegs();
Returns       : pointer to array of arrays
Args          : none

_numResLines

Title         : _numResLines
Usage         : returns the total number of residue lines in this
                dssp file.
                This number is DIFFERENT than the number of residues in
                the pdb file because dssp has chain termination and chain
                discontinuity 'residues'.
Function      :
Example       : $num_res = $dssp_obj->_numResLines();
Returns       : scalar int
Args          : none

_toDsspKey

Title         : _toDsspKey
Usage         : returns the unique dssp integer key given a pdb residue id.
                All accessor methods require (internally)
                the dssp key.   This method is very useful in converting
                pdb keys to dssp keys so the accessors can accept pdb keys
                as argument.  PDB Residue IDs are inherently
                problematic since they have multiple parts of
                overlapping function and ill-defined or observed
                convention in form.  Input can be in any of the formats
                described in the DESCRIPTION section above.
Function      :
Example       : $dssp_id = $dssp_obj->_pdbKeyToDsspKey( '10B:A' )
Returns       : scalar int
Args          : pdb residue identifier: num[insertion code]:[chain]

_parse

Title         : _parse
Usage         : parses dssp output
Function      :
Example       : used by the constructor
Returns       :
Args          : input source ( handled by Bio::Root:IO )

_parseResLine

Title         : _parseResLine
Usage         : parses a single residue line
Function      :
Example       : used internally
Returns       :
Args          : residue line ( string )