NAME

Bio::Structure::SecStr::STRIDE::Res - Module for parsing/accessing stride output

SYNOPSIS

my $stride_obj = new Bio::Structure::SecStr::STRIDE::Res( '-file' => 'filename.stride' );

# or

my $stride_obj = new Bio::Structure::SecStr::STRIDE::Res( '-fh' => \*STDOUT );

# Get secondary structure assignment for PDB residue 20 of chain A
$sec_str = $stride_obj->resSecStr( '20:A' );

# same
$sec_str = $stride_obj->resSecStr( 20, 'A' )

DESCRIPTION

STRIDE::Res is a module for objectifying STRIDE output. STRIDE is a program (similar to DSSP) for assigning secondary structure to individual residues of a pdb structure file.

( Knowledge-Based Protein Secondary Structure Assignment,
PROTEINS: Structure, Function, and Genetics 23:566-579 (1995) )

STRIDE is available here: http://www.embl-heidelberg.de/argos/stride/down_stride.html

Methods are then available for extracting all of the infomation present within the output or convenient subsets of it.

Although they are very similar in function, DSSP and STRIDE differ somewhat in output format. Thes differences are reflected in the return value of some methods of these modules. For example, both the STRIDE and DSSP parsers have resSecStr() methods for returning the secondary structure of a given residue. However, the range of return values for DSSP is ( H, B, E, G, I, T, and S ) whereas the range of values for STRIDE is ( H, G, I, E, B, b, T, and C ). See individual methods for details.

The methods are roughly divided into 3 sections:

 1.  Global features of this structure (PDB ID, total surface area,
     etc.).  These methods do not require an argument. 
 2.  Residue specific features ( amino acid, secondary structure,
     solvent exposed surface area, etc. ).  These methods do require an
     arguement.  The argument is supposed to uniquely identify a
     residue described within the structure.  It can be of any of the
     following forms:
     ('#A:B') or ( #, 'A', 'B' )
 	|| |
 	|| - Chain ID (blank for single chain)
 	|--- Insertion code for this residue.  Blank for most residues.
 	|--- Numeric portion of residue ID.

     (#)
      |
      --- Numeric portion of residue ID.  If there is only one chain and
 	   it has no ID AND there is no residue with an insertion code at this
 	   number, then this can uniquely specify a residue.

     ('#:C') or ( #, 'C' )
 	| |
 	| -Chain ID
 	---Numeric portion of residue ID.

    If a residue is incompletely specified then the first residue that
    fits the arguments is returned.  For example, if 19 is the argument
    and there are three chains, A, B, and C with a residue whose number
    is 19, then 19:A will be returned (assuming its listed first).

    Since neither DSSP nor STRIDE correctly handle alt-loc codes, they
    are not supported by these modules.

3.  Value-added methods.  Return values are not verbatem strings
    parsed from DSSP or STRIDE output.  

FEEDBACK

MailingLists

UsUser feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.

bioperl-l@bioperl.org          - General discussion
http://bio.perl.org/MailList.html             - About the mailing lists

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:

bioperl-bugs@bio.perl.org
http://bugzilla.bioperl.org/

AUTHOR - Ed Green

Email ed@compbio.berkeley.edu

APPENDIX

The Rest of the documentation details each method. Internal methods are preceded with a _.

new

 Title         : new
 Usage         : makes new object of this class
 Function      : Constructor
 Example       : $stride_obj = Bio::Structure::SecStr::STRIDE:Res->new( '-file' =>  filename 
						     # or 
						     '-fh'   => FILEHANDLE )
 Returns       : object (ref)
 Args          : filename or filehandle( must be proper STRIDE output )

totSurfArea

Title         : totSurfArea
Usage         : returns sum of surface areas of all residues of all
                chains considered.  Result is memoized.
Function      :
Example       : $tot_SA = $stride_obj->totSurfArea();
Returns       : scalar
Args          : none

numResidues

Title         : numResidues
Usage         : returns total number of residues in all chains or
                just the specified chain
Function      : 
Example       : $tot_res = $stride_obj->numResidues();
Returns       : scalar int
Args          : none or chain id

pdbID

Title         : pdbID
Usage         : returns pdb identifier ( 1FJM, e.g. )
Function      : 
Example       : $pdb_id = $stride_obj->pdbID();
Returns       : scalar string
Args          : none

pdbAuthor

Title         : pdbAuthor
Usage         : returns author of this PDB entry
Function      : 
Example       : $auth = $stride_obj->pdbAuthor()
Returns       : scalar string
Args          : none

pdbCompound

Title         : pdbCompound
Usage         : returns string of what was found on the  
                CMP lines
Function      : 
Example       : $cmp = $stride_obj->pdbCompound();
Returns       : string
Args          : none

pdbDate

Title         : pdbDate
Usage         : returns date given in PDB file
Function      :
Example       : $pdb_date = $stride_obj->pdbDate();
Returns       : scalar
Args          : none

pdbHeader

Title         : pdbHeader
Usage         : returns string of characters found on the PDB header line
Function      :
Example       : $head = $stride_obj->pdbHeader();
Returns       : scalar
Args          : none

pdbSource

Title         : pdbSource
Usage         : returns string of what was found on SRC lines
Function      : 
Example       : $src = $stride_obj->pdbSource();
Returns       : scalar
Args          : none

resAA

Title         : resAA
Usage         : returns 1 letter abbr. of the amino acid specified by
                the arguments
Function      : 
Examples      : $aa = $stride_obj->resAA( RESIDUE_ID );
Returns       : scalar character
Args          : RESIDUE_ID

resPhi

Title         : resPhi
Usage         : returns phi angle of specified residue
Function      :
Example       : $phi = $stride_obj->resPhi( RESIDUE_ID );
Returns       : scaler
Args          : RESIDUE_ID

resPsi

Title         : resPsi
Usage         : returns psi angle of specified residue
Function      :
Example       : $psi = $stride_obj->resPsi( RESIDUE_ID );
Returns       : scalar
Args          : RESIDUE_ID

resSolvAcc

Title         : resSolvAcc
Usage         : returns stride calculated surface area of specified residue
Function      : 
Example       : $sa = $stride_obj->resSolvAcc( RESIDUE_ID );
Returns       : scalar
Args          : RESIDUE_ID

resSurfArea

Title         : resSurfArea
Usage         : returns stride calculated surface area of specified residue
Function      : 
Example       : $sa = $stride_obj->resSurfArea( RESIDUE_ID );
Returns       : scalar
Args          : RESIDUE_ID

resSecStr

Title         : resSecStr 
Usage         : gives one letter abbr. of stride determined secondary
                structure of specified residue
Function      : 
Example       : $ss = $stride_obj->resSecStr( RESIDUE_ID );
Returns       : one of: 'H' => Alpha Helix
                        'G' => 3-10 helix
                        'I' => PI-helix
                        'E' => Extended conformation
                        'B' or 'b' => Isolated bridge
                        'T' => Turn
                        'C' => Coil
                        ' ' => None
               # NOTE:  This range is slightly DIFFERENT from the
               #        DSSP method of the same name
Args          : RESIDUE_ID

resSecStrSum

Title         : resSecStrSum
Usage         : gives one letter summary of secondary structure of
                specified residue.  More general than secStruc() 
Function      :
Example       : $ss_sum = $stride_obj->resSecStrSum( RESIDUE_ID );
Returns       : one of: 'H' (helix), 'B' (beta), 'T' (turn), or 'C' (coil)
Args          : residue identifier(s) ( SEE INTRO NOTE )

resSecStrName

Title         : resSecStrName
Usage         : gives full name of the secondary structural element
                classification of the specified residue
Function      : 
Example       : $ss_name = $stride_obj->resSecStrName( RESIDUE_ID );
Returns       : scalar string
Args          : RESIDUE_ID

strideLocs

Title         : strideLocs
Usage         : returns stride determined contiguous secondary
   structural elements as specified on the LOC lines
Function      : 
Example       : $loc_pnt = $stride_obj->strideLocs();
Returns       : pointer to array of 5 element arrays.
   0 => stride name of structural element
   1 => first residue pdb key (including insertion code, if app.)
   2 => first residue chain id
   3 => last residue pdb key (including insertion code, if app.)
   4 => last residue chain id
   NOTE the differences between this range and the range of SecBounds()
Args          : none

secBounds

Title         : secBounds
Usage         : gets residue ids of boundary residues in each
                contiguous secondary structural element of specified
                chain 
Function      : 
Example       : $ss_bound_pnt = $stride_obj->secBounds( 'A' );
Returns       : pointer to array of 3 element arrays.  First two elements
                are the PDB IDs of the start and end points, respectively
                and inclusively.  The last element is the STRIDE secondary
                structural element code (same range as resSecStr).
Args          : chain identifier ( one character ).  If none, '-' is assumed

chains

Title         : chains
Usage         : gives array chain I.D.s (characters)
Function      :
Example       : @chains = $stride_obj->chains();
Returns       : array of characters
Args          : none

getSeq

 Title         : getSeq
 Usage         : returns a Bio::PrimarySeq object which represents an
                 approximation at the sequence of the specified chain.
 Function      : For most chain of most entries, the sequence returned by
                 this method will be very good.  However, it it inherently 
                 unsafe to rely on STRIDE to extract sequence information about
                 a PDB entry.  More reliable information can be obtained from
                 the PDB entry itself.  If a second option is given
                 (and evaluates to true), the sequence generated will
                 have 'X' in spaces where the pdb residue numbers are
                 discontinuous.  In some cases this results in a
                 better sequence object (when the  discontinuity is
		 due to regions which were present, but could not be
		 resolved).  In other cases, it will result in a WORSE
                 sequence object (when the discontinuity is due to
		 historical sequence numbering and all sequence is
		 actually resolved).
 Example       : $pso = $dssp_obj->getSeq( 'A' );
 Returns       : (pointer to) a PrimarySeq object
 Args          : Chain identifier.  If none given, '-' is assumed.  

INTERNAL METHODS

_pdbNum

Title        : _pdbNum
Usage        : fetches the numeric portion of the identifier for a given
               residue as reported by the pdb entry.  Note, this DOES NOT
               uniquely specify a residue.  There may be an insertion code
               and/or chain identifier differences.
Function     : 
Example      : $pdbNum = $self->pdbNum( 3, 'A' );
Returns      : a scalar
Args         : valid ordinal num / chain combination

_resAA

Title         : _resAA
Usage         : returns 1 letter abbr. of the amino acid specified by
                the arguments
Function      : 
Examples      : $aa = $stride_obj->_resAA( 3, '-' );
Returns       : scalar character
Args          : ( ord. num, chain )

_pdbInsCo

Title        : _pdbInsCo
Usage        : fetches the Insertion code for this residue.
Function     : 
Example      : $pdb_ins_co = $self->_pdb_ins_co( 15, 'B' );
Returns      : a scalar
Args         : ordinal number and chain

_toOrdChain

Title         : _toOrdChain
Usage         : takes any set of residue identifying parameters and
   wrestles them into a two element array:  the chain and the ordinal
   number of this residue.  This two element array can then be
   efficiently used as keys in many of the above accessor methods
('#A:B') or ( #, 'A', 'B' )
  || |
  || - Chain ID (blank for single chain)
  |--- Insertion code for this residue.  Blank for most residues.
  |--- Numeric portion of residue ID.

 (#)
  |
  --- Numeric portion of residue ID.  If there is only one chain and
  it has no ID AND there is no residue with an insertion code at this
  number, then this can uniquely specify a residue.

 #  ('#:C) or ( #, 'C' )
      | |
      | -Chain ID
      ---Numeric portion of residue ID.

 If a residue is incompletely specified then the first residue that 
 fits the arguments is returned.  For example, if 19 is the argument 
 and there are three chains, A, B, and C with a residue whose number 
 is 19, then 19:A will be returned (assuming its listed first).

Function      :
Example       : my ( $ord, $chain ) = $self->_toOrdChain( @args );
Returns       : two element array
Args          : valid set of residue identifier(s) ( SEE NOTE ABOVE )

_parse

Title         : _parse
Usage         : as name suggests, parses stride output, creating object
Function      :
Example       : $self->_parse( $io );
Returns       : 
Args          : valid Bio::Root::IO object

_parseTop

Title         : _parseTop
Usage         : makes sure this looks like stride output
Function      :
Example       : 
Returns       :
Args          :

_parseHead

Title         : _parseHead
Usage         : parses
Function      : HDR, CMP, SRC, and AUT lines
Example       :
Returns       :
Args          :

_parseSummary

Title         : _parseSummary
Usage         : parses LOC lines
Function      :
Example       :
Returns       :
Args          :

_parseASG

Title         : _parseASG
Usage         : parses ASG lines
Function      :
Example       :
Returns       :
Args          :