NAME

Bio::ToolBox::db_helper::wiggle

DESCRIPTION

This module is used to collect the dataset scores from a binary wig file (.wib) that is referenced in the database. Typically, a single feature representing the dataset is present across each chromosome. The feature should contain an attribute ('wigfile') that references the location of the binary file representing the dataset scores. The file is read using the Bio::Graphics::Wiggle module, and the values extracted from the region of interest.

Scores may be restricted to strand by specifying the desired strandedness. For example, to collect transcription data over a gene, pass the strandedness value 'sense'. If the strand of the region database object (representing the gene) matches the strand of the wig file data feature, then the data is collected.

For loading wig files into a Bio::DB database, see the perl script 'wiggle2gff3.pl' included with the Bio::Graphics distribution, as well as Bio::Graphics::Wiggle::Loader.

To speed up the program and avoid repetitive opening and closing of the files, the opened wig file object is stored in a global hash in case it is needed again.

USAGE

The module requires Lincoln Stein's Bio::Graphics to be installed.

Load the module at the beginning of your program.

use Bio::ToolBox::db_helper::wiggle;

It will automatically export the name of the subroutines.

collect_wig_scores

This subroutine will collect only the score values from a binary wig file for the specified database region. The positional information of the scores is not retained, and the values are best further processed through some statistical method (mean, median, etc.).

The subroutine is passed three or more arguments in the following order:

1) The start position of the segment to collect from
2) The stop or end position of the segment to collect from
3) The strand of the original feature (or region), -1, 0, or 1.
4) A scalar value representing the desired strandedness of the data 
   to be collected. Acceptable values include "sense", "antisense", 
   "none" or "no". Only those scores which match the indicated 
   strandedness are collected.
5) The method or type of data collected. 
   Acceptable values include 'score' (returns the score), 
   'count' (the number of defined positions with scores), or 
   'length' (the wig step is used here).  
6) One or more database feature objects that contain the reference 
   to the wib file. They should contain the attribute 'wigfile'.

The subroutine returns an array of the defined dataset values found within the region of interest.

collect_wig_position_scores

This subroutine will collect the score values from a binary wig file for the specified database region keyed by position.

The subroutine is passed the same arguments as collect_wig_scores().

The subroutine returns a hash of the defined dataset values found within the region of interest keyed by position. Note that only one value is returned per position, regardless of the number of dataset features passed.

AUTHOR

Timothy J. Parnell, PhD
Howard Hughes Medical Institute
Dept of Oncological Sciences
Huntsman Cancer Institute
University of Utah
Salt Lake City, UT, 84112

This package is free software; you can redistribute it and/or modify it under the terms of the Artistic License 2.0.