NAME

Bio::Root::Utilities - general-purpose utilities

SYNOPSIS

Object Creation

# Using the supplied singleton object:
use Bio::Root::Utilities qw(:obj);
$Util->some_method();

# Create an object manually:
use Bio::Root::Utilities;
my $util = Bio::Root::Utilities->new();
$util->some_method();

$date_stamp = $Util->date_format('yyy-mm-dd');

$clean = $Util->untaint($dirty);

$compressed = $Util->compress('/home/me/myfile.txt')

my ($mean, $stdev) = $Util->mean_stdev( @data );

$Util->authority("me@example.com");
$Util->mail_authority("Something you should know about...");

...and a host of other methods. See below.

DESCRIPTION

Provides general-purpose utilities of potential interest to any Perl script.

The :obj tag is a convenience that imports a $Util symbol into your namespace representing a Bio::Root::Utilities object. This saves you from creating your own Bio::Root::Utilities object via Bio::Root::Utilities->new() or from prefixing all method calls with Bio::Root::Utilities, though feel free to do these things if desired. Since there should normally not be a need for a script to have more than one Bio::Root::Utilities object, this module thus comes with it's own singleton.

INSTALLATION

This module is included with the central Bioperl distribution:

http://www.bioperl.org/wiki/Getting_BioPerl
ftp://bio.perl.org/pub/DIST

Follow the installation instructions included in the README file.

DEPENDENCIES

Inherits from Bio::Root::Root, and uses Bio::Root::IO and Bio::Root::Exception.

Relies on external executables for file compression/uncompression and sending mail. No paths to these are hard coded but are located as needed.

SEE ALSO

http://bioperl.org  - Bioperl Project Homepage

ACKNOWLEDGEMENTS

This module was originally developed under the auspices of the Saccharomyces Genome Database: http://www.yeastgenome.org/

AUTHOR Steve Chervitz

date_format

Title     : date_format
Usage     : $Util->date_format( [FMT], [DATE])
Purpose   : -- Get a string containing the formatted date or time
          :    taken when this routine is invoked.
          : -- Provides a way to avoid using `date`.
          : -- Provides an interface to localtime().
          : -- Interconverts some date formats.
          :
          : (For additional functionality, use Date::Manip or
          :  Date::DateCalc available from CPAN).
Example   : $Util->date_format();
          : $date = $Util->date_format('yyyy-mmm-dd', '11/22/92');
Returns   : String (unless 'list' is provided as argument, see below)
          :
          :   'yyyy-mm-dd'  = 1996-05-03    # default format.
          :   'yyyy-dd-mm'  = 1996-03-05
          :   'yyyy-mmm-dd' = 1996-May-03
          :   'd-m-y'       = 3-May-1996
          :   'd m y'       = 3 May 1996
          :   'dmy'         = 3may96
          :   'mdy'         = May 3, 1996
          :   'ymd'         = 96may3
          :   'md'          = may3
          :   'year'        = 1996
          :   'hms'         = 23:01:59  # when not converting a format, 'hms' can be
          :                             # tacked on to any of the above options
          :                             # to add the time stamp: eg 'dmyhms'
          :   'full' | 'unix' = UNIX-style date: Tue May  5 22:00:00 1998
          :   'list'          = the contents of localtime(time) in an array.
Argument  : (all are optional)
          : FMT  = yyyy-mm-dd | yyyy-dd-mm | yyyy-mmm-dd |
          :        mdy | ymd | md | d-m-y | hms | hm
          :        ('hms' may be appended to any of these to
          :        add a time stamp)
          :
          : DATE = String containing date to be converted.
          :        Acceptable input formats:
          :           12/1/97 (for 1 December 1997)
          :           1997-12-01
          :           1997-Dec-01
Throws    :
Comments  : If you don't care about formatting or using backticks, you can
          : always use: $date = `date`;
          :
          : For more features, use Date::Manip.pm, (which I should
          : probably switch to...)

See Also : file_date(), month2num()

month2num

Title      : month2num
Purpose    : Converts a string containing a name of a month to integer
           : representing the number of the month in the year.
Example    : $Util->month2num("march");  # returns 3
Argument   : The string argument must contain at least the first
           : three characters of the month's name. Case insensitive.
Throws     : Exception if the conversion fails.

num2month

Title   : num2month
Purpose : Does the opposite of month2num.
        : Converts a number into a string containing a name of a month.
Example : $Util->num2month(3);  # returns 'Mar'
Throws  : Exception if supplied number is out of range.

compress

Title     : compress
Usage     : $Util->compress(full-path-filename);
          : $Util->compress(<named parameters>);
Purpose   : Compress a file.
Example   : $Util->compress("/usr/people/me/data.txt");
          : $Util->compress(-file=>"/usr/people/me/data.txt",
          :                 -tmp=>1,
          :                 -outfile=>"/usr/people/share/data.txt.gz",
          :                 -exe=>"/usr/local/bin/fancyzip");
Returns   : String containing full, absolute path to compressed file
Argument  : Named parameters (case-insensitive):
          :   -FILE => String (name of file to be compressed, full path).
          :            If the supplied filename ends with '.gz' or '.Z',
          :            that extension will be removed before attempting to compress.
          : Optional:
          :   -TMP  => boolean. If true, (or if user is not the owner of the file)
          :            the file is compressed to a temp file. If false, file may be
          :            clobbered with the compressed version (if using a utility like
          :            gzip, which is the default)
          :   -OUTFILE => String (name of the output compressed file, full path).
          :   -EXE  => Name of executable for compression utility to use.
          :            Will supersede those in @COMPRESSION_UTILS defined by
          :            this module. If the absolute path to the executable is not provided,
          :            it will be searched in the PATH env variable.
Throws    : Exception if file cannot be compressed.
          : If user is not owner of the file, generates a warning and compresses to
          : a tmp file. To avoid this warning, use the -o file test operator
          : and call this function with -TMP=>1.
Comments  : Attempts to compress using utilities defined in the @COMPRESSION_UTILS
          : defined by this module, in the order defined. The first utility that is
          : found to be executable will be used. Any utility defined in optional -EXE param
          : will be tested for executability first.
          : To minimize security risks, the -EXE parameter value is untained using
          : the untaint() method of this module (in 'relaxed' mode to permit path separators).

See Also : uncompress()

uncompress

Title     : uncompress
Usage     : $Util->uncompress(full-path-filename);
          : $Util->uncompress(<named parameters>);
Purpose   : Uncompress a file.
Example   : $Util->uncompress("/usr/people/me/data.txt");
          : $Util->uncompress(-file=>"/usr/people/me/data.txt.gz",
          :                   -tmp=>1,
          :                   -outfile=>"/usr/people/share/data.txt",
          :                   -exe=>"/usr/local/bin/fancyzip");
Returns   : String containing full, absolute path to uncompressed file
Argument  : Named parameters (case-insensitive):
          :   -FILE => String (name of file to be uncompressed, full path).
          :            If the supplied filename ends with '.gz' or '.Z',
          :            that extension will be removed before attempting to uncompress.
          : Optional:
          :   -TMP  => boolean. If true, (or if user is not the owner of the file)
          :            the file is uncompressed to a temp file. If false, file may be
          :            clobbered with the uncompressed version (if using a utility like
          :            gzip, which is the default)
          :   -OUTFILE => String (name of the output uncompressed file, full path).
          :   -EXE  => Name of executable for uncompression utility to use.
          :            Will supersede those in @UNCOMPRESSION_UTILS defined by
          :            this module. If the absolute path to the executable is not provided,
          :            it will be searched in the PATH env variable.
Throws    : Exception if file cannot be uncompressed.
          : If user is not owner of the file, generates a warning and uncompresses to
          : a tmp file. To avoid this warning, use the -o file test operator
          : and call this function with -TMP=>1.
Comments  : Attempts to uncompress using utilities defined in the @UNCOMPRESSION_UTILS
          : defined by this module, in the order defined. The first utility that is
          : found to be executable will be used. Any utility defined in optional -EXE param
          : will be tested for executability first.
          : To minimize security risks, the -EXE parameter value is untained using
          : the untaint() method of this module (in 'relaxed' mode to permit path separators).

See Also : compress()

file_date

Title    : file_date
Usage    : $Util->file_date( filename [,date_format])
Purpose  : Obtains the date of a given file.
         : Provides flexible formatting via date_format().
Returns  : String = date of the file as: yyyy-mm-dd (e.g., 1997-10-15)
Argument : filename = string, full path name for file
         : date_format = string, desired format for date (see date_format()).
         :               Default = yyyy-mm-dd
Thows    : Exception if no file is provided or does not exist.
Comments : Uses the mtime field as obtained by stat().

untaint

Title   : untaint
Purpose : To remove nasty shell characters from untrusted data
        : and allow a script to run with the -T switch.
        : Potentially dangerous shell meta characters:  &;`'\"|*?!~<>^()[]{}$\n\r
        : Accept only the first block of contiguous characters:
        :  Default allowed chars = "-\w.', ()"
        :  If $relax is true  = "-\w.', ()\/=%:^<>*"
Usage   : $Util->untaint($value, $relax)
Returns : String containing the untained data.
Argument: $value = string
        : $relax = boolean
Comments:
    This general untaint() function may not be appropriate for every situation.
    To allow only a more restricted subset of special characters
    (for example, untainting a regular expression), then using a custom
    untainting mechanism would permit more control.

    Note that special trusted vars (like $0) require untainting.

mean_stdev

Title    : mean_stdev
Usage    : ($mean, $stdev) = $Util->mean_stdev( @data )
Purpose  : Calculates the mean and standard deviation given a list of numbers.
Returns  : 2-element list (mean, stdev)
Argument : list of numbers (ints or floats)
Thows    : n/a

count_files

Title    : count_files
Purpose  : Counts the number of files/directories within a given directory.
         : Also reports the number of text and binary files in the dir
         : as well as names of these files and directories.
Usage    : count_files(\%data)
         :   $data{-DIR} is the directory to be analyzed. Default is ./
         :   $data{-PRINT} = 0|1; if 1, prints results to STDOUT, (default=0).
Argument : Hash reference (empty)
Returns  : n/a;
         : Modifies the hash ref passed in as the sole argument.
         :  $$href{-TOTAL}            scalar
         :  $$href{-NUM_TEXT_FILES}   scalar
         :  $$href{-NUM_BINARY_FILES} scalar
         :  $$href{-NUM_DIRS}         scalar
         :  $$href{-T_FILE_NAMES}     array ref
         :  $$href{-B_FILE_NAMES}     array ref
         :  $$href{-DIRNAMES}         array ref

file_info

Title   : file_info
Purpose : Obtains a variety of date for a given file.
        : Provides an interface to Perl's stat().
Status  : Under development. Not ready. Don't use!

delete

Title   : delete
Purpose :

create_filehandle

Usage     : $object->create_filehandle(<named parameters>);
Purpose   : Create a FileHandle object from a file or STDIN.
          : Mainly used as a helper method by read() and get_newline().
Example   : $data = $object->create_filehandle(-FILE =>'usr/people/me/data.txt')
Argument  : Named parameters (case-insensitive):
          :  (all optional)
          :    -CLIENT  => object reference for the object submitting
          :                the request. Default = $Util.
          :    -FILE    => string (full path to file) or a reference
          :                to a FileHandle object or typeglob. This is an
          :                optional parameter (if not defined, STDIN is used).
Returns   : Reference to a FileHandle object.
Throws    : Exception if cannot open a supplied file or if supplied with a
          : reference that is not a FileHandle ref.
Comments  : If given a FileHandle reference, this method simply returns it.
          : This method assumes the user wants to read ascii data. So, if
          : the file is binary, it will be treated as a compressed (gzipped)
          : file and access it using gzip -ce. The problem here is that not
          : all binary files are necessarily compressed. Therefore,
          : this method should probably have a -mode parameter to
          : specify ascii or binary.

See Also : get_newline()

get_newline

Usage     : $object->get_newline(<named parameters>);
Purpose   : Determine the character(s) used for newlines in a given file or
          : input stream. Delegates to Bio::Root::Utilities::get_newline()
Example   : $data = $object->get_newline(-CLIENT => $anObj,
          :                                   -FILE =>'usr/people/me/data.txt')
Argument  : Same arguemnts as for create_filehandle().
Returns   : Reference to a FileHandle object.
Throws    : Propagates any exceptions thrown by Bio::Root::Utilities::get_newline().

See Also : taste_file(), create_filehandle()

taste_file

Usage     : $object->taste_file( <FileHandle> );
          : Mainly a utility method for get_newline().
Purpose   : Sample a filehandle to determine the character(s) used for a newline.
Example   : $char = $Util->taste_file($FH)
Argument  : Reference to a FileHandle object.
Returns   : String containing an octal represenation of the newline character string.
          :   Unix = "\012"  ("\n")
          :   Win32 = "\012\015" ("\r\n")
          :   Mac = "\015"  ("\r")
Throws    : Exception if no input is read within $TIMEOUT_SECS seconds.
          : Exception if argument is not FileHandle object reference.
          : Warning if cannot determine neewline char(s).
Comments  : Based on code submitted by Vicki Brown (vlb@deltagen.com).

See Also : get_newline()

file_flavor

Usage     : $object->file_flavor( <filename> );
Purpose   : Returns the 'flavor' of a given file (unix, dos, mac)
Example   : print "$file has flavor: ", $Util->file_flavor($file);
Argument  : filename = string, full path name for file
Returns   : String describing flavor of file and handy info about line endings.
          : One of these is returned:
          :   unix (\n or 012 or ^J)
          :   dos (\r\n or 015,012 or ^M^J)
          :   mac (\r or 015 or ^M)
          :   unknown
Throws    : Exception if argument is not a file
          : Propagates any exceptions thrown by Bio::Root::Utilities::get_newline().

See Also : get_newline(), taste_file()

mail_authority

Title    : mail_authority
Usage    : $Util->mail_authority( $message )
Purpose  : Syntactic sugar to send email to $Bio::Root::Global::AUTHORITY

See Also : send_mail()

authority

Title    : authority
Usage    : $Util->authority('admin@example.com');
Purpose  : Set/get the email address that should be notified by mail_authority()

See Also : mail_authority()

send_mail

Title    : send_mail
Usage    : $Util->send_mail( named_parameters )
Purpose  : Provides an interface to mail or sendmail, if available
Returns  : n/a
Argument : Named parameters:  (case-insensitive)
         :  -TO   => e-mail address to send to
         :  -SUBJ => subject for message  (optional)
         :  -MSG  => message to be sent   (optional)
         :  -CC   => cc: e-mail address   (optional)
Thows    : Exception if TO: address appears bad or is missing.
         : Exception if mail cannot be sent.
Comments : Based on  TomC's tip at:
         :   http://www.perl.com/CPAN/doc/FMTEYEWTK/safe_shellings
         :
         : Using default 'From:' information.
         :   sendmail options used:
         :      -t: ignore the address given on the command line and
         :          get To:address from the e-mail header.
         :     -oi: prevents send_mail from ending the message if it
         :          finds a period at the start of a line.

See Also : mail_authority()

find_exe

Title     : find_exe
Usage     : $Util->find_exe(name);
Purpose   : Locate an executable (for use in a system() call, e.g.))
Example   : $Util->find_exe("gzip");
Returns   : String containing executable that passes the -x test.
            Returns undef if an executable of the supplied name cannot be found.
Argument  : Name of executable to be found.
          : Can be a full path. If supplied name is not executable, an executable
          : of that name will be searched in all directories in the currently
          : defined PATH environment variable.
Throws    : No exceptions, but issues a warning if multiple paths are found
          : for a given name. The first one is used.
Comments  : TODO: Confirm functionality on all bioperl-supported platforms.
            May get tripped up by variation in path separator character used
            for splitting ENV{PATH}.
See Also   :

yes_reply

Title   : yes_reply()
Usage   : $Util->yes_reply( [query_string]);
Purpose : To test an STDIN input value for affirmation.
Example : print +( $Util->yes_reply('Are you ok') ? "great!\n" : "sorry.\n" );
        : $Util->yes_reply('Continue') || die;
Returns : Boolean, true (1) if input string begins with 'y' or 'Y'
Argument: query_string = string to be used to prompt user (optional)
        : If not provided, 'Yes or no' will be used.
        : Question mark is automatically appended.

request_data

Title   : request_data()
Usage   : $Util->request_data( [value_name]);
Purpose : To request data from a user to be entered via keyboard (STDIN).
Example : $name = $Util->request_data('Name');
        : # User will see: % Enter Name:
Returns : String, (data entered from keyboard, sans terminal newline.)
Argument: value_name = string to be used to prompt user.
        : If not provided, 'data' will be used, (not very helpful).
        : Question mark is automatically appended.

quit_reply

Title   : quit_reply
Usage   :
Purpose :

verify_version

Purpose : Checks the version of Perl used to invoke the script.
        : Aborts program if version is less than the given argument.
Usage   : verify_version('5.000')