NAME
Bio::Util::DNA - Basic DNA utilities
SYNOPSES
use Bio::Util::DNA qw(:all);
my $clean_ref = cleanDNA($seq_ref);
my $seq_ref = randomDNA(100);
my $rev_ref = reverse_complement($seq_ref);
DESCRIPTION
Provides a set of functions and predefined variables which are handy when working with DNA.
VARIABLES
BASIC VARIABLES
Basic nucleotide variables that could be useful. All of the variables have a prefix and a suffix;
Prefixes
Suffixes
- ${prefix}s
-
String of the different nucleotides
- @{prefix}s
-
Array of the different nucleotides
- ${prefix}_match
-
Precompiled regular expression which matches nucleotide characters
- ${prefix}_fail
-
Precompiled regular expression which matches non-nucleotide characters
%degenerate2nucleotides
Hash of degenerate nucleotide definitions. Each entry contains a reference to an array of DNA nucleotides that each degenerate nucleotide stands for.
%nucleotides2degenerate
Reverse of %degenerate2nucleotides. Keys are alphabetically-sorted DNA nucleotides and values are the degenerate nucleotide that can represent those nucleotides.
%degenerate_hierarchy
Contains the heirarchy of degenerate nucleotides; N of course contains all the other degenerates, and the four degenerates that can stand for three different bases contain three of the two-base degenerates.
FUNCTIONS
cleanDNA
my $clean_ref = cleanDNA($seq_ref);
Cleans the sequence for use. Strips out comments (lines starting with '>') and whitespace, converts uracil to thymine, and capitalizes all characters.
Examples:
my $clean_ref = cleanDNA($seq_ref);
my $seq_ref = cleanDNA(\'actg');
my $seq_ref = cleanDNA(\'act tag cta');
my $seq_ref = cleanDNA(\'>some mRNA
acugauauagau
uauagacgaucc');
randomDNA
my $seq_ref = randomDNA($length);
Generate random DNA for testing this module or your own scripts. Default length is 100 nucleotides.
Example:
my $seq_ref = randomDNA();
my $seq_ref = randomDNA(600);
reverse_complement
rev_comp
my $reverse_ref = reverse_complement($seq_ref);
Finds the reverse complement of the sequence and handles degenerate nucleotides.
Example:
$reverse_ref = reverse_complement(\'act');
unrollDNA
my $seq_arrayref = unrollDNA( $seq_ref );
Unroll a DNA string containing degenerate nucleotides. The first entry of the arrayref will be the actual sequence.
Example:
my $seq_arrayref = unrollDNA( \'ACSTAD' ) =
[
'ACSTAD', 'ACCTAD', 'ACGTAD',
'ACSTAR', 'ACCTAR', 'ACGTAR',
'ACSTAW', 'ACCTAW', 'ACGTAW',
'ACSTAK', 'ACCTAK', 'ACGTAK',
'ACSTAA', 'ACCTAA', 'ACGTAA',
'ACSTAG', 'ACCTAG', 'ACGTAG',
'ACSTAT', 'ACCTAT', 'ACGTAT'
];
AUTHOR
Kevin Galinsky, <first initial last name plus cpan at gmail dot com>
COPYRIGHT AND LICENSE
Copyright (c) 2010-2011, Broad Institute.
Copyright (c) 2008-2009, J. Craig Venter Institute.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.