NAME

Seeder - Motif discovery in DNA sequences

VERSION

Version 0.01

DESCRIPTION

This module is a base class and is not meant to be instantiated itself.

Seeder is a framework for DNA motif discovery. It is designed for efficient and reliable prediction of regulatory motifs in eukaryotic promoters. In order to generate DNA motifs, you need one positive set of DNA sequences in fasta format (believe to contain a similar cis-regulatory element) and a background set of DNA sequences in fasta format.

To discover motifs in DNA sequences, follow this sequence:

(1) Generation of the index (this structure improves the performance of HD calculation). Restrict seed width to between 6 and 8.

use Seeder::Index;  
    my $index = Seeder::Index->new( 
    seed_width => "6", 
    out_file   => "6.index", 
); 
$index->get_index; 

(2) Generation of the background distributions.

use Seeder::Background; 
    my $background = Seeder::Background->new( 
    seed_width    => "6", 
    strand        => "revcom", 
    hd_index_file => "6.index", 
    seq_file      => "seqs.fasta", 
    out_file      => "seqs.bkgd", 
);
$background->get_background; 

(3) Motif discovery.

use Seeder::Finder;  
    my $finder = Seeder::Finder->new( 
    seed_width    => "6", 
    strand        => "revcom", 
    motif_width   => "12", 
    n_motif       => "1", 
    hd_index_file => "6.index", 
    seq_file      => "prom.fasta", 
    bkgd_file     => "seqs.bkgd", 
    out_file      => "prom.finder", 
); 
$finder->find_motifs;

EXPORT

None by default

FUNCTIONS

read_hd_index

Title   : read_hd_index
Usage   : $self->read_hd_index;
Function: read the index file
Returns : reference to a 2D array of positive integers
Args    : none

generate_oligo

Title   : generate_oligo
Usage   : $self->generate_oligo;
Function: generate all combinations of nucleotides for a given word length,
          represented by numbers (0=>A, 1=>C, 2=>G, 3=>T)
Returns : reference to a 2D array of oligos
Args    : none

lookup_coord

Title   : lookup_coord
Usage   : $self->lookup_coord;
Function: generate indices for lookup (HD calculation)
Returns : reference to 2D arrays of begin/end lookup indices
Args    : none

bc_factor

Title   : bc_factor
Usage   : my $bc_factor_ref = $self->bc_factor;
Function: generate the number of neighbors in function of Hamming distance
          and seed width
Returns : references to a 2D array of number of neighbors
Args    : none

_factorial

Title   : _factorial
Usage   : my $r = _factorial($n);
Function: calculate the product of all positive integers less than or equal
          to a given number
Returns : non-negative integer
Args    : none

generate_hd_index

Title   : generate_hd_index
Usage   : my $hd_index = $self->generate_hd_index( $oligo_ref );
Function: generate oligo indices for increasing Hamming distances
Returns : array of indices
Args    : reference to oligo

execution_time

Title   : execution_time
Usage   : my $time_string = $self->execution_time;
Function: transform a lapse of time in "seconds since epoch" into a
          human-readable format
Returns : lapse of time (string) 
Args    : none

encode

Title   : encode
Usage   : my $representation = encode($value, $base, $depth);
Function: convert value (base 10) to representation in specified base
Returns : array or integers from 0 to 3
Args    : base 10 number to be converted, base to which the number is
          converted, width of the representation

decode

Title   : decode
Usage   : my $value = decode($representation, $base);
Function: convert representation to value (base 10)
Returns : integer
Args    : representation, base of the representation

AUTHOR

François Fauteux, <ffauteux at cpan.org>

BUGS

Please report any bugs or feature requests to bug-Seeder at rt.cpan.org, or at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Seeder. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Seeder

You can also look for information at:

ACKNOWLEDGEMENTS

This algorithm was developed by François Fauteux, Mathieu Blanchette and Martina Strömvik. We thank the Perl Monks <http://www.perlmonks.org/> for their support.

COPYRIGHT & LICENSE

Copyright 2008 François Fauteux, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 351:

Non-ASCII character seen before =encoding in 'François'. Assuming UTF-8