NAME

Bio::Align::Subset - A BioPerl module to generate new alignments as subset from larger alignments

VERSION

Version 1.25

SYNOPSIS

use strict;
use warnings;
use Data::Dumper;

use Bio::Align::Subset;

# The alignment in a file
my $filename = "alignmentfile.fas";
# The format
my $format = "fasta";

# The subset of codons
my $subset = [1,12,25,34,65,100,153,156,157,158,159,160,200,201,202,285];

# Create the object
my $obj = Bio::Align::Subset->new(
                                  file => $filename,
                                  format => $format
                                );

# View the result
# This function returns a Bio::SimpleAlign object
print Dumper($obj->build_subset($subset));

DESCRIPTION

Given an array of codon positions and an alignment, the function Bio::Align::Subset->build_subset returns a new alignment with the codons at those positions from the original alignment.

CONSTRUCTOR

Bio::Align::Subset->new()

$Obj = Bio::Align::Subset->new(file => 'filename', format => 'format')

The new class method constructs a new Bio::Align::Subset object. The returned object can be used to retrieve, print and generate subsets from alignment objects. new accepts the following parameters:

file

A file path to be opened for reading or writing. The usual Perl conventions apply:

'file'       # open file for reading
'>file'      # open file for writing
'>>file'     # open file for appending
'+<file'     # open file read/write
'command |'  # open a pipe from the command
'| command'  # open a pipe to the command
format

Specify the format of the file. Supported formats include fasta, genbank, embl, swiss (SwissProt), Entrez Gene and tracefile formats such as abi (ABI) and scf. There are many more, for a complete listing see the SeqIO HOWTO (http://bioperl.open-bio.org/wiki/HOWTO:SeqIO).

If no format is specified and a filename is given then the module will attempt to deduce the format from the filename suffix. If there is no suffix that Bioperl understands then it will attempt to guess the format based on file content. If this is unsuccessful then SeqIO will throw a fatal error.

The format name is case-insensitive: 'FASTA', 'Fasta' and 'fasta' are all valid.

Currently, the tracefile formats (except for SCF) require installation of the external Staden "io_lib" package, as well as the Bio::SeqIO::staden::read package available from the bioperl-ext repository.

OBJECT METHODS

build_subset($index_list)

my $subset = $obj->build_subset([1,12,25,34,65,100,153,156,157,158,159]);

Build a new alignment with the specified codons in $index_list. It returns a Bio::SimpleAlign object.

ACCESSOR METHODS

get_count

Title   : get_count
Usage   : $instance_no = $obj->get_count
Function: 
Returns : Number of istances for this class
Args    :

get_file

Title   : get_file
Usage   : $file_path = $obj->get_file
Function:
Returns : The file name of the alignment
Args    :

get_format

Title   : get_format
Usage   : $format = $obj->get_format
Function:
Returns : The alignment format (fasta, phylip, etc.)
Args    :

get_identifiers

Title   : get_identifiers
Usage   : $identifiers $obj->get_identifiers
Function:
Returns : An array reference with all the identifiers in an alignment
Args    :

get_seq_length

Title   : get_seq_length
Usage   : $long = $obj->get_seq_length
Function:
Returns : The longitude of all the sequences in an alignment
Args    :

get_sequences

Title   : get_sequences
Usage   : $sequences = $obj->get_sequences
Function:
Returns : An array reference with all the sequences in an alignment
Args    :

MUTATOR METHODS

set_file

Title   : set_file
Usage   : $obj->set_file('filename')
Function: Set the file path for an alignment
Returns : 
Args    : String

set_format

Title   : set_format
Usage   : $obj->set_format('fasta')
Function: Set the file format for an alignment
Returns :
Args    : String

set_identifiers

Title   : set_identifiers
Usage   : $obj->set_identifiers(\@array_ids)
Function: Change the identifiers for all the sequences in the alignment
Returns :
Args    : List

set_sequences

Title   : set_sequences
Usage   : $obj->set_sequences(\@array_seqs)
Function: Change the sequences in the alignment
Returns :
Args    : List

AUTHOR - Hector Valverde

Hector Valverde, <hvalverde@uma.es>

CONTRIBUTORS

Juan Carlos Aledo, <caledo@uma.es>

BUGS

Please report any bugs or feature requests to bug-bio-align-subset at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Bio-Align-Subset. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Bio::Align::Subset

You can also look for information at:

LICENSE AND COPYRIGHT

Copyright 2012 Hector Valverde and Juan Carlos Aledo.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.