NAME
Bio::Gonzales::Seq::IO - fast utility functions for sequence IO
SYNOPSIS
use Bio::Gonzales::Seq::IO qw( faslurp faspew fahash fasubseq faiterate )
DESCRIPTION
SUBROUTINES
- @seqs = faslurp(@filenames)
- $seqsref = faslurp(@filenames)
-
faslurp
reads in all sequences from@filenames
and returns an array in list or an arrayref in scalar context of the read sequences. The sequences are stored as FAlite2::Entry objects. - $iterator = faiterate($filename)
-
Allows you to create an iterator for the fasta file
$filename
. This iterator can be used to loop over the sequence file w/o reading in all content at once. Iterator usage:while(my $sequence_object = $iterator->()) { #do something with the sequence object }
- $seqs = fasubseq($file, \@ids_with_locations, \%c)
- $seqs = fasubseq($file, \@id_list, \%c)
-
#ARRAY OF ARRAYS @ids_with_locations = ( [ $id, $begin, $end, $strand ], ... );
Config options can be:
%c = ( keep_id => 1, # keeps the original id of the sequence wrap => 1, # see further down relaxed_range => 1, # substitute 0 or undef for $begin with '^' and for $end with '$' );
There are several possibilities for
$begin
and$end
:GGCAAAGGA ATGATGGTGT GCAGGCTTGG CATGGGAGAC ^..........^ (1,11) OR ('^', 11) ^.....................................^ (4,'$') ^..............^ (21,35) { with wrap on: OR (-19,35) OR (-19, -5) } ^..................^ (21,35) { with wrap on: OR (-19,'$') }
wrap
: The default is to limit all negative values to the sequence boundaries, so a negative begin would be equal to 1 or '^' and a negative end would be equal to '$'. - $sref = fahash(@filenames)
- %seqs = fahash(@filenames)
-
Does the same as faslurp, but returns an hash with the sequence ids as keys and the sequence objects as values.
- faspew($file, $seq1, $seq2, ...)
-
"spew" out the given sequences to a file. Every
$seqN
argument can be an hash reference with FAlite2::Entry objects as values or an array reference of FAlite2::Entry objects or just plain FAlite2::Entry objects. - $iterator = faspew_iterate($filename)
- $iterator = faspew_iterate($fh)
-
Creates an iterator that writes the sequences to the given
$filename
or$fh
.for my $sequence_object (@sequences) { $iterator->($sequence_object) } #DO NOT FORGET THIS, THIS CALL WILL CLOSE THE FILEHANDLE $iterator->(); #this is equal to: $iterator->(@sequences); $iterator->(); #or $iterator->(\@sequences); $iterator->(); #DO NOT DO THIS: $iterator->();
The filehandle will not be closed in case one supplies not a
$filename
but a$fh
handle.
ADVANCED
- change the output format
-
$Bio::Gonzales::Seq::IO::WIDTH = 60; #sequence width in fasta output #but only if set to 'all_pretty' ('all' is default) $Bio::Gonzales::Seq::IO::SEQ_FORMAT = 'all_pretty';
SEE ALSO
AUTHOR
jw bargsten, <joachim.bargsten at wur.nl>