NAME
Bio::BPWrapper::SeqManipulations - Functions for bioseq
SYNOPSIS
use Bio::BPWrapper::SeqManipulations;
# Set options hash ...
initialize(\%opts);
write_out(\%opts);
SUBROUTINES
initialize()
Sets up most of the actions to be performed on an alignment.
Call this right after setting up an options hash.
Sets package variables: $in
, $in_format
, $filename
, $out_format
, and $out
.
write_out()
Writes out the sequence file.
Call this after calling #initialize(\%opts)
and processing those options.
reading_frame_ops
Translate in 1, 3, or 6 frames based on the value of $opts
set via #initilize(\%opts)
. Wraps Bio::Seq->translate(), Bio::SeqUtils->translate_3frames(), and Bio::SeqUtils->translate_6frames().
restrict_digest()
Predicted fragments from digestion by a specified restriction enzyme specified in $opts{restrinct}
set via #initilize(\%opts)
.
An input file with a single sequence is expected. Wraps Bio::Restriction::Analysis->cut().
anonymize()
Replace sequence IDs with serial IDs n characters long, as specified in $opts{'anonymize'}
set via #initilize(\%opts)
. For example if $opts{'anonymize'}
, the first ID will be S0001
. leading 'S' The length of the serial idea
A sed script file is produced with a .sed suffix that may be used with sed's '-f'
argument. If the filename is '-', the sed file is named STDOUT.sed
instead. A message containing the sed filename is written to STDERR
.
shred_seq()
Break into individual sequences writing a FASTA file for each sequence.
count_codons()
Count codons for coding sequences (e.g., a genome file consisting of CDS sequences). Wraps Bio::Tools::SeqStats->count_codons().
print_gb_gene_feats()
print gene sequences in FASTA from a GenBank file of bacterial genome. Won't work for a eukaryote genbank file.
count_leading_gaps()
Count and print the number of leading gaps in each sequence.
hydroB()
Return the mean Kyte-Doolittle hydropathicity for protein sequences. Wraps Bio::Tools::SeqStats->hydropathicity().
linearize()
Linearize FASTA, print one sequence per line.
reloop_at()
Re-circularize a bacterial genome by starting at a specified position given in the $opts{"reloop"
set via #initilize(\%opts)
.
For example for sequence "ABCDE". bioseq -R'2' ..
would generate"'BCDEA".
remove_stop()
Remove stop codons.