NAME

Bioscripts 1.3 - Bioperl scripts

SYNOPSIS

A list of the scripts in the Bioperl package

DESCRIPTION

These scripts have been contributed by the developers and users of Bioperl. They are organized into directories roughly mirroring those in the Bioperl Bio/ directory. There are 2 directories for these scripts, scripts/ and examples/. The scripts in scripts/ are production quality scripts that have POD documentation and accept command-line arguments, the scripts in examples/ are useful examples of Bioperl code.

You can install the scripts in the scripts/ directory if you'd like, simply follow the instructions on 'make install'. The installation directory is specified by the INSTALLSCRIPT variable in the Makefile, the default directory is /usr/bin. Installation will copy the scripts to the specified directory, change the 'PLS' suffix to 'pl' and prepend 'bp_' to the script name if it isn't so named already.

Please contact bioperl-l at bioperl.org if you are interested in contributing your own script.

PRODUCTION SCRIPTS

scripts/install_bioperl_scripts.PLS

This script installs scripts from the scripts/ directory on 'make install'.

A fully-featured script that uses Bio::Biblio, a module for accessing and querying bibliographic repositories like Medline.

scripts/biblio/biblio.PLS

A fully-featured script that uses Bio::Biblio, a module for accessing and querying bibliographic repositories like Medline.

scripts/Bio-DB-GFF/bulk_load_gff.PLS

This script loads a mySQL Bio::DB::GFF database with the features contained in a list of GFF files, it cannot do incremental loads.

scripts/Bio-DB-GFF/bp_genbank2gff.PLS

This script loads a Bio::DB::GFF database with the features contained in a either a local Genbank file or an accession that is fetched from Genbank.

scripts/Bio-DB-GFF/fast_load_gff.PLS

This script does a rapid load of a mySQL Bio::DB::GFF database using files as source. Probably only works in Unix as it relies on pipes.

scripts/Bio-DB-GFF/generate_histogram.PLS

Create a GFF-formatted histogram of the density of the indicated set of feature types.

scripts/Bio-DB-GFF/load_gff.PLS

This script loads a mySQL Bio::DB::GFF database with the features contained in a list of GFF files. This script will work with all database adaptors supported by Bio::DB::GFF (mySQL, Oracle, Postgres).

scripts/Bio-DB-GFF/pg_bulk_load_gff.PLS

Bulk-load a PostgreSQL Bio::DB::GFF database from GFF files.

scripts/Bio-DB-GFF/process_gadfly.PLS

Transforms Gadfly GFF files into correct format.

scripts/Bio-DB-GFF/process_ncbi_human.PLS

Trnasforms NCBI's chromosome annotations into correct format.

scripts/Bio-DB-GFF/process_sgd.PLS

Transform SGD format annotations into GFF format.

scripts/Bio-DB-GFF/process_wormbase.PLS

Transforms Wormbase's GFF files into correct format. Requires Ace.pm.

scripts/DB/bioflat_index.pl

Create or update a biological sequence database indexed with the Bio::DB::Flat indexing scheme.

scripts/DB/flanks.PLS

Fetch a sequence, find the sequences flanking a variant or SNP in the sequence given its position.

scripts/DB/biofetch_genbank_proxy.PLS

A CGI scripts that queries NCBI's eutils to provide database access according to the BioFetch protocol. Requires Cache::FileCache.

scripts/DB/biogetseq.PLS

Sequence retrieval using the OBDA registry.

scripts/graphics/feature_draw.PLS

Script that accepts files in GFF or tab-delimited format and creates corresponding PNG image files. See Bio::Graphics and Bio::Graphics::FeatureFile for more information.

scripts/graphics/frend.PLS

Create a PNG file on the Web using Bio::Graphics - accepts a file containing sequence and feature coordinates.

scripts/graphics/search_overview.PLS

Create a simple overview graphic of the hits, color is based on the hit score much like the NCBI overview graphic in a BLAST report.

scripts/index/bp_fetch.PLS

Fetch sequences from local indexed database or over the network and reformat using Bio::Index* and Bio::DB*.

scripts/index/bp_index.PLS

Indexes local databases, partners with bp_fetch.pl.

scripts/seq/extract_feature_seq.PLS

Extract the sequence for a specified feature type.

scripts/popgen/composite_LD.PLS

An easy way to calculate composite linkage disequilibrium (LD).

scripts/popgen/heterogeneity.PLS

A test for distinguishing between selection and population expansion.

scripts/searchio/filter_search.PLS

Simple script to filter by SearchIO criteria and print.

scripts/seq/seqconvert.PLS

Bioperl sequence format converter.

scripts/seq/split_seq.PLS

Split a sequence in a file into chunks of equal size.

scripts/seq/translate_seq.PLS

A simple Bioperl translator.

scripts/seqstats/aacomp.PLS

Prints out the count of amino acids over all protein sequences in the input file.

scripts/seqstats/chaos_plot.PLS

Produce a PNG or JPEG chaos plot given a DNA sequence using GD.pm.

scripts/seqstats/gccalc.PLS

Prints out the GC content for every nucleotide sequence in the input file.

scripts/seqstats/oligo_count.PLS

Use this script to determine what primers would be useful for frequent priming of nucleic acid for random labeling.

scripts/taxa/local_taxonomydb_query.PLS

Script that accesses a local Taxonomy database and retrieves species or taxon ids.

scripts/taxa/taxid4species.PLS

Retrieve the NCBI Tada ID for a given species.

scripts/tree/blast2tree.PLS

Builds a phylogenetic tree based on a sequence search (Fasta, BLAST, HMMER).

scripts/utilities/bp_mrtrans.PLS

Perl implementation of Bill Pearson's mrtrans to project protein alignment back into cDNA coordinates.

scripts/utilities/bp_nrdb.PLS

Make a non-redundant database based on sequence, not id. Requires Digest::MD5.

scripts/utilities/bp_sreformat.PLS

Perl implementation of Sean Eddy's sreformat, a sequence and alignment converter.

scripts/utilities/dbsplit.PLS

Splits one or more sequence files into subfiles with specified numbers of sequences, any sequence format.

scripts/utilities/mask_by_search.PLS

Masks parts of a sequence based on a significant matches to that sequence as contained in a SearchIO-compatible report file.

scripts/utilities/mutate.PLS

Randomly mutagenize a single protein or DNA sequence. Specify percentage mutated and number of resulting mutagenized sequences.

scripts/utilities/pairwise_kaks.PLS

Takes DNA sequences as input, aligns them as proteins, projects the alignment back into DNA and estimates the Ka (non-synonymous) and Ks (synonymous) substitutions.

scripts/utilities/remote_blast.PLS

This script executes a remote Blast search using RemoteBlast. See Bio::Tools::Run::RemoteBlast for more information.

scripts/utilities/search2BSML.PLS

Turns SearchIO-compatible reports into a BSML report.

scripts/utilities/search2alnblocks.PLS

Turns SearchIO-compatible reports into alignments in formats supported by AlignIO.

scripts/utilities/search2tribe.PLS

This script will turn a protein SearchIO-compatible report (BLASTP, FASTP, SSEARCH) into a Markov Matrix for TribeMCL clustering.

scripts/utilities/search2gff.PLS

Turn SearchIO parseable reports(s) into a GFF report.

scripts/utilities/seq_length.PLS

Reports the total number of residues and total number of individual sequences in a specified sequence database file.

EXAMPLE SCRIPTS

examples/align/align_on_codons.pl

Aligns nucleotide sequences based on codons in a specified reading frame.

examples/align/aligntutorial.pl

Examples using EMBOSS, pSW, Clustalw, TCoffee, and Blast to align sequences.

examples/align/clustalw.pl

A demonstration of the various uses of Alignment::Clustalw. See Bio::Tools::Run::Alignment::Clustalw for more.

examples/align/simplealign.pl

A script that demonstrates some uses of AlignIO. Please see Bio::AlignIO for more information.

examples/biblio/biblio.pl

A script that shows how to query bibliographic databases, such as Medline, using ids, keywords, and other fields. See Bio::Biblio for details.

examples/biblio/biblio_soap.pl

Connect to and test a SOAP server using a Bio::Biblio object.

examples/biographics/all_glyphs.pl

Creates an image showing all possible glyphs.

examples/biographics/dynamic_glyphs.pl

Creates a complex image of a gene with confirmed and predicted exons, transcripts, and text labels.

examples/biographics/lots_of_glyphs.pl

Creates a complex image of a gene with confirmed and predicted exons, transcripts, and text labels.

examples/bioperl.pl

A Bioperl shell!

examples/cluster/dbsnp.pl

How to parse a dbsnp XML file. See Bio::ClusterIO for details.

examples/contributed/nmrpdb_parse.pl

Extracts individual conformers from an NMR-derived PDB file.

examples/contributed/prosite2perl.pl

Convert Prosite motifs to Perl regular expressions.

examples/contributed/rebase2list.pl

Script to convert rebase file to format compatible with Bio::Tools::RestrictionEnzyme.

examples/contributed/expression_analysis*

A set of scripts that accept microarray data as input and perform statistical analyses, including t test, U test, Mann-Whitney, and Pearson correlation coefficent.

examples/db/dbfetch

Creates a Web page to query a local SRS server and fetch sequences.

examples/db/est_tissue_query.pl

Fetch EST sequences from local files or Genbank filtered by tissue using Bio::DB* or Bio::Index*.

examples/db/gb2features.pl

Shows how to extract all the features from a Genbank file. See Bio::Seq for more information on features.

examples/db/getGenBank.pl

Retrieving Genbank entries over the Web using DB::GenBank. See Bio::DB::GenBank for more information.

examples/db/get_seqs.pl

Fetches and formats sequences from GenBank, EMBL, or SwissProt over the network using Bio::DB*.

examples/db/gff/*

Scripts that reformat sequence to GFF and load GFF format files into an indexed database - see Bio::DB::GFF for more information.

examples/db/rfetch.pl

A script that uses Bio::DB::Registry to retrieve sequences from EMBL, reformat them, and print them. See Bio::DB::Registry.

examples/db/use_registry.pl

Script that shows how to use Bio::DB::Registry, part of Bioperl's integration with OBDA, the Open Bio Database Access registry scheme. See Bio::DB::Registry for more information.

examples/exceptions/*

Scripts that demonstrate how to throw and catch Error.pm objects.

examples/generate_random_seq.pl

Writes random RNA, DNA, or protein sequence of given length.

examples/biographics/render_sequence.pl

This scripts fetchs a sequence from a remote database, extracts its features (CDS, gene, tRNA), and creates a graphic representation of the sequence in PNG or GIF format. See Bio::DB::BioFetch and Bio::Graphics.

examples/liveseq/change_gene.pl

A script showing how to use LiveSeq::Mutator and LiveSeq::Mutation. Please see Bio::LiveSeq::Mutator and Bio::LiveSeq::Mutation for more information.

examples/longorf.pl

A script that finds the longest ORF in one or more nucleotide sequences.

examples/make_mrna_protein.pl

Translate a cDNA or ORF to protein using Bio::Seq's translate() method.

examples/make_primers.pl

Design PCR primers given a sequence and the positions of the start and stop codons in the sequence's ORF.

examples/popgen/parse_calc_stats.pl

Shows how to read data from a Bio::PopGen::IO object.

examples/rev_and_trans.pl

Examples using Bio::Seq.pm for reversing and translating sequences. See Bio::Seq for more information.

examples/revcom_dir.pl

Eeturn reverse complement sequences of all sequences in the current directory and save them in the same directory.

examples/sirna/rnai_finder.cgi

CGI script for RNAi reagent design. See Bio::Tools::SiRNA for more information.

examples/root_object/*

Scripts that demonstrate uses of the Bio::Root modules.

examples/searchio/blast_example.pl

Print out all parsed values from a BLAST report.

examples/searchio/custom_writer.pl

Demonstrates how to extract data from BLAST reports and output as tab-delimited data.

examples/searchio/hitwriter.pl

Demonstrates how to extract data from BLAST reports and output as tab-delimited data.

examples/searchio/hspwriter.pl

Demonstrates how to extract data from BLAST reports and output as tab-delimited data.

examples/searchio/htmlwriter.pl

Demonstrates how to extract data from BLAST reports and output as HTML.

examples/searchio/psiblast_features.pl

Illustrates how to grab a set of SeqFeatures from a Psiblast report.

examples/searchio/psiblast_iterations.pl

Demonstrates the use of a SearchIO parser for processing the iterations within a PSI-BLAST report.

examples/searchio/rawwriter.pl

Shows how to print out raw BLAST alignment data for each HSP.

examples/searchio/resultwriter.pl

Demonstrates how to extract data from BLAST reports and output as tab-delimited data.

examples/searchio/waba2gff.pl

Convert raw WABA output to one type of GFF.

examples/seq/*

Example code for working with multiple sequence files, including formatting, printing, and filtering based on length or description or ID.

examples/seq/extract_cds.pl

Extract the CDS features from a Genbank file.

examples/seqstats/aacomp.pl

Calculate amino acid composition of a protein using Tools::CodonTable and Tools::IUPAC. See Bio::Tools::IUPAC and Bio::Tools::CodonTable for more information.

examples/structure/struct_example*

Scripts that show how to examine details of the 3D structure of a protein by parsing a PDB file. See Bio::Structure::IO for more information.

examples/subsequence.cgi

CGI script to fetch a sequence from Genbank and extract a subsequence using DB::GenBank. See Bio::DB::GenBank.

examples/tk/gsequence.pl

Create a Protein Sequence Control Panel GUI with Gtk.

examples/tk/hitdisplay.pl

Create a GUI for displaying Blast results using Tk::HitDisplay. Please see Bio::Tk::HitDisplay for more information.

examples/tools/gb_to_gff.pl

Extracts top-level sequence features from Genbank-formatted sequence files using Tools::GFF. See Bio::Tools::GFF.

examples/tools/gff2ps.pl

Takes an input file in GFF format and draws its genes and features as Postscript using Tools::GFF. See Bio::Tools::GFF.

examples/tools/parse_codeml.pl

Script that parses output from codeml, one of the PAML programs. See Bio::Tools::Phylo::PAML.

examples/tools/psw.pl

Example code for using the XS extensions for comparing proteins using Smith-Waterman.

examples/tools/restriction.pl

Example code for using the RestrictionEnzyme module. See Bio::Tools::RestrictionEnzyme for more information (note that Bio::Tools::RestrictionEnzyme has been superceded by Bio::Restriction::*).

examples/tools/run_genscan.pl

Run GENSCAN on multiple sequences and create output sequence files using Tools::Genscan. Please see Bio::Tools::Genscan for more information.

examples/tools/seq_pattern.pl

A script that shows how to use sequences as regular expressions using Tools::SeqPattern. Please see Bio::Tools::SeqPattern for more information.

examples/tools/standaloneblast.pl

The many uses of StandAloneBlast, including BLAST and PSIBLAST.

examples/tools/state-machine.pl

A demonstration of how to create a state machine using StateMachine::AbstractStateMachine. Please see Bio::Tools::StateMachine::AbstractStateMachine for more information.

examples/tools/test-genscan.pl

Script for testing and demonstrating Bio::Tools::Genscan.

examples/tree/paup2phylip.pl

Convert a PAUP tree block to Phylip format.