NAME
Bio::Tools::Blast::Run::Webblast.pm - Bioperl module for running Blast analyses using a HTTP interface.
SYNOPSIS
# Run a Blast
use Bio::Tools::Blast::Run::Webblast qw(&blast_remote);
@out_file_names = &blast_remote($object, %named_parameters);
blast_remote is the only exported method of this module and it returns a list of local file names containing the Blast reports. $object
is a reference to a Bio::Root::Object.pm object or subclass. See blast_remote() for a description of available parameters.
# Obtain a list of available databases
use Bio::Tools::Blast::Run::Webblast qw(@Blast_dbp_remote
@Blast_dbn_remote);
@amino_dbs = @Blast_dbp_remote;
@nucleotide_dbs = @Blast_dbn_remote;
INSTALLATION
This module is included with the central Bioperl distribution:
http://bio.perl.org/Core/Latest
ftp://bio.perl.org/pub/DIST
Follow the installation instructions included in the README file.
DESCRIPTION
Bio::Tools::Blast::Run::Webblast.pm contains methods and data necessary for running Blast sequence analyses using a remote server and saving the results locally.
Bio::Tools::Blast::run() provides an interface for Webblast.pm, so, ideally, you shouldn't use Webblast.pm directly, but via Blast.pm.
FEATURES:
Supports NCBI Blast1, Blast2, and PSI-Blast servers as well as WashU-Blast servers.
Can operate through a proxy server enabling operation from behind a firewall.
Can save reports with and without HTML formatting.
Uses LWP.
In principle, this module can be customized to use different servers that provide a Blast interface like the NCBI or WashU style servers. Such servers could be remote or local. This hasn't been well-tested however.
DEPENDENCIES
Bio::Tools::Blast::Run::Webblast.pm is used by Bio::Tools::Blast.pm. The development of this is thus linked with the Blast.pm module.
SEE ALSO
Bio::Tools::Blast.pm - Blast object.
Bio::Tools::Blast::Run::LocalBlast.pm - Utility module for running Blasts locally.
Bio::Tools::Blast::HTML.pm - Blast HTML-formating utility class.
Bio::Seq.pm - Biosequence object
Bio::Root::Object.pm - Bioperl base object class.
http://bio.perl.org/Projects/modules.html - Online module documentation
http://bio.perl.org/Projects/Blast/ - Bioperl Blast Project
http://bio.perl.org/ - Bioperl Project Homepage
FEEDBACK
Mailing Lists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bio.perl.org/MailList.html - About the mailing lists
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via email or the web:
bioperl-bugs@bioperl.org
http://bio.perl.org/bioperl-bugs/
AUTHOR
- Steve A. Chervitz <sac@genome.stanford.edu>
-
- Webblast.pm modularized version of webblast script.
- Alex Dong Li <ali@genet.sickkids.on.ca>
-
- original webblast script.
- Ross N. Crowhurst <RCrowhurst@hort.cri.nz>
-
- modified Webblast.pm to use LWP to give proxy server support.
VERSION
Bio::Tools::Blast::Run::Webblast.pm, 1.24
COPYRIGHT
Copyright (c) 1998, 1999 Steve A. Chervitz, Alex Dong Li, Ross N. Crowhurst. All Rights Reserved.This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
APPENDIX
Methods beginning with a leading underscore are considered private and are intended for internal use by this module. They are not considered part of the public interface and are described here for documentation purposes only.
blast_remote
Usage : @files = blast_remote( $blast_object, %namedParameters);
: This method is exported.
Purpose : Run a remote Blast analysis on one or more sequences.
: NOTE: The name of this method is potentially misleading
: since the a local server could be specified.
: Probably should be called blast_http.
Returns : Array containing a list of filenames of the Blast reports.
Argument : First argument should be a Bio::Tools::Blast.pm object reference.
: This object is primarily used for error reporting
: Remaining arguments are named parameters:
: (PARAMETER TAGS CAN BE UPPER OR LOWER CASE).
:
: -ALIGN => integer, number of alignments (B, 100)
: -ALIGN_VIEW => alignment view option (see below)
: -CUTOFF => Blast score cutoff (60-110 or 'default')
: -DATABASE => name of database (see below)
: -DESCR => integer, number of on-line descriptions (V, 100)
: -EXPECT => expect value cutoff
: -EXPECT_PSI => expect value for inclusion in PSI-BLAST iteration 1
: -FILTER => sequence complexity filter ('default' or 'none')
: -GAP => 'on' or 'off'
: -GAP_CREATE => gap creation penalty (G, 5)
: -GAP_EXTEND => gap extension penalty (E, 2)
: -GEN_CODE => integer for special genetic code (see below) blastx only
: -GRAPH => 'on' or 'off' (graphical overview not yet supported)
: -HISTOGRAM => 'on' or 'off' or 'both'
: -HTML => 'on' or 'off' or 'both'
: -INPUT_TYPE => 'Sequence in FASTA format' or 'Accession or GI'
: -MATRIX => substitution scoring matrix (blast1 only for NCBI server)
: -NCBI_GI => 'on' or 'off'
: -MATCH => match reward (r, 1) (blastn only)
: -MAX_LEN => max query sequence length to blast
: -MIN_LEN => min query sequence length to blast
: -MISMATCH => mismatch penalty (q, -3) (blastn only)
: -ORGANISM => organism name to limit Blast2 search.
: -ORGANISM_CUSTOM => custom organism or taxon name.
: -OUT_DIR => output directory to store blast result files
: -PROG => name of blast program (blastp, blastx, etc.)
: -SEQS => ref to an array of Bio::Seq.pm objects.
: -SERVER => blast server to use (default is NCBI Blast2)
: -STRAND => Default = 'Both' (not used by NCBI servers)
: -VERSION => blast version (1, 2, PSI, WashU)
: -WORD => word size (W, 11 for blastn, 3 for all others)
#rnc: LIST_ORG
# valid list_org entries for blast2 are a string of 50 chars max, default is empty string
:
Throws : Exception if:
: - Cannot obtain parameters by calling _rearrange() on the
: first argument, which should be a Bio::Tools::Blast.pm object ref.
: - No sequences are provided.
: - Sequence type is incompatible with Blast program type.
: - Database name is not one of the valid names.
: - Supplied e-mail address looks invalid.
Comments :
-------------------------------------------------------------
Available programs: blastn, blastx, dbest, blastp, tblastn, tblastx
Program versions: 1, 2, PSI, WashU (or WU)
-------------------------------------------------------------
Available databases:
nr, month, swissprot, dbest, dbsts,
est_mouse, est_human, est_others, pdb, vector, kabat,
mito, alu, epd, yeast, ecoli, gss, htgs.
These are exported by this module in the @Blast_dbp_remote
and @Blast_dbn_remote arrays.
-------------------------------------------------------------
Available Genetic Codes are (blastx only):
(1) Standard (2) Vertebrate Mitochondrial
(3) Yeast Mitochondrial (4) Mold Mitochondrial; ...
(5) Invertebrate Mitochondrial (6) Ciliate Nuclear; ...
(9) Echinoderm Mitochondrial (10) Euplotid Nuclear
(11) Bacterial (12) Alternative Yeast Nuclear
(13) Ascidian Mitochondrial (14) Flatworm Mitochondrial
(15) Blepharisma Macronuclear
-------------------------------------------------------------
Available values for organism (Blast2):
(None) (DEFAULT; note that the parentheses are required.)
Arabidopsis thaliana
Bacillus subtilis
Bos taurus
Caenorhabditis elegans
Danio rerio
Dictyostelium discoideum
Drosophila melanogaster
Escherichia coli
Gallus gallus
Homo sapiens
Human immunodeficiency virus type 1
Mus musculus
Oryctolagus cuniculus
Oryza sativa
Ovis aries
Plasmodium falciparum
Rattus norvegicus
Saccharomyces cerevisiae
Schizosaccharomyces pombe
Simian immunodeficiency virus
Xenopus laevis
Zea mays
-------------------------------------------------------------
Available values for align_view (Blast2):
0 Pairwise (DEFAULT)
1 master-slave with identities
2 master-slave without identities
3 flat master-slave with identities
4 flat master-slave without identities
-------------------------------------------------------------
Available substitution scoring matrices (NCBI):
BLAST2 matrices: BLOSUM80, BLOSUM62, BLOSUM45, PAM30, PAM70
BLAST1 matrices: BLOSUM62, PAM40, PAM120, PAM250, IDENTITY.
Others members of the BLOSUM and PAM family of matrices
may be available as well.
These are exported by this module in the @Blast_matrix_remote array.
Note that certain combinations of matrices and gap creation/extension
penalties are disallowed (E.g., PAM250 will work with 12/2 but not 11/1).
--------------------------------------------------------------
Limited values for gap creation and extension are supported for
blastp, blastx, tblastn. Some supported and suggested values are:
Creation Extension
10 1
10 2
11 1
8 2
9 2
-------------------------------------------------------------
Available sequence complexity filters:
SEG, SEG+XNU, XNU, dust, none.
See Also : _set_options(), _adjust_options(), _validate_options(), _blast(), Bio::Tools::Blast.pm
APPENDIX 2: Parameter listings
Parameters for Blast (NCBI ungapped, no longer supported by NCBI so should dicontinue use of ungapped blast), Blast2 (NCBI), PSI-Blast2 (NCBI). WashU-Blast2 has yet to be added as does PHI-Blast2 (NCBI).
These lists of parameters for posting to blast servers were obtained directly from the respective WWW forms for each server.
Basic ungapped BLAST Search Server Parameters
PROGRAM [default value]:blastn blastp tblastn tblastx blastx
DATALIB [default value]:nr month swissprot dbest dbsts pdb vector kabat mito alu epd yeast gss htgs ecoli
INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI
SEQUENCE
EXPECT [default value]:default 0.0001 0.01 1 10 100 1000
CUTOFF [default value]:default 60 70 80 90 100 110
MATRIX [default value]:default BLOSUM62 PAM40 PAM120 PAM250 IDENTITY
STRAND [default value]:both top bottom
FILTER [default value]:default none dust SEG SEG+XNU XNU
HISTOGRAM [default value]:'' HISTOGRAM
NCBI_GI [default value]:"" NCBI_GI
DESCRIPTIONS [default value]:default 0 10 50 100 250 500
ALIGNMENTS [default value]:default 0 10 50 100 250 500
ADVANCED [default value]:""
EMAIL [default value]:'' IS_SET
PATH [default value]:""
HTML [default value]:'' HTML
Basic Blast 2
PROGRAM [default value]:blastn blastp blastx tblastn tblastx
DATALIB [default value]:nr month swissprot dbest dbsts est_mouse est_human est_others pdb pat vector kabat mito alu epd yeast ecoli gss htgs
UNGAPPED_ALIGNMENT [default value]:'' is_set
FSET [default value]:is_set ''
OVERVIEW [default value]:is_set ''
INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI
SEQUENCE
EMAIL [default value]:'' IS_SET
PATH [default value]:""
HTML [default value]:'' IS_SET
BLAST2 ADVANCED
PROGRAM [default value]:blastn blastp blastx tblastn tblastx
DATALIB [default value]:nr month swissprot dbest dbsts est_mouse est_human est_others pdb pat vector kabat mito alu epd yeast ecoli gss htgs
UNGAPPED_ALIGNMENT [default value]:"" is_set
INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI
SEQUENCE
GI_LIST [default value]:(None) Arabidopsis thaliana Bacillus subtilis Bos taurus Caenorhabditis elegans Danio rerio Dictyostelium discoideum Drosophila melanogaster Escherichia coli Gallus gallus Homo sapiens Human immunodeficiency virus type 1 Mus musculus Oryctolagus cuniculus Oryza sativa Ovis aries Plasmodium falciparum Rattus norvegicus Saccharomyces cerevisiae Schizosaccharomyces pombe Simian immunodeficiency virus Xenopus laevis Zea mays
LIST_ORG
EXPECT [default value]:10 0.0001 0.01 1 10 100 1000
FILTER [default value]:default none
NCBI_GI [default value]:'' is_set
OVERVIEW [default value]:is_set ''
DESCRIPTIONS [default value]:500 0 10 50 100 250 500
ALIGNMENTS [default value]:500 0 10 50 100 250 500
ALIGNMENT_VIEW [default value]:0 #Pairwise 1 #master-slave with identities 2 #master-slave without identities 3 #flat master-slave with identities 4 #flat master-slave without identities
GENETIC_CODE [default value]:Standard (1) Vertebrate Mitochondrial (2) Yeast Mitochondrial (3) Mold Mitochondrial; ... (4) Invertebrate Mitochondrial (5) Ciliate Nuclear; ... (6) Echinoderm Mitochondrial (9) Euplotid Nuclear (10) Bacterial (11) Alternative Yeast Nuclear (12) Ascidian Mitochondrial (13) Flatworm Mitochondrial (14) Blepharisma Macronuclear (15)
MAT_PARAM [default value]:BLOSUM62 11 1 PAM30 9 1 PAM70 10 1 BLOSUM80 10 1 BLOSUM62 11 1 BLOSUM45 14 2 PAM30 7 2 PAM30 6 2 PAM30 5 2 PAM30 10 1 PAM30 9 1 #recommended PAM30 8 1 PAM70 8 2 PAM70 7 2 PAM70 6 2 PAM70 11 1 PAM70 10 1 #recommended PAM70 9 1 BLOSUM80 8 2 BLOSUM80 7 2 BLOSUM80 6 2 BLOSUM80 11 1 BLOSUM80 10 1 #recommended BLOSUM80 9 1 BLOSUM62 9 2 BLOSUM62 8 2 BLOSUM62 7 2 BLOSUM62 12 1 BLOSUM62 11 1 #recommended BLOSUM62 10 1 BLOSUM45 13 3 BLOSUM45 12 3 BLOSUM45 11 3 BLOSUM45 10 3 BLOSUM45 15 2 BLOSUM45 14 2 #recommended BLOSUM45 13 2 BLOSUM45 12 2 BLOSUM45 19 1 BLOSUM45 18 1 BLOSUM45 17 1 BLOSUM45 16 1
OTHER_ADVANCED [default value]:""
EMAIL [default value]:'' IS_SET
PATH [default value]:""
HTML [default value]:'' IS_SET
PSI BLAST2
PROGRAM [default value]:blastp
DATALIB [default value]:nr month swissprot pdb kabat alu yeast ecoli
GAPPED_ALIGNMENT [default value]:is_set ''
INPUT_TYPE [default value]:Sequence in FASTA format Accession or GI
SEQUENCE
EXPECT [default value]:10 0.0001 0.01 1 10 100 1000
FILTER [default value]:default none
NCBI_GI [default value]:'' is_set
GRAPHIC_OVERVIEW [default value]:is_set ''
DESCRIPTIONS [default value]:500 0 10 50 100 250 500
ALIGNMENTS [default value]:500 0 10 50 100 250 500
E_THRESH [default value]:0.001 #max value is 10
MAT_PARAM [default value]:BLOSUM62 11 1 PAM30 9 1 PAM70 10 1 BLOSUM80 10 1 BLOSUM62 11 1 BLOSUM45 14 2 PAM30 7 2 PAM30 6 2 PAM30 5 2 PAM30 10 1 PAM30 9 1 PAM30 8 1 PAM70 8 2 PAM70 7 2 PAM70 6 2 PAM70 11 1 PAM70 10 1 PAM70 9 1 BLOSUM80 8 2 BLOSUM80 7 2 BLOSUM80 6 2 BLOSUM80 11 1 BLOSUM80 10 1 BLOSUM80 9 1 BLOSUM62 9 2 BLOSUM62 8 2 BLOSUM62 7 2 BLOSUM62 12 1 BLOSUM62 11 1 BLOSUM62 10 1 BLOSUM45 13 3 BLOSUM45 12 3 BLOSUM45 11 3 BLOSUM45 10 3 BLOSUM45 15 2 BLOSUM45 14 2 BLOSUM45 13 2 BLOSUM45 12 2 BLOSUM45 19 1 BLOSUM45 18 1 BLOSUM45 17 1 BLOSUM45 16 1
OTHER_ADVANCED [default value]:""
WashU BLAST2
WU-Blast2 Database Searches http://www2.ebi.ac.uk/blast2/
email ""
title Sequence
srchtype interactive email
database swall swissprot swnew trembl tremblnew pdb gpcrdb prints HLAprot embl emnew est igvec emvec imgt HLAnuc
program WU-blastp WU-blastx WU-blastn
matrix blosum62 blosum30 blosum35 blosum40 blosum45 blosum50 blosum65 blosum70 blosum75 blosum80 blosum85 blosum90 blosum100 GONNET pam10 pam20 pam30 pam40 pam50 pam60 pam70 pam80 pam90 pam100 pam110 pam120 pam130 pam140 pam150 pam160 pam170 pam180 pam190 pam200 pam210 pam220 pam230 pam240 pam250 pam260 pam270 pam280 pam290 pam300 pam310 pam320 pam330 pam340 pam350 pam360 pam370 pam380 pam390 pam400 pam410 pam420 pam430 pam440 pam450 pam460 pam470 pam480 pam490 pam500
strand default top bottom
exp default 1.0 10 100 1000
filter none seg xnu seg+xnu dust
echofilter no yes
histogram no yes
stats sump poisson
sort pvalue count highscore totalscore
scores default 5 10 20 50 100 150 200 250
numal default 5 10 20 50 100 150 200 250
sequence
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 344:
Can't have a 0 in =over 0