NAME
Bio::Grep::Backend::Vmatch - Vmatch back-end
SYNOPSIS
use Bio::Grep;
my $sbe = Bio::Grep->new('Vmatch');
# generate a Vmatch suffix array. you have to do this only once.
$sbe->generate_database({
file => 'ATH1.cdna',
description => 'AGI Transcripts',
datapath => 'data',
prefix_length => 3,
});
# search for the reverse complement and allow 4 mismatches
# parse the description (max. 100 chars) directly out of the
# Vmatch output instead of calling vsubseqselect for every
# search result
$sbe->search({
query => 'UGAACAGAAAGCUCAUGAGCC',
reverse_complement => 1,
mismatches => 4,
showdesc => 100,
database => 'ATH1.cdna',
});
# output the searchresults with nice alignments
while ( my $res = $sbe->next_res ) {
print $res->sequence->id . "\n";
print $res->mark_subject_uppercase() . "\n";
print $res->alignment_string() . "\n\n";
# sequence_id now contains the gene id (e.g. At1g1234),
# not the Vmatch internal id
# To retrieve the complete sequences, one has to
# call get_sequences for every gene id
my $seq_io = $sbe->get_sequences([$res->sequence_id]);
my $sequence = $seq_io->next_seq;
}
# for retrieving up- and downstream regions,
# Vmatch internal sequence ids are required
# (no showdesc possible)
$sbe->search({
query => 'AGAGCCCT',
reverse_complement => 1,
mismatches => 1,
upstream => 30,
downstream => 30,
});
my @internal_ids;
while ( my $res = $sbe->next_res ) {
# vsubseqselect is called now for every result ...
push @internal_ids, $res->sequence_id;
}
# ... but one can retrieve all complete sequences with
# just one call of vseqselect
my $seq_io = $sbe->get_sequences(\@internal_ids);
DESCRIPTION
Bio::Grep::Backend::Vmatch searches for a query in a Vmatch
suffix array.
NOTE 1: When "maxhits" is defined, this back-end returns the maxhits best hits (those with smallest E-values).
METHODS
See Bio::Grep::Backend::BackendI for inherited methods.
Bio::Grep::Backend::Vmatch->new()
-
This method constructs a
Vmatch
back-end object and should not used directly. Rather, a back-end should be constructed by the main class Bio::Grep:my $sbe = Bio::Grep->new('Vmatch');
$sbe->available_sort_modes()
-
Returns all available sort modes as hash. keys are sort modes, values a short description.
$sbe->sort('ga');
Available sortmodes in
Vmatch
:ga : 'ascending order of dG' gd : 'descending order of dG' la : 'ascending order of length' ld : 'descending order of length' ia : 'ascending order of first position' id : 'descending order of first position' ja : 'ascending order of second position' jd : 'descending order of second position' ea : 'ascending order of Evalue' ed : 'descending order of Evalue' sa : 'ascending order of score' sd : 'descending order of score' ida : 'ascending order of identity' idd : 'descending order of identity'
Note that 'ga' and 'gd' require that search results have dG set. Bio::Grep::RNA ships with filters for free energy calculation. Also note that these two sort options require that we load all results in memory.
$sbe->get_sequences()
-
Takes as argument an array reference. If first array element is an integer, then this method assumes that the specified sequence ids are
Vmatch
internal ids. Otherwise it will take the first array element as query.# get sequences 0,2 and 4 out of suffix array $sbe->get_sequences([0,2,4]); # get sequences that start with At1g1 $sbe->get_sequences(['At1g1', 'ignored']);
The internal ids are stored in
$res->sequence_id
. If you have specifiedshowdesc
, thensequence_id
will contain the gene id (e.g. At1g1234), NOT theVmatch
internal id.
DIAGNOSTICS
See Bio::Grep::Backend::BackendI for other diagnostics.
mkvtree call failed. Cannot generate suffix array. Command was: ...
.-
It was not possible to generate a suffix array in generate_database(). Check permissions and paths.
Bio::Root::SystemException
. Unsupported alphabet of file.
-
The method generate_database() could not determine the alphabet (DNA or Protein) of the specified Fasta file.
Bio::Root::BadParameter
Vmatch call failed. Command was: ...
-
It was not possible to run
Vmatch
in function search(). Check the search settings. When you get theVmatch
errorvmatch: searchlength=x must be >= y=prefixlen
The number of mismatches is too high or the query is too short. You can rebuild the index with generate_database() and a smaller
prefix_length
or you can tryonline
.Bio::Root::SystemException
. vseqselect call failed. Cannot fetch sequences. Command was: ...
-
It was not possible to get some sequences out of the suffix array in get_sequences(). Check sequence ids.
Bio::Root::SystemException
. You can't combine qspeedup and complete.
-
The
Vmatch
parameters-complete
and-qspeedup
cannot combined. See theVmatch
documentation.Bio::Root::BadParameter
. You can't use showdesc() with upstream or downstream.
-
We need the tool
vsubseqselect
of theVmatch
package for the upstream and downstream regions. This tool requires as parameter an internalVmatch
sequence id, which is not shown in theVmatch
output whenshowdesc
is on.Bio::Root::BadParameter
. You have to specify complete or querylength. ...'
-
The
Vmatch
parameters-complete
and-l
cannot combined. See theVmatch
documentation.Bio::Root::BadParameter
.
SEE ALSO
Bio::Grep::Backend::BackendI Bio::Grep::SearchSettings Bio::SeqIO
AUTHOR
Markus Riester, <mriester@gmx.de>
LICENCE AND COPYRIGHT
Based on Weigel::Search v0.13
Copyright (C) 2005-2006 by Max Planck Institute for Developmental Biology, Tuebingen.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
DISCLAIMER OF WARRANTY
BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.