NAME

Bio::Grep::SearchSettings - Data structure for all search settings

SYNOPSIS

use Bio::Grep;

# configure our search back-end, in this case VMATCH
my $sbe = Bio::Grep->new('Vmatch');

# tell the back-end the location of the executables
# (only required if they are not in path)
$sbe->settings->execpath('/usr/local/virtual.distrib');

# maybe an alternative path for temporary files
$sbe->settings->tmppath($tmp_path);

# some query options
$sbe->settings->mismatches(4);
$sbe->settings->datapath($path_to_data_directory);
$sbe->settings->database("ATH1.cdna");

# retrieve upstream and downstream regions
# see perldoc Bio::Grep::SearchResult
# how to get them out of the search results
$sbe->settings->upstream(3);
$sbe->settings->downstream(3);

# search for the reverse complement 
$sbe->settings->query("UGAACAGAAAGCUCAUGAGCC");
$sbe->settings->reverse_complement(1);

# Alternatively, you can specify the options in the search call:
$sbe->search({
   query => $query,
   reverse_complement => 1,
});

DESCRIPTION

Bio::Grep::SearchSettings is the data structure for all search settings. Not all back-ends will support every option.

METHODS

new()

This function constructs an Bio::Grep::SearchSettings object. The back-end adds an object of this module into $sbe->settings, so you should never have to call this constructor directly. If you want to reset all settings to their default values, call

$sbe->settings->set({});

See set().

set($hash_ref)

Sets all settings in the hash reference:

set( { query => $query, reverse_complement => 1 } );

This function resets everything except the paths and the database to default values.

query()

Get/Set the query, a simple string. Queries are DNA, RNA, Protein or regular expressions but not all back-ends support all three query types.

Maybe this will change in future versions to allow multiple queries.

$sbe->settings->query('tgacagaagagagtgagcac');
query_file()

Get/set query_file. The back-ends can create a query file based on the search settings ($sbe->query()) or you can define one with $sbe->query_file.

$sbe->settings->query_file('oligos.fasta');

Note that all settings that affect the creation of a query file (like reverse_ complement) may be ignored (depends on your back-end).

Currently only available in the Vmatch and GUUGle back-end.

reverse_complement()

Get/set reverse_complement.

#  search for the reverse complement
$sbe->settings->reverse_complement(1)

#  don't search for the reverse complement (default)
$sbe->settings->reverse_complement(0)
direct_and_rev_com()

Get/set direct_and_rev_com. Searches for direct matches and the reverse complement.

Currently only available in Vmatch (-d AND -p flag), GUUGle (query file with two queries), and RE (regex $query|$rev_com_query).

mismatches()

Get/Set allowed mismatches

$sbe->settings->mismatches(5)

Not available in the GUUGle backend.

editdistance()

Get/Set allowed edit distance.

Only available in the Vmatch and Agrep back-end.

deletions()

Get/Set allowed number of deletions. Not supported by any back-end.

insertions()

Get/Set allowed number of insertions. Not supported by any back-end.

upstream()

Get/set upstream. This is the number of bases upstream the match.

$sbe->settings->upstream( 10 );

Not available in the Agrep backend.

downstream()

Get/set downstream. This is the number of bases downstream the match.

$sbe->settings->downstream( 10 );

Not available in the Agrep backend.

filters()

Get/set the filters. This is an array of modules based on Bio::Grep::Filter::FilterI.

# display only possible targets of the miRNA query   
my $filter1 = Bio::Grep::Filter::MIRNAFilter->new();

# and display every gene id only once
my $filter2 = Bio::Grep::Filter::FilterRemoveDuplicates->new();

$sbe->settings->filters( ( $filter1, $filter2 ) );

Not available in the Agrep backend.

datapath()

Get/set the datapath. This is the path to the databases for the back-end. Default is current directory ('./').

database()

Get/set the database. This is the name of the database in $self->datapath

tmppath()

Get/set the tmppath. This is a path were the back-end can store temporary files.

execpath()

Get/set the execpath. This is a path to the back-end executable. If you don't set it, make sure that the required executables are in path.

no_alignments()

Get/set the no_alignments. Some back-ends like Agrep don't output alignments. The EMBOSS Smith-Waterman implementation will automatically generate alignments for a search result. For performance reasons, you can turn turn this feature off:

# if things need to be fast, turn alignments off
$sbe->settings->no_alignments(1);

Not available (meaning not necessary) in the Vmatch and GUUGle back-end.

online()

Get/set online. Vmatch back-end allows searching without using the index. When you allow many mismatches, this could be faster.

Only available in the Vmatch back-end.

sort()

Get set sort mode. You can get an hash with all available sort modes (including a description from the back-end:

my %sort_modes = $sbe->available_sort_modes();
maxhits()

Get/set maxhits. Tells the back-end that it should output only the best n hits.

Only available in the Vmatch and GUUGle back-end.

gumismatches()

Get/set gumismatches. Tells the back-end how it should count GU mismatches. Valid values in GUUGle are 0, in all other back-ends 1 (default).

Only available in the GUUGle back-end.

query_length()

Get/Set the query length. Initialized with length of the query string. If this member variable is smaller then the query string, then the back-end will search for all substrings of that size.

Only available in the Vmatch and GUUGle back-end.

to_string()

This function returns a string representation of this object

Vmatch only

complete()

Get/set complete. Specify that query sequences must match completely.

$sbe->settings->complete(1);
hxdrop()

Specifies the xdrop value for hamming distance extension.

exdrop()

Specifies the xdrop value for edit distance extension.

showdesc()

Get/Set showdesc. This Vmatch command line option makes the Vmatch parser fetch the $sbe->result->sequence data directly out of the Vmatch output instead of calling vseqsubselect. Because of that, it is much faster with many search results. You can't use this option if you want to retrieve up- or downstream regions or if you are interested in the Vmatch internal sequence id.

# get the first 10 characters of the sequence description
# (sequence id + annotation)
$sbe->settings->showdesc(10);
qspeedup()

Get/Set qspeedup. Specify speedup level when matching queries (0: fast, 2: faster; default is 2). Beware of time/space tradeoff.

NOTE: You can use the hash features from the back-end to check if some feature is available or not. See Bio::Grep::Backend::BackendI for details.

SEE ALSO

Bio::Grep::Filter::FilterI Bio::Grep::SearchResult n<Bio::Grep::Backend::BackendI>

AUTHOR

Markus Riester, <mriester@gmx.de>

LICENCE AND COPYRIGHT

Based on Weigel::Search v0.13

Copyright (C) 2005-2006 by Max Planck Institute for Developmental Biology, Tuebingen.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.