NAME

KinoSearch::QueryParser - Transform a string into a Query object.

SYNOPSIS

my $query_parser = KinoSearch::QueryParser->new(
    schema => $searcher->get_schema,
    fields => ['body'],
);
my $query = $query_parser->parse( $query_string );
my $hits  = $searcher->hits( query => $query );

DESCRIPTION

QueryParser accepts search strings as input and produces KinoSearch::Search::Query objects, suitable for feeding into KinoSearch::Searcher and other Searchable subclasses.

The following syntactical constructs are recognized by QueryParser:

* Boolean operators 'AND', 'OR', and 'AND NOT'.
* Prepented +plus and -minus, indicating that the labeled entity should 
  be either required or forbidden -- be it a single word, a phrase, or
  a parenthetical group.
* Logical groups, delimited by parentheses.  
* Phrases, delimited by double quotes.

Additionally, the following syntax can be enabled via set_heed_colons():

* Field-specific constructs, in the form of 'fieldname:termtext' or 
  'fieldname:(foo bar)'.  (The field specified by 'fieldname:' will be
  used instead of the QueryParser's default fields).

CONSTRUCTORS

new( [labeled params] )

my $query_parser = KinoSearch::QueryParser->new(
    schema         => $searcher->get_schema,    # required
    analyzer       => $analyzer,                # overrides schema
    fields         => ['bodytext'],             # default: indexed fields
    default_boolop => 'AND',                    # default: 'OR'
);

Constructor.

  • schema - A Schema.

  • analyzer - An Analyzer. Ordinarily, the analyzers specified by each field's definition will be used, but if analyzer is supplied, it will override and be used for all fields. This can lead to mismatches between what is in the index and what is being searched for, so use caution.

  • fields - The names of the fields which will be searched against. Defaults to those fields which are defined as indexed in the supplied Schema.

  • default_boolop - Two possible values: 'AND' and 'OR'. The default is 'OR', which means: return documents which match any of the query terms. If you want only documents which match all of the query terms, set this to 'AND'.

METHODS

parse(query_string)

Build a Query object from the contents of a query string. At present, implemented internally by calling tree(), expand(), and prune().

  • query_string - The string to be parsed. May be undef.

Returns: a Query.

tree(query_string)

Parse the logical structure of a query string, building a tree comprised of Query objects. Leaf nodes in the tree will most often be LeafQuery objects but might be MatchAllQuery or NoMatchQuery objects as well. Internal nodes will be objects which subclass PolyQuery: ANDQuery, ORQuery, NOTQuery, and RequiredOptionalQuery.

The output of tree() is an intermediate form which must be passed through expand() before being used to feed a search.

  • query_string - The string to be parsed.

Returns: a Query.

expand(query)

Walk the hierarchy of a Query tree, descending through all PolyQuery nodes and calling expand_leaf() on any LeafQuery nodes encountered.

  • query - A Query object.

Returns: A Query -- usually the same one that was supplied after in-place modification, but possibly another.

expand_leaf(query)

Convert a LeafQuery into either a TermQuery, a PhraseQuery, or an ORQuery joining multiple TermQueries/PhraseQueries to accommodate multiple fields. LeafQuery text will be passed through the relevant Analyzer for each field. Quoted text will be transformed into PhraseQuery objects. Unquoted text will be converted to either a TermQuery or a PhraseQuery depending on how many tokens are generated.

  • query - A Query. Only LeafQuery objects will be processed; others will be passed through.

Returns: A Query.

prune(query)

Prevent certain Query structures from returning too many results. Query objects built via tree() and expand() can generate "return the world" result sets, such as in the case of NOT a_term_not_in_the_index; prune() walks the hierarchy and eliminates such branches.

'NOT foo'               => [NOMATCH]
'foo OR NOT bar'        => 'foo'
'foo OR (-bar AND -baz) => 'foo'

prune() also eliminates some double-negative constructs -- even though such constructs may not actually return the world:

'foo AND -(-bar)'      => 'foo'

In this example, safety is taking precedence over logical consistency. If you want logical consistency instead, call tree() then expand(), skipping prune().

  • query - A Query.

Returns: a Query; in most cases, the supplied Query after in-place modification.

set_heed_colons(heed_colons)

Enable/disable parsing of fieldname:foo constructs.

make_term_query( [labeled params] )

Factory method creating a TermQuery.

  • field - Field name.

  • term - Term text.

Returns: A Query.

make_phrase_query( [labeled params] )

Factory method creating a PhraseQuery.

  • field - Field that the phrase must occur in.

  • terms - Ordered array of terms that must match.

Returns: A Query.

make_and_query(children)

Factory method creating an ANDQuery.

  • children - Array of child Queries.

Returns: A Query.

make_or_query(children)

Factory method creating an ORQuery.

  • children - Array of child Queries.

Returns: A Query.

make_not_query(negated_query)

Factory method creating a NOTQuery.

  • negated_query - Query to be inverted.

Returns: A Query.

make_req_opt_query( [labeled params] )

Factory method creating a RequiredOptionalQuery.

  • required_query - Query must must match.

  • optional_query - Query which should match.

Returns: A Query.

INHERITANCE

KinoSearch::QueryParser isa KinoSearch::Obj.

COPYRIGHT

Copyright 2005-2009 Marvin Humphrey

LICENSE, DISCLAIMER, BUGS, etc.

See KinoSearch version 0.30.