NAME

App::ElasticSearch::Utilities::QueryString - CLI query string fixer

VERSION

version 5.4

SYNOPSIS

This class provides a pluggable architecture to expand query strings on the command-line into complex Elasticsearch queries.

ATTRIBUTES

search_path

An array reference of additional namespaces to search for loading the query string processing plugins. Example:

$qs->search_path([qw(My::Company::QueryString)]);

This will search:

App::ElasticSearch::Utilities::QueryString::*
My::Company::QueryString::*

For query processing plugins.

plugins

Array reference of ordered query string processing plugins, lazily assembled.

METHODS

expand_query_string(@tokens)

This function takes a list of tokens, often from the command line via @ARGV. Uses a plugin infrastructure to allow customization.

Returns: App::ElasticSearch::Utilities::Query object

TOKENS

The token expansion plugins can return undefined, which is basically a noop on the token. The plugin can return a hash reference, which marks that token as handled and no other plugins receive that token. The hash reference may contain:

query_string

This is the rewritten bits that will be reassembled in to the final query string.

condition

This is usually a hash reference representing the condition going into the bool query. For instance:

{ terms => { field => [qw(alice bob charlie)] } }

Or

{ prefix => { user_agent => 'Go ' } }

These conditions will wind up in the must or must_not section of the bool query depending on the state of the the invert flag.

invert

This is used by the bareword "not" to track whether the token invoked a flip from the must to the must_not state. After each token is processed, if it didn't set this flag, the flag is reset.

dangles

This is used for bare words like "not", "or", and "and" to denote that these terms cannot dangle from the beginning or end of the query_string. This allows the final pass of the query_string builder to strip these words to prevent syntax errors.

Extended Syntax

The search string is pre-analyzed before being sent to ElasticSearch. The following plugins work to manipulate the query string and provide richer, more complete syntax for CLI applications.

App::ElasticSearch::Utilities::Barewords

The following barewords are transformed:

or => OR
and => AND
not => NOT

App::ElasticSearch::Utilities::QueryString::IP

If a field is an IP address wild card, it is transformed:

src_ip:10.* => src_ip:[10.0.0.0 TO 10.255.255.255]

App::ElasticSearch::Utilities::Underscored

This plugin translates some special underscore surrounded tokens into the Elasticsearch Query DSL.

Implemented:

_prefix_

Example query string:

_prefix_:useragent:'Go '

Translates into:

{ prefix => { useragent => 'Go ' } }

App::ElasticSearch::Utilities::QueryString::FileExpansion

If the match ends in .dat, .txt, or .csv, then we attempt to read a file with that name and OR the condition:

$ cat test.dat
50  1.2.3.4
40  1.2.3.5
30  1.2.3.6
20  1.2.3.7

Or

$ cat test.csv
50,1.2.3.4
40,1.2.3.5
30,1.2.3.6
20,1.2.3.7

Or

$ cat test.txt
1.2.3.4
1.2.3.5
1.2.3.6
1.2.3.7

We can source that file:

src_ip:test.dat => src_ip:(1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7)

This make it simple to use the --data-file output options and build queries based off previous queries. For .txt and .dat file, the delimiter for columns in the file must be either a tab, comma, or a semicolon. For files ending in .csv, Text::CSV_XS is used to accurate parsing of the file format.

You can also specify the column of the data file to use, the default being the last column or (-1). Columns are zero-based indexing. This means the first column is index 0, second is 1, .. The previous example can be rewritten as:

src_ip:test.dat[1]

or: src_ip:test.dat[-1]

This option will iterate through the whole file and unique the elements of the list. They will then be transformed into an appropriate terms query.

AUTHOR

Brad Lhotsky <brad@divisionbyzero.net>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2012 by Brad Lhotsky.

This is free software, licensed under:

The (three-clause) BSD License

2 POD Errors

The following errors were encountered while parsing the POD:

Around line 160:

'=item' outside of any '=over'

Around line 188:

You forgot a '=back' before '=head1'