NAME
App::ElasticSearch::Utilities::Query - Object representing ES Queries
VERSION
version 5.4
ATTRIBUTES
query_stash
Hash reference containing replaceable query elements. See stash.
must
The must section of a bool query as an array reference. See: add_bool Can be set using set_must and is a valid init_arg.
must_not
The must_not section of a bool query as an array reference. See: add_bool Can be set using set_must_not and is a valid init_arg.
should
The should section of a bool query as an array reference. See: add_bool Can be set using set_should and is a valid init_arg.
filter
The filter section of a bool query as an array reference. See: add_bool Can be set using set_filter and is a valid init_arg.
from
Integer representing the offset the query should start returning documents from. The default is undefined, which falls back on the Elasticsearch default of 0, or from the beginning. Can be set with set_from. Cannot be an init_arg.
size
The number of documents to return in the query. The default size is 50. Can be set with set_size. Cannot be an init_arg.
fields
An array reference containing the names of the fields to retrieve with the query. The default is undefined, which falls back on the Elasticsearch default of empty, or no fields retrieved. The _source is still retrieved. Can be set with set_fields. Cannot be an init_arg.
sort
An array reference of sorting keys/directions. The default is undefined, which falls back on the Elasticsearch default of score:desc. Can be set with set_sort. Cannot be an init_arg.
aggregations
A hash reference of aggergations to perform. The default is undefined, which means do not perform any aggregations. Can be set with set_aggregations, which is aliased as set_aggs. Cannot be an init_arg. Aliased as aggs.
scroll
An ElasticSearch time constant. The default is undefined, which means scroll will not be set on a query. Can be set with set_scroll. Cannot be an init_arg. See also: set_scan_scroll.
timeout
An ElasticSearch time constant. The default is undefined, which means it will default to the connection timeout. Can be set with set_timeout. Cannot be an init_arg.
terminate_after
The number of documents to cancel the search after. This generally shouldn't be used except for large queries where you are protecting against OOM Errors. The size attribute is more accurate as it's truncation occurs after the reduce operation, where terminate_after occurs during the map phase of the query. Can be set with set_terminateafter. Cannot be an init_arg.
METHODS
uri_params()
Retrieves the URI parameters for the query as a hash reference. Undefined parameters will not be represented in the hash.
request_body()
Builds and returns a hash reference representing the request body for the Elasticsearch query. Undefined elements will not be represented in the hash.
query()
Builds and returns a hash reference represnting the bool query section of the request body. This function is called by the request_body function but is useful and distinct enough to expose as it's own method. Undefined elements of the query will not be represented in the hash it returns.
add_aggregations( name => { ... } )
Takes one or more key-value pairs. The key is the name of the aggregation. The value being the hash reference representation of the aggregation itself. It will silently replace a previously named aggregation with the most recent call.
Calling this function overrides the size element to 0 and scroll to undef.
Aliased as add_aggs.
wrap_aggregations( name => { ... } )
Use this to wrap an aggregation in another aggregation. For example:
$q->add_aggregation(ip => { terms => { field => src_ip } });
Creates:
{
"aggs": {
"ip": {
"terms": {
"field": "src_ip"
}
}
}
}
Would give you the top IP for the whole query set. To wrap that aggregation to get top IPs per hour, you could:
$q->wrap_aggregations( hourly => { date_histogram => { field => 'timestamp', interval => '1h' } } );
Which translates the query into:
{
"aggs": {
"hourly": {
"date_histogram": {
"field": "timestamp",
"interval": "1h"
}
"aggs": {
"ip": {
"terms": {
"field": "src_ip"
}
}
}
}
}
}
set_scan_scroll($ctxt_life)
This function emulates the old scan scroll feature in early version of Elasticsearch. It takes an optional ElasticSearch time constant, but defaults to '1m'. It is the same as calling:
$self->set_sort( [qw(_doc)] );
$self->set_scroll( $ctxt_life );
set_match_all()
This method clears all filters and query elements to and sets the must to match_all. It will not reset other parameters like size, sort, and aggregations.
add_bool( section => condition )
Appends a search condition to a section in the query body. Valid query body points are: must, must_not, should, and filter.
stash( section => condition )
Allows a replaceable query element to exist in the query body sections: must, must_not, should, and/or filter. This is useful for moving through a data-set preserving everthing in a query except one piece that shifts. Imagine:
my $query = App::ElasticSearch::Utilities::Query->new();
$query->add_bool(must => { terms => {src_ip => [qw(1.2.3.4)]} });
$query->add_bool(must => { range => { attack_score => { gt => 10 }} });
while( 1 ) {
$query->stash( must => { range => { timestamp => { gt => now() } } } );
my @results = make_es_request( $query->request_body, $query->uri_params );
# Long processing
}
This allows re-use of the query object inside of loops like this.
AUTHOR
Brad Lhotsky <brad@divisionbyzero.net>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2012 by Brad Lhotsky.
This is free software, licensed under:
The (three-clause) BSD License