NAME
Elastic::Manual::QueryDSL - How to use the Elasticsearch Query DSL
VERSION
version 0.29_2
INTRODUCTION
Elasticsearch provides a rich query language, known as the Query DSL which exposes much of the power of Lucene through a simple JSON interface. It is tuned for full text search, but is in no way limited just to that. It also provides very fast and flexible filters, ranges, geo-location and more.
ElasticSearch::SearchBuilder is a more concise, more Perlish version of the Query DSL, similar to SQL::Abstract. Both syntaxes are fully supported by Elastic::Model. In Elastic::Model::View, query, filter and post_filter expect the native Query DSL, while queryb, filterb and post_filterb (note the extra "b") expect the SearchBuilder syntax.
FULL TEXT VS EXACT MATCHING
There are two broad ways to match values in ElasticSearch:
Exact matching
Documents where
status eq 'active'
Documents with any of the tags
"perl"
,"python"
or"ruby"
Documents published between 2012-01-01 and 2012-12-31
Documents within 50km of geo-point Lat,Lon
Documents that have a value in the
name
field
Full text matching
Documents which are relevant to the search terms "quick brown fox"
Documents about "Big Mac", but not about "Apple Mac"
Documents about "flying kites", where the text may include "fly", "flying", "kite" or "kites"
The most relevant auto-complete terms which match the partial phrase "arnold schwa"
You can only find what is actually stored in Elasticsearch. For this reason, exact matching is easy. Full text matching is made easy by the analysis process, which you can read more about in Elastic::Manual::Analysis.
QUERIES VS FILTERS
Every search has a single query, but that query can contain multiple other queries and filters. Choosing the right tool for the job is important:
- Filters:
-
are for "Exact matching" only
are boolean: a doc either matches or it doesn't. There is no scoring phase
are faster
are cacheable
- Queries:
-
can be used for "Exact matching" or for "Full text matching"
score each document by "relevance" (see http://www.lucenetutorial.com/advanced-topics/scoring.html for a summary of how scoring works)
are slower, because of the scoring phase
are not cacheable
In summary: you should use filters for any part of the query that does not require relevance scoring.
Note: a search for documents which have the exact tags "perl"
or "python"
may use a filter or a query. If all you care about is that each document has at least one of those tags, then use a filter. If a document that has BOTH tags should be considered more relevant than a document with only one tag, then you need a query.
USING QUERIES AND FILTERS WITH ELASTIC::MODEL::VIEW
Elastic::Model::View gives you a "view" across your data. It is the class you use to build your searches.
Just a query:
$view->query( text => { title => 'object models' })->search;
Just a filter:
$view->filter( terms => { tags => ['perl','python' ] })->search
A query and filter combined:
$view->query( text => { title => 'object models' })
->filter( terms => { tags => ['perl','python' ] })->search
Or, with the SearchBuilder syntax:
$view->queryb( title => 'object models' )
->filterb( tags => ['perl','python' ] )->search
Note: Elasticsearch only accepts a query
parameter, so the query and filter attributes are combined at search time. This means that you can quite happily specify queries with nested filters using just the query attribute.
Note: the post_filter works in exactly the same way as filter, but it only filters the results AFTER the facets (like GROUP BY) have been calculated.
QUERIES
See Elastic::Manual::QueryDSL::Queries for examples of commonly used queries.
FILTERS
See Elastic::Manual::QueryDSL::Filters for examples of commonly used filters.
AUTHOR
Clinton Gormley <drtech@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Clinton Gormley.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.