NAME
Elastic::Model::View - Views to query your docs in ElasticSearch
VERSION
version 0.02
SYNOPSIS
$view = $model->view(); # all domains and types known to the model
$view = $domain->view(); # just $domain->name, and its types
$posts = $view->type( 'post' ); # just type post
10 most relevant posts containing 'perl'
or 'moose'
$results = $posts->queryb( content => 'perl moose' )->search;
10 most relevant posts containing 'perl'
or 'moose'
published since 1 Jan 2012, sorted by timestamp
, with highlighted snippets from the content
field:
$results = $posts
->queryb ( 'content' => 'perl moose' )
->filterb ( 'created' => { gte => '2012-01-01' } )
->sort ( 'timestamp' )
->highlight ( 'content' )
->search;
The same as the above, but in one step:
$results = $domain->view(
type => 'post',
sort => 'timestamp',
queryb => { content => 'perl moose' },
filterb => { created => { gte => '2012-01-01' } },
highlight => 'content',
)->search;
Efficiently retrieve all posts, unsorted:
$results = $posts->size(100)->scan;
while (my $result = $results->shift_result)) {
do_something_with($result);
);
DESCRIPTION
Elastic::Model::View is used to query your docs in ElasticSearch.
Views are "chainable". In other words, you get a clone of the current view every time you set an attribute. For instance, you could do:
$all_types = $domain->view;
$users = $all_types->type('user');
$posts = $all_types->('post');
$recent_posts = $posts->filterb({ published => { gt => '2012-05-01' }});
Alternatively, you can set all or some of the attributes when you create a view:
$recent_posts = $domain->view(
type => 'post',
filterb => { published => { gt => '2012-05-01 '}}
);
Views are also reusable. They only hit the database when you call one of the methods, eg:
$results = $recent_posts->search; # retrieve $size results
$scroll = $recent_posts->scroll; # keep pulling results
METHODS
Calling one of the methods listed below executes your query and returns the results. Your view
is unchanged and can be reused later.
See Elastic::Manual::Searching for a discussion about when and how to use "search()", "scroll()" or "scan()".
search()
$results = $view->search();
Executes a search and returns an Elastic::Model::Results object with at most "size" results.
This is useful for returning finite results, ie where you know how many results you want. For instance: "give me the 10 best results".
scroll()
$scroll_timeout = '1m';
$scrolled_results = $view->scroll( $scroll_timeout );
Executes a search and returns an Elastic::Model::Results::Scrolled object which will pull "size" results from ElasticSearch as required until either (1) no more results are available or (2) more than $scroll_timeout
(default 1 minute) elapses between requests to ElasticSearch.
Scrolling allows you to return an unbound result set. Useful if you're not sure whether to expect 2 results or 2000.
scan()
$timeout = '1m';
$scrolled_results = $view->scan($timeout);
"scan()" is a special type of "scroll()" request, intended for efficient handling of large numbers of unsorted docs (eg when you want to reindex all of your data).
first()
$result = $view->first();
$object = $view->first->object;
Executes the search and returns just the first result. All other metadata is thrown away.
total()
$total = $view->total();
Executes the search and returns the total number of matching docs. All other metadta is thrown away.
delete()
$results = $view->delete();
Deletes all docs matching the query and returns a hashref indicating success. Any docs that are stored in a live scope or are cached somewhere are not removed.
This should really only be used once you are sure that the matching docs are out of circulation. Also, it is more efficient to just delete a whole index (if possible), rather than deleting large numbers of docs.
Note: The only attributes relevant to "delete()" are "domain", "type", "query", "routing", "consistency" and "replication".
CORE ATTRIBUTES
domain
$new_view = $view->domain('my_index');
$new_view = $view->domain('index_one','alias_two');
\@domains = $view->domain;
Specify one or more domains (indices or aliases) to query. By default, a view
created from a domain will query just that domain's name. A view
created from the model will query all the main domains (ie the "name" in Elastic::Model::Namespace) and fixed domains known to the model.
type
$new_view = $view->type('user');
$new_view = $view->type('user','post');
\@types = $view->type;
By default, a view
will query all types known to all the domains specified in the view. You can specify one or more types.
query
queryb
# native query DSL
$new_view = $view->query( text => { title => 'interesting words' } );
# SearchBuilder DSL
$new_view = $view->queryb( title => 'interesting words' );
\%query = $view->query
Specify the query to run in the native ElasticSearch query DSL or use queryb()
to specify your query with the more Perlish ElasticSearch::SearchBuilder query syntax.
By default, the query will match all docs.
filter
filterb
# native query DSL
$new_view = $view->filter( term => { tag => 'perl' } );
# SearchBuilder DSL
$new_view = $view->filterb( tag => 'perl' );
\%filter = $view->filter;
You can specify a filter to apply to the query results using either the native ElasticSearch query DSL or, use filterb()
to specify your filter with the more Perlish ElasticSearch::SearchBuilder DSL. If a filter is specified, it will be combined with the "query" as a filtered query, or (if no query is specified) as a constant score query.
post_filter
post_filterb
# native query DSL
$new_view = $view->post_filter( term => { tag => 'perl' } );
# SearchBuilder DSL
$new_view = $view->post_filterb( tag => 'perl' );
\%filter = $view->post_filter;
Post-filters filter the results AFTER any "facets" have been calculated. In the above example, the facets would be calculated on all values of tag
, but the results would then be limited to just those docs where tag == perl
.
You can specify a post_filter using either the native ElasticSearch query DSL or, use post_filterb()
to specify it with the more Perlish ElasticSearch::SearchBuilder DSL.
sort
$new_view = $view->sort( '_score' ); # _score desc
$new_view = $view->sort( 'timestamp' ); # timestamp asc
$new_view = $view->sort( { timestamp => 'asc' } ); # timestamp asc
$new_view = $view->sort( { timestamp => 'desc' } ); # timestamp desc
$new_view = $view->sort(
'_score', # _score desc
{ timestamp => 'desc' } # then timestamp desc
);
\@sort = $view->sort
By default, results are sorted by "relevance" (_score => 'desc'
). You can specify multiple sort arguments, which are applied in order, and can include scripts or geo-distance. See http://www.elasticsearch.org/guide/reference/api/search/sort.html for more information.
Note: Sorting cannot be combined with "scan()".
from
$new_view = $view->from( 10 );
$from = $view->from;
By default, results are returned from the first result. If you would like to start at a later result (eg for paging), you can set "from".
size
$new_view = $view->size( 100 );
$size = $view->size;
The number of results returned in a single "search()", which defaults to 10.
Note: See "scan()" for a slightly different application of the "size" value.
facets
$new_view = $view->facets(
facet_one => {
terms => {
field => 'field.to.facet',
size => 10
},
facet_filterb => { status => 'active' },
},
facet_two => {....}
);
$new_view = $view->add_facet( facet_three => {...} )
$new_view = $view->remove_facet('facet_three');
\%facets = $view->facets;
\%facet = $view->get_facet('facet_one');
Facets allow you to aggregate data from a query, for instance: most popular terms, number of blog posts per day, average price etc. Facets are calculated from the query generated from "query" and "filter". If you want to filter your query results down further after calculating your facets, you can use "post_filter".
See http://www.elasticsearch.org/guide/reference/api/search/facets/ for an explanation of what facets are available.
highlight
$new_view = $view->highlight(
'field_1',
'field_2' => \%field_2_settings,
'field_3'
);
Specify which fields should be used for highlighted snippets. to your search results. You can pass just a list of fields, or fields with their field-specific settings. These values are used to set the fields
parameter in "highlighting".
highlighting
$new_view = $view->highlighting(
pre_tags => [ '<em>', '<b>' ],
post_tags => [ '</em>', '</b>' ],
encoder => 'html'
...
);
The "highlighting" attribute is used to pass any highlighting parameters which should be applied to all of the fields set in "highlight" (although you can override these settings for individual fields by passing field settings to "highlight").
See http://www.elasticsearch.org/guide/reference/api/search/highlighting.html. for more about how highlighting works, and "highlight" in Elastic::Model::Result for how to retrieve the highlighted snippets.
OTHER ATTRIBUTES
fields
$new_view = $view->fields('title','content');
By default, searches will return the _source field which contains the whole document, allowing Elastic::Model to inflate the original object without having to retrieve the document separately. If you would like to just retrieve a subset of fields, you can specify them in "fields". See http://www.elasticsearch.org/guide/reference/api/search/fields.html.
Note: If you do specify any fields, and you DON'T include '_source'
then the _source
field won't be returned, and you won't be able to retrieve the original object without requesting it from ElasticSearch in a separate (but automatic) step.
script_fields
$new_view = $view->script_fields(
distance => {
script => q{doc['location'].distance(lat,lon)},
params => { lat => $lat, lon => $lon }
},
$name => \%defn,
...
);
$new_view = $view->add_script_field( $name => \%defn );
$new_view = $view->remove_script_field($name);
\%fields = $view->script_fields;
\%defn = $view->get_script_field($name);
Script fields can be generated using the mvel scripting language. (You can also use Javascript, Python and Java.)
routing
$new_view = $view->routing( 'routing_val' );
$new_view = $view->routing( 'routing_1', 'routing_2' );
Search queries are usually directed at all shards. If you are using routing (eg to store related docs on the same shard) then you can limit the search to just the relevant shard(s). Note: if you are searching on aliases that have routing configured, then specifying a "routing" manually will override those values.
See Elastic::Manual::Scaling for more.
index_boosts
$new_view = $view->index_boosts(
index_1 => 4,
index_2 => 2
);
$new_view = $view->add_index_boost( $index => $boost );
$new_view = $view->remove_index_boost( $index );
\%boosts = $view->index_boosts;
$boost = $view->get_index_boost( $index );
Make results from one index more relevant than those from another index.
min_score
$new_view = $view->min_score( 2 );
$min_score = $view->min_score;
Exclude results whose score (relevance) is less than the specified number.
preference
$new_view = $view->preference( '_local' );
Control which node should return search results. See http://www.elasticsearch.org/guide/reference/api/search/preference.html for more.
timeout
$new_view = $view->timeout( 10 ); # 10 ms
$new_view = $view->timeout( '10s' ); # 10 sec
$timeout = $view->timeout;
Sets an upper limit on the the time to wait for search results, returning with whatever results it has managed to receive up until that point.
track_scores
$new_view = $view->track_scores( 1 );
$track = $view->track_scores;
By default, If you sort on a field other than _score
, ElasticSearch does not return the calculated relevance score for each doc. If "track_scores" is true, these scores will be returned regardless.
DEBUGGING ATTRIBUTES
explain
$new_view = $view->explain( 1 );
$explain = $view->explain;
Set "explain" to true to return debugging information explaining how each document's score was calculated. See "explain" in Elastic::Model::Result to view the output.
stats
$new_view = $view->stats( 'group_1', 'group_2' );
\@groups = $view->stats;
The statistics for each search can be aggregated by group
. These stats can later be retrieved using "index_stats()" in ElasticSearch.
search_builder
$new_view = $view->search_builder( $search_builder );
$builder = $view->search_builder;
If you would like to use a different search builder than the default ElasticSearch::SearchBuilder for "queryb", "filterb" or "post_filterb", then you can set a value for "search_builder".
DELETE ATTRIBUTES
These parameters are only used with "delete()".
consistency
$new_view = $view->consistency( 'quorum' | 'all' | 'one' );
$consistency = $view->consistency;
At least one
, all
or a quorum
(default) of nodes must be present for the delete to take place.
replication
$new_view = $view->replication( 'sync' | 'async' );
$replication = $view->replication;
Should a delete be done synchronously (ie waits until all nodes within the replcation group have run the delete) or asynchronously (returns immediately, and performs the delete in the background).
TODO
Possibly support partial fields
AUTHOR
Clinton Gormley <drtech@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2012 by Clinton Gormley.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.