NAME

Search::Elasticsearch::Client::0_90::Direct - Thin client with full support for Elasticsearch APIs

VERSION

version 1.12

SYNOPSIS

Create a client:

use Search::Elasticsearch;
my $e = Search::Elasticsearch->new(
    client => '0_90::Direct'          # for the 0.90 branch
);

Index a doc:

$e->index(
    index   => 'my_index',
    type    => 'blog_post',
    id      => 123,
    body    => {
        title   => "Elasticsearch clients",
        content => "Interesting content...",
        date    => "2013-09-23"
    }
);

Get a doc:

$e->get(
    index   => 'my_index',
    type    => 'my_type',
    id      => 123
);

Search for docs:

$results = $e->search(
    index   => 'my_index',
    body    => {
        query => {
            match => {
                title => "elasticsearch"
            }
        }
    }
);

Index-level requests:

$e->indices->create( index => 'my_index' );
$e->indices->delete( index => 'my_index' )

Cluster-level requests:

$state = $e->cluster->state;
$stats = $e->cluster->node_stats;

DESCRIPTION

The Search::Elasticsearch::Client::Direct class provides the default client that is returned by:

$e = Search::Elasticsearch->new;

It is intended to be as close as possible to the native REST API that Elasticsearch uses, so that it is easy to translate the Elasticsearch reference documentation for an API to the equivalent in this client.

This class provides the methods for document CRUD, bulk document CRUD and search. It also provides access to clients for managing indices and the cluster.

ELASTICSEARCH VERSION

This module is for use with the 0.90 branch of Elasticsearch and should be used as follows:

$es = Search::Elasticsearch->new(
    client => '0_90::Direct'
);

See Search::Elasticsearch::Client::Direct for the default client.

CONVENTIONS

Parameter passing

Parameters can be passed to any request method as a list or as a hash reference. The following two statements are equivalent:

$e->search( size => 10 );
$e->search({size => 10});

Path parameters

Any values that should be included in the URL path, eg /{index}/{type} should be passed as top level parameters:

$e->search( index => 'my_index', type => 'my_type' );

Alternatively, you can specify a path parameter directly:

$e->search( path => '/my_index/my_type' );

Query-string parameters

Any values that should be included in the query string should be passed as top level parameters:

$e->search( size => 10 );

If you pass in a \%params hash, then it will be included in the query string parameters without any error checking. The following:

$e->search( size => 10, params => { from => 5, size => 5 })

would result in this query string:

?from=5&size=10

Body parameter

The request body should be passed in the body key:

$e->search(
    body => {
        query => {...}
    }
);

The body can also be a UTF8-decoded string, which will be converted into UTF-8 bytes and passed as is:

$e->indices->analyze( body => "The quick brown fox");

Ignore parameter

Normally, any HTTP status code outside the 200-299 range will result in an error being thrown. To suppress these errors, you can specify which status codes to ignore in the ignore parameter.

$e->indices->delete(
    index  => 'my_index',
    ignore => 404
);

This is most useful for Missing errors, which are triggered by a 404 status code when some requested resource does not exist.

Multiple error codes can be specified with an array:

$e->indices->delete(
    index  => 'my_index',
    ignore => [404,409]
);

CONFIGURATION

bulk_helper_class

The class to use for the "bulk_helper()" method. Defaults to Search::Elasticsearch::Bulk.

scroll_helper_class

The class to use for the "scroll_helper()" method. Defaults to Search::Elasticsearch::Scroll.

GENERAL METHODS

info()

$info = $e->info

Returns information about the version of Elasticsearch that the responding node is running.

ping()

$e->ping

Pings a node in the cluster and returns 1 if it receives a 200 response, otherwise it throws an error.

indices()

$indices_client = $e->indices;

Returns an Search::Elasticsearch::Client::0_90::Direct::Indices object which can be used for managing indices, eg creating, deleting indices, managing mapping, index settings etc.

cluster()

$cluster_client = $e->cluster;

Returns an Search::Elasticsearch::Client::0_90::Direct::Cluster object which can be used for managing the cluster, eg cluster-wide settings, cluster health, node information and stats.

DOCUMENT CRUD METHODS

These methods allow you to perform create, index, update and delete requests for single documents:

index()

$response = $e->index(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # optional, otherwise auto-generated

    body    => { document }         # required
);

The index() method is used to index a new document or to reindex an existing document.

Query string parameters: consistency, op_type, parent, percolate, refresh, replication, routing, timeout, timestamp, ttl, version, version_type

See the index docs for more information.

create()

$response = $e->create(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # optional, otherwise auto-generated

    body    => { document }         # required
);

The create() method works exactly like the "index()" method, except that it will throw a Conflict error if a document with the same index, type and id already exists.

Query string parameters: consistency, op_type, parent, percolate, refresh, replication, routing, timeout, timestamp, ttl, version, version_type

See the create docs for more information.

get()

$response = $e->get(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # required
);

The get() method will retrieve the document with the specified index, type and id, or will throw a Missing error.

Query string parameters: _source, _source_exclude, _source_include, fields, parent, preference, realtime, refresh, routing

See the get docs for more information.

get_source()

$response = $e->get_source(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # required
);

The get_source() method works just like the "get()" method except that it returns just the _source field (the value of the body parameter in the "index()" method) instead of returning the _source field plus the document metadata, ie the _index, _type etc.

Query string parameters: _source_exclude, _source_include, parent, preference, realtime, refresh, routing

See the get_source docs for more information.

exists()

$response = $e->exists(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # required
);

The exists() method returns 1 if a document with the specified index, type and id exists, or an empty string if it doesn't.

Query string parameters: parent, preference, realtime, refresh, routing

See the exists docs for more information.

delete()

$response = $e->delete(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # required
);

The delete() method will delete the document with the specified index, type and id, or will throw a Missing error.

Query string parameters: consistency, parent, refresh, replication, routing, timeout, version, version_type

See the delete docs for more information.

update()

$response = $e->update(
    index   => 'index_name',        # required
    type    => 'type_name',         # required
    id      => 'doc_id',            # required

    body    => { update }           # required
);

The update() method updates a document with the corresponding index, type and id if it exists. Updates can be performed either by:

  • providing a partial document to be merged in to the existing document:

    $response = $e->update(
        ...,
        body => {
            doc => { new_field => 'new_value'},
        }
    );
  • or with a script:

    $response = $e->update(
        ...,
        body => {
            script => "ctx._source.counter += incr",
            params => { incr => 5 }
        }
    );

Query string parameters: consistency, fields, lang, parent, percolate, realtime, refresh, replication, retry_on_conflict, routing, script, timeout, timestamp, ttl, version, version_type

See the update docs for more information.

BULK DOCUMENT CRUD METHODS

The bulk document CRUD methods are used for running multiple CRUD actions within a single request. By reducing the number of network requests that need to be made, bulk requests greatly improve performance.

bulk()

$response = $e->bulk(
    index   => 'index_name',        # required if type specified
    type    => 'type_name',         # optional

    body    => [ actions ]          # required
);

See Search::Elasticsearch::Bulk and "bulk_helper()" for a helper module that makes bulk indexing simpler to use.

The bulk() method can perform multiple "index()", "create()", "delete()" or "update()" actions with a single request. The body parameter expects an array containing the list of actions to perform.

An action consists of an initial metadata hash ref containing the action type, plus the associated metadata, eg :

{ delete => { _index => 'index', _type => 'type', _id => 123 }}

The index and create actions then expect a hashref containing the document itself:

{ create => { _index => 'index', _type => 'type', _id => 123 }},
{ title => "A newly created document" }

And the update action expects a hashref containing the update commands, eg:

{ update => { _index => 'index', _type => 'type', _id => 123 }},
{ script => "ctx._source.counter+=1" }

Each action can include the same parameters that you would pass to the equivalent "index()", "create()", "delete()" or "update()" request, except that _index, _type and _id must be specified with the preceding underscore. All other parameters can be specified with or without the underscore.

For instance:

$response = $e->bulk(
    index   => 'index_name',        # default index name
    type    => 'type_name',         # default type name
    body    => [

        # create action
        { create => {
            _index => 'not_the_default_index',
            _type  => 'not_the_default_type',
            _id    => 123
        }},
        { title => 'Foo' },

        # index action
        { index => { _id => 124 }},
        { title => 'Foo' },

        # delete action
        { delete => { _id => 125 }},

        # update action
        { update => { _id => 126 }},
        { script => "ctx._source.counter+1" }
    ]
);

Each action is performed separately. One failed action will not cause the others to fail as well.

Query string parameters: consistency, refresh, replication, timeout, type

See the bulk docs for more information.

bulk_helper()

$bulk_helper = $e->bulk_helper( @args );

Returns a new instance of the class specified in the "bulk_helper_class", which defaults to Search::Elasticsearch::Bulk.

mget()

$results = $e->mget(
    index   => 'default_index',     # optional, required when type specified
    type    => 'default_type',      # optional

    body    => { docs or ids }      # required
);

The mget() method will retrieve multiple documents with a single request. The body consists of an array of documents to retrieve:

$results = $e->mget(
    index   => 'default_index',
    type    => 'default_type',
    body    => {
        docs => [
            { _id => 1},
            { _id => 2, _type => 'not_the_default_type' }
        ]
    }
);

You can also pass any of the other parameters that the "get()" request accepts.

If you have specified an index and type, you can just include the ids of the documents to retrieve:

$results = $e->mget(
    index   => 'default_index',
    type    => 'default_type',
    body    => {
        ids => [ 1, 2, 3]
    }
);

Query string parameters: _source, _source_exclude, _source_include, fields, preference, realtime, refresh

See the mget docs for more information.

delete_by_query()

$result = $e->delete_by_query(
    index => 'index' | \@indices,   # optional
    type  => 'type'  | \@types,     # optional

    body  => { query }              # required

);

The delete_by_query() method deletes all documents which match the query. For instance, to delete all documents from 2012:

$result = $e->delete_by_query(
    body  => {
        query => {
            range => {
                date => {
                    gte => '2012-01-01',
                    lt  => '2013-01-01'
                }
            }
        }
    }
);

Query string parameters: allow_no_indices, analyzer, consistency, default_operator, df, expand_wildcards, ignore_indices, ignore_unavailable, q, replication, routing, source, timeout

See the delete_by_query docs for more information.

SEARCH METHODS

The search methods are used for querying documents in one, more or all indices and of one, more or all types:

search()

$results = $e->search(
    index   => 'index' | \@indices,     # optional
    type    => 'type'  | \@types,       # optional

    body    => { search params }        # optional
);

The search() method searches for matching documents in one or more indices. It is just as easy to search a single index as it is to search all the indices in your cluster. It can also return facets (aggregations on particular fields), highlighted snippets and did-you-mean or search-as-you-type suggestions.

The lite version of search allows you to specify a query string in the q parameter, using the Lucene query string syntax:

$results = $e->search( q => 'title:(elasticsearch clients)');

However, the preferred way to search is by using the Query DSL to create a query, and passing that query in the request body:

$results = $e->search(
    body => {
        query => {
            match => { title => 'Elasticsearch clients'}
        }
    }
);

Query string parameters: _source, _source_exclude, _source_include, allow_no_indices, analyze_wildcard, analyzer, default_operator, df, expand_wildcards, explain, fields, from, ignore_indices, ignore_indices, ignore_unavailable, lenient, lowercase_expanded_terms, preference, q, routing, scroll, search_type, size, sort, source, stats, suggest_field, suggest_mode, suggest_size, suggest_text, timeout, version

See the search reference for more information.

Also see "send_get_body_as" in Search::Elasticsearch::Transport.

count()

$results = $e->count(
    index   => 'index' | \@indices,     # optional
    type    => 'type'  | \@types,       # optional

    body    => { query }                # optional
)

The count() method returns the total count of all documents matching the query:

$results = $e->count(
    body => {
        query => {
            match => { title => 'Elasticsearch clients' }
        }
    }
);

Query string parameters: allow_no_indices, expand_wildcards, ignore_indices, ignore_unavailable, min_score, preference, routing, source

See the count docs for more information.

scroll()

$results = $e->scroll(
    scroll      => '1m',
    scroll_id   => $id
);

When a "search()" has been performed with the scroll parameter, the scroll() method allows you to keep pulling more results until the results are exhausted.

NOTE: you will almost always want to set the search_type to scan in your original search() request.

See "scroll_helper()" and Search::Elasticsearch::Scroll for a helper utility which makes managing scroll requests much easier.

Query string parameters: scroll, scroll_id

See the scroll docs and the search_type docs for more information.

clear_scroll()

$response = $e->clear_scroll(
    scroll_id => $id | \@ids    # required
);

The clear_scroll() method can clear unfinished scroll requests, freeing up resources on the server.

scroll_helper()

$scroll_helper = $e->scroll_helper( @args );

Returns a new instance of the class specified in the "scroll_helper_class", which defaults to Search::Elasticsearch::Scroll.

msearch()

$results = $e->msearch(
    index   => 'default_index' | \@indices,     # optional
    type    => 'default_type'  | \@types,       # optional

    body    => [ searches ]                     # required
);

The msearch() method allows you to perform multiple searches in a single request. Similar to the "bulk()" request, each search request in the body consists of two hashes: the metadata hash then the search request hash (the same data that you'd specify in the body of a "search()" request). For instance:

$results = $e->msearch(
    index   => 'default_index',
    type    => ['default_type_1', 'default_type_2'],
    body => [
        # uses defaults
        {},
        { query => { match_all => {} }},

        # uses a custom index
        { index => 'not_the_default_index' },
        { query => { match_all => {} }}
    ]
);

Query string parameters: search_type

See the msearch docs for more information.

explain()

$response = $e->explain(
    index   => 'my_index',  # required
    type    => 'my_type',   # required
    id      => 123,         # required

    body    => { search }   # required
);

The explain() method explains why the specified document did or did not match a query, and how the relevance score was calculated. For instance:

$response = $e->explain(
    index   => 'my_index',
    type    => 'my_type',
    id      => 123,
    body    => {
        query => {
            match => { title => 'Elasticsearch clients' }
        }
    }
);

Query string parameters: _source, _source_exclude, _source_include, analyze_wildcard, analyzer, default_operator, df, fields, lenient, lowercase_expanded_terms, parent, preference, q, routing, source

See the explain docs for more information.

percolate()

$results = $e->percolate(
    index   => 'my_index',      # required
    type    => 'my_type',       # required

    body    => { percolation }  # required
);

Percolation is search inverted: instead of finding docs which match a particular query, it finds queries which match a particular document, eg for alert-me-when functionality.

The percolate() method runs a percolation request to find the queries matching a particular document. In the body you should pass the _source field of the document under the doc key:

$results = $e->percolate(
    index   => 'my_index',
    type    => 'my_type',
    body    => {
        doc => {
            title => 'Elasticsearch rocks'
        }
    }
);

Query string parameters: prefer_local

See the percolate docs for more information.

suggest()

$results = $e->suggest(
    index   => 'index' | \@indices,     # optional
    type    => 'type'  | \@types,       # optional

    body    => { suggest request }      # required
);

The suggest() method is used to run did-you-mean or search-as-you-type suggestion requests, which can also be run as part of a "search()" request.

$results = $e->suggest(
    index   => 'my_index',
    type    => 'my_type',
    body    => {
        my_suggestions => {
            phrase  => {
                text    => 'johnny walker',
                field   => 'title'
            }
        }
    }
);

Query string parameters: allow_no_indices, expand_wildcards, ignore_indices, ignore_unavailable, preference, routing, source

mlt()

$results = $e->mlt(
    index   => 'my_index',  # required
    type    => 'my_type',   # required
    id      => 123,         # required

    body    => { search }   # optional
);

The mlt() method runs a more-like-this query to find other documents which are similar to the specified document.

Query string parameters: boost_terms, max_doc_freq, max_query_terms, max_word_len, min_doc_freq, min_term_freq, min_word_len, mlt_fields, percent_terms_to_match, routing, search_from, search_indices, search_query_hint, search_scroll, search_size, search_source, search_type, search_types, stop_words

See the mlt docs for more information.

AUTHOR

Clinton Gormley <drtech@cpan.org>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2014 by Elasticsearch BV.

This is free software, licensed under:

The Apache License, Version 2.0, January 2004