NAME
ElasticSearch::Transport - Base class for communicating with ElasticSearch
DESCRIPTION
ElasticSearch::Transport is a base class for the modules which communicate with the ElasticSearch server.
It handles failover to the next node in case the current node closes the connection.
All requests are round-robin'ed to all live servers as returned by /_cluster/nodes
, except we shuffle
the server list when we retrieve it, and thus avoid having all our instances make their first request to the same server.
On the first request and every max_requests
after that (default 10,000), the list of live nodes is automatically refreshed. This can be disabled by setting max_requests
to 0
.
Regardless of the max_requests
setting, a list of live nodes will still be retrieved on the first request. This may not be desirable behaviour if, for instance, you are connecting to remote servers which use internal IP addresses, or which don't allow remote nodes()
requests.
If you want to disable this behaviour completely, set no_refresh
to 1
, in which case the transport module will round robin through the servers
list only. Failed nodes will be removed from the list (but added back in every max_requests
or when all nodes have failed):
Currently, the available backends are:
http
(default)Uses LWP to communicate using HTTP. See ElasticSearch::Transport::HTTP
httplite
Uses HTTP::Lite to communicate using HTTP. See ElasticSearch::Transport::HTTPLite
httptiny
Uses HTTP::Tiny to communicate using HTTP. See ElasticSearch::Transport::HTTPTiny
curl
Uses WWW::Curl and thus libcurl to communicate using HTTP. See ElasticSearch::Transport::Curl
aehttp
Uses AnyEvent::HTTP to communicate asynchronously using HTTP. See ElasticSearch::Transport::AEHTTP
aecurl
Uses AnyEvent::Curl::Multi (and thus libcurl) to communicate asynchronously using HTTP. See ElasticSearch::Transport::AECurl
thrift
Uses
thrift
to communicate using a compact binary protocol over sockets. See ElasticSearch::Transport::Thrift. You need to have thetransport-thrift
plugin installed on your ElasticSearch server for this to work.
You shouldn't need to talk to the transport modules directly - everything happens via the main ElasticSearch class.
SYNOPSIS
use ElasticSearch;
my $e = ElasticSearch->new(
servers => 'search.foo.com:9200',
transport => 'httplite',
timeout => '10',
no_refresh => 0 | 1,
deflate => 0 | 1,
);
my $t = $e->transport;
$t->max_requests(5) # refresh_servers every 5 requests
$t->protocol # eg 'http'
$t->next_server # next node to use
$t->current_server # eg '127.0.0.1:9200' ie last used node
$t->default_servers # seed servers passed in to new()
$t->servers # eg ['192.168.1.1:9200','192.168.1.2:9200']
$t->servers(@servers); # set new 'live' list
$t->refresh_servers # refresh list of live nodes
$t->clear_clients # clear all open clients
$t->no_refresh(0|1) # don't retrieve the live node list
# instead, use just the nodes specified
$t->deflate(0|1); # should ES deflate its responses
# useful if ES is on a remote network.
# ES needs compression enabled with
# http.compression: true
$t->register('foo',$class) # register new Transport backend
WHICH TRANSPORT SHOULD YOU USE
Although the thrift
interface has the right buzzwords (binary, compact, sockets), the generated Perl code is very slow. Until that is improved, I recommend one of the http
backends instead.
The HTTP backends in increasing order of speed are:
http
- LWP basedhttplite
- HTTP::Lite based, about 30% faster thanhttp
httptiny
- HTTP::Tiny based, about 1% faster thanhttplite
curl
- WWW::Curl based, about 60% faster thanhttptiny
!
See also: http://www.elasticsearch.org/guide/reference/modules/http.html and http://www.elasticsearch.org/guide/reference/modules/thrift.html
SUBCLASSING TRANSPORT
If you want to add a new transport backend, then these are the methods that you should subclass:
init()
$t->init($params)
By default, a no-op. Receives a HASH ref with the parameters passed in to new()
, less servers
, transport
and timeout
.
Any parameters specific to your module should be deleted from $params
send_request()
$json = $t->send_request($server,$params)
where $params = {
method => 'GET',
cmd => '/_cluster',
qs => { pretty => 1 },
data => '{ "foo": "bar"}',
}
This must be overridden in the subclass - it is the method called to actually talk to the server.
See ElasticSearch::Transport::HTTP for an example implementation.
protocol()
$t->protocol
This must return the protocol in use, eg "http"
or "thrift"
. It is used to extract the list of bound addresses from ElasticSearch, eg http_address
or thrift_address
.
client()
$client = $t->client($server)
Returns the client object used in "send_request()". The server param will look like "192.168.5.1:9200"
. It should store its clients in a PID specific slot in $t->{_client}
as clear_clients()
deletes this key.
See "client()" in ElasticSearch::Transport::HTTP and "client()" in ElasticSearch::Transport::Thrift for an example implementation.
Registering your Transport backend
You can register your Transport backend as follows:
BEGIN {
ElasticSearch::Transport->register('mytransport',__PACKAGE__);
}
SEE ALSO
LICENSE AND COPYRIGHT
Copyright 2010 - 2011 Clinton Gormley.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.