NAME
Elastic::Model::Index - Create and administer indices in Elasticsearch
VERSION
version 0.29_2
SYNOPSIS
$index = $model->namespace('myapp')->index;
$index = $model->namespace('myapp')->index('index_name');
$index->create( settings => \%settings );
$index->reindex( 'old_index' );
See also "SYNOPSIS" in Elastic::Model::Role::Index.
DESCRIPTION
Elastic::Model::Index objects are used to create and administer indices in an Elasticsearch cluster.
See Elastic::Model::Role::Index for more about usage. See Elastic::Manual::Scaling for more about how indices can be used in your application.
METHODS
create()
$index = $index->create();
$index = $index->create( settings => \%settings, types => \@types );
Creates an index called name (which defaults to $namespace->name
).
The type mapping is automatically generated from the attributes of your doc classes listed in the namespace. Similarly, any custom analyzers required by your classes are added to the index \%settings that you pass in:
$index->create( settings => {number_of_shards => 1} );
To create an index with a sub-set of the types known to the namespace, pass in a list of @types
.
$index->create( types => ['user','post' ]);
reindex()
# reindex $domain_name to $index->name
$index->reindex( $domain_name );
# more options
$index->reindex(
$domain,
repoint_uids => 1,
size => 1000,
bulk_size => 1000,
scan => '2m',
quiet => 0,
transform => sub {...},
on_conflict => sub {...} | 'IGNORE'
on_error => sub {...} | 'IGNORE'
uid_on_conflict => sub {...} | 'IGNORE'
uid_on_error => sub {...} | 'IGNORE'
);
While you can add to the mapping of an index, you can't change what is already there. Especially during development, you will need to reindex your data to a new index.
"reindex()" reindexes your data from domain $domain_name
into an index called $index->name
. The new index is created if it doesn't already exist.
See Elastic::Manual::Reindex for more about reindexing strategies. The documentation below explains what each parameter does:
- size
-
The
size
parameter defaults to 1,000 and controls how many documents are pulled from$domain
in each request. See "size" in Elastic::Model::View.Note: documents are pulled from the
domain
/view
using "scan()" in Elastic::Model::View, which can pull a maximum of size* number_of_primary_shards
in a single request. If you have large docs or underpowered servers, you may want to change thesize
parameter. - bulk_size
-
The
bulk_size
parameter defaults tosize
and controls how many documents are indexed into the new domain in a single bulk-indexing request. - scan
-
scan
is the same as "scan" in Elastic::Model::View - it controls how long Elasticsearch should keep the "scroll" live between requests. Defaults to '2m'. Increase this if the reindexing process is slow and you get scroll timeouts. - repoint_uids
-
If true (the default), "repoint_uids()" will be called automatically to update any UIDs (which point at the old index) in indices other than the ones currently being reindexed.
- transform
-
If you need to change the structure/data of your doc while reindexing, you can pass a
transform
coderef. This will be called before any changes have been made to the doc, and should return the new doc. For instance, to convert the single-valuetag
field to an array oftags
:$index->reindex( 'new_index', 'transform' => sub { my $doc = shift; $doc->{_source}{tags} = [ delete $doc->{_source}{tag} ]; return $doc } );
- on_conflict / on_error
-
If you are indexing to the new index at the same time as you are reindexing, you may get document conflicts. You can handle the conflicts with a coderef callback, or ignore them by by setting
on_conflict
to'IGNORE'
:$index->reindex( 'myapp_v2', on_conflict => 'IGNORE' );
Similarly, you can pass an
on_error
handler which will handle other errors, or all errors if noon_conflict
handler is defined.See "Error handlers" in Search::Elasticsearch::Compat for more.
- uid_on_conflict / uid_on_error
-
These work in the same way as the
on_conflict
oron_error
handlers, but are passed to "repoint_uids()" ifrepoint_uids
is true. - quiet
-
By default, "reindex()" prints out progress information. To silence this, set
quiet
to true:$index->reindex( 'myapp_v2', quiet => 1 );
repoint_uids()
$index->repoint_uids(
uids => [ ['myapp_v1','user',10],['myapp_v1','user',12]...],
exclude => ['myapp_v2'],
scan => '2m',
size => 1000,
bulk_size => 1000,
quiet => 0,
on_conflict => sub {...} | 'IGNORE'
on_error => sub {...} | 'IGNORE'
);
The purpose of "repoint_uids()" is to update stale UID attributes to point to a new index. It is called automatically from "reindex()".
Parameters:
- uids
-
uids
is a hash ref the stale UIDs which should be updated.For instance: you have reindexed
myapp_v1
tomyapp_v2
, but domainother
has documents with UIDs which point tomyapp_v1
. You can updated these by passing a list of the old UIDs, as follows:$index = $namespace->index('myapp_v2'); $index->repoint_uids( uids => { # index myapp_v1 => { # type user => { 1 => 1, # ids 2 => 1, } } } );
- exclude
-
By default, all indices known to the model are updated. You can exclude indices with:
$index->repoint_uids( uids => \@uids, exclude => ['index_1', ...] );
- size
-
This is the same as the
size
parameter to "reindex()". - bulk_size
-
This is the same as the
bulk_size
parameter to "reindex()". - scan
-
This is the same as the
scan
parameter to "reindex()". - quiet
-
This is the same as the
quiet
parameter to "reindex()". - on_conflict / on_error
-
These are the same as the
uid_on_conflict
anduid_on_error
handlers in "reindex()".
doc_updater()
$coderef = $index->doc_updater( $doc_updater, $uid_updater );
"doc_updater()" is used by "reindex()" and "repoint_uids()" to update the top-level doc and any UID attributes with callbacks.
The $doc_updater
receives the $doc
as its only attribute, and should return the $doc
after making any changes:
$doc_updater = sub {
my ($doc) = @_;
$doc->{_index} = 'foo';
return $doc
};
The $uid_updater
receives the UID as its only attribute:
$uid_updater = sub {
my ($uid) = @_;
$uid->{index} = 'foo'
};
IMPORTED ATTRIBUTES
Attributes imported from Elastic::Model::Role::Index
namespace
name
IMPORTED METHODS
Methods imported from Elastic::Model::Role::Index
close()
open()
refresh()
delete()
update_analyzers()
update_settings()
delete_mapping()
is_alias()
is_index()
SEE ALSO
AUTHOR
Clinton Gormley <drtech@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Clinton Gormley.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.