NAME
es-copy-index.pl - Copy an index from one cluster to another
VERSION
version 5.3
SYNOPSIS
es-copy-access.pl [options] [query to select documents]
Options:
--source (Required) The source index name for the copy
--destination Destination index name, assumes source
--from (Required) A server in the cluster where the index lives
--to A server in the cluster where the index will be copied to
--block How many docs to process in one batch, default: 1,000
--mapping JSON mapping to use instead of the source mapping
--settings JSON index settings to use instead of those from the source
--append Instead of creating the index, add the documents to the destination
--help print help
--manual print full manual
From App::ElasticSearch::Utilities:
--local Use localhost as the elasticsearch host
--host ElasticSearch host to connect to
--port HTTP port for your cluster
--proto Defaults to 'http', can also be 'https'
--http-username HTTP Basic Auth username
--http-password HTTP Basic Auth password (if not specified, and --http-user is, you will be prompted)
--password-exec Script to run to get the users password
--noop Any operations other than GET are disabled, can be negated with --no-noop
--timeout Timeout to ElasticSearch, default 30
--keep-proxy Do not remove any proxy settings from %ENV
--index Index to run commands against
--base For daily indexes, reference only those starting with "logstash"
(same as --pattern logstash-* or logstash-DATE)
--datesep Date separator, default '.' also (--date-separator)
--pattern Use a pattern to operate on the indexes
--days If using a pattern or base, how many days back to go, default: all
See also the "CONNECTION ARGUMENTS" and "INDEX SELECTION ARGUMENTS" sections from App::ElasticSearch::Utilities.
From CLI::Helpers:
--data-file Path to a file to write lines tagged with 'data => 1'
--color Boolean, enable/disable color, default use git settings
--verbose Incremental, increase verbosity (Alias is -v)
--debug Show developer output
--debug-class Show debug messages originating from a specific package, default: main
--quiet Show no output (for cron)
--syslog Generate messages to syslog as well
--syslog-facility Default "local0"
--syslog-tag The program name, default is the script name
--syslog-debug Enable debug messages to syslog if in use, default false
DESCRIPTION
This script allows you to copy data from one index to another on the same cluster or on a separate cluster. It handles index creation, either directly copying the mapping and settings from the source index or from mapping/settings JSON files.
This script could also be used to split up an index into smaller indexes for any number of reasons.
This uses the reindex API to copy data from one cluster to another
NAME
es-copy-index.pl - Copy an index from one cluster to another
OPTIONS
- from
-
REQUIRED: hostname or IP of the source cluster
- to
-
Hostname or IP of the destination cluster, defaults to the same host unless otherwise specified.
- source
-
REQUIRED: name of the source index for the copy
- destination
-
Optional: change the name of the index on the destination cluster
- block
-
Batch size of docs to process in one retrieval, default is 1,000
- mapping
-
Path to a file containing JSON mapping to use on the destination index instead of the mapping directly from the source index.
- settings
-
Path to a file containing JSON settings to use on the destination index instead of the settings directly from the source index.
- append
-
This mode skips the index mapping and settings configuration and just being indexing documents from the source into the destination.
- help
-
Print this message and exit
- manual
-
Print detailed help with examples
EXAMPLES
Copy to different cluster
es-copy-index.pl --from localhost --to remote.cluster.com --source logstash-2013.01.11
Rename an existing index
es-copy-index.pl --from localhost --source logstash-2013.01.11 --destination logs-2013.01.11
Subset an existing index
es-copy-index.pl --from localhost \
--source logstash-2013.01.11 \
--destination secure-2013.01.11 \
category:'(authentication authorization)'
Changing settings and mappings
es-copy-index.pl --from localhost \
--source logstash-2013.01.11 \
--destination testing-new-settings-old-data-2013.01.11 \
--settings new_settings.json \
--mappings new_mappings.json
Building an Incident Index using append
Let's say we were investigating an incident and wanted to have an index that contained the data we were interested in. We could use different retention rules for incident indexes and we could arbitrarily add data to them based on searches being performed on the source index.
Here's our initial query, a bad actor on our admin login page.
es-copy-index.pl --from localhost \
--source logstash-2013.01.11 \
--destination incident-rt1234-2013.01.11 \
src_ip:1.2.3.4 dst:admin.exmaple.com and file:'\/login.php'
Later on, we discover there was another actor:
es-copy-index.pl --from localhost \
--source logstash-2013.01.11 \
--destination incident-rt1234-2013.01.11 \
--append \
src_ip:4.3.2.1 dst:admin.exmaple.com and file:'\/login.php'
The incident-rt1234-2013.01.11 index will now hold all the data from both of those queries.
Query Syntax Extensions
The search string is pre-analyzed before being sent to ElasticSearch. The following plugins work to manipulate the query string and provide richer, more complete syntax for CLI applications.
App::ElasticSearch::Utilities::Barewords
The following barewords are transformed:
or => OR
and => AND
not => NOT
App::ElasticSearch::Utilities::QueryString::IP
If a field is an IP address wild card, it is transformed:
src_ip:10.* => src_ip:[10.0.0.0 TO 10.255.255.255]
App::ElasticSearch::Utilities::Underscored
This plugin translates some special underscore surrounded tokens into the Elasticsearch Query DSL.
Implemented:
_prefix_
Example query string:
_prefix_:useragent:'Go '
Translates into:
{ prefix => { useragent => 'Go ' } }
App::ElasticSearch::Utilities::QueryString::FileExpansion
If the match ends in .dat, .txt, or .csv, then we attempt to read a file with that name and OR the condition:
$ cat test.dat
50 1.2.3.4
40 1.2.3.5
30 1.2.3.6
20 1.2.3.7
Or
$ cat test.csv
50,1.2.3.4
40,1.2.3.5
30,1.2.3.6
20,1.2.3.7
Or
$ cat test.txt
1.2.3.4
1.2.3.5
1.2.3.6
1.2.3.7
We can source that file:
src_ip:test.dat => src_ip:(1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7)
This make it simple to use the --data-file output options and build queries based off previous queries. For .txt and .dat file, the delimiter for columns in the file must be either a tab, comma, or a semicolon. For files ending in .csv, Text::CSV_XS is used to accurate parsing of the file format.
You can also specify the column of the data file to use, the default being the last column or (-1). Columns are zero-based indexing. This means the first column is index 0, second is 1, .. The previous example can be rewritten as:
src_ip:test.dat[1]
or: src_ip:test.dat[-1]
This option will iterate through the whole file and unique the elements of the list. They will then be transformed into an appropriate terms query.
AUTHOR
Brad Lhotsky <brad@divisionbyzero.net>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2012 by Brad Lhotsky.
This is free software, licensed under:
The (three-clause) BSD License