NAME
files-to-elasticsearch.pl - A simple utility to tail a file and index each line as a document in ElasticSearch
VERSION
version 0.010
SYNOPSIS
To see available options, run:
file-to-elasticsearch.pl --help
Create a config file and run the utility:
file-to-elasticsearch.pl --config config.yaml --log4perl logging.conf --debug
This will run a single threaded POE instance that will tail the log files you've requested, performing the requested transformations and sending them to the elasticsearch cluster and index you've specified.
CONFIGURATION
Configuration
ElasticSearch Settings
The elasticsearch
section of the config controls the settings passed to the POE::Component::ElasticSearch::Indexer.
---
elasticsearch:
servers: [ "localhost:9200" ]
flush_interval: 30
flush_size: 1_000
index: logstash-%Y.%m.%d
type: log
The settings available are:
- servers
-
An array of servers used to send bulk data to ElasticSearch. The default is just localhost on port 9200.
- flush_interval
-
Every
flush_interval
seconds, the queued documents are send to the Bulk API of the cluster. - flush_size
-
If this many documents is received, regardless of the time since the last flush, force a flush of the queued documents to the Bulk API.
- index
-
A
strftime
compatible string to use as theDefaultIndex
parameter if a file doesn't pass one along. - type
-
Mostly useless as Elastic is abandoning "types", but this will be set as the
DefaultType
for documents being indexed.
Tail Section
The files
section contains the list of files to tail and the rules to use to index them.
---
tail:
- file: '/var/log/osquery/result.log'
index: "osquery-result-%Y.%m.%d"
decode: json
extract:
- by: split
from: name
when: '^pack'
into: 'pack'
split_on: '/'
split_parts: [ null, "name", "report" ]
mutate:
prune: true
remove: [ "calendarTime", "epoch", "counter", "_raw" ]
rename:
unixTime: _epoch
Each element is a hash containing the following information.
- file
-
Required: The path to the file on the filesystem.
- decode
-
This may be a single element, or an array, containing one or more of the implemented decoders.
- json
-
Decode the discovered JSON in the document to a hash reference. This finds the first occurrence of an
{
in the string and assumes everything to the end of the string is JSON.Decoding is done by JSON::MaybeXS.
- syslog
-
Parses each line as a standard UNIX syslog message. Parsing is provided via Parse::Syslog::Line which isn't a hard requirement of the this package, but will be loaded if available.
- index
-
A
strfime
compatible string to use as the index to put documents created from this file. If not specified, the defaults from the ElasticSearch section will be used, and failing that, the default as specified in POE::Component::ElasticSearch::Index. - type
-
The type to use for documents sourced from this file.
- extract
-
Extraction of fields from the document by one of the supported methods.
- by
AUTHOR
Brad Lhotsky <brad@divisionbyzero.net>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2018 by Brad Lhotsky.
This is free software, licensed under:
The (three-clause) BSD License