NAME

App::ElasticSearch::Utilities::QueryString::FileExpansion - Build a terms query from unique values in a column of a file

VERSION

version 8.7

SYNOPSIS

App::ElasticSearch::Utilities::QueryString::FileExpansion

If the match ends in .dat, .txt, .csv, or .json then we attempt to read a file with that name and OR the condition:

$ cat test.dat
50  1.2.3.4
40  1.2.3.5
30  1.2.3.6
20  1.2.3.7

$ cat test.csv
50,1.2.3.4
40,1.2.3.5
30,1.2.3.6
20,1.2.3.7

$ cat test.txt
1.2.3.4
1.2.3.5
1.2.3.6
1.2.3.7

$ cat test.json
{ "ip": "1.2.3.4" }
{ "ip": "1.2.3.5" }
{ "ip": "1.2.3.6" }
{ "ip": "1.2.3.7" }

We can source that file:

src_ip:test.dat      => src_ip:(1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7)
src_ip:test.json[ip] => src_ip:(1.2.3.4 1.2.3.5 1.2.3.6 1.2.3.7)

This make it simple to use the --data-file output options and build queries based off previous queries. For .txt and .dat file, the delimiter for columns in the file must be either a tab or a null. For files ending in .csv, Text::CSV_XS is used to accurate parsing of the file format. Files ending in .json are considered to be newline-delimited JSON.

You can also specify the column of the data file to use, the default being the last column or (-1). Columns are zero-based indexing. This means the first column is index 0, second is 1, .. The previous example can be rewritten as:

src_ip:test.dat[1]

or: src_ip:test.dat[-1]

For newline delimited JSON files, you need to specify the key path you want to extract from the file. If we have a JSON source file with:

{ "first": { "second": { "third": [ "bob", "alice" ] } } }
{ "first": { "second": { "third": "ginger" } } }
{ "first": { "second": { "nope":  "fred" } } }

We could search using:

actor:test.json[first.second.third]

Which would expand to:

{ "terms": { "actor": [ "alice", "bob", "ginger" ] } }

This option will iterate through the whole file and unique the elements of the list. They will then be transformed into an appropriate terms query.

Wildcards

We can also have a group of wildcard or regexp in a file:

$ cat wildcards.dat
*@gmail.com
*@yahoo.com

To enable wildcard parsing, prefix the filename with a *.

es-search.pl to_address:*wildcards.dat

Which expands the query to:

{
  "bool": {
    "minimum_should_match":1,
    "should": [
       {"wildcard":{"to_outbound":{"value":"*@gmail.com"}}},
       {"wildcard":{"to_outbound":{"value":"*@yahoo.com"}}}
    ]
  }
}

No attempt is made to verify or validate the wildcard patterns.

Regular Expressions

If you'd like to specify a file full of regexp, you can do that as well:

$ cat regexp.dat
.*google\.com$
.*yahoo\.com$

To enable regexp parsing, prefix the filename with a ~.

es-search.pl to_address:~regexp.dat

Which expands the query to:

{
  "bool": {
    "minimum_should_match":1,
    "should": [
      {"regexp":{"to_outbound":{"value":".*google\\.com$"}}},
      {"regexp":{"to_outbound":{"value":".*yahoo\\.com$"}}}
    ]
  }
}

No attempt is made to verify or validate the regexp expressions.

AUTHOR

Brad Lhotsky <brad@divisionbyzero.net>

COPYRIGHT AND LICENSE

This is free software, licensed under:

The (three-clause) BSD License

To install App::ElasticSearch::Utilities, copy and paste the appropriate command in to your terminal.

cpanm

cpanm App::ElasticSearch::Utilities

CPAN shell

perl -MCPAN -e shell
install App::ElasticSearch::Utilities

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)