NAME

Log::Parallel::Paths - variable expansion, capture, globs, regex on filenames

SYNOPSIS

use Log::Parallel::Paths;

$filename = path_to_filename($spec, %data);

$glob = path_to_shell_glob($spec);

($regex, $closure) = path_to_regex($spec);

DESCRIPTION

Within the batch log processing system, Log::Parallel, filenames are specified with magic cookies embeded in them. For example:

path: '%DATADIR%/%YYYY%/%MM%/%DD%/%JOBNAME%.%DURATION%.%BUCKET%.%SOURCE_BKT%.gz'

These magic cookes need to be expanded in various way: for making a new filename (path_to_filename()); for handing to a shell to glob to look for files (path_to_shell_glob()); for a perl regular expression to extract these parameters from a filename (path_to_regex()).

The magic cookies that are recognized are:

BUCKET

Format: %05d. The bucket number for this file.

SOURCE_BKT

Format: %05d. When one job writes to buckets, the next job will process each bucket separately, often in parallel. The new bucket for a bit of data may be different than the old bucket. The SOURCE_BKT is the old bucket number.

YYYY

Format: %04d. Year part of the end date for this data.

MM

Format: %02d. Month part of the end date for this data.

DD

Format: %02. Day part of the end date for this data.

HH

Format: %02. Hour part of time.

FROM_YYYY

Format: %04d. Year part of the beginning date for this data.

FROM_MM

Format: %02d. Month part of the beginning date for this data.

FROM_DD

Format: %02. Day part of the beginning date for this data.

DURATION

Format: %s. day, daily, week, weekly, etc.

%%%

The % character.

%word=regex%

The specification can have user specified formats. For path_to_regex(), the key for the bit matched by the regex is word.

The path_to_regex() function returns both a regular expression and a bit of code that will translate the positional matches ($1, $2, etc) into key/value pairs.

LICENSE

This package may be used and redistributed under the terms of either the Artistic 2.0 or LGPL 2.1 license.