NAME

Data::Tubes::Plugin::Writer

DESCRIPTION

This module contains functions to ease using tubes.

FUNCTIONS

Functions starting with write_ have an equivalent form without this prefix.

dispatch_to_files

my $tube = dispatch_to_files($filename, %args); # OR
my $tube = dispatch_to_files(%args); # OR
my $tube = dispatch_to_files(\%args);

composition of dispatch from Data::Tubes::Plugin::Plumbing and "to_files", allows handling multiple output channels selected on the base of the contents of the input record. This is the most flexible mechanism available to relate the output channel to the input record, while at the same time taking advantage of automatic handling of output segmentation into multiple files (as provided by "to_files").

Accepts the same arguments as "to_files", although it will always override parameter filename (for obvious reasons!). This parameter can be set to either a sub reference that is supposed to generate a file name or a handle each time it is invoked (as filename_factory) or a string holding a template filename (as filename_template), so it is a handy shortcut for both. For this reason, it is also the default parameter when passed as the first, unnamed option.

The function also accepts all options from dispatch in Data::Tubes::Plugin::Plumbing, plus the following ones:

filename

handy shortcut for either filename_factory or filename_template, so this is NOT passed over directly to "to_files";

filename_factory

a sub reference that will emit anything valid for filename in "to_files". It will be fed with the key and the record, see dispatch in Data::Tubes::Plugin::Plumbing for details;

filename_template

a meta-template string, i.e. a Template::Perlish template that will be expanded based on a hash with the following keys:

key

whatever passed by dispatch in Data::Tubes::Plugin::Plumbing;

record

the current record.

This field is used only if a filename_factory is not available.

The expansion should return anything valid for "to_files".

As an example, suppose you want to generate your filenames based on the key passed by dispatch, and on one additional field foo in the first record for that key. You might have a filename_template like the following:

$template = 'output-[% key %]-[% record.foo %].%03d.txt';

After the expansion, you can get the following templates:

output-bar-whatever.%03d.txt
output-baz-yuppie.%03d.txt
...

i.e. templates that can be further expanded according to a policy.

tp_opts

options for Template::Perlish, e.g. if you want to change the delimiters.

An example is due at this point:

my %dtf_tube = dispatch_to_files(
   # options for `dispatch_to_files` directly
   filename_template => 'output-[% key %]-%02d.txt',

   # options for `Data::Tubes::Plugin::Plumbing::dispatch`. This is
   # used to automatically generate the "key" from the input record,
   # i.e. the key will be $record->{structured}{class}
   key => [qw< structured class >],

   # options for `to_files`
   policy => { records_threshold => 10 },
   header => '{{{',
   footer => '}}}',
);

to_files

my $tube = to_files($filename, %args); # OR
my $tube = to_files(%args); # OR
my $tube = to_files(\%args);

generate a tube for writing to files.

In this context, file is something quite broad, ranging from one single file, to filehandles, to families of files that share a common way to derive their filename.

This factory uses Data::Tubes::Util::Output, so you might want to take a look there too.

The central argument is filename, that can also be set as an initial unnamed parameter in the arguments list. You can set it in different ways:

filehandle

and this will be used. No operations will be performed on it, apart printing (so, no binmode, no close, etc.)

CORE::open-compliant thingie

i.e. a string with the name of a file or a reference to a string;

filename template

i.e. a template that is ready for expansion (via sprintffy in Data::Tubes::Util. This is useful if your output should be segmented into multiple files based on a policy (another argument to the factory>, where the name can contain sprintf-like sequences (most notably, %n represents the increasing id of the file, and %02n is the same, but printed in at least two characters and zero-padded);

sub reference

that is supposed to return either a filehandle or a filename at each call. This is how you can gain maximum flexibility at the expense of more coding on your side.

Most of the times you'll probably be interested in the filename template, so here's an example:

$template = 'my-output-%02d.txt

expands to

my-output-00.txt
my-output-01.txt
...

The following expansions are available:

%(\d*)n

expands to the current index for a file, always increasing and starting from 0. The optional digits are handled like an integer expansion in CORE::sprintf;

%Y

expands to the year (four digits);

%m

expands to the month (two digits, zero-padded on the left, starting from 1);

%d

expands to the day (two digits, zero-padded on the left, starting from 1);

%H

expands to the hour (two digits, zero-padded on the left, starting from 0);

%M

expands to the minute (two digits, zero-padded on the left, starting from 0);

%S

expands to the second (two digits, zero-padded on the left, starting from 0);

%z

expands to the time zone (in the format [-+]\d\d:\d\d);

%D

expands to the date without separators, same as %Y%m%d;

%T

expands to the time without separators and including the time zone, same as %H%M%S%z;

%t

expands to the a full timestamp without separators and including the time zone, same as %Y%m%d%H%M%S%z;

%%

expands to a literal percent sign, in case you were wondering.

NOTE: if you want to put a timestamp, use %t instead of %D and %T. The two expansions will rely on two different calls to CORE::localtime, which means that there is the very slight chance that you might trip over the day change and get the date for the previous day, but the time of the next one, which makes you lose a day. Using %t takes all the variables in one single call, so it always provides a consistent read.

If you provide a string filename field that has no expansion, but at the same time set a policy that will lead to generating multiple files, the first file will be called exactly as specified in filename, and the following one will have the name with appended an underscore character and the number (starting from 1) without padding. So, the following filename:

$template = 'my-output.txt'

expands to:

my-output.txt
my-output.txt_1
my-output.txt_2
...

If you don't set a policy, or your thresholds are not hit, then only the first filename will be used of course.

The following arguments are accepted:

binmode

value to set via CORE::binmode to opened filehandles (not to provided ones though). See Data::Tubes::Util::Output;

filename

see above. Defaults to standard output;

data to be inserted as footer when closing/releasing a file, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;

data to be inserted as header when opening/starting to use a file, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;

input

input field in the record. This is what will actually be printed. Defaults to rendered, in compliance with the output of tubes from Data::Tubes::Plugin::Renderer.

interlude

data to be inserted between records printed out, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;

name

name of the tube, useful when debugging;

policy

a policy object where you can set thresholds for limiting the content/size of generated files. See Data::Tubes::Util::Output.

write_to_files

Alias for "to_files".

BUGS AND LIMITATIONS

Report bugs either through RT or GitHub (patches welcome).

AUTHOR

Flavio Poletti <polettix@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2016 by Flavio Poletti <polettix@cpan.org>

This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.