NAME
Data::Tubes::Plugin::Writer
DESCRIPTION
This module contains functions to ease using tubes.
FUNCTIONS
Functions starting with write_
have an equivalent form without this prefix.
dispatch_to_files
my $tube = dispatch_to_files($filename, %args); # OR
my $tube = dispatch_to_files(%args); # OR
my $tube = dispatch_to_files(\%args);
composition of dispatch
from Data::Tubes::Plugin::Plumbing and "to_files", allows handling multiple output channels selected on the base of the contents of the input record. This is the most flexible mechanism available to relate the output channel to the input record, while at the same time taking advantage of automatic handling of output segmentation into multiple files (as provided by "to_files").
Accepts the same arguments as "to_files", although it will always override parameter filename
(for obvious reasons!). This parameter can be set to either a sub reference that is supposed to generate a file name or a handle each time it is invoked (as filename_factory
) or a string holding a template filename (as filename_template
), so it is a handy shortcut for both. For this reason, it is also the default parameter when passed as the first, unnamed option.
The function also accepts all options from dispatch
in Data::Tubes::Plugin::Plumbing, plus the following ones:
filename
-
handy shortcut for either
filename_factory
orfilename_template
, so this is NOT passed over directly to "to_files"; filename_factory
-
a sub reference that will emit anything valid for
filename
in "to_files". It will be fed with the key and the record, seedispatch
in Data::Tubes::Plugin::Plumbing for details; filename_template
-
a meta-template string, i.e. a Template::Perlish template that will be expanded based on a hash with the following keys:
key
-
whatever passed by
dispatch
in Data::Tubes::Plugin::Plumbing; record
-
the current record.
This field is used only if a
filename_factory
is not available.The expansion should return anything valid for "to_files".
As an example, suppose you want to generate your filenames based on the key passed by
dispatch
, and on one additional fieldfoo
in the first record for that key. You might have afilename_template
like the following:$template = 'output-[% key %]-[% record.foo %].%03d.txt';
After the expansion, you can get the following templates:
output-bar-whatever.%03d.txt output-baz-yuppie.%03d.txt ...
i.e. templates that can be further expanded according to a policy.
tp_opts
-
options for Template::Perlish, e.g. if you want to change the delimiters.
An example is due at this point:
my %dtf_tube = dispatch_to_files(
# options for `dispatch_to_files` directly
filename_template => 'output-[% key %]-%02d.txt',
# options for `Data::Tubes::Plugin::Plumbing::dispatch`. This is
# used to automatically generate the "key" from the input record,
# i.e. the key will be $record->{structured}{class}
key => [qw< structured class >],
# options for `to_files`
policy => { records_threshold => 10 },
header => '{{{',
footer => '}}}',
);
to_files
my $tube = to_files($filename, %args); # OR
my $tube = to_files(%args); # OR
my $tube = to_files(\%args);
generate a tube for writing to files.
In this context, file is something quite broad, ranging from one single file, to filehandles, to families of files that share a common way to derive their filename.
This factory uses Data::Tubes::Util::Output, so you might want to take a look there too.
The central argument is filename
, that can also be set as an initial unnamed parameter in the arguments list. You can set it in different ways:
- filehandle
-
and this will be used. No operations will be performed on it, apart printing (so, no
binmode
, noclose
, etc.) CORE::open
-compliant thingie-
i.e. a string with the name of a file or a reference to a string;
- filename template
-
i.e. a template that is ready for expansion (via
sprintffy
in Data::Tubes::Util. This is useful if your output should be segmented into multiple files based on apolicy
(another argument to the factory>, where the name can containsprintf
-like sequences (most notably,%n
represents the increasing id of the file, and%02n
is the same, but printed in at least two characters and zero-padded); - sub reference
-
that is supposed to return either a filehandle or a filename at each call. This is how you can gain maximum flexibility at the expense of more coding on your side.
Most of the times you'll probably be interested in the filename template, so here's an example:
$template = 'my-output-%02d.txt
expands to
my-output-00.txt
my-output-01.txt
...
The following expansions are available:
%(\d*)n
-
expands to the current index for a file, always increasing and starting from
0
. The optional digits are handled like an integer expansion inCORE::sprintf
; %Y
-
expands to the year (four digits);
%m
-
expands to the month (two digits, zero-padded on the left, starting from 1);
%d
-
expands to the day (two digits, zero-padded on the left, starting from 1);
%H
-
expands to the hour (two digits, zero-padded on the left, starting from 0);
%M
-
expands to the minute (two digits, zero-padded on the left, starting from 0);
%S
-
expands to the second (two digits, zero-padded on the left, starting from 0);
%z
-
expands to the time zone (in the format
[-+]\d\d:\d\d
); %D
-
expands to the date without separators, same as
%Y%m%d
; %T
-
expands to the time without separators and including the time zone, same as
%H%M%S%z
; %t
-
expands to the a full timestamp without separators and including the time zone, same as
%Y%m%d%H%M%S%z
; %%
-
expands to a literal percent sign, in case you were wondering.
NOTE: if you want to put a timestamp, use %t
instead of %D
and %T
. The two expansions will rely on two different calls to CORE::localtime
, which means that there is the very slight chance that you might trip over the day change and get the date for the previous day, but the time of the next one, which makes you lose a day. Using %t
takes all the variables in one single call, so it always provides a consistent read.
If you provide a string filename
field that has no expansion, but at the same time set a policy
that will lead to generating multiple files, the first file will be called exactly as specified in filename
, and the following one will have the name with appended an underscore character and the number (starting from 1) without padding. So, the following filename
:
$template = 'my-output.txt'
expands to:
my-output.txt
my-output.txt_1
my-output.txt_2
...
If you don't set a policy
, or your thresholds are not hit, then only the first filename will be used of course.
The following arguments are accepted:
binmode
-
value to set via
CORE::binmode
to opened filehandles (not to provided ones though). See Data::Tubes::Util::Output; filename
-
see above. Defaults to standard output;
-
data to be inserted as footer when closing/releasing a file, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;
header
-
data to be inserted as header when opening/starting to use a file, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;
input
-
input field in the record. This is what will actually be printed. Defaults to
rendered
, in compliance with the output of tubes from Data::Tubes::Plugin::Renderer. interlude
-
data to be inserted between records printed out, eventually passed to Data::Tubes::Util::Output. This parameter is initially passed to "read_file_maybe" in Data::Tubes::Util, so it can either be a string (as required by Data::Tubes::Util::Output, or an array reference that is expanded into a list passed to "read_file" in Data::Tubes::Util for reading the text;
name
-
name of the tube, useful when debugging;
policy
-
a policy object where you can set thresholds for limiting the content/size of generated files. See Data::Tubes::Util::Output.
write_to_files
Alias for "to_files".
BUGS AND LIMITATIONS
Report bugs either through RT or GitHub (patches welcome).
AUTHOR
Flavio Poletti <polettix@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2016 by Flavio Poletti <polettix@cpan.org>
This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.
This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.