NAME
dbpipeline - allow db commands to be assembled as pipelines in Perl
SYNOPSIS
use Fsdb::Filter::dbpipeline qw(:all);
dbpipeline(
dbrow(qw(name test1)),
dbroweval('_test1 += 5;')
);
Or for more customized versions, see "dbpipeline_filter", "dbpipeline_sink", "dbpipeline_open2", and "dbpipeline_close2_hash".
DESCRIPTION
This module makes it easy to create pipelines in Perl using separate processes. (In the past we used to use perl threads.)
By default (as with all Fsdb modules), input is from STDIN and output to STDOUT. Two helper functions, fromfile and tofile can grab data from files.
Dbpipeline differs in several ways from all other Fsdb::Filter modules: it does not have a corresponding Unix command (it is used only from within Perl). It does not log its presence to the output stream (this is arguably a bug, but it doesn't actually do anything).
OPTIONS
Unlike most Fsdb modules, dbpipeline defaults to --autorun
.
This module also supports the standard fsdb options:
- -d
-
Enable debugging output.
- -i or --input InputSource
-
Read from InputSource, typically a file name, or
-
for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. - -o or --output OutputDestination
-
Write to OutputDestination, typically a file name, or
-
for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. - --autorun or --noautorun
-
By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The
--(no)autorun
option controls that behavior within Perl. - --help
-
Show help.
- --man
-
Show full manual.
SEE ALSO
CLASS FUNCTIONS
dbpipeline
dbpipeline(@modules);
This shorthand-routine creates a dbpipeline object and then immediately runs it.
Thus perl code becomes nearly as terse as shell code:
dbpipeline(
dbcol(qw(name test1)),
dbroweval('_test1 += 5;'),
);
The following commands currently have shorthand aliases:
- cgi_to_db(1)
- combined_log_format_to_db(1)
- csv_to_db(1)
- db_to_csv(1)
- db_to_html_table(1)
- dbcol(1)
- dbcolcopylast(1)
- dbcolcreate(1)
- dbcoldefine(1)
- dbcolhisto(1)
- dbcolmerge(1)
- dbcolmovingstats(1)
- dbcolneaten(1)
- dbcolpercentile(1)
- dbcolrename(1)
- dbcolscorrelate(1)
- dbcolsplittocols(1)
- dbcolsplittorows(1)
- dbcolsregression(1)
- dbcolstats(1)
- dbcolstatscores(1)
- dbfilealter(1)
- dbfilecat(1)
- dbfilediff(1)
- dbfilepivot(1)
- dbfilestripcomments(1)
- dbfilevalidate(1)
- dbformmail(1)
- dbjoin(1)
- dbmapreduce(1)
- dbmerge(1)
- dbmerge2(1)
- dbmultistats(1)
- dbrow(1)
- dbrowaccumulate(1)
- dbrowcount(1)
- dbrowdiff(1)
- dbroweval(1)
- dbrowuniq(1)
- dbrvstatdiff(1)
- dbsort(1)
- html_table_to_db(1)
- kitrace_to_db(1)
- mysql_to_db(1)
- tabdelim_to_db(1)
- tcpdump_to_db(1)
- xml_to_db(1)
and
dbpipeline_filter
my($result_reader, $fred) = dbpipeline_filter($source, $result_reader_aref, @modules);
Set up a pipeline of @MODULES that filters data pushed through it, where the data comes from $SOURCE (any Fsdb::Filter::parse_io_option object, like a Fsdb::IO::Reader object, queue, or filename).
Returns a $RESULT_READER Fsdb::IO::Reader object, created with $RESULT_READER_AREF as options. This reader will produce the filtered data, and a $FRED that must be joined to guarantee output has completed.
Or if $RESULT_READER_AREF is [-raw_fh, 1]
, it just returns the IO::Handle to the pipe.
As an example, this code uses dbpipeline_filter
to insure the input (from $in
which is a filename or Fsdb::IO::Reader) is sorted numerically by column x
:
use Fsdb::Filter::dbpipeline qw(dbpipeline_filter dbsort);
my($new_in, $new_fred) = dbpipeline_filter($in,
[-comment_handler => $self->create_delay_comments_sub],
dbsort(qw(--nolog -n x)));
while (my $fref = $new_in->read_rowwobj()) {
# do something
};
$new_in->close;
$new_fred->join();
dbpipeline_sink
my($fsdb_writer, $fred) = dbpipeline_sink($writer_arguments_aref, @modules);
Set up a pipeline of @MODULES that is a data "sink", where the output is given by a --output
argument, or goes to standard output (by default). The caller generates input into the pipeline by writing to a newly created $FSDB_WRITER, whose configuration is specified by the mandatory first argument $WRITER_ARGUMENTS_AREF. (These arguments should include the schema.) Returns this writer, and a $FRED that must be joined to guarantee output has completed.
If the first argument to modules is "--fred_exit_sub", then the second is taken as a CODE block that runs at fred exit (and the two are not passed to modules).
If the first argument to modules is "--fred_description", then the second is taken as a text description of the Fred.
dbpipeline_open2
my($fsdb_reader_fh, $fsdb_writer, $fred) =
dbpipeline_open2($writer_arguments_aref, @modules);
Set up a pipeline of @MODULES that is a data sink and source (both!). The caller generates input into the pipeline by writing to a newly created $FSDB_WRITER, whose configuration is specified by the mandatory argument $WRITER_ARGUMENTS_AREF. These arguments should include the schema.) The output of the pipeline comes out to the newly created $FSDB_READER_FH. Returns this read queue and writer, and a $PID that must be joined to guarantee output has completed.
(Unfortunately the interface is asymmetric with a read queue but a write Fsdb::IO
object, because Fsdb::IO::Reader
blocks on input of the header.)
Like IPC::Open2, with all of its pros and cons like potential deadlock.
dbpipeline_close2_hash
my($href) = dbpipeline_close2_hash($fsdb_read_fh, $fsdb_writer, $pid);
Reads and returns one line of output from $FSDB_READER, after closing $FSDB_WRITER and joining the $PID.
Useful, for example, to get dbcolstats output cleanly.
new
$filter = new Fsdb::Filter::dbpipeline(@arguments);
set_defaults
$filter->set_defaults();
Internal: set up defaults.
parse_options
$filter->parse_options(@ARGV);
Internal: parse options
setup
$filter->_reap();
Internal: reap any forked threads.
setup
$filter->setup();
Internal: setup, parse headers.
run
$filter->run();
Internal: run over all IO
finish
$filter->finish();
Internal: we would write a trailer, but we don't because we depend on the last command in the pipeline to do that. We don't actually have a valid output stream.
AUTHOR and COPYRIGHT
Copyright (C) 1991-2016 by John Heidemann <johnh@isi.edu>
This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.