NAME
Fsdb::Filter - base class for Fsdb filters
DESCRIPTION
Fsdb::Filter is the virtual base class for Fsdb filters.
Users will typically invoke individual programs via the command line (for example, see dbcol(1)) or string together several in a Perl program as described in dbpipeline(3).
For new Filter developers, internal processing is:
new
set_defaults
parse_options
autorun if desired
parse_options # optionally called additional times
setup # does IO on header
run # does IO on data
finish # any shutdown
In addition, the info
method returns metadata about a given filter.
FUNCTIONS
new
$fsdb = new Fsdb::Filter;
Create a new filter object, calling set_defaults and parse_options. A user program will call a specific filter (say Fsdb::Filter::dbcol) to do processing. See also dbpipeline for aliases that remove the wordiness.
post_new
$filter->post_new();
Called when the subclass is done with new, giving Fsdb::Filter a chance to autorun.
set_defaults
$filter->set_defaults();
Set up object defaults. Called once during new.
Fsdb::Filter::set_defaults does some general setup, tracking module invocation and preparing for one input and output stream.
set_default_tmpdir
$filter->set_default_tmpdir
Figure out a tmpdir, from environment variables if necessary.
parse_options
$filter->parse_options(@ARGV);
Parse_options is called one or more times to parse ARGV-style options. It should not do any IO or any irreverssable actions; defer those to startup.
Fsdb::Filter::parse_options does no work; the subclass is expected to call Fsdb::Filter::get_options() for all arguments.
Most modules implement certain common fsdb options, listed below.
This module also supports the standard fsdb options:
- -d
-
Enable debugging output.
- -i or --input InputSource
-
Read from InputSource, typically a file name, or
-
for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. - -o or --output OutputDestination
-
Write to OutputDestination, typically a file name, or
-
for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. - --autorun or --noautorun
-
By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The
--(no)autorun
option controls that behavior within Perl. - --noclose
-
By default, programs close their output when done. For some cases where objects are used internally,
--noclose
may be used to leave output open for further I/O. (This option is only supported by some filters.) - --saveoutput $OUT_REF
-
By default, programs close their output when done. With this option, programs in Perl can have a subprogram create an output refrence and return it to the caller in
$OUT_REF
. The caller can then use it for further I/O. (This option is only supported by some filters.) - --help
-
Show help.
- --man
-
Show full manual.
parse_target_column
$self->parse_target_column(\@argv);
A helper function: allow one column to be specified as the _target_column
.
get_options
$success = $filter->get_options(\@argv, "v+" => \$verbose, ...)
get_options is just like Getopt::Long's GetOptions, but takes the argument list as the first argument. This list is modified and any non-options are returned. It also saves _orig_argv in itself.
parse_sort_option
$fsdb_io = $filter->parse_sort_option($option_name, $target);
This helper function handles sorting options and column names as described in dbsort(1). We normalize long sort options to unbundled short options and accumulate them in $self->{_sort_argv}.
parse_io_option
$fsdb_io = $filter->parse_io_option($io_direction, $option_name, $target);
This helper function handles --input
or --output
options, without doing any setup.
It fills in $self->{_$IO_DIRECTION} with the resulting object, which is either a file handle or Fsdb::Filter::Piepline object, and expects finish_io_option
to convert this token into a full Fsdb::IO object.
$IO_DIRECTION is usually input or output, but it can also be inputs (with an "s") when multiple input sources are allowed.
finish_one_io_option
$fsdb_io = $filter->finish_io_option($io_direction, $token, @fsdb_args);
This helper function finishes setting up a Fsdb::IO object in $IO_DIRECTION, using $TOKEN as information. using @FSDB_ARGS as parameters. It creates the actual Fsdb::IO objects, opens the files (or whatever), and reads the headers. It returns the $FSDB_IO option.
$IO_DIRECTION must be "input" or "output".
Since it does IO, finish_io_option should only be called from setup, not parse_options.
Can be called once per IO stream.
finish_io_option
$filter->finish_io_option($io_direction, @fsdb_args);
This helper function finishes setting up a Fsdb::IO object in $IO_DIRECTION, using @FSDB_ARGS as parameters. It creates the actual Fsdb::IO objects, opens the files (or whatever), and reads the headers. the resulting Fsdb::IO objects are built from $self-
{_$IO_DIRECTION}> and are left in $self-
{_in}> or (_out
or @_ins
).
$IO_DIRECTION must be "input", "inputs" or "output".
Since it does IO, finish_io_option should only be called from setup, not parse_options.
Can be called once per IO stream.
No return value.
direction_to_stdio
$fh = direction_to_stdio($direction)
Private internal routing. Give a filehandle for STDIN or STDOUT based on $DIRECTION == 'input or 'output'
finish_fh_io_option
$filter->finish_fh_io_option($io_direction);
This helper function creates a filehandle in $IO_DIRECTION. Compare to finish_io_option which creates a Fsdb::IO object. It creates the actual IO::File objects, opens the files (or whatever). The filehandle is built from $self-
{_$IO_DIRECTION}> and are left in $self-
{_in}> or (_out
).
$IO_DIRECTION must be "input" or "output".
This function does no IO.
No return value.
setup
$filter->setup();
Do any setup that requires minimal IO (for example, reading and parsing headers).
Called exactly once.
run
$filter->run();
Execute the body, typically iterating over the input rows.
Called exactly once.
compute_program_log
$log = $filter->figure_program_log();
Compute and return the log entry for a program.
finish
$filter->finish();
Write out any trailing comments and close output.
setup_run_finish
$filter->setup_run_finish();
Shorthand for doing everything needed to run a command straightaway.
info
$filter->info($INFOTYPE)
Return information about what the filter does. Infotypes:
- input_type Types of input accepted. Raw types are: "fsdbtext", "fsdbobj", "fsdb*", "text", or "none".
- output_type Type of output produced. Same format as input_type.
- input_count Number of input streams (usually 1).
- output_count Number of input streams (usually 1).
CLASS-SPECIFIC UTILITY ROUTINES
Filter has some class-specific utility routines in it. (I.e., they know about $self.)
create_pass_comments_sub
$filter->create_pass_comments_sub
or
$filter->create_pass_comments_sub('_VALUE');
Creates a code block suitable for passing to Fsdb::IO::Readers
-comment_handler
that passes comments through to $self-
{_out}>. Or with the optional argument, through $self-
{_VALUE}>.
create_tolerant_pass_comments_sub
$filter->create_tolerant_pass_comments_sub
or
$filter->create_tolerant_pass_comments_sub('_VALUE');
Like $self-
create_pass_comments_sub>, but this version tolerates the output not being opened. In those cases, comments are discarded. Warning: use carefully to guarantee consistent results.
A symptom requiring tolerance is to get an error like "Can't call method "write_raw" on an undefined value at /usr/lib/perl5/vendor_perl/5.10.0/Fsdb/Filter.pm line 678." (which will be the sub create_pass_comments_sub ($;$) line in create_pass_comments.)
create_delay_comments_sub
$filter->create_delay_comments_sub($optional_value);
Creates a code block suitable for passing to Fsdb::IO::Readers -comment_handler that will buffer comments for automatic (from $self->final) after all other IO. No output occurs until finish() is called, at which time $self->{_out}
must be a live Fsdb object.
create_compare_code
$filter->create_compare_code($a_fsdb, $b_fsdb, $a_fref_name, $b_fref_name).
Write compare code based on sort-style options stored in $self-
{_sort_argv}>. $A_FSDB
and $B_FSDB
are the Fsdb::IO object that defines the schemas for the two objects. We assume the variables $a
and $b
point to arefs; these names can be overridden by specifying $A_FREF_NAME
and $B_FREF_NAME
.
Returns undef if there are no fields in $self-
{_sort_argv}>.
numeric_formatting
$out = $self->numeric_formatting($x)
Display a floating point number $x using $self->{_format}, handling possible non-numeric "-" as a special case.
setup_exactly_two_inputs
$self->setup_exactly_two_inputs
Ensure that there are exactly two input streams. Common to dbmerge and dbjoin.
NON-CLASS UTILITY ROUTINES
Filter also has some utility routines that are not part of the class structure. They are not exported.
(none currently)