NAME

Fsdb::Filter - base class for Fsdb filters

DESCRIPTION

Fsdb::Filter is the virtual base class for Fsdb filters.

Users will typically invoke individual programs via the command line (for example, see dbcol(1)) or string together several in a Perl program as described in dbpipeline(3).

For new Filter developers, internal processing is:

    new
	set_defaults
	parse_options
	autorun if desired
    parse_options   # optionally called additional times
    setup           # does IO on header
    run             # does IO on data
    finish          # any shutdown

In addition, the info method returns metadata about a given filter.

FUNCTIONS

new

$fsdb = new Fsdb::Filter;

Create a new filter object, calling set_defaults and parse_options. A user program will call a specific filter (say Fsdb::Filter::dbcol) to do processing. See also dbpipeline for aliases that remove the wordiness.

post_new

$filter->post_new();

Called when the subclass is done with new, giving Fsdb::Filter a chance to autorun.

set_defaults

$filter->set_defaults();

Set up object defaults. Called once during new.

Fsdb::Filter::set_defaults does some general setup, tracking module invocation and preparing for one input and output stream.

set_default_tmpdir

$filter->set_default_tmpdir

Figure out a tmpdir, from environment variables if necessary.

parse_options

$filter->parse_options(@ARGV);

Parse_options is called one or more times to parse ARGV-style options. It should not do any IO or any irreverssable actions; defer those to startup.

Fsdb::Filter::parse_options does no work; the subclass is expected to call Fsdb::Filter::get_options() for all arguments.

Most modules implement certain common fsdb options, listed below.

This module also supports the standard fsdb options:

-d

Enable debugging output.

-i or --input InputSource

Read from InputSource, typically a file name, or - for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

-o or --output OutputDestination

Write to OutputDestination, typically a file name, or - for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects.

--autorun or --noautorun

By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The --(no)autorun option controls that behavior within Perl.

--noclose

By default, programs close their output when done. For some cases where objects are used internally, --noclose may be used to leave output open for further I/O. (This option is only supported by some filters.)

--saveoutput $OUT_REF

By default, programs close their output when done. With this option, programs in Perl can have a subprogram create an output refrence and return it to the caller in $OUT_REF. The caller can then use it for further I/O. (This option is only supported by some filters.)

--help

Show help.

--man

Show full manual.

parse_target_column

$self->parse_target_column(\@argv);

A helper function: allow one column to be specified as the _target_column.

get_options

$success = $filter->get_options(\@argv, "v+" => \$verbose, ...)

get_options is just like Getopt::Long's GetOptions, but takes the argument list as the first argument. This list is modified and any non-options are returned. It also saves _orig_argv in itself.

parse_sort_option

$fsdb_io = $filter->parse_sort_option($option_name, $target);

This helper function handles sorting options and column names as described in dbsort(1). We normalize long sort options to unbundled short options and accumulate them in $self->{_sort_argv}.

parse_io_option

$fsdb_io = $filter->parse_io_option($io_direction, $option_name, $target);

This helper function handles --input or --output options, without doing any setup.

It fills in $self->{_$IO_DIRECTION} with the resulting object, which is either a file handle or Fsdb::Filter::Piepline object, and expects finish_io_option to convert this token into a full Fsdb::IO object.

$IO_DIRECTION is usually input or output, but it can also be inputs (with an "s") when multiple input sources are allowed.

finish_one_io_option

$fsdb_io = $filter->finish_io_option($io_direction, $token, @fsdb_args);

This helper function finishes setting up a Fsdb::IO object in $IO_DIRECTION, using $TOKEN as information. using @FSDB_ARGS as parameters. It creates the actual Fsdb::IO objects, opens the files (or whatever), and reads the headers. It returns the $FSDB_IO option.

$IO_DIRECTION must be "input" or "output".

Since it does IO, finish_io_option should only be called from setup, not parse_options.

Can be called once per IO stream.

finish_io_option

$filter->finish_io_option($io_direction, @fsdb_args);

This helper function finishes setting up a Fsdb::IO object in $IO_DIRECTION, using @FSDB_ARGS as parameters. It creates the actual Fsdb::IO objects, opens the files (or whatever), and reads the headers. the resulting Fsdb::IO objects are built from $self-{_$IO_DIRECTION}> and are left in $self-{_in}> or (_out or @_ins).

$IO_DIRECTION must be "input", "inputs" or "output".

Since it does IO, finish_io_option should only be called from setup, not parse_options.

Can be called once per IO stream.

No return value.

direction_to_stdio

$fh = direction_to_stdio($direction)

Private internal routing. Give a filehandle for STDIN or STDOUT based on $DIRECTION == 'input or 'output'

finish_fh_io_option

$filter->finish_fh_io_option($io_direction);

This helper function creates a filehandle in $IO_DIRECTION. Compare to finish_io_option which creates a Fsdb::IO object. It creates the actual IO::File objects, opens the files (or whatever). The filehandle is built from $self-{_$IO_DIRECTION}> and are left in $self-{_in}> or (_out).

$IO_DIRECTION must be "input" or "output".

This function does no IO.

No return value.

setup

$filter->setup();

Do any setup that requires minimal IO (for example, reading and parsing headers).

Called exactly once.

run

$filter->run();

Execute the body, typically iterating over the input rows.

Called exactly once.

compute_program_log

$log = $filter->figure_program_log();

Compute and return the log entry for a program.

finish

$filter->finish();

Write out any trailing comments and close output.

setup_run_finish

$filter->setup_run_finish();

Shorthand for doing everything needed to run a command straightaway.

info

$filter->info($INFOTYPE)

Return information about what the filter does. Infotypes:

input_type Types of input accepted. Raw types are: "fsdbtext", "fsdbobj", "fsdb*", "text", or "none".
output_type Type of output produced. Same format as input_type.
input_count Number of input streams (usually 1).
output_count Number of input streams (usually 1).

CLASS-SPECIFIC UTILITY ROUTINES

Filter has some class-specific utility routines in it. (I.e., they know about $self.)

create_pass_comments_sub

$filter->create_pass_comments_sub
or
$filter->create_pass_comments_sub('_VALUE');

Creates a code block suitable for passing to Fsdb::IO::Readers -comment_handler that passes comments through to $self-{_out}>. Or with the optional argument, through $self-{_VALUE}>.

create_tolerant_pass_comments_sub

$filter->create_tolerant_pass_comments_sub
or
$filter->create_tolerant_pass_comments_sub('_VALUE');

Like $self-create_pass_comments_sub>, but this version tolerates the output not being opened. In those cases, comments are discarded. Warning: use carefully to guarantee consistent results.

A symptom requiring tolerance is to get an error like "Can't call method "write_raw" on an undefined value at /usr/lib/perl5/vendor_perl/5.10.0/Fsdb/Filter.pm line 678." (which will be the sub create_pass_comments_sub ($;$) line in create_pass_comments.)

create_delay_comments_sub

$filter->create_delay_comments_sub($optional_value);

Creates a code block suitable for passing to Fsdb::IO::Readers -comment_handler that will buffer comments for automatic (from $self->final) after all other IO. No output occurs until finish() is called, at which time $self->{_out} must be a live Fsdb object.

create_compare_code

$filter->create_compare_code($a_fsdb, $b_fsdb, $a_fref_name, $b_fref_name).

Write compare code based on sort-style options stored in $self-{_sort_argv}>. $A_FSDB and $B_FSDB are the Fsdb::IO object that defines the schemas for the two objects. We assume the variables $a and $b point to arefs; these names can be overridden by specifying $A_FREF_NAME and $B_FREF_NAME.

Returns undef if there are no fields in $self-{_sort_argv}>.

numeric_formatting

$out = $self->numeric_formatting($x)

Display a floating point number $x using $self->{_format}, handling possible non-numeric "-" as a special case.

setup_exactly_two_inputs

$self->setup_exactly_two_inputs

Ensure that there are exactly two input streams. Common to dbmerge and dbjoin.

NON-CLASS UTILITY ROUTINES

Filter also has some utility routines that are not part of the class structure. They are not exported.

(none currently)