Name
Schedule::Depend
Synopsis
Single argument is assumed to be a schedule, either newline delimited text or array referent:
my $q = Scheduler->prepare( "newline delimited schedule" );
my $q = Scheduler->prepare( [ qw(array ref of schedule lines) ] );
Multiple items are assumed to be a hash, which much include the "depend" argument.
my $q = Scheduler->prepare( depend => "foo:bar", verbose => 1 );
Object can be saved and used to execute the schedule or the schedule can be executed (or debugged) directly:
$q->debug;
$q->execute;
Scheduler->prepare( depend => $depend)->debug;
Scheduler->prepare( depend => $depend, verbose => 1 )->execute;
Since the deubgger exits nonzero on a bogus queue:
Scheduler->prepare( depend => $depend)->debug->execute;
The "unalias" method can be safely overloaded for specialized command construction at runtime; precheck can be overloaded in cases where the status of a job can be determined easily (e.g., via /proc). A "cleanup" method may be provided, and will be called after the job is complete.
Arguments
- sched
-
The dependencies are described much like a Makefile, with targets waiting for other jobs to complete on the left and the dependencies on the right. Schedule lines can have single dependencies like:
waits_for : depends_on
or multiple dependencies:
wait1 wait2 : dep1 dep2 dep3
or no dependencies:
runs_immediately :
Dependencies without a wait_for argument are an error (e.g., ": foo" will croak during prepare).
The schedule can be passed as a single argument (string or referent) or with the "depend" key as a hash value:
depend => [ schedule as seprate lines in an array ] or depend => "newline delimited schedule, one item per line";
- verbose
-
Turns on verbose execution for preparation and execution.
All output controlled by verbosity is output to STDOUT; errors, roadkill, etc, are written to STDERR.
verbose == 0 only displays a few fixed preparation and execution messages. This is mainly intended for production system with large numbers of jobs where searching a large output would be troublesome.
verbose == 1 displays the input schedule contents during preparation and fork/reap messages as jobs are started.
verbose == 2 is intended for monitoring automatically generated queues and debugging new schedules. It displays the input lines as they are processed, forks/reaps, exit status and results of unalias calls before the jobs are exec-ed.
verbose can also be specified in the schedule, with schedule settings overriding the args. If no verbose setting is made then debug runs w/ verobse == 1, execution with 0.
- debug
-
Runs the full prepare but does not fork any jobs, pidfiles get a "Debugging $job" entry in them and an exit of 1. This can be used to test the schedule or debug side-effects of overloaded methods. See also: verbose, above.
Description
Parallel scheduler with simplified make syntax for job dependencies and substitutions. Like make, targets have dependencies that must be completed before the can be run. Unlike make there are no statements for the targets, the targets are themselves executables.
The use of pidfiles with status information allows running the queue in "restart" mode. This skips any jobs with zero exit status in their pidfiles, stops and re-runs or waits for any running jobs and launches anything that wasn't started. This should allow a schedule to be re-run with a minimum of overhead.
Each job is executed via fork/exec. The parent writes out a pidfile with initially two lines: pid and command line. It then closes the pidfile. After the parent detectes the child process exiting the exit status is written to the file and the file closed.
The pidfile serves three purposes:
- On restart any leftover pidfiles with
a zero exit status in them can be skipped.
- Any process used to monitor the result of
a job can simply perform a blocking I/O to
for the exit status to know when the job
has completed. This avoids the monitoring
system having to poll the status.
- Tracking the empty pidfiles gives a list of
the pending jobs. This is mainly useful with
large queues where running in verbose mode
would generate execesive output.
The configuration syntax is make-like. The two sections give aliases and the schedule itself. Aliases and targets look like make rules:
target = expands_to
target : dependency
example:
a = /somedir/abjob.ksh
b = /somedir/another.ksh
c = /somedir/loader
a : /somedir/startup.ksh
b : /somedir/startup.ksh
c : a b
/somedir/validate : a b c
Will use the various path expansions for "a", "b" and "c" in the targets and rules, running /somedir/abjob.ksh only after /somedir/startup.ksh has exited zero, the same for /somedir/another.ksh. The file /somedir/loader gets run only after both abjob.ksh and another.ksh are done with and the validate program gets run only after all of the other three are done with.
The main uses of aliases would be to simplify re-use of scripts. One example is the case where the same code gets run multiple times with different arguments:
# comments are introduced by '#', as usual.
# blank lines are also ignored.
a = /somedir/process 1
b = /somedir/process 2
c = /somedir/process 3
d = /somedir/process 4
e = /somedir/process 5
f = /somedir/process 6
a : /otherdir/startup # startup.ksh isn't aliased
b : /otherdir/startup
c : /otherdir/startup
d : a b
e : b c
f : d e
cleanup : a b c d e f
Would allow any variety of arguments to be run for the a-f code simply by changing the aliases, the dependencies remain the same.
Another example is a case of loading fact tables after the dimensions complete:
fact1 fact2 fact3 : dim1 dim2 dim3
Would load all of the dimensions at once and the facts afterward. Note that stub entries are not required for the dimensions, they are added as runnable jobs when the rule is read.
Overloading the "unalias" method to properly select the shell comand for loading the files would leave this as the entire schedule. An example overloaded method would look like:
sub unalias
{
my $que = shift;
my $diskfile = shift;
my $tmufile = "$tmudir/$diskfile.tmu";
-e $tmufile or croak "$$: Missing: $tmufile";
my $logfile = "$logdir/$diskfile.log";
# hand back the completed tmu command.
"rb_tmu $tmufile \$RB_USER < $diskfile > $logfile 2>&1"
}
A more flexable unalias might decide if the file should be unzipped and piped or simply redirected and whether to zip the logfile as it is processed.
Since the executed code is fork-execed it can contain any useful environment variables also:
a = process --seq 1 --foo=$BAR
will interpolate $BAR at fork-time in the child process (i.e.. by the shell handling the exec portion).
The scheduling module exports modules for managing the preparation, validation and execution of schedule objects. Since these are separated they can be manipulated by the caller as necessary.
One example would be to read in a set of schedules, run the first one to completion, modify the second one based on the output of the first. This might happen when jobs are used to load data that is not always present. The first schedule would run the data extract/import/tally graphs. Code could then check if the tally shows any work for the intermittant data and stub out the processing of it by aliasing the job to "/bin/true":
/somedir/somejob.ksh = /bin/true
prepare = /somedir/extract.ksh
load = /somedir/batchload.ksh
/somedir/somejob.ksh : prepare
/somedir/ajob.ksh : prepare
/somedir/bjob.ksh : prepare
load : /somedir/somejob.ksh /somedir/ajob.ksh /somedir/bjob.ksh
In this case /somedir/somejob.ksh will be stubbed to exit zero immediately. This will not interfere with any of the scheduling patterns, just reduce any dealays in the schedule.
Known Bugs
Running $q->debug then $q->execute( ... restart => 1 ) will result in nothing being executed. The restart option will check, find that all of the
Author
Steven Lembark, Knightsbridge Solutions slembark@knightsbridge.com
Copyright
(C) 2001-2002 Steven Lembark, Knightsbridge Solutions
This code is released under the same terms as Perl istelf. Please see the Perl-5.6.1 distribution (or later) for a full description.
In any case, this code is release as-is, with no implied warranty of fitness for a particular purpose or warranty of merchantability.
SEE ALSO
perl(1).
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 1301:
You forgot a '=back' before '=head1'