NAME
Parallel::ForkControl - Finer grained control of processes on a Unix System
SYNOPSIS
use Parallel::ForkControl;
my $forker = new Parallel::ForkControl(
WatchCount => 1,
MaxKids => 50,
MinKids => 5,
WatchLoad => 1,
MaxLoad => 8.00,
Name => 'My Forker',
Code => \&mysub
);
my @hosts = qw/host1 host2 host3 host5 host5/;
my $altSub = sub { my $t = shift; ... };
foreach my $host (@hosts) {
if( $host eq 'alternateHost' ) {
$forker->run( $altSub, $host );
}
else {
$forker->run($host);
}
}
$forker->waitforkids(); # wait for all children to finish;
my $results = $forker->get_results(); # Get the Return Codes from Children
# $results = {
# '29786' => { # Kid PID
# 'status' => 'string',
# 'exitcode' => int,
# 'return' => $scalarCopyofReturnValue,
# 'signature' => $scalarFreezeOfArguments,
# }, ...
$forker->clear_results(); # Reset the Results Tracker
.....
DESCRIPTION
Parallel::ForkControl introduces a new and simple way to deal with fork()ing. The 'Code' parameter will be run everytime the run() method is called on the fork object. Any parameters passed to the run() method will be passed to the subroutine ref defined as the 'Code' arg. This allows a developer to spend less time worrying about the underlying fork() system, and just write code.
METHODS
- new([ Option => Value ... ])
-
Constructor. Creates a Parallel::ForkControl object for using. Ideally, all options should be set here and not changed, though the accessors and mutators allow such behavior, even while the run() method is being executed.
- Options
-
- Name
-
Process Name that will show up in a 'ps', mostly cosmetic, but serves as an easy way to distinguish children and parent in a ps.
- ProcessTimeOut
-
The max time any given process is allowed to run before its interrupted. Default :120 seconds
- WatchCount
-
Enforce count (MaxKids) restraints on new processes. Default : 1
- WatchLoad
-
Enforce load based (MaxLoad) restraints on process creation. NOTE: This MUST be a true value to enable throttling based on Load Averages. Default : 0
- WatchMem ***
-
(unimplemented)
- WatchCPU ***
-
(unimplemented)
- Method
-
May be 'block' or 'cycle'. Block will fork off MaxKids and wait for all of them to die, then fork off MaxKids more processes. Cycle will continually replace processes as the restraints allow. Cycle is almost ALWAYS the preferred method. Default :Cycle B
- MaxKids
-
The maximum number of children that may be running at any given time. Default : 5
- MinKids
-
The minimum number of kids to keep running regardless of load/memory/CPU throttling. Default : 1
- MaxLoad
-
The maximum one minute average load. Make sure to set WatchLoad. Default : 4.50 (off by default)
- MaxMem ***
-
(unimplemented)
- MaxCPU ***
-
(unimplemented)
- Code
-
This should be a subroutine reference. If you intend on passing arguments to this subroutine arguments it is imperative that you NOT include () in the reference. All code inside the subroutine will be run in the child process. The module provides all the necessary checks and safety nets, so your subroutine may just "return". It is not necessary, nor is it good practice to have exit()s in this subroutine as eventually, return codes are stored and made available to the parent process after completion. Examples:
my $code = sub { # do something useful my $t = shift; return $t; }; my $forker = new Parallel::ForkControl( Name => 'me', MaxKids => 10, Code => $code # or #Code => \&mysub ) sub mysub { my $t = shift; return $t; }
Alternatively, you may pass the sub reference as the first argument of the run() method.
- Accounting
-
By default this is turned off. If you would like to keep track of the exit codes, sub routine return values, and current status of the children forked by the run() routine, enable this option:
Accounting => 1
- TrackArgs
-
By setting this to a true value, the fork controller will keep track of the arguments passed to each of the children. Using this you can see what arguments yielded which results. This argument truly only makes sense if you've enabled the Accounting option.
- Check_At
-
This determines between how many child processes the module does some checking to verify the validity of its internal process table. It shouldn't be necessary to modify this value, but given it is a little low, someone only utilizing this module for a larger number of data sets might want to check things at larger intervals. Default : 2
- Debug
-
A number 0-4. The higher the number, the more debugging information you'll see. 0 means nothing. Default : 0
- run([ @ARGS ])
-
This method calls the subroutine passed as the Code option. This method handles process throttling, creation, monitoring, and reaping. The subroutine in the Code option run in the child process and all control is returned to the parent object as soon as the child is successfully created. run() will block until it is allowed to create a process or process creation fails completely. run() returns the PID of the child on success, or undef on failure. NOTE: This is not the return code of your subroutine. I will eventually provide mapping to argument sets passed to run() with success/failure options and (idea) a "Report" option to enable some form of reporting based on that API.
- waitforkids()
-
This method blocks until all children have finished processing.
- cleanup()
-
Alias for waitforkids(), provided for legacy applications
- get_results( [ $pid ])
-
This method returns a hash reference of the arguments and return codes of the children:
$hashref = { '2975' => { # PID of Child exitcode => 0, status => 'done', signature => $FrozenScalar, return => $ReferenceToReturnValue }, .... };
The $pid is optional, but if specified, will return:
$hashref = { exitcode => 0, status => 'done', signature => $FrozenScalar, return => $ReferenceToReturnValue };
Requires Accounting => 1 and optionally TrackArgs => 1
- clear_results()
-
This method clears the results hash.
- kids()
-
This method returns the PIDs of all the children still alive in array context. In scalar context it returns the number of children still running.
- kid_time( $PID )
-
This method returns the start time in epoch seconds that the PID began.
EXPORT
None by default.
KNOWN ISSUES
- 01/08/2004 - brad@divisionbyzero.net
-
For some reason, I'm having to throttle process creation, as a slew of processes starting and ending at the same time seems to be causing problems on my machine. I've adjust the Check_At down to 2 which seems to catch any processes whose SIG{CHLD} gets lost in the mess of spawning. I'm looking into a more permanent, professional solution.
SEE ALSO
perldoc -f fork, search CPAN for Parallel::ForkManager
AUTHOR
Brad Lhotsky <brad@divisionbyzero.net>
CONTRIBUTIONS BY
Mark Thomas <mark@ackers.net>
COPYRIGHT AND LICENSE
Copyright 2003 by Brad Lhotsky
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.