NAME
Forks::Super - extensions and convenience methods for managing background processes.
VERSION
Version 0.13
SYNOPSIS
use Forks::Super;
use Forks::Super MAX_PROC => 5, DEBUG => 1;
# familiar use - parent returns PID>0, child returns zero
$pid = fork();
die "fork failed" unless defined $pid;
if ($pid > 0) {
# parent code
} else {
# child code
}
# wait for a child process to finish
$w = wait; # blocking wait on any child, $? holds child exit status
$w = waitpid $pid, 0; # blocking wait on specific child
$w = waitpid $pid, WNOHANG; # non-blocking wait, use with POSIX ':sys_wait_h'
$w = waitpid 0, $flag; # wait on any process in current process group
waitall; # block until all children are finished
# -------------- helpful extensions ---------------------
# fork directly to a shell command. Child doesn't return.
$pid = fork { cmd => "./myScript 17 24 $n" };
$pid = fork { exec => [ "/bin/prog" , $file, "-x", 13 ] };
# fork directly to a Perl subroutine. Child doesn't return.
$pid = fork { sub => $methodNameOrRef , args => [ @methodArguments ] };
$pid = fork { sub => \&subroutine, args => [ @args ] };
$pid = fork { sub => sub { "anonymous sub" }, args => [ @args ] );
# put a time limit on the child process
$pid = fork { cmd => $command, timeout => 30 }; # kill child if not done in 30s
$pid = fork { sub => $subRef , expiration => 1260000000 }; # complete by 8AM Dec 5, 2009 UTC
# obtain standard filehandles for the child process
$pid = fork { child_fh => "in,out,err" };
if ($pid == 0) { # child process
sleep 1;
$x = <STDIN>; # read from parent's $Forks::Super::CHILD_STDIN{$pid}
print rand() > 0.5 ? "Yes\n" : "No\n" if $x eq "Clean your room\n";
sleep 2;
$i_can_haz_ice_cream = <STDIN>;
if ($i_can_haz_ice_cream !~ /you can have ice cream/ && rand() < 0.5) {
print STDERR '@#$&#$*&#$*&',"\n";
}
exit 0;
} # else parent process
$child_stdin = $Forks::Super::CHILD_STDIN{$pid};
print $child_stdin "Clean your room\n";
sleep 2;
$child_stdout = $Forks::Super::CHILD_STDOUT{$pid};
$child_response = <$child_stdout>; # -or- = Forks::Super::read_stdout($pid);
if ($child_response eq "Yes\n") {
print $child_stdin "Good boy. You can have ice cream.\n";
} else {
print $child_stdin "Bad boy. No ice cream for you.\n";
sleep 2;
$child_err = Forks::Super::read_stderr($pid);
# -or- $child_err = readline($Forks::Super::CHILD_STDERR{$pid});
print $child_stdin "And no back talking!\n" if $child_err;
}
# ---------- manage jobs and system resources ---------------
# runs 100 tasks but the fork call blocks when there are already 5 jobs running
$Forks::Super::MAX_PROC = 5;
$Forks::Super::ON_BUSY = 'block';
for ($i=0; $i<100; $i++) {
$pid = fork { cmd => $task[$i] };
}
# jobs fail (without blocking) if the system is too busy
$Forks::Super::MAX_PROC = 5;
$Forks::Super::ON_BUSY = 'fail';
$pid = fork { cmd => $task };
if ($pid > 0) { print "'$task' is running\n" }
elsif ($pid < 0) { print "5 or more jobs running -- didn't start '$task'\n"; }
# $Forks::Super::MAX_PROC setting can be overridden. Start job immediately if < 3 jobs running
$pid = fork { sub => 'MyModule::MyMethod', args => [ @b ], max_proc => 3 };
# try to fork no matter how busy the system is
$pid = fork { force => 1, sub => \&MyMethod, args => [ @my_args ] };
# when system is busy, queue jobs. When system is not busy, some jobs on the queue will start.
# if job is queued, return value from fork() is a very negative number
$Forks::Super::ON_BUSY = 'queue';
$pid = fork { cmd => $command };
$pid = fork { cmd => $useless_command, queue_priority => -5 };
$pid = fork { cmd => $important_command, queue_priority => 5 };
$pid = fork { cmd => $future_job, delay => 20 } # keep job on queue for at least 20s
# assign descriptive names to tasks
$pid1 = fork { cmd => $command, name => "my task" };
$pid2 = waitpid "my task", 0;
# run callbacks at various points of job life-cycle
$pid = fork { cmd => $command, callback => \&on_complete };
$pid = fork { sub => $sub, callback => { start => 'on_start', finish => \&on_complete,
queue => sub { print "Job $_[1] queued.\n" } } };
# set up dependency relationships
$pid1 = fork { cmd => $job1 };
$pid2 = fork { cmd => $job2, depend_on => $pid1 }; # put on queue until job 1 is complete
$pid4 = fork { cmd => $job4, depend_start => [$pid2,$pid3] }; # put on queue until jobs 2,3 have started
$pid5 = fork { cmd => $job5, name => "group C" };
$pid6 = fork { cmd => $job6, name => "group C" };
$pid7 = fork { cmd => $job7, depend_on => "group C" }; # wait for jobs 5 & 6 to complete
# manage OS settings on jobs -- not available on all systems
$pid1 = fork { os_priority => 10 }; # like nice(1) on Un*x
$pid2 = fork { cpu_affinity => 0x5 }; # background task will prefer CPUs #0 and #2
# job information
$state = Forks::Super::state($pid); # 'ACTIVE', 'DEFERRED', 'COMPLETE', 'REAPED'
$status = Forks::Super::status($pid); # exit status for completed jobs
# --- evaluate long running expressions in the background
$result = bg_eval { a_long_running_calculation() };
# sometime later ...
print "Result was $$result\n";
DESCRIPTION
This package provides new definitions for the Perl functions fork, wait, and waitpid with richer functionality. The new features are designed to make it more convenient to spawn background processes and more convenient to manage them and to get the most out of your system's resources.
$pid = fork( \%options )
The new fork call attempts to spawn a new process. With no arguments, it behaves the same as the Perl system call fork():
creating a new process running the same program at the same point
returning the process id (PID) of the child process to the parent.
On Windows, this is a pseudo-process ID
returning 0 to the child process
returning
undefif the fork call was unsuccessful
Options for instructing the child process
The fork call supports three options, cmd, exec, and sub (or sub/args) that will instruct the child process to carry out a specific task. Using either of these options causes the child process not to return from the fork call.
$child_pid = fork { cmd => $shell_command }$child_pid = fork { cmd => \@shell_command }-
On successful launch of the child process, runs the specified shell command in the child process with the Perl
system()function. When the system call is complete, the child process exits with the same exit status that was returned by the system call.Returns the PID of the child process to the parent process. Does not return from the child process, so you do not need to check the fork() return value to determine whether code is executing in the parent or child process.
$child_pid = fork { exec => $shell_command }$child_pid = fork { exec => \@shell_command }-
Like the
cmdoption, but the background process launches the shell command withexecinstead of withsystem.Using
execinstead ofcmdwill spawn one fewer process, but note that thetimeoutandexpirationoptions cannot be used with theexecoption (see "Options for simple job management").
$child_pid = fork { sub => $subroutineName [, args => \@args ] }$child_pid = fork { sub => \&subroutineReference [, args => \@args ] }$child_pid = fork { sub => sub { ... code ... } [, args => \@args ] }-
On successful launch of the child process,
forkinvokes the specified Perl subroutine with the specified set of method arguments (if provided). If the subroutine completes normally, the child process exits with a status of zero. If the subroutine exits abnormally (i.e., if itdies, or if the subroutine invokesexitwith a non-zero argument), the child process exits with non-zero status.Returns the PID of the child process to the parent process. Does not return from the child process, so you do not need to check the fork() return value to determine whether code is running in the parent or child process.
If neither the
cmdor thesuboption is provided to the fork call, then the fork() call behaves like a Perlfork()call, returning the child PID to the parent and also returning zero to the child.
Options for simple job management
fork { timeout => $delay_in_seconds }fork { expiration => $timestamp_in_seconds_since_epoch_time }-
Puts a deadline on the child process and causes the child to
dieif it has not completed by the deadline. With thetimeoutoption, you specify that the child process should not survive longer than the specified number of seconds. Withexpiration, you are specifying an epoch time (like the one returned by thetimefunction) as the child process's deadline.If the
setpgrp()system call is implemented on your system, then this module will reset the process group ID of the child process. On timeout, the module will attempt to kill off all subprocesses of the expiring child process.If the deadline is some time in the past (if the timeout is not positive, or the expiration is earlier than the current time), then the child process will die immediately after it is created.
Note that this feature uses the Perl
alarmcall with a handler forSIGALRM. If you use this feature and also specify asubto invoke, and that subroutine also tries to use thealarmfeature or set a handler forSIGALRM, the results will be undefined.The
timeoutandexpirationoptions cannot be used with theexecoption, since the child process will not be able to generate aSIGALRMafter anexeccall. fork { delay => $delay_in_seconds }fork { start_after => $timestamp_in_epoch_time }-
Causes the child process to be spawned at some time in the future. The return value from a
forkcall that uses these features will not be a process id, but it will be a very negative number called a job ID. See the section on "Deferred processes" for information on what to do with a job ID.A deferred job will start no earlier than its appointed time in the future. Depending on what circumstances the queued jobs are examined, the actual start time of the job could be significantly later than the appointed time.
A job may have both a minimum start time (through
delayorstart_afteroptions) and a maximum end time (throughtimeoutandexpiration). Jobs with inconsistent times (end time is not later than start time) will be killed of as soon as they are created. fork { child_fh => $fh_spec }fork { child_fh => [ @fh_spec ] }-
Note: API change since v0.10.
Launches a child process and makes the child process's STDIN, STDOUT, and/or STDERR filehandles available to the parent process in the scalar variables $Forks::Super::CHILD_STDIN{$pid}, $Forks::Super::CHILD_STDOUT{$pid}, and/or $Forks::Super::CHILD_STDERR{$pid}, where $pid is the PID return value from the fork call. This feature makes it possible, even convenient, for a parent process to communicate with a child, as this contrived example shows.
$pid = fork { sub => \&pig_latinize, timeout => 10, child_fh => "all" }; # in the parent, $Forks::Super::CHILD_STDIN{$pid} is an *output* filehandle print {$Forks::Super::CHILD_STDIN{$pid}} "The blue jay flew away in May\n"; sleep 2; # give child time to start up and get ready for input # and $Forks::Super::CHILD_STDOUT{$pid} is an *input* handle $result = <{$Forks::Super::CHILD_STDOUT{$pid}}>; print "Pig Latin translator says: $result\n"; # ==> eThay ueblay ayjay ewflay awayay inay ayMay\n @errors = <{$Forks::Super::CHILD_STDERR{$pid}>; print "Pig Latin translator complains: @errors\n" if @errors > 0; sub pig_latinize { for (;;) { while (<STDIN>) { foreach my $word (split /\s+/) { if ($word =~ /^qu/i) { print substr($word,2) . substr($word,0,2) . "ay"; # STDOUT } elsif ($word =~ /^([b-df-hj-np-tv-z][b-df-hj-np-tv-xz]*)/i) { my $prefix = 1; $word =~ s/[b-df-hj-np-tv-z][b-df-hj-np-tv-xz]*//i; print $word . $prefix . "ay"; } elsif ($word =~ /^[aeiou]/i) { print $word . "ay"; } else { print STDERR "Didn't recognize this word: $word\n"; } print " "; } print "\n"; } } }The set of filehandles to make available are specified either as a non-alphanumeric delimited string, or list reference. This spec may contain one or more of the words
in,out,err,join,all, orsocket.in,out, anderrmean that the child's STDIN, STDOUT, and STDERR, respectively, will be available in the parent process through the filehandles in$Forks::Super::CHILD_STDIN{$pid},$Forks::Super::CHILD_STDOUT{$pid}, and$Forks::Super::CHILD_STDERR{$pid}, where$pidis the child's process ID.allis a convenient way to specifyin,out, anderr.joinspecifies that the child's STDOUT and STDERR will be returned through the same filehandle, specified as both$Forks::Super::CHILD_STDOUT{$pid}and$Forks::Super::CHILD_STDERR{$pid}.If
socketis specified, then local sockets will be used to pass between parent and child instead of temporary files.
Socket handles vs. file handles
Here are some things to keep in mind when deciding whether to use sockets or regular files for parent-child IPC:
Sockets have a performance advantage, especially at child process start-up.
Socket input buffers have limited capacity. Write operations can block if the socket reader is not vigilant
On Windows, sockets are blocking, and care must be taken to prevent your script from reading on an empty socket
Socket and file handle gotchas
Some things to keep in mind when using socket or file handles to communicate with a child process.
care should be taken before
close'ing a socket handle. The same socket handle can be used for both reading and writing. Don't close a handle when you are only done with one half of the socket operations.The test
defined getsockname($handle)can determine whether$handleis a socket handle or a regular filehandle.The following idiom is safe to use on both socket handles and regular filehandles:
shutdown($handle,2) || close $handle;IPC in this module is asynchronous. In general, you cannot tell whether the parent/child has written anything to be read in the child/parent. So getting
undefwhen reading from the$Forks::Super::CHILD_STDOUT{$pid}handle does not necessarily mean that the child has finished (or even started!) writing to its STDOUT. Check out theseek HANDLE,0,1trick in the perlfunc documentation for seek about reading from a handle after you have already read past the end. You may find it useful for your parent and child processes to follow some convention (for example, a special word like"__END__") to denote the end of input.
Options for complicated job management
The fork() call from this module supports options that help to manage child processes or groups of child processes in ways to better manage your system's resources. For example, you may have a lot of tasks to perform in the background, but you don't want to overwhelm your (possibly shared) system by running them all at once. There are features to control how many, how, and when your jobs will run.
fork { name => $name }-
Attaches a string identifier to the job. The identifier can be used for several purposes:
to obtain a Forks::Super::Job object representing the background task through the
Forks::Super::Job::getorForks::Super::Job::getByNamemethods.as the first argument to
waitpidto wait on a job or jobs with specific namesto identify and establish dependencies between background tasks. See the
depend_onanddepend_startparameters below.if supported by your system, the name attribute will change the argument area used by the ps(1) program and change the way the background process is displaying in your process viewer. (See $PROGRAM_NAME in perlvar about overriding the special
$0variable.)
$Forks::Super::MAX_PROC = $max_simultaneous_jobsfork { max_fork => $max_simultaneous_jobs }-
Specifies the maximum number of background processes that you want to run. If a
forkcall is attempted while there are already the maximum number of child processes running, then thefork()call will either block (until some child processes complete), fail (return a negative value without spawning the child process), or queue the job (returning a very negative value called a job ID), according to the specified "on_busy" behavior (see the next item). See the "Deferred processes" section for information about how queued jobs are handled.On any individual
forkcall, the maximum number of processes may be overridden by also specifyingmax_procorforceoptions.$Forks::Super::MAX_PROC = 8; # launch 2nd job only when system is very not busy $pid1 = fork { sub => 'method1' }; $pid2 = fork { sub => 'method2', max_proc => 1 }; $pid3 = fork { sub => 'method3' };Setting $Forks::Super::MAX_PROC to zero or a negative number will disable the check for too many simultaneous processes.
$Forks::Super::ON_BUSY = "block" | "fail" | "queue"fork { on_busy => "block" | "fail" | "queue" }-
Dictates the behavior of
forkin the event that the module is not allowed to launch the specified job for whatever reason."block"-
If the system cannot create a new child process for the specified job, it will wait and periodically retry to create the child process until it is successful. Unless a system fork call is attempted and fails,
forkcalls that use this behavior will return a positive PID. "fail"-
If the system cannot create a new child process for the specified job, the
forkcall will immediately return with a small negative value. "queue"-
If the system cannot create a new child process for the specified job, the job will be deferred, and an attempt will be made to launch the job at a later time. See "Deferred processes" below. The return value will be a very negative number (job ID).
On any individual
forkcall, the default launch failure behavior specified by $Forks::Super::ON_BUSY can be overridden by specifying aon_busyoption:$Forks::Super::ON_BUSY = "fail"; $pid1 = fork { sub => 'myMethod' }; $pid2 = fork { sub => 'yourMethod', on_busy => "queue" } fork { force => $bool }-
If the
forceoption is set, theforkcall will disregard the usual criteria for deciding whether a job can spawn a child process, and will always attempt to create the child process. fork { queue_priority => $priority }-
In the event that a job cannot immediately create a child process and is put on the job queue (see "Deferred processes"), the C{queue_priority} specifies the relative priority of the job on the job queue. In general, eligible jobs with high priority values will be started before jobs with lower priority values.
fork { depend_on => $id }fork { depend_on => [ $id_1, $id_2, ... ] }fork { depend_start => $id }fork { depend_start => [ $id_1, $id_2, ... ] }-
Indicates a dependency relationship between the job in this
forkcall and one or more other jobs. The identifiers may be process/job IDs ornameattributes (ses above) from earlierforkcalls.If a
forkcall specifies adepend_onoption, then that job will be deferred until all of the child processes specified by the process or job IDs have completed. If aforkcall specifies adepend_startoption, then that job will be deferred until all of the child processes specified by the process or job IDs have started.Invalid process and job IDs in a
depend_onordepend_startsetting will produce a warning message but will not prevent a job from starting.Dependencies are established at the time of the
forkcall and can only apply to jobs that are known at run time. So for example, in this code,$job1 = fork { cmd => $cmd, name => "job1", depend_on => "job2" }; $job2 = fork { cmd => $cmd, name => "job2", depend_on => "job1" };at the time the first job is cereated, the job named "job2" has not been created yet, so the first job will not have a dependency (and a warning will be issued when the job is created). This may be a limitation but it also guarantees that there will be no circular dependencies.
When a dependency identifier is a name attribute that applies to multiple jobs, the job will be dependent on all existing jobs with that name:
# Job 3 will not start until BOTH job 1 and job 2 are done $job1 = fork { name => "Sally", ... }; $job2 = fork { name => "Sally", ... }; $job3 = fork { depend_on => "Sally", ... }; # all of these jobs have the same name and depend on ALL previous jobs $job4 = fork { name => "Ralph", depend_start => "Ralph", ... }; # no dependencies $job5 = fork { name => "Ralph", depend_start => "Ralph", ... }; # depends on Job 4 $job6 = fork { name => "Ralph", depend_start => "Ralph", ... }; # depends on #4 and #5 fork { can_launch => \&methodName }fork { can_launch => sub { ... anonymous sub ... } }-
Supply a user-specified function to determine when a job is eligible to be started. The function supplied should return 0 if a job is not eligible to start and non-zero if it is eligible to start.
During a
forkcall or when the job queue is being examined, the user'scan_launchmethod will be invoked with a singleForks::Super::Jobargument containing information about the job to be launched. User code may make use of the default launch determination method by invoking the_can_launchmethod of the job object:# Running on a BSD system with the uptime(1) call. # Want to block jobs when the current CPU load # (1 minute) is greater than 4 and respect all other criteria: fork { cmd => $my_command, can_launch => sub { $job = shift; # a Forks::Super::Job object return 0 if !$job->_can_launch; # default $cpu_load = (split /\s+/,`uptime`)[-3]; # get 1 minute avg CPU load return 0 if $cpu_load > 4.0; # system too busy. let's wait return 1; } } fork { callback => $subroutineName }fork { callback => sub { BLOCK } }fork { callback => { start => ..., finish => ..., queue => ..., fail => ... } }-
Install callbacks to be run when and if certain events in the life cycle of a background process occur. The first two forms of this option are equivalent to
fork { callback => { finish => ... } }and specify code that will be executed when a background process is complete and the module has received its
SIGCHLDevent. Astartcallback is executed just after a new process is spawned. Aqueuecallback is run if the job is deferred for any reason (see "Deferred processes") and the job is placed onto the job queue for the first time. And thefailcallback is run if the job is not going to be launched (that is, a case where theforkcall would return-1).Callbacks are invoked with two arguments when they are triggered: the
Forks::Super::Jobobject that was created with the originalforkcall, and the job's ID (the return value fromfork).You should keep your callback functions short and sweet, like you do for your signal handlers. Sometimes callbacks are invoked from the signal handler, and the processing of other signals could be delayed if the callback functions take too long to run.
fork { os_priority => $priority }-
On supported operating systems, and after the successful creation of the child process, attempt to set the operating system priority of the child process.
On unsupported systems, this option is ignored.
fork { cpu_affinity => $bitmask }-
On supported operating systems with multiple cores, and after the successful creation of the child process, attempt to set the process's CPU affinity. Each bit of the bitmask represents one processor. Set a bit to 1 to allow the process to use the corresponding processor, and set it to 0 to disallow the corresponding processor. There may be additional restrictions on the valid range of values imposed by the operating system.
As of version 0.07, supported systems are Cygwin, Win32, Linux, and possibly BSD.
fork { debug => $bool }fork { undebug => $bool }-
Overrides the value in
$Forks::Super::DEBUG(see "MODULE VARIABLES") for this specific job. If specified, thedebugparameter controls only whether the module will output debugging information related to the job created by thisforkcall.Normally, the debugging settings of the parent, including the job-specific settings, are inherited by child processes. If the
undebugoption is specified with a non-zero parameter value, then debugging will be disabled in the child process.
Deferred processes
Whenever some condition exists that prevents a fork() call from immediately starting a new child process, an option is to defer the job. Deferred jobs are placed on a queue. At periodic intervals, in response to periodic events, or whenever you invoke the Forks::Super::run_queue method in your code, the queue will be examined to see if any deferred jobs are eligible to be launched.
Job ID
When a fork() call fails to spawn a child process but instead defers the job by adding it to the queue, the fork() call will return a unique, large negative number called the job ID. The number will be negative and large enough (<= -100000) so that it can be distinguished from any possible PID, Windows pseudo-process ID, process group ID, or fork() failure code.
Although the job ID is not the actual ID of a system process, it may be used like a PID as an argument to waitpid, as a dependency specification in another fork call's depend_on or depend_start option, or the other module methods used to retrieve job information (See "Obtaining job information" below). Once a deferred job has been started, it will be possible to obtain the actual PID (or on Windows, the actual psuedo-process ID) of the process running that job.
Job priority
Every job on the queue will have a priority value. A job's priority may be set explicitly by including the queue_priority option in the fork() call, or it will be assigned a default priority near zero. Every time the queue is examined, the queue will be sorted by this priority value and an attempt will be made to launch each job in this order. Note that different jobs may have different criteria for being launched, and it is possible that that an eligible low priority job may be started before an ineligible higher priority job.
Queue examination
Certain events in the SIGCHLD handler or in the wait, waitpid, and/or waitall methods will cause the list of deferred jobs to be evaluated and to start eligible jobs. But this configuration does not guarantee that the queue will be examined in a timely or frequent enough basis. The user may invoke the
Forks::Super::run_queue()
method at any time to cause the queue to be examined.
Special tips for Windows systems
On POSIX systems (including Cygwin), programs using the Forks module are interrupted when a child process completes. A callback function performs some housekeeping and may perform other duties like trying to dispatch things from the list of deferred jobs.
Windows systems do not have the signal handling capabilities of other systems, and so other things equal, a script running on Windows will not perform the housekeeping tasks as frequently as a script on other systems.
The method Forks::Super::pause can be used as a drop in replacement for the Perl sleep call. In a pause function call, the program will check on active child processes, reap the ones that have completed, and attempt to dispatch jobs on the queue.
Calling pause with an argument of 0 is also a valid way of invoking the child handler function on Windows. When used this way, pause returns immediately after running the child handler.
Child processes are implemented differently in Windows than in POSIX systems. The CORE::fork and Forks::Super::fork calls will usually return a pseudo-process ID to the parent process, and this will be a negative value. The Unix idiom of testing whether a fork call returns a positive number needs to be modified on Windows systems by testing whether Forks::Super::isValidPid($pid) returns true, where $pid is the return value from a Forks::Super::fork call.
OTHER FUNCTIONS
$reaped_pid = wait-
Like the Perl
waitsystem call, blocks until a child process terminates and returns the PID of the deceased process, or-1if there are no child processes remaining to reap. The exit status of the child is returned in$?. $reaped_pid = waitpid $pid, $flags-
Waits for a child with a particular PID or a child from a particular process group to terminate and returns the PID of the deceased process, or
-1if there is no suitable child process to reap. If the return value contains a PID, then$?is set to the exit status of that process.A valid job ID (see "Deferred processes") may be used as the $pid argument to this method. If the
waitpidcall reaps the process associated with the job ID, the return value will be the actual PID of the deceased child.Note that the
waitpidfunction can wait on a job ID even when the job associated with that ID is still in the job queue, waiting to be started.A $pid value of
-1waits for the first available child process to terminate and returns its PID.A $pid value of
0waits for the first available child from the same process group of the calling process.A negative
$pidthat is not recognized as a valid job ID will be interpreted as a process group ID, and thewaitpidfunction will return the PID of the first available child from the same process group.On some^H^H^H^H every modern system that I know about, a
$flagsvalue ofPOSIX::WNOHANGis supported to perform a non-blocking wait. See the Perlwaitpiddocumentation. waitall-
Blocking wait for all child processes, including deferred jobs that have not started at the time of the
waitallcall. Forks::Super::isValidPid( $pid )-
Tests whether the return value of a
forkcall indicates that a background process was successfully created or not. On POSIX systems it is sufficient to check whether$pidis a positive integer, butisValidPidis a more Forks::Super::pause($delay)-
A productive drop-in replacement for the Perl
sleepsystem call (orTime::HiRes::sleep, if available). On systems like Windows that lack a proper method for handlingSIGCHLDevents, theForks::Super::pausemethod will occasionally reap child processes that have completed and attempt to dispatch jobs on the queue.On other systems, using
Forks::Super::pauseis less vulnerable thansleepto interruptions from this module (See "BUGS AND LIMITATIONS" below). $status = Forks::Super::status($pid)-
Returns the exit status of a completed child process represented by process ID or job ID $pid. Aside from being a permanent store of the exit status of a job, using this method might be a more reliable indicator of a job's status than checking
$?after awaitorwaitpidcall. It is possible for this module'sSIGCHLDhandler to temporarily corrupt the$?value while it is checking for deceased processes. $line = Forks::Super::read_stdout($pid)@lines = Forks::Super::read_stdout($pid)$line = Forks::Super::read_stderr($pid)@lines = Forks::Super::read_stderr($pid)-
For jobs that were started with the
get_child_stdoutandget_child_stderroptions enabled, read data from the STDOUT and STDERR filehandles of child processes.Aside from the more readable syntax, these functions may be preferable to
@lines = < {$Forks::Super::CHILD_STDOUT{$pid}} >; $line = < {$Forks::Super::CHILD_STDERR{$pid}} >;because they will automatically handle clearing the EOF condition on the filehandles if the parent is reading on the filehandles faster than the child is writing on them.
Functions work in both scalar and list context. If there is no data to read on the filehandle, but the child process is still active and could put more data on the filehandle, these functions return "" in scalar and list context. If there is no more data on the filehandle and the child process is finished, the functions return
undef.
Obtaining job information
$job = Forks::Super::Job::get($pid)-
Returns a
Forks::Super::Jobobject associated with process ID or job ID$pid. See Forks::Super::Job for information about the methods and attributes of these objects. @jobs = Forks::Super::Job::getByName($name)-
Returns zero of more
Forks::Super::Jobobjects with the specified job names. A job receives a name if anameparameter was provided in theForks::Super::forkcall. $reference = bg_eval { BLOCK }$reference = bg_eval { BLOCK } { option => value, ... }-
Evaluates the specified block of code in a background process. When the parent process dereferences the result, it uses interprocess communication to retrieve the result from the child process, waiting until the child finishes if necessary.
# Example 1: must wait until job finishes before $$result is available $result = bg_eval { sleep 3 ; return 42 }; print "Result is $$result\n"; # Example 2: $$result is probably available immediately $result = bg_eval { sleep 3 ; return 42 }; &do_something_that_takes_about_5_seconds(); print "Result is $$result\n";The code block is always evaluated in scalar context, though it is acceptable to return a reference:
$result = bg_eval { @files = File::Find::find(\&criteria, @lots_of_dirs); return \@files; }; # ... do something else while that job runs ... foreach my $matching_file (@$$result) { # note double dereference # ... do something with $matching_file }The background job will be spawned with the
Forks::Super::forkcall, and the command will block, fail, or defer a background job in accordance with all of the other rules of this module. Additional options may be passed tobg_evalthat will be provided to theforkcall. For example:$result = bg_eval { return get_from_teh_Internet($something, $where); } { timeout => 60, priority => 3 };will return a reference to
undefif the operation takes longer than 60 seconds. Most valid options for theforkcall are also valid options forbg_eval, including timeouts, delays, job dependencies, names, and callback. The only invalid options forbg_evalarecmd,sub,exec, andchild_fh. @result = bg_eval { BLOCK }@result = bg_eval { BLOCK } { option => value, ... }-
Evaluates the specified block of code in a background process and in list context. The parent process retrieves the result from the child through interprocess communication the first time that an element of the array is referenced; the parent will wait for the child to finish if necessary.
The background job will be spawned with the
Forks::Super::forkcall, and the command will block, fail, or defer a background job in accordance with all of the rules of this module. Additional options may be passed to thebg_evalfunction that will be provided to theForks::Super::forkcall. For example:@result = bg_eval { count_words($a_huge_file) } { timeout => 60 };will return an empty list if the operation takes longer than 60 seconds. Any valid options for the
forkcall are also valid options forbg_eval, except forexec,cmd,sub, andchild_fh.
MODULE VARIABLES
Module variables may be initialized on the use Forks::Super line
# set max simultaneous procs to 5, allow children to call CORE::fork()
use Forks::Super MAX_PROC => 5, CHILD_FORK_OK => -1;
or they may be set explicitly in the code:
$Forks::Super::ON_BUSY = 'queue';
$Forks::Super::FH_DIR = "/home/joe/temp-ipc-files";
Module variables that may be of interest include:
Previous sections discussed the use of $Forks::Super::MAX_PROC and $Forks::Super::ON_BUSY. Some other module variables that might be of interest are
$Forks::Super::MAX_PROC-
The maximum number of simultaneous background processes that can be spawned by
Forks::Super. If aforkcall is attempted while there are already at least this many active background processes, the behavior of theforkcall will be determined by the value in$Forks::Super::ON_BUSYor by theon_busyoption passed to theforkcall.This value will be ignored during a
forkcall if theforceoption is passed toforkwith a non-zero value. The value might also not be respected if the user supplies a code reference in thecan_launchoption and the user-supplied code does not test whether there are already too many active proceeses. $Forks::Super::ON_BUSY = 'block' | 'fail' | 'queue'-
Determines behavior of a
forkcall when the system is too busy to create another background process.If this value is set to
block, thenforkwill wait until the system is no longer too busy and then launch the background process. The return value will be a normal process ID value (assuming there was no system error in creating a new process).If the value is set to
fail, theforkcall will return immediately without launching the background process. The return value will be-1. AForks::Super::Jobobject will not be created.If the value is set to
queue, then theforkcall will create a "deferred" job that will be queued and run at a later time. Also see thequeue_priorityoption toforkto set the urgency level of a job in case it is deferred. The return value will be a large and negative job ID.This value will be ignored in favor of an
on_busyoption supplied to theforkcall. $Forks::Super::CHILD_FORK_OK = -1 | 0 | +1-
Spawning a child process from another child process with this module has its pitfalls, and this capability is disabled by default: you will get a warning message and the
fork()call will fail if you try it.To override hits behavior, set
$Forks::Super::CHILD_FORK_OKto a non-zero value. Setting it to a positive value will allow you to use all the functionality of this module from a child process (with the obvious caveat that you cannotwaiton the child process or a child process from the main process).Setting
$Forks::Super::CHILD_FORK_OKto a negative value will disable the functionality of this module but will reenable the classic Perlfork()system call from child processes. $Forks::Super::DEBUG, Forks::Super::DEBUG-
To see the internal workings of the
Forksmodule, set$Forks::Super::DEBUGto a non-zero value. Information messages will be written to theForks::Super::DEBUGfilehandle. By defaultForks::Super::DEBUGis aliased toSTDERR, but it may be reset by the module user at any time.Debugging behavior may be overridden for specific jobs if the
debugorundebugoption is provided tofork. %Forks::Super::CHILD_STDIN%Forks::Super::CHILD_STDOUT%Forks::Super::CHILD_STDERR-
In jobs that request access to the child process filehandles, these hash arrays contain filehandles to the standard input and output streams of the child. The filehandles for particular jobs may be looked up in these tables by process ID or job ID for jobs that were deferred.
Remember that from the perspective of the parent process,
$Forks::Super::CHILD_STDIN{$pid}is an output filehandle (what you print to this filehandle can be read in the child's STDIN), and$Forks::Super::CHILD_STDOUT{$pid}and$Forks::Super::CHILD_STDERR{$pid}are input filehandles (for reading what the child wrote to STDOUT and STDERR).As with any asynchronous communication scheme, you should be aware of how to clear the EOF condition on filehandles that are being simultaneously written to and read from by different processes. A scheme like this works on most systems:
# in parent, reading STDOUT of a child for (;;) { while (<{$Forks::Super::CHILD_STDOUT{$pid}}>) { print "Child $pid said: $_"; } # EOF reached, but child may write more to filehandle later. sleep 1; seek $Forks::Super::CHILD_STDOUT{$pid}, 0, 1; } @Forks::Super::ALL_JOBS,%Forks::Super::ALL_JOBS-
List of all
Forks::Super::Jobobjects that were created fromfork()calls, including deferred and failed jobs. Both process IDs and job IDs (for jobs that were deferred at one time) can be used to look up Job objects in the %Forks::Super::ALL_JOBS table. $Forks::Super::QUEUE_INTERRUPT-
On systems with mostly-working signal frameworks, this module installs a signal handler the first time that a task is deferred. The signal that is trapped is defined in the variable
$Forks::Super::QUEUE_INTERRUPT. The default value isUSR1, and it may be overridden directly or set on module importuse Forks::Super QUEUE_INTERRUPT => 'TERM'; $Forks::Super::QUEUE_INTERRUPT = 'USR2';You would only worry about resetting this variable if you (including other modules that you import) are making use of an existing
SIGUSR1handler.
DIAGNOSTICS
fork() not allowed in child process ...Forks::Super::fork() call not allowed in child process ...-
When the package variable
$Forks::Super::CHILD_FORK_OKis zero, this package does not allow thefork()method to be called from a child process. Set$Forks::Super::CHILD_FORK_OKto change this behavior. quick timeout-
A job was configured with a timeout/expiration time such that the deadline for the job occurred before the job was even launched. The job was killed immediately after it was spawned.
Job start/Job dependency <nnn> for job <nnn> is invalid. Ignoring.-
A process id or job id that was specified as a
depend_onordepend_startoption did not correspond to a known job. Job <nnn> reaped before parent initialization.-
A child process finished quickly and was reaped by the parent process
SIGCHLDhandler before the parent process could even finish initializing the job state. The state of the job in the parent process might be unavailable or corrupt for a short time, but eventually it should be all right. interprocess filehandles not availablecould not open filehandle to provide child STDIN/STDOUT/STDERRchild was not able to detect STDIN file ... Child may not have any input to read.could not open filehandle to write child STDINcould not open filehandle to read child STDOUT/STDERR-
Initialization of filehandles for a child process failed. The child process will continue, but it will be unable to receive input from the parent through the
$Forks::Super::CHILD_STDIN{pid}filehandle, or pass output to the parent through the filehandles$Forks::Super::CHILD_STDOUT{PID}AND$Forks::Super::CHILD_STDERR{pid}. exec option used, timeout option ignored-
A
forkcall was made using the incompatible optionsexecandtimeout.
INCOMPATIBILITIES
Some features use the alarm function and custom SIGALRM handlers in the child processes. Using other modules that employ this functionality may cause undefined behavior. Systems and versions that do not implement the alarm function (like MSWin32 prior to Perl v5.7) will not be able to use these features.
The first time that a task is deferred, by default this module will try to install a SIGUSR1 handler. See the description of $Forks::Super::QUEUE_INTERRUPT under "MODULE VARIABLES" for changing this behavior if you intended to use a SIGUSR1 handler for something else.
DEPENDENCIES
The bg_eval function requires YAML.
Otherwise, there are no hard dependencies on non-core modules. Some features, especially operating-system specific functions, depend on some modules (Win32::API and Win32::Process for Wintel systems, for example), but the module will compile without those modules. Attempts to use these features without the required modules will be silently ignored.
BUGS AND LIMITATIONS
A typical script using this module will have a lot of behind-the-scenes signal handling as child processes finish and are reaped. These frequent interruptions can affect the execution of your program. For example, in this script:
1: use Forks::Super;
2: fork(sub => sub { sleep 2 });
3: sleep 5;
4: # ... program continues ...
the sleep call in line 3 is probably going to get interrupted before 5 seconds have elapsed as the end of the child process spawned in line 2 will interrupt execution and invoke the SIGCHLD handler. In some cases there are tedious workarounds:
3a: $stop_sleeping_at = time + 5;
3b: sleep 1 while time < $stop_sleeping_at;
It should be noted that signal handling in Perl is much improved with version 5.7.3, and the problems caused by such interruptions are much more tractable than they used to be.
The system implementation of fork'ing and wait'ing varies from platform to platform. It is possible that this module or certain features will not work as advertised. Please report any problems you encounter to <mob@cpan.org> and I'll see what I can do about it.
SEE ALSO
There are reams of other modules on CPAN for managing background processes. See Parallel::*, Proc::Parallel, Proc::Fork, Proc::Launcher.
Inspiration for bg_eval function from Acme::Fork::Lazy.
AUTHOR
Marty O'Brien, <mob@cpan.org>
LICENSE AND COPYRIGHT
Copyright (c) 2009-2010, Marty O'Brien.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.