NAME
IPC::Exe - Execute processes or Perl subroutines & string them via IPC. Think shell pipes.
SYNOPSIS
use IPC::Exe qw(exe bg);
my @pids = &{
exe sub { "2>#" }, qw( ls /tmp a.txt ),
bg exe qw( sort -r ),
exe sub { print "[", shift, "] 2nd cmd: @_\n"; print "three> $_" while <STDIN> },
bg exe 'sort',
exe "cat", "-n",
exe sub { print "six> $_" while <STDIN>; print "[", shift, "] 5th cmd: @_\n" },
};
is like doing the following in a modern Unix shell:
ls /tmp a.txt 2> /dev/null | { sort -r | [perlsub] | { sort | cat -n | [perlsub] } & } &
except that [perlsub]
is really a perl child process with access to main program variables in scope.
DESCRIPTION
This module was written to provide a secure and highly flexible way to execute external programs with an intuitive syntax. In addition, more info is returned with each string of executions, such as the list of PIDs and $?
of the last external pipe process (see "RETURN VALUES"). Execution uses exec
command, and the shell is never invoked.
The two exported subroutines perform all the heavy lifting of forking and executing processes. In particular, exe( )
implements the KID_TO_READ
version of
http://perldoc.perl.org/perlipc.html#Safe-Pipe-Opens
while bg( )
implements the double-fork technique illustrated at
http://perldoc.perl.org/perlfaq8.html#How-do-I-start-a-process-in-the-background?
EXAMPLES
Let's dive right away into some examples. To begin:
my $exit = system( "myprog $arg1 $arg2" );
can be replaced with
my $exit = &{ exe 'myprog', $arg1, $arg2 };
exe( )
returns a LIST of PIDs, the last item of which is $?
(of default &READER
). To get the actual exit value $exitval
, shift right by eight $? >> 8
.
Extending the previous example,
my $exit = system( "myprog $arg1 $arg2 $arg3 > out.txt" );
can be replaced with
my $exit = &{ exe sub { [ '>', 'out.txt' ] }, 'myprog', $arg1, $arg2, };
The previous two examples will wait for 'myprog' to finish executing before continuing the main program.
Extending the previous example again,
# cannot obtain $exit of 'myprog' because it is in background
system( "myprog $arg1 $arg2 $arg3 > out.txt &" );
can be replaced with
# just add 'bg' before 'exe' in previous example
my $bg_pid = &{ bg exe sub { [ '>', 'out.txt' ] }, 'myprog', $arg1, $arg2, };
Now, 'myprog' will be put in background and the main program will continue without waiting.
To monitor the exit value of a background process:
my $bg_pid = &{
bg sub {
# same as 2nd previous example
my ($pid) = &{
exe sub { [ '>', 'out.txt' ] }, 'myprog', $arg1, $arg2,
};
# check if exe() was successful
defined($pid) or die("Failed to fork process in background");
# handle exit value here
print STDERR "background exit value: " . ($? >> 8) . "\n";
}
};
# check if bg() was successful
defined($bg_pid) or die("Failed to send process to background");
Instead of using backquotes or qx( )
,
# slurps entire STDOUT into memory
my @stdout = (`$program @ARGV`);
# handle STDOUT here
for my $line (@stdout)
{
print "read_in> $line";
}
we can read the STDOUT
of one process with:
my ($pid) = &{
# execute $program with arguments
exe $program, @ARGV,
# handle STDOUT here
sub {
while (my $line = <STDIN>)
{
print "read_in> $line";
}
# set exit status of main program
waitpid($_[0], 0);
},
};
# check if exe() was successful
defined($pid) or die("Failed to fork process");
# exit value of $program
my $exitval = $? >> 8;
Perform tar copy of an entire directory:
use Cwd qw(chdir);
my @pids = &{
exe sub { chdir $source_dir or die $! }, qw(/bin/tar cf - .),
exe sub { chdir $target_dir or die $! }, qw(/bin/tar xBf -),
};
# check if exe()'s were successful
defined($pids[0]) && defined($pids[1])
or die("Failed to fork processes");
# was un-tar successful?
my $error = pop(@pids);
Here is an elaborate example to pipe STDOUT
of one process to the STDIN
of another, consecutively:
my @pids = &{
# redirect STDERR to STDOUT
exe sub { "2>&1" }, $program, @ARGV,
# 'perl' receives STDOUT of $program via STDIN
exe sub {
my ($pid) = &{
exe qw(perl -e), 'print "read_in> $_" while <STDIN>; exit 123',
};
# check if exe() was successful
defined($pid) or die("Failed to fork process");
# handle exit value here
print STDERR "in-between exit value: " . ($? >> 8) . "\n";
# this is executed in child process
# no need to return
},
# 'sort' receives STDOUT of 'perl'
exe qw(sort -n),
# [perlsub] receives STDOUT of 'sort'
exe sub {
# find out command of previous pipe process
# if @_[1..$#_] is an empty list, previous process was a [perlsub]
my ($child_pid, $prog, @args) = @_;
# output: "last_pipe[12345]> sort -n"
print STDERR "last_pipe[$child_pid]> $prog @args\n";
# print sorted, 'perl' filtered, output of $program
print while <STDIN>;
# find out exit value of previous 'sort' pipe process
waitpid($_[0], 0);
warn("Bad exit for: @_\n") if $?;
return $?;
},
};
# check if exe()'s were successful
defined($pids[0]) && defined($pids[1]) && defined($pids[2])
or die("Failed to fork processes");
# obtain exit value of last process on pipeline
my $exitval = pop(@pids) >> 8;
Shown below is an example of how to capture STDERR
and STDOUT
after sending some input to STDIN
of the child process:
# reap child processes 'xargs' when done
local $SIG{CHLD} = 'IGNORE';
# like IPC::Open3, filehandles are generated on-the-fly
my ($pid, $TO_STDIN, $FROM_STDOUT, $FROM_STDERR) = &{
exe +{ stdin => 1, stdout => 1, stderr => 1 }, qw(xargs ls -ld),
};
# check if exe() was successful
defined($pid) or die("Failed to fork process");
# ask 'xargs' to 'ls -ld' three files
print $TO_STDIN "/bin\n";
print $TO_STDIN "does_not_exist\n";
print $TO_STDIN "/etc\n";
# cause 'xargs' to flush its stdout
close($TO_STDIN);
# print captured outputs
print "stderr> $_" while <$FROM_STDERR>;
print "stdout> $_" while <$FROM_STDOUT>;
# close filehandles
close($FROM_STDOUT);
close($FROM_STDERR);
Of course, more exe( )
calls may be chained together as needed:
# reap child processes 'xargs' when done
local $SIG{CHLD} = 'IGNORE';
# like IPC::Open2, except filehandles are generated on-the-fly
my ($pid1, $TO_STDIN, $pid2, $FROM_STDOUT) = &{
exe +{ stdin => 1 }, sub { "2>&1" }, qw(perl -ne), 'print STDERR "360.0 / $_"',
exe +{ stdout => 1 }, qw(bc -l),
};
# check if exe()'s were successful
defined($pid1) && defined($pid2)
or die("Failed to fork processes");
# ask 'bc -l' results of "360 divided by given inputs"
print $TO_STDIN "$_\n" for 2 .. 8;
# we redirect stderr of 'perl' to stdout
# which, in turn, is fed into stdin of 'bc'
# print captured outputs
print "360 / $_ = " . <$FROM_STDOUT> for 2 .. 8;
# close filehandles
close($TO_STDIN);
close($FROM_STDOUT);
Important: Some non-Unix platforms, such as Win32, require interactive processes (shown above) to know when to quit, and can neither rely on close($TO_STDIN)
, nor kill(TERM => $pid);
SUBROUTINES
Both exe( )
and bg( )
are optionally exported. They each return CODE references that need to be called.
exe( )
exe \%EXE_OPTIONS, &PREEXEC, LIST, &READER
exe \%EXE_OPTIONS, &PREEXEC, &READER
exe \%EXE_OPTIONS, &PREEXEC
exe &READER
\%EXE_OPTIONS
is an optional hash reference to instruct exe( )
to return STDIN
/ STDERR
/ STDOUT
filehandle(s) of the executed child process. See "SETTING OPTIONS".
LIST
is exec( )
in the child process after the parent is forked, where the child's stdout is redirected to &READER
's stdin.
&PREEXEC
is called right before exec( )
in the child process, so we may reopen filehandles or do some child-only operations beforehand.
Optionally, &PREEXEC
could return a LIST of strings to perform common filehandle redirections and/or modify binmode
settings (which are performed in-order). The following are preset actions:
"2>#" or "2>null" silence stderr
">#" or "1>null" silence stdout
"2>&1" redirect stderr to stdout
"1>&2" or ">&2" redirect stdout to stderr
"1><2" or "2><1" swap stdout and stderr
"0:crlf" does binmode(STDIN, ":crlf")
"1:raw" or "1:" does binmode(STDOUT, ":raw")
"2:utf8" does binmode(STDERR, ":utf8")
&PREEXEC
could also return array references in the mix to perform open
operations. If open
fails, IPC::Exe
will die. Minimal validation is done for the array items, so be careful. Examples:
[ ">", "/path/file" ] does open(STDOUT, ">", "/path/file")
[ ">>", "/path/file" ] does open(STDOUT, ">>", "/path/file")
[ "2>", "/path/file" ] does open(STDERR, ">", "/path/file")
[ *FH, "+>>", $file ] does open(FH, "+>>", $file)
If references to array refs are returned by &PREEXEC
, then sysopen
will be used instead:
\[ *FH, $file, O_RDWR ] does sysopen(FH, $file, O_RDWR)
\[ *FH, $file, O_WRONLY, 0644 ] does sysopen(FH, $file, O_WRONLY, 0644)
It is important to note that the actions & return of &PREEXEC
matters, as it may be used to redirect filehandles before &PREEXEC
becomes the exec process.
&PREEXEC
is called with arguments passed to the CODE reference returned by exe( )
.
&READER
is called with ($child_pid, LIST)
as its arguments. LIST
corresponds to the positional arguments passed in-between &PREEXEC
and &READER
.
If exe( )
's are chained, &READER
calls itself as the next exe( )
in line, which in turn, calls the next &PREEXEC
, LIST
, etc.
&READER
is always called in the parent process.
&PREEXEC
is always called in the child process.
&PREEXEC
and &READER
are very similar and may be treated the same.
waitpid( $_[0], 0 )
in &READER
to set exit status $?
of previous process executing on the pipe. close( $IPC::Exe::PIPE )
can also be used to close the input filehandle and set $?
at the same time (for Unix platforms only).
If LIST
is not provided, &PREEXEC
will still be called.
If &PREEXEC
is not provided, LIST
will still exec.
If &READER
is not provided, it defaults to
sub { print while <STDIN>; waitpid($_[0], 0); return $? } # $_[0] is the $child_pid
exe( &READER )
returns &READER
.
exe( )
with no arguments returns an empty list.
bg( )
bg \%BG_OPTIONS, &BACKGROUND
bg &BACKGROUND
\%BG_OPTIONS
is an optional hash reference to instruct bg( )
to wait a certain amount of time for PREEXEC to complete (for non-Unix platforms only). See "SETTING OPTIONS".
&BACKGROUND
is called after it is sent to the init process.
If &BACKGROUND
is not a CODE reference, return an empty list upon execution.
bg( )
with no arguments returns an empty list.
This experimental feature is not enabled by default:
Upon failure of background to init process,
bg( )
can fallback by calling&BACKGROUND
in parent or child process if$IPC::Exe::bg_fallback
is true. To enable fallback feature, set$IPC::Exe::bg_fallback = 1;
SETTING OPTIONS
exe( )
\%EXE_OPTIONS
is a hash reference that can be provided as the first argument to exe( )
to control returned values. It may be used to return or assign STDIN
/ STDERR
/ STDOUT
filehandle(s) of the child process to emulate IPC::Open2 and IPC::Open3 behavior.
The default values are:
%EXE_OPTIONS = (
pid => undef,
stdin => 0,
stdout => 0,
stderr => 0,
autoflush => 1,
binmode_io => undef,
);
These are the effects of setting the following options:
- pid => \$pid
-
Set
$pid
to the child process PID, given a SCALAR reference. The PID will not be returned as part of the return values ofexe( )
. - stdin => 1 or stdin => \$TO_STDIN
-
Return a WRITEHANDLE to
STDIN
of the child process. The filehandle will be set to autoflush on write if$EXE_OPTIONS{autoflush}
is true.If given a SCALAR reference, set
$TO_STDIN
to the WRITEHANDLE described above. The WRITEHANDLE then will not be returned as part of the return values ofexe( )
. - stdout => 1 or stdout => \$FROM_STDOUT
-
Return a READHANDLE from
STDOUT
of the child process, so output to stdout may be captured. When this option is set and&READER
is not provided, the default&READER
subroutine will NOT be called.If given a SCALAR reference, set
$FROM_STDOUT
to the READHANDLE described above. The READHANDLE then will not be returned as part of the return values ofexe( )
. - stderr => 1 or stdout => \$FROM_STDERR
-
Return a READHANDLE from
STDERR
of the child process, so output to stderr may be captured.If given a SCALAR reference, set
$FROM_STDERR
to the READHANDLE described above. The READHANDLE then will not be returned as part of the return values ofexe( )
. - autoflush => 0
-
Disable autoflush on the WRITEHANDLE to
STDIN
of the child process. This option only has effect when$EXE_OPTIONS{stdin}
is true. - binmode_io => ":raw", ":crlf", ":bytes", ":encoding(utf8)", etc.
-
Set
binmode
ofSTDIN
andSTDOUT
of the child process for layer$EXE_OPTIONS{binmode_io}
. This is automatically done for subsequently chainedexe( )
cutions. To stop this, set to an empty string""
or another layer to bring a different mode into effect.
bg( )
NOTE: This only applies to non-Unix platforms.
\%BG_OPTIONS
is a hash reference that can be provided as the first argument to bg( )
to set wait time (in seconds) before relinquishing control back to the parent thread. See "CAVEAT" for reasons why this is necessary.
The default value is:
%BG_OPTIONS = (
wait => 2, # Win32 option
);
RETURN VALUES
By chaining exe( )
and bg( )
statements, calling the single returned CODE reference sets off the chain of executions. This returns a LIST in which each element corresponds to each exe( )
or bg( )
call.
exe( )
When
exe( )
executes an external process, the PID for that process is returned, or an EMPTY LIST ifexe( )
failed in any operation prior to forking. If an EMPTY LIST is returned, the chain of execution stops there and the next&READER
is not called, guaranteeing the final return LIST to be truncated at that point. Failure after forking causesdie( )
to be called.When
exe( )
executes a&READER
subroutine, the subroutine's return value is returned. If there is no explicit&READER
, the implicit default&READER
subroutine is called instead:sub { print while <STDIN>; waitpid($_[0], 0); return $? } # $_[0] is the $child_pid
It returns
$?
, which is the status of the last pipe process close. This allows code to be written like:my $exit = &{ exe 'myprog', $myarg }; # $exit = ($myprog_pid, $myprog_exit_status);
When non-default
\%EXE_OPTIONS
are specified, eachexe( )
returns additional filehandles in the following LIST:( $PID, # undef if exec failed $STDIN_WRITEHANDLE, # only if $EXE_OPTIONS{stdin} is true $STDOUT_READHANDLE, # only if $EXE_OPTIONS{stdout} is true $STDERR_READHANDLE, # only if $EXE_OPTIONS{stderr} is true )
The positional LIST form return allows code to be written like:
my ($pid, $TO_STDIN, $FROM_STDOUT) = &{ exe +{ stdin => 1, stdout => 1 }, '/usr/bin/bc' };
SCALAR references may be passed in
\%EXE_OPTIONS
for their scalars to be assigned in-place, instead of returning them in the positional LIST:my ($pid, $FROM_STDOUT); my ($TO_STDIN) = &{ exe +{ pid => \$pid, stdin => 1, stdout => \$FROM_STDOUT }, '/usr/bin/bc' };
Note: It is necessary to disambiguate
\%EXE_OPTIONS
(also\%BG_OPTIONS
) as a hash reference by including a unary+
before the opening curly bracket:+{ stdin => 1, autoflush => 0 } +{ wait => 2.5 }
bg( )
Calling the CODE reference returned by bg( )
returns the PID of the background process, or an EMPTY LIST
if bg( )
failed in any operation prior to forking. Failure after forking causes die( )
to be called.
ERROR CHECKING
To determine if either exe( )
or bg( )
was successful until the point of forking, check whether the returned $PID
is defined.
See "EXAMPLES" for examples on error checking.
WARNING: This may get a slightly complicated for chained exe( )
's when non-default \%EXE_OPTIONS
cause the positions of $PID
in the overall returned LIST to be non-uniform (caveat emptor). Remember, the chain of executions is doing a lot for just a single CODE call, so due diligence is required for error checking.
A minimum count of items (PIDs and/or filehandles) can be expected in the returned LIST to determine whether forks were initiated for the entire exe( )
/ bg( )
chain.
Failures after forking are responded with die( )
. To handle these errors, use eval
.
TAINT CHECKING
In taint mode, exe( )
will die if it is called with tainted arguments or environment variables. By default, the following environment variables are checked:
PATH PATHEXT IFS CDPATH ENV BASH_ENV PERL5SHELL
We may add to this list with:
BEGIN { push @IPC::Exe::TAINT_ENV, qw(PATH_LOCALE TERMINFO TERMPATH) }
SYNTAX
It is highly recommended to avoid unnecessary parentheses ( )'s when using exe( )
and bg( )
.
IPC::Exe
relies on Perl's LIST parsing magic in order to provide the clean intuitive syntax.
As a guide, the following syntax should be used:
my @pids = &{ # call CODE reference
[ bg ] exe [ sub { ... }, ] $prog1, $arg1, @ARGV, # end line with comma
exe [ sub { ... }, ] $prog2, $arg2, $arg3, # end line with comma
[ bg ] exe sub { ... }, # this bg() acts on last exe() only
sub { ... },
};
where brackets [ ]'s denote optional syntax.
Note that Perl sees
my @pids = &{
bg exe $prog1, $arg1, @ARGV,
bg exe sub { "2>#" }, $prog2, $arg2, $arg3,
exe sub { 123 },
sub { 456 },
};
as
my @pids = &{
bg( exe( $prog1, $arg1, @ARGV,
bg( exe( sub { "2>#" }, $prog2, $arg2, $arg3,
exe( sub { 123 },
sub { 456 }
)
)
)
)
);
};
CAVEAT
END { } blocks
Code declared in END blocks will be called upon exit, whether it be after &PREEXEC
sub without a LIST command, from a die
failure, or even a failed exec
call.
The user should make provisions to handle this situation. This is desirable when END blocks must only be called in the main process (or thread).
$IPC::Exe::is_forked
is set to true after the code forks in &PREEXEC
and &BACKGROUND
. It can be used to tell the main process/thread apart from child processes/threads:
END {
# only run in main process/thread
return if $IPC::Exe::is_forked;
### REST OF THE CODE GOES HERE ###
...
}
PLATFORMS
This module is targeted for Unix environments, using techniques described in perlipc and perlfaq8. Development is done on FreeBSD, Linux, and Win32 platforms. It may not work well on other non-Unix systems, let alone Win32.
MSWin32
Some care was taken to rely on Perl's Win32 threaded implementation of fork( )
. To get things to work almost like Unix, redirections of filehandles have to be performed in a certain order. More specifically: let's say STDOUT of a child process (read: thread) needs to be redirected elsewhere (anywhere, it doesn't matter). It is important that the parent process (read: thread) does not use STDOUT until after the child is exec'ed. At the point after exec, the parent must restore STDOUT to a previously dup'ed original and may then proceed along as usual. If this order is violated, deadlocks may occur, often manifesting as an apparent stall in execution when the parent tries to use STDOUT.
exe( )
Since fork( )
is emulated with threads, &PREEXEC
and &READER
really do begin their lives in the same process, but in separate threads. This imposes limitations on how they can be used. One limitation is that, as separate threads, either one MUST NOT block, or else the other thread will not be able to continue.
Writing to, or reading from a pipe will block when the pipe buffer is full or empty, respectively.
Putting the facts together, it means that a pipe writer and reader should not function (as separate threads or otherwise) in the same process for fear that one may block and not let the other continue (a deadlock).
For example, this code below will block:
&{
exe sub { print "a" x 9000, "\n" for 1 .. 3 }, # sub is &PREEXEC
sub { @result = <STDIN> } # sub is &READER
};
The execution stalls, and the program just hangs there. &PREEXEC
is writing out more data than the pipe buffer can fit. Once the buffer is full, print
will block to wait for the buffer to be emptied. However, &READER
is not able to continue and read off some data from the pipe buffer because it is in the same blocked process. If it were in a separate process (as in a real fork
), than a blocking &PREEXEC
cannot affect the &READER
.
The way to ensure exe( )
works smoothly on Win32 is to exec
processes on the pipeline chain. This code will work instead:
&{
exe qw(perl -e), 'print "a" x 9000, "\n" for 1 .. 3', # &PREEXEC exec'ed perl
sub { @result = <STDIN> } # sub is &READER
};
Now, &PREEXEC
is no longer running in the same process, and cannot affect &READER
. If the new perl
process blocks, &READER
in the original process can still continue to read the pipe.
Writing and reading small amounts of data (to not cause blocking) between &PREEXEC
and &READER
is possible, but not recommended.
bg( )
On Win32, bg( )
unfortunately has to substantially rely on timer code to wait for &PREEXEC
to complete in order to work properly with exe( )
. The example shown below illustrates that bg( )
has to wait at least until $program
is exec'ed. Hence, $wait_time > $work_time
must hold true and this requires a priori knowledge of how long &PREEXEC
will take.
&{
bg +{ wait => $wait_time }, exe sub { sleep($work_time) }, $program
};
This essentially renders bg &BACKGROUND
useless if &BACKGROUND
does not exec any programs (Win32).
In summary: (on Win32)
Only use
bg( )
to exec programs into the background.Keep
&PREEXEC
as short-running as possible. Or make sure$BG_OPTIONS{wait}
time is longer.No
&PREEXEC
(or code running in parallel thread) == no problems.
Some useful information:
http://perldoc.perl.org/perlfork.html#CAVEATS-AND-LIMITATIONS
http://www.nntp.perl.org/group/perl.perl5.porters/2003/11/msg85488.html
http://www.nntp.perl.org/group/perl.perl5.porters/2003/08/msg80311.html
http://www.perlmonks.org/?node_id=684859
http://www.perlmonks.org/?node_id=225577
http://www.perlmonks.org/?node_id=742363
DEPENDENCIES
Perl v5.6.0+ is required.
No non-core modules are required.
AUTHOR
Gerald Lai <glai at cpan dot org>