NAME
ModPerl2::Tools - a few hopefully useful tools
SYNOPSIS
use ModPerl2::Tools;
ModPerl2::Tools::spawn +{keep_fd=>[3,4,7], survive=>1}, sub {...};
ModPerl2::Tools::spawn +{keep_fd=>[3,4,7], survive=>1}, qw/bash -c .../;
ModPerl2::Tools::safe_die $status;
$r->safe_die($status);
$f->safe_die($status);
$content=ModPerl2::Tools::fetch_url $url;
$content=$r->fetch_url($url);
DESCRIPTION
This module is a collection of functions and methods that I found useful when working with mod_perl
. I work mostly under Linux. So, I don't expect all of these functions to work on other operating systems.
Forking off long running processes
Sometimes one needs to spawn off a long running process as the result of a request. Under modperl this is not as simple as calling fork
because that way all open file descriptors would be inherited by the child and, more subtle, the long running process would be killed when the administrator shuts down the web server. The former is usually considered a security issue, the latter a design decision.
There is already $r->spawn_proc_prog that serves a similar purpose as the spawn
function. However, spawn_proc_prog
is not usable for long running processes because it kills the children after a certain timeout.
Solution
$pid=ModPerl2::Tools::spawn \%options, $subroutine, @parameters;
or
$pid=ModPerl2::Tools::spawn \%options, @command_line;
spawn
expects as the first parameter an options hash reference. The second parameter may be a code reference or a string.
In case of a code ref no other program is executed but the subroutine is called instead. The remaining parameters are passed to this function.
Note, the perl environment under modperl differs in certain ways from a normal perl environment. For example %ENV
is not bound to the C-level environ
. These modifications are not undone by this module. So, it's generally better to execute another perl interpreter instead of using the $subroutine
feature.
The options parameter accepts these options:
- keep_fd => \@fds
-
here an array of file descriptor numbers (not file handles) is expected. All other file descriptors except for the listed and file descriptor 2 (STDERR) are closed before calling
$subroutine
or executing@command_line
. - survive => $boolean
-
if passed
false
the created process will be killed when Apache shuts down. if true it will survive an Apache restart.
The return code on success is the PID of the process. On failure undef
or an empty string is returned.
The created process is not related as a child process to the current apache child.
Serving ErrorDocument
s
Triggering ErrorDocument
s from a registry script or even more from an output filter is not simple. The normal way as a handler is
return Apache2::Const::STATUS;
This does not work for registry scripts. An output filter even if it returns a status can trigger only a SERVER_ERROR
.
The main interface to enter standard error processing in Apache is ap_die()
at C-level. Its Perl interface is hidden in Apache2::HookRun.
There is one case when an error message cannot be sent to the user. This happens if the HTTP headers are already on the wire. Then it is too late.
The various flavors of safe_die()
take this into account.
safe_die
won't return. Instead it calls ModPerl::Util::exit(0) which raises an exception.
- ModPerl2::Tools::safe_die $status
-
This function is designed to be called from registry scripts. It uses Apache2::RequestUtil->request to fetch the current request object. So,
PerlOption +GlobalRequest
must be enabled.
Usage example:
ModPerl2::Tools::safe_die 401;
- $r->safe_die($status)
- $f->safe_die($status)
-
These 2 methods are to be used if a request object or a filter object are available.
Usage from within a filter:
package My::Filter; use strict; use warnings; use ModPerl2::Tools; use base 'Apache2::Filter'; sub handler : FilterRequestHandler { my ($f, $bb)=@_; $f->safe_die(410); }
The filter flavor removes the current filter from the request's output filter chain.
- $r->headers_sent
-
This function checks if the HTTP_HEADER output filter is still present. If so, it returns an empty list, true otherwise.
The presence of this filter means no output has yet been written to the client. The HTTP status code and header fields can still be modified.
Fetching the content of another document
Sometimes a handler or a filter needs the content of another document in the web server's realm. Apache provides subrequests for this purpose.
The 2 fetch_url
variants use a subrequest to fetch the content of another document. The document can even be fetched via mod_proxy
from another server.
ModPerl2::Tools::fetch_url
needs
PerlOption +GlobalRequest
Usage:
$content=ModPerl2::Tools::fetch_url '/some/where?else=42';
$content=$r->fetch_url('/some/where?else=42');
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42');
If mod_proxy
is available fetch_url
can use it to fetch a document from another web server. If mod_ssl
is configured to allow proxying SSL (see SSLProxyEngine
) even the https
scheme works. Another subtle point, ProxyErrorOverride
may affect the output in case of an error.
Further, if fetch_url
is passed a subroutine as the last argument the content is not accumulated in a single variable but passed brigade-wise to the function:
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42', sub {
my ($subr, @brigade)=$_;
...
});
The subroutine is called with the subrequest as the first parameter and a list of non-empty strings. The list itself may be empty if all buckets of the brigade do not contain data.
On success the resulting $content
will be the empty string in this case.
fetch_url()
normally strips almost all input HTTP header fields from the subrequest before running it. However, if the $r
request object has a Host
header field it is passed on. Also, a User-Agent
header is set for the subrequest containing ModPerl2::Tools/$VERSION
where $VERSION
is the module's version.
If you need to pass more fields pass an array reference as the 2nd parameter to fetch_url()
:
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42', [qw/
X-MyHeader my-value
X-MyNextHeader my-next-value
/]);
or even:
($content, $headers)=
$r->fetch_url('http://what.is/the/meaning/of?life=42', [qw/
X-MyHeader my-value
X-MyNextHeader my-next-value
/], sub {
my ($subr, @brigade)=$_;
...
});
If Host
or User-Agent
headers are passed this way they overwrite the default ones.
Note, though, the header fields are assigned to the subrequest just before the response handler is run. Earlier phases will see a copy of the main request's headers.
How does it work?
If the passed URL starts with https://
or http://
a subrequest for the URI /
is initiated via $r->lookup_uri('/')
. Before the subrequest is run it is changed into a proxy request for the passed URL. One precondition for this to work is that there are no access restrictions for the URL /
.
Otherwise it is simply a subrequest for the passed URL.
Then ModPerl2::Tools::Filter::fetch_content_filter
is installed as output filter for the subrequest. After that the subrequest is run.
The output filter collects all output.
When the request is done its $r->headers_out
is copied into a normal hash and in list context the output string and this hash are returned. In scalar context only the string is returned.
HTTP header names are case insensitive. Their names are all converted to lower case in the $headers
hash. There are 2 hash members in upper case. STATUS
contains the HTTP status code of the subrequest and STATUSLINE
the status line.
Useful functions for similar cases
Note, it is always better to process data one chunk at a time. Try hard to do that! Collecting data in memory should only be a last resort.
- ModPerl2::Tools::Filter::read_bb $bucket_brigade, \@buffer
-
read_bb
collects the data of a bucket brigade in the@buffer
array. If anEOS
bucket has been seen it returns true otherwise false.A simple output filter that collects all data could look like:
sub filter { my ($f, $bb)=@_; my @buffer; do_something(join '', @buffer) if ModPerl2::Tools::Filter::read_bb $bb, \@buffer; return Apache2::Const::OK; }
- ModPerl2::Tools::Filter::fetch_content_filter
-
This function is a
FilterRequestHandler
. Is is controlled by 2 elements of$r->pnotes
,out
andforce_fetch_content
.out
must be an array reference. It is passed toread_bb
to collect the output.force_fetch_content
is a flag. If false the filter does nothing and removes itself if the$r->status
on the first invocation of the filter is notHTTP_OK
.Usage:
my $subr=$r->lookup_uri(...); my $output=[]; @{$subr->pnotes}{qw/out force_fetch_content/}=($output,1); $subr->add_output_filter (\&ModPerl2::Tools::Filter::fetch_content_filter); $subr->run; do_something(join '', @$output)
EXPORTS
None.
TODO
SEE ALSO
AUTHOR
Torsten Förtsch, <torsten.foertsch@gmx.net>
COPYRIGHT AND LICENSE
Copyright (C) 2010 by Torsten Förtsch
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.