NAME

Apache::Filter - Alter the output of previous handlers

SYNOPSIS

#### In httpd.conf:
PerlModule Apache::Filter;
# That's it - this isn't a handler.

<Files ~ "*\.blah">
 SetHandler perl-script
 PerlHandler Filter1 Filter2 Filter3
<\Files>

#### In Filter1, Filter2, and Filter3:
my $fh = $r->filter_input();
while (<$fh>) {
  s/ something / something else /;
  print;
}

#### or, alternatively:
my ($fh, $status) = $r->filter_input();
return $status unless $status == OK;  # The Apache::Constants OK
while (<$fh>) {
  s/ something / something else /;
  print;
}

DESCRIPTION

Each of the handlers Filter1, Filter2, and Filter3 will make a call to $r->filter_input(), which will return a filehandle. For Filter1, the filehandle points to the requested file. For Filter2, the filehandle contains whatever Filter1 wrote to STDOUT. For Filter3, it contains whatever Filter3 wrote to STDOUT. The output of Filter3 goes directly to the browser.

Note that the modules Filter1, Filter2, and Filter3 are listed in forward order, in contrast to the reverse-order listing of Apache::OutputChain.

When you've got this module, you can use the same handler both as a stand-alone handler, and as an element in a chain. Just make sure that whenever you're chaining, all the handlers in the chain are "Filter-aware," i.e. they each call $r->filter_input() exactly once, before they start printing to STDOUT. There should be almost no overhead for doing this when there's only one element in the chain.

METHODS

This module doesn't create a class of its own - rather, it adds some methods to the Apache:: class. Thus, it's really a mix-in package that just adds functionality to the $r request object.

  • $r->filter_input()

    This method will give you a filehandle that contains either the file requested by the user ($r->filename), or the output of a previous filter. If called in a scalar context, that filehandle is all you'll get back. If called in a list context, you'll also get an Apache status code (OK, NOT_FOUND, or FORBIDDEN) that tells you whether $r->filename was successfully found and opened.

  • $r->changed_since($time)

    Returns true or false based on whether the current input seems like it has changed since $time. Currently what this means is that if we're the first handler in the chain, and the file pointed to by $r->filename hasn't changed since the time given, then we return false. Otherwise we return true.

    In the future, there might be a way for filters to specify whether they're "deterministic" or not (given identical input at different times, a deterministic filter will always return the same output). So if you had a filter chain in which the first filter just converted all its input to upper-case, and then the second filter applied some more complicated procedure, the second filter could implement a scheme that cached the output of the upper-caser by checking to see whether only deterministic filters had filtered its input. Such a feature seems frivolous to me though, so don't expect it unless you can convince me that it's a good idea.

HEADERS

In order to make a decent web page, each of the filters shouldn't call $r->send_http_header() or you'll get lots of headers all over your page. This is so obvious that the previous sentence should be a lot shorter.

So the current solution is to have _none_ of the filters send the headers, and this module will send them for you when the last filter calls $r->filter_input(). You should still set up the content-type (using $r->content_type), and any other headers you want to send, before calling $r->filter_input(). filter_input will simply call $r->send_http_header() with no arguments to send whatever headers you have set.

One downside of this is that all the filters in the stack will probably call $r->content_type, most of them for no reason, but say la vee. If anyone's got better ideas, don't hold them back.

NOTES

VERY IMPORTANT: if one handler in a stacked handler chain uses Apache::Filter, then THEY ALL MUST USE IT. This means they all must call $r->filter_input exactly once. Otherwise Apache::Filter couldn't capture the output of the handlers properly, and it wouldn't know when to release the output to the browser.

The output of each filter is accumulated in memory before it's passed to the next filter, so memory requirements might be large for large pages. I'm not sure whether Apache::OutputChain is subject to this same behavior. In future versions I might find a way around this, or cache large pages to disk so memory requirements don't get out of hand. We'll see whether it's a problem.

My usual alpha disclaimer: the interface here isn't stable. So far this should be treated as a proof-of-concept.

A couple examples of filters are provided with this distribution in the t/ subdirectory: UC.pm converts all its input to upper-case, and Reverse.pm prints the lines of its input reversed.

I tried using $r->finfo for file-test operators, but they didn't seem to work. If they start working or I figure out what's going on, I'll replace $r->filename with $r->finfo. This is pretty bizzarre, because it worked fine in Apache::SSI (shrug).

BUGS

This uses some funny stuff to figure out when the currently executing handler is the last handler in the chain. As a result, code that manipulates the handler list at runtime (using push_handlers and the like) might produce mayhem. Poke around a bit in the code before you try anything.

I haven't considered what will happen if you use this and you haven't turned on PERL_STACKED_HANDLERS.

AUTHOR

Ken Williams (ken@forum.swarthmore.edu)

COPYRIGHT

Copyright 1998 Ken Williams. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

perl(1).