NAME
HTTPD::Log::Filter - a module to filter entries out of an httpd log.
SYNOPSIS
my $hlf = HTTPD::Log::Filter->new(
exclusions_file => $exclusions_file,
agent_re => '.*Mozilla.*',
format => 'ELF',
);
while( <> )
{
my $ret = $hlf->filter( $_ );
die "Error at line $.: invalid log format\n" unless defined $ret;
print $line if $ret;
}
print grep { $hlf->filter( $_ ) } <>;
DESCRIPTION
This module provide a simple interface to filter entries out of an httpd logfile. The constructor can be passed regular expressions to match against particular fields on the logfile. It does its filtering line by line, using a filter method that takes a line of a logfile as input, and returns true if it matches, and false if it doesn't.
There are two possible non-matching (false) conditions; one is where the line is a valid httpd logfile entry, but just doesn't happen to match the filter (where "" is returned). The other is where it is an invalid entry according to the format specified in the constructor.
CONSTRUCTOR
The constructor is passed a number of options as a hash. These are:
- exclusions_file
-
This option can be used to specify a filename for entries that don't match the filter to be written to.
- invert
-
This option, is set to true, will invert the logic of the fliter; i.e. will return only non-matching lines.
- format
-
This should be one of:
- CLF
-
Common Log Format (CLF):
"%h %l %u %t \"%r\" %>s %b"
- ELF
-
NCSA Extended/combined Log format:
"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\""
- XLF
-
Some bespoke format based on extended log format + some junk at the end:
"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" %j
where %j is .* in regex-speak.
See http://httpd.apache.org/docs/mod/mod_log_config.html for more information on log file formats.
- (host|ident|authexclude|date|request|status|bytes|referer|agent)_re
-
This class of options specifies the regular expression or expressions which are used to filter the
METHODS
filter
Filters a line of a httpd logfile. returns true (the line) if it matches, and false ("" or undef) if it doesn't.
There are two possible non-matching (false) conditions; one is where the line is a valid httpd logfile entry, but just doesn't happen to match the filter (where "" is returned). The other is where it is an invalid entry according to the format specified in the constructor.
re
Returns the current filter regular expression.
AUTHOR
Ave Wrigley <Ave.Wrigley@itn.co.uk>
COPYRIGHT
Copyright (c) 2001 Ave Wrigley. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.