NAME
Logfile::EPrints - Parse Apache logs from GNU EPrints
SYNOPSIS
use Logfile::EPrints;
my $parser = Logfile::EPrints->new(
handler=>Logfile::Repeated->new(
handler=>Logfile::Institution->new(
handler=>$MyHandler,
)),
identifier=>'oai:myir:', # Prepended to the eprint id
);
open my $fh, "<access_log" or die $!;
$parser->parse_file($fh);
package MyHandler;
sub new { ... }
sub AUTOLOAD { ... }
sub fulltext {
my ($self,$hit) = @_;
printf("%s from %s requested %s (%s)\n",
$hit->hostname||$hit->address,
$hit->institution||'Unknown',
$hit->page,
$hit->identifier,
);
}
DESCRIPTION
The Logfile::* modules provide a means to analyze log files from Web servers (typically Institutional Repositories) by translating HTTP requests into more informative data, e.g. a full-text download by a user at Caltech.
The architectural design consists of a series of pluggable filters that read from a log file or stream into Perl objects/callbacks. The first filter in the stream needs to convert from the log file format into a Perl object representing a single "hit". Subsequent filters can then ignore hits (e.g. from robots) and/or augment them with additional data (e.g. country of origin by GeoIP).
CALLBACKS
See Logfile::Hit for the fields available from the 'hit' object.
Filter Callbacks
- abstract($handler,$hit)
- browse($handler,$hit)
- fulltext($handler,$hit)
- repeated($handler,$hit)
-
Repeated is implemented by Logfile::Repeated
- search($handler,$hit)
SEE ALSO
AUTHOR
Timothy D Brody, <tdb01r@ecs.soton.ac.uk>
TODO
Robots filter:
- Exclude users that request robots.txt (probably requires persistent storage) =item Exclude users by user-agent string
COPYRIGHT AND LICENSE
Copyright (C) 2005 by Timothy D Brody
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.
2 POD Errors
The following errors were encountered while parsing the POD:
- Around line 169:
'=item' outside of any '=over'
- Around line 172:
You forgot a '=back' before '=head1'