NAME

WWW::RobotRules::AnyDBM_File - Persistent RobotRules

SYNOPSIS

require WWW::RobotRules::AnyDBM_File;
require LWP::RobotUA;

# Create a robot useragent that uses a diskcaching RobotRules
my $rules = WWW::RobotRules::AnyDBM_File->new( 'my-robot/1.0', 'cachefile' );
my $ua = WWW::RobotUA->new( 'my-robot/1.0', 'me@foo.com', $rules );

# Then just use $ua as usual
$res = $ua->request($req);

DESCRIPTION

This is a subclass of WWW::RobotRules that uses the AnyDBM_File package to implement persistent diskcaching of robots.txt and host visit information.

The constructor (the new() method) takes an extra argument specifying the name of the DBM file to use. If the DBM file already exists, then you can specify undef as agent name as the name can be obtained from the DBM database.

SECURITY CONSIDERATIONS

The caller-supplied DBM filename must reside in a directory writable only by the same user that runs this code. The underlying AnyDBM_File backends open the file via the C open(2) syscall without O_NOFOLLOW, so a symlink at the cache path (or at its .dir/.pag/.db siblings) will be followed and the linked target may be overwritten with DBM page data. The cache file is created with mode 0600; callers that need different permissions can chmod after construction.

SEE ALSO

WWW::RobotRules, LWP::RobotUA

AUTHORS

Hakan Ardo <hakan@munin.ub2.lu.se>, Gisle Aas <aas@sn.no>