NAME
Gungho - Yet Another High Performance Web Crawler Framework
SYNOPSIS
use Gungho;
my $g = Gungho->new($config);
$g->run;
DESCRIPTION
Gungho is Yet Another Web Crawler Framework, aimed to be an extensible and fast. Its meant to be a culmination of lessons learned while building Xango -- Xango was *fast*, but it was horribly hard to debug. Gungho tries to build from clean structures, based upon principles from the likes of Catalyst and Plagger.
WARNING: *ALL* APIs are still subject to change.
STRUCTURE
Gungho is comprised of three parts. A Provider, which provides Gungho with requests to process, a Handler, which handles the fetched page, and an Engine, which controls the entire process.
METHODS
new($config)
Creates a new Gungho instance. It requires either the name of a config filename or a hashref.
run
Starts the Gungho process.
setup
Sets up the Gungho environment, including calling the various setup_* methods to configure the provider, engine, handler, etc.
setup_engine
setup_handler
setup_log
setup_provider
Sets up the various components.
has_requests
Delegates to provider's has_requests
get_requests
Delegates to provider's get_requests
handle_response
Delegates to handler's handle_response
load_config($config)
Loads the config from $config via Config::Any.
load_gungho_module($name, $prefix)
Loads a Gungho component. Compliments the module name with 'Gungho::$prefix::', unless the name is prefixed with a '+'. In that case, no transformation is performed, and the module name is used as-is.
CODE
You can obtain the current code base from
http://gungho-crawler.googlecode.com/svn/trunk
AUTHOR
Copyright (c) 2007 Daisuke Maki <daisuke@endeworks.jp>
All rights reserved.
LICENSE
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
See http://www.perl.com/perl/misc/Artistic.html