NAME

Scrappy::Scraper::Control - Scrappy HTTP Request Constraints System

VERSION

version 0.94112090

SYNOPSIS

#!/usr/bin/perl
use Scrappy::Scraper::Control;

my  $control = Scrappy::Scraper::Control->new;

    $control->allow('http://search.cpan.org');
    $control->allow('http://search.cpan.org', if => {
            content_type => ['text/html', 'application/x-tar']
        }
    );
    
    $control->restrict('http://www.cpan.org');
    
    if ($control->is_allowed('http://search.cpan.org/')) {
        ...
    }
    
    # constraints will only be checked if the is_allowed method is
    # passed a HTTP::Response object.

DESCRIPTION

Scrappy::Scraper::Control provides HTTP request access control for the Scrappy framework.

ATTRIBUTES

The following is a list of object attributes available with every Scrappy::Scraper::Control instance.

allowed

The allowed attribute holds a hasherf of allowed domain/contraints.

my  $control = Scrappy::Scraper::Control->new;
    $control->allowed;
    
    e.g.
    
    {
        'www.foobar.com' => {
            methods => [qw/GET POST PUSH PUT DELETE/],
            content_type => ['text/html']
        }
    }

restricted

The restricted attribute holds a hasherf of restricted domain/contraints.

my  $control = Scrappy::Scraper::Control->new;
    $control->restricted;
    
    e.g.
    
    {
        'www.foobar.com' => {
            methods => [qw/GET POST PUSH PUT DELETE/]
        }
    }

METHODS

allow

my  $control = Scrappy::Scraper::Control->new;
    $control->allow('http://www.perl.org');
    $control->allow('http://search.cpan.org', if => {
            content_type => ['text/html', 'application/x-tar']
        }
    );

restrict

my  $control = Scrappy::Scraper::Control->new;
    $control->restrict('http://www.perl.org');
    $control->restrict('http://search.cpan.org', if => {
            content_type => ['text/html', 'application/x-tar']
        }
    );

is_allowed

my  $control = Scrappy::Scraper::Control->new;
    $control->allow('http://search.cpan.org');
    $control->restrict('http://www.perl.org');
    
    if (! $control->is_allowed('http://perl.org')) {
        die 'Cant get to Perl.org';
    }

AUTHOR

Al Newkirk <awncorp@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2010 by awncorp.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.