NAME

Scrappy - All Powerful Web Spidering, Scrapering, Crawling Framework

VERSION

version 0.9111110

SYNOPSIS

#!/usr/bin/perl
use Scrappy;

my  $scraper = Scrappy->new;
    $scraper->crawl('search.cpan.org',
        '/recent' => {
            '#cpansearch li a' => sub {
                print $_[1]->{href}, "\n";
            }
        }
    );

DESCRIPTION

Scrappy is an easy (and hopefully fun) way of scraping, spidering, and/or harvesting information from web pages, web services, and more. Scrappy is a feature rich, flexible, intelligent web automation tool.

Scrappy (pronounced Scrap+Pee) == 'Scraper Happy' or 'Happy Scraper'; If you like you may call it Scrapy (pronounced Scrape+Pee) although Python has a web scraping framework by that name and this module is not a port of that one.

METHODS

crawl

The crawl method is very useful when it is desired to crawl an entire website or at-least partially, it automates the tasks of creating a queue, fetching and parsing html pages, and establishing simple flow-control. See the SYNOPSIS for a simplified example, ... the following is a more complex example.

my  $scrappy = Scrappy->new;

$scrappy->crawl('http://search.cpan.org/recent',
    '/recent' => {
        
        '#cpansearch li a' => sub {
            my ($self, $item) = @_;
            # follow all recent modules from search.cpan.org
            $self->queue->add($item->{href});
        }
        
    },
    '/~:author/:name-:version/' => {
        
        'body' => sub {
            my ($self, $item, $args) = @_;
            
            my $reviews = $self
            ->select('.box table tr')->focus(3)->select('td.cell small a')
            ->data->[0]->{text};
            
            $reviews = $reviews =~ /\d+ Reviews/ ?
                $reviews : '0 reviews';
            
            print "found $args->{name} version $args->{version} ".
                "[$reviews] by $args->{author}\n";
            
        }
        
    }
);

AUTHOR

Al Newkirk <awncorp@cpan.org>

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

To install Scrappy, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Scrappy

CPAN shell

perl -MCPAN -e shell
install Scrappy

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)