NAME

Scrappy::Manual - How Do I Command The All Powerful Web Scraper Scrappy?

VERSION

version 0.6

DISCLAIMER

This documentation is incomplete, obviously. For help and support find alnewkirk or alnewkirk|com on IRC, or find this project on GitHub. If all else fails, write your local congressman.

WHAT IS SCRAPPY

Scrappy is an easy (and hopefully fun) way of scraping, spidering, and/or harvesting information from web pages. Internally Scrappy uses the awesome Web::Scraper and WWW::Mechanize modules so as such Scrappy imports its awesomeness. Scrappy is inspired by the fun and easy-to-use Dancer API. Beyond being a pretty API for WWW::Mechanize::Plugin::Web::Scraper, Scrappy also has its own featuer-set which makes web scraping easier and more enjoyable.

Scrappy (pronounced Scrap+Pee) == 'Scraper Happy' or 'Happy Scraper'; If you like you may call it Scrapy (pronounced Scrape+Pee) although Python has a web scraping framework by that name and this module is not a port of that one.

BASIC USAGE

#!/usr/bin/perl
use Scrappy qw/:syntax/;
    
# get page from URL
get 'http://search.cpan.org/recent';

if (loaded) {
    var modules => grab '#cpansearch li a', { name => 'TEXT', link => '@href' };
}

# the list function deferences, list == @{...}
print $_->{name}, "\n" for list var->{modules};

ADVANCED USAGE

Scrape From A Website

get $website_url;
var foo => grab 'a.more_info', 'ALL';

Scrape From A File

use URI;
get URI->new($filename);
var foo => grab 'div', 'ALL';

Scrape An Entire Website

crawl $starting_url, {
    'a' => sub { queue shift->href },
    '/*' => sub {
        # /* matches the root node, you can also use body, div.container, etc
        # do something
    }
};

AUTHOR

Al Newkirk <awncorp@cpan.org>

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

To install Scrappy, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Scrappy

CPAN shell

perl -MCPAN -e shell
install Scrappy

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)