NAME

Scrappy::Examples - How Do I Command The All Powerful Web Scraper Scrappy?

VERSION

version 0.592

WHAT IS SCRAPPY

Scrappy is an easy (and hopefully fun) way of scraping, spidering, and/or harvesting information from web pages. Internally Scrappy uses the awesome Web::Scraper and WWW::Mechanize modules so as such Scrappy imports its awesomeness. Scrappy is inspired by the fun and easy-to-use Dancer API. Beyond being a pretty API for WWW::Mechanize::Plugin::Web::Scraper, Scrappy also has its own featuer-set which makes web scraping easier and more enjoyable.

Scrappy (pronounced Scrap+Pee) == 'Scraper Happy' or 'Happy Scraper'; If you like you may call it Scrapy (pronounced Scrape+Pee) although Python has a web scraping framework by that name and this module is not a port of that one.

BASIC USAGE

#!/usr/bin/perl
use Scrappy qw/:syntax/;
    
# get page from URL
get 'http://search.cpan.org/recent';

if (loaded) {
    var modules => grab '#cpansearch li a', { name => 'TEXT', link => '@href' };
}

# the list function deferences, list == @{...}
print $_->{name}, "\n" for list var->{modules};

ADVANCED USAGE

Scrape From A Website

get $website_url;
var foo => grab 'a.more_info', 'ALL';

Scrape From A File

use URI;
get URI->new($filename);
var foo => grab 'div', 'ALL';

Scrape An Entire Website

crawl $starting_url, {
    'a' => sub { queue shift->href },
    '/*' => sub {
        # /* matches the root node, you can also use body, div.container, etc
        # do something
    }
};

DISCLAIMER

This documentation is incomplete, obviously. For help and support find alnewkirk or alnewkirk|com on IRC, or find this project on GitHub. If all else fails, write your local congressman.

AUTHOR

Al Newkirk <awncorp@cpan.org>

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

To install Scrappy, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Scrappy

CPAN shell

perl -MCPAN -e shell
install Scrappy

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)