NAME
Sport::Analytics::NHL::Scraper - Scrape and crawl the NHL website for data
SYNOPSIS
Scrape and crawl the NHL website for data
use Sport::Analytics::NHL::Scraper
my $schedules = crawl_schedule({
start_season => 2016,
stop_season => 2017
});
...
my $contents = crawl_game(
{ season => 2011, stage => 2, season_id => 0001 }, # game 2011020001 in NHL accounting
{ game_files => [qw(BS PL)], retries => 2 },
);
IMPORTANT VARIABLE
Variable @GAME_FILES contains specific definitions for the report types. Right now only the boxscore javascript has any meaningful non-default definitions; the PB feed seems to have become unavailable.
FUNCTIONS
scrape
-
A wrapper around the LWP::Simple::get() call for retrying and control. Arguments: hash reference containing * url => URL to access * retries => Number of retries * validate => sub reference to validate the download Returns: the content if both download and validation are successful undef otherwise.
crawl_schedule
-
Crawls the NHL schedule. The schedule is accessed through a minimalistic live api first (only works for post-2010 seasons), then through the general /api/
Arguments: hash reference containing * start_season => the first season to crawl * stop_season => the last season to crawl Returns: hash reference of seasonal schedules where seasons are the keys, and decoded JSONs are the values.
get_game_url_args
-
Sets the arguments to populate the game URL for a given report type and game Arguments: document name, currently one of qw(BS PB RO ES GS PL) game hashref containing * season => YYYY * stage => 2|3 * season ID => NNNN Returns: a configured list of arguments for the URL.
crawl_game
-
Crawls the data for the given game Arguments: game data as hashref: * season => YYYY * stage => 2|3 * season ID => NNNN options hashref: * game_files => hashref of types of reports that are requested * force => 0|1 force overwrite of files already present in the system * retries => N number of the retries for every get call
AUTHOR
More Hockey Stats, <contact at morehockeystats.com>
BUGS
Please report any bugs or feature requests to contact at morehockeystats.com
, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=Sport::Analytics::NHL::Scraper. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Sport::Analytics::NHL::Scraper
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
https://rt.cpan.org/NoAuth/Bugs.html?Dist=Sport::Analytics::NHL::Scraper
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
https://cpanratings.perl.org/d/Sport::Analytics::NHL::Scraper
Search CPAN