NAME

WebService::CIA::Parser - Parse pages from the CIA World Factbook

SYNOPSIS

use WebService::CIA::Parser;
my $parser = WebService::CIA::Parser->new;
my $data = $parser->parse($string);

DESCRIPTION

WebService::CIA::Parser takes a string of HTML and parses it. It will only give sensible output if the string is the HTML for a page whose URL matches https://www.cia.gov/library/publications/the-world-factbook/print/[a-z]{2}\.html

This parsing is somewhat fragile, since it assumes a certain page structure. It'll work just as long as the CIA don't choose to alter their pages.

METHODS

new

Creates a new WebService::CIA::Parser object. It takes no arguments.

parse($html)

Parses a string of HTML take from the CIA World Factbook. It takes a single string as its argument and returns a hashref of fields and values.

The values are stripped of all HTML. <br> tags are replaced by newlines.

It also creates four extra fields: "URL", "URL - Print", "URL - Flag", and "URL - Map" which are the URLs of the country's Factbook page, the printable version of that page, a GIF map of the country, and a GIF flag of the country respectively.

EXAMPLE

use WebService::CIA::Parser;
use LWP::Simple qw(get);

$html = get(
  "https://www.cia.gov/library/publications/the-world-factbook/print/uk.html"
);
$parser = WebService::CIA::Parser->new;
$data = $parser->parse($html);
print $data->{"Population"};

AUTHOR

Ian Malpass (ian-cpan@indecorous.com)

COPYRIGHT

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The CIA World Factbook's copyright information page (https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html) states:

The Factbook is in the public domain. Accordingly, it may be copied
freely without permission of the Central Intelligence Agency (CIA).

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)