NAME
WWW::Gazetteer::HeavensAbove - Find location of world towns and cities
SYNOPSIS
use WWW::Gazetteer::HeavensAbove;
my $atlas = WWW::Gazetteer::HeavensAbove->new;
# simple query using ISO 3166 codes
my @towns = $atlas->find( 'Bacton', 'GB' );
print $_->{name}, ", ", $_->{elevation}, $/ for @towns;
# simple query using heavens-above.com codes
my @towns = $atlas->query( 'Bacton', 'UK' );
print $_->{name}, ", ", $_->{elevation}, $/ for @towns;
# big queries can use a callback (and return nothing)
$atlas->find(
'Bacton', 'GB',
sub { print $_->{name}, ", ", $_->{elevation}, $/ for @_ }
);
# find() returns an arrayref in scalar context
$cities = $atlas->find( 'Paris', 'FR' );
print $cities->[1]{name};
# the heavens-above.com site supports complicated queries
my @az = $atlas->find( 'a*z', 'FR' );
# and you can naturally use callbacks for those!
my ($c, n);
$atlas->find( 'N*', 'US', sub { $c++; $n += @_ } );
print "$c web requests needed for finding $n cities";
# or use your own UserAgent
my $ua = LWP::UserAgent->new;
$atlas = WWW::Gazetteer::HeavensAbove->new( ua => $ua );
# another way to create a new object
use WWW::Gazetteer;
my $g = WWW::Gazetteer->new('HeavensAbove');
DESCRIPTION
A gazetteer is a geographical dictionary (as at the back of an atlas). The WWW::Gazetteer::HeavensAbove module uses the information at http://www.heavens-above.com/countries.asp to return geographical location (longitude, latitude, elevation) for towns and cities in countries in the world.
Once a WWW::Gazetteer::HeavensAbove objects is created, use the find() method to return lists of hashrefs holding all the information for the matching cities.
A city tructure looks like this:
$lesparis = {
iso => 'FR',
latitude => '45.633',
regionname => 'Region',
region => 'Rhône-Alpes',
alias => 'Les Paris',
elevation => '508', # meters
longitude => '5.733',
name => 'Paris',
};
Note: the 'regioname' attribute is the local name of a region (this can change from country to country).
Due to the way heavens-above.com's database was created, cities from the U.S.A. are handled as a special case. The region
field is the state, and a special field named county
holds the county name.
Here is an example of an American city:
$newyork = {
iso => 'US',
latitude => '39.685',
regionname => 'State',
region => 'Missouri',
county => 'Caldwell', # this is only for US cities
alias => '',
elevation => '244',
longitude => '-93.927',
name => 'New York'
};
Methods
- new()
-
Return a new WWW::Gazetteer::UserAgent, ready to find() cities for you.
The constructor can be given a list of parameters. Currently supported parameters are :
ua
- the LWP::UserAgent used for the web requestsretry
- the number of times a failed connection will be retriedYou can also use the generic WWW::Gazetteer module to create a new WWW::Gazetteer::HeavenAbove object:
use WWW::Gazetteer; my $g = WWW::Gazetteer->new('HeavensAbove');
You can also pass it inialisation parameters:
use WWW::Gazetteer; my $g = WWW::Gazetteer->new('HeavensAbove', retry => 3);
- find( $city, $country [, $callback ] )
-
Return a list of cities matching $city, within the country with ISO 3166 code $code (not all codes are supported by heavens-above.com).
This method always returns an array of city structures. If the request returns a lot of cities, you can pass a callback routine to find(). This routine receives the list of city structures as @_. If a callback method is given to find(), find() will return an empty list.
A single call to find() can lead to several web requests. If the query returns more than 200 answeris, heavens-above.com cuts at 200. WWW::Gazetteer::HeavensAbove picks as many data as possible from this first answer and then refines the query again and again.
Here's an excerpt from heavens-above.com documentation:
You can use "wildcard" characters to match several towns if you're not sure of the exact name. These characters are '*' which means "match any sequence of characters", and '?' which means "match any single character". The search is not case-sensitive.
Diacritic characters, such as ü and Ä can either be entered directly from the keyboard (assuming you have the appropriate keyboard), or simply enter the letter without diacritic (e.g. you can enter 'a' for 'ä', 'à', 'á', 'â', 'ã' and 'å'). If you need a special character which is not on your keyboard, and is not a diacritic (e.g. the german 'ß', and scandinavian 'æ'), simply enter a "?" instead, and all characers will be matched.
Note: heavens-above.com doesn't use ISO 3166 codes, but its own country codes. If you want to use those directly, please see the query() method. (And read the source for the full list of HA codes.)
- fetch( $searchstring, $code [, $callback ] );
-
fetch() is a synonym for find(). It is kept for backward compatibility.
- query( $searchstring, $code [, $callback ] );
-
This method is the actual method called by find().
The only difference is that $code is the heavens-above.com specific country code, instead of the ISO 3166 code.
Callbacks
The find() and query() methods both accept a optionnal coderef as their third argument. This method is used as a callback each time a batch of cities is returned by a web query to heavens-above.com.
This can be very useful if a query with a joker returns more than 200 answers. WWW::Gazetteer::HeavensAbove breaks it into new requests that return a smaller number of answers. The callback is called with the results of the subquery after each web request.
This method is called in void context, and is passed a list of hashrefs (the cities found by the last query).
An example callback is (from eg/city.pl):
# print a tab separated list of cities
my $cb = sub {
local $, = "\t";
local $\ = $/;
print @$_{qw(name alias region latitude longitude elevation)} for @_;
};
Please note that, due to the nature of the queries, your callback can (and will most probably) be called with an empty @_.
TODO
Handle the case where a query with more than one joker (*?) returns more than 200 answers. For now, it stops at 200.
BUGS
There is at least one query that cannot be fulfilled: there are more than 200 cities named Buenavista in Mexico. The web site limitation of 200 cities per query prevents us to get the other Benavistas in Mexico. WWW::Gazetteer::HeavensAbove version 0.11 includes a workaround to continues with the global query, and fetch only the first 200 Buenavistas. (This will work with other similarly broken answers.)
Network errors croak after the maximum retry count has been reached. This can be a problem when making big queries (that return more than 200 answers) which results are passed to a callback, because part of the data has been already processed by the callback when the script dies. And even if you can catch the exception, you cannot easily guess where to start again.
Bugs in the database are not from heavens-above.com, since they "put together and enhanced" data from the following two sources: US Geological Survey (http://geonames.usgs.gov/index.html) for the USA and dependencies, and The National Imaging and Mapping Agency (http://www.nima.mil/gns/html/index.html) for all other countries.
See also: http://www.heavens-above.com/ShowFAQ.asp?FAQID=100
AUTHOR
Philippe "BooK" Bruhat <book@cpan.org>.
This module was a script, before I found out about Leon Brocard's WWW::Gazetteer module. Thanks! And, erm, bits of the documentation were stolen from WWW::Gazetteer.
Thanks to Alain Zalmanski (of http://www.fatrazie.com/ fame) for asking me for all that geographical data in the first place.
SEE ALSO
"How I captured thousands of Afghan cities in a few hours", one of my lightning talks at YAPC::Europe 2002 (Munich). You had to be there.
WWW::Gazetteer and WWW::Gazetteer::Calle, by Leon Brocard.
The use Perl discussion that had me write this module from the original script: http://use.perl.org/~acme/journal/8079
COPYRIGHT
This module is free software; you can redistribute it or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 365:
Non-ASCII character seen before =encoding in ''Rhône-Alpes','. Assuming CP1252