NAME
WWW::Search::Google - class for searching Google
SYNOPSIS
use WWW::Search; my $Search = new WWW::Search('Google'); # cAsE matters my $Query = WWW::Search::escape_query("Where is Jimbo"); $Search->native_query($Query); while (my $Result = $Search->next_result()) { print $Result->url, "\n"; }
DESCRIPTION
This class is a Google specialization of WWW::Search. It handles making and interpreting Google searches. http://www.google.com.
Googles returns 100 Hits per page. Custom Linux Only search capable.
This class exports no public interface; all interaction should be done through WWW::Search objects.
LINUX SEARCH
For LINUX lovers like me, you can put Googles in a LINUX only search mode by changing search URL from:
'search_url' => 'http://www.google.com/search',
to:
'search_url' => 'http://www.google.com/linux',
SEE ALSO
To make new back-ends, see WWW::Search.
HOW DOES IT WORK?
native_setup_search
is called (from WWW::Search::setup_search
) before we do anything. It initializes our private variables (which all begin with underscore) and sets up a URL to the first results page in {_next_url}
.
native_retrieve_some
is called (from WWW::Search::retrieve_some
) whenever more hits are needed. It calls WWW::Search::http_request
to fetch the page specified by {_next_url}
. It then parses this page, appending any search hits it finds to {cache}
. If it finds a ``next'' button in the text, it sets {_next_url}
to point to the page for the next set of results, otherwise it sets it to undef to indicate we''re done.
TESTING
This module adheres to the WWW::Search
test suite mechanism.
AUTHOR
This backend is written and maintained/supported by Jim Smyser. <jsmyser@bigfoot.com>
BUGS
Google is not an easy search engine to parse in that it is capable of altering it's output ever so slightly on different search terms. There may be new slight results output the author has not yet seen that will pop at any given time for certain searches. So, if you think you see a bug keep the above in mind and send me the search words you used so I may code for any new variations.
VERSION HISTORY
2.13 New regexp to parse newly found results format with certain search terms.
2.10 removed warning on absence of description; new test case
2.09 Google NOW returning url and title on one line.
2.07 Added a new parsing routine for yet another found result line. Added a substitute for whacky url links some queries can produce. Added Kingpin's new hash_to_cgi_string() 10/12/99
2.06 Fixed missing links / regexp crap.
2.05 Matching overhaul to get the code parsing right due to multiple tags being used by google on the hit lines. 9/25/99
2.02 Last Minute description changes 7/13/99
2.01 New test mechanism 7/13/99
1.00 First release 7/11/99
LEGALESE
THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.