NAME

WWW::Search::Google - class for searching Google

SYNOPSIS

use WWW::Search; my $Search = new WWW::Search('Google'); # cAsE matters my $Query = WWW::Search::escape_query("Where is Jimbo"); $Search->native_query($Query); while (my $Result = $Search->next_result()) { print $Result->url, "\n"; }

DESCRIPTION

This class is a Google specialization of WWW::Search. It handles making and interpreting Google searches. http://www.google.com.

Googles returns 100 Hits per page. Custom Linux Only search capable.

This class exports no public interface; all interaction should be done through WWW::Search objects.

LINUX SEARCH

For LINUX lovers like me, you can put Googles in a LINUX only search mode by changing search URL from:

'search_url' => 'http://www.google.com/search',

to:

'search_url' => 'http://www.google.com/linux',

SEE ALSO

To make new back-ends, see WWW::Search.

HOW DOES IT WORK?

native_setup_search is called (from WWW::Search::setup_search) before we do anything. It initializes our private variables (which all begin with underscore) and sets up a URL to the first results page in {_next_url}.

native_retrieve_some is called (from WWW::Search::retrieve_some) whenever more hits are needed. It calls WWW::Search::http_request to fetch the page specified by {_next_url}. It then parses this page, appending any search hits it finds to {cache}. If it finds a ``next'' button in the text, it sets {_next_url} to point to the page for the next set of results, otherwise it sets it to undef to indicate we''re done.

TESTING

This module adheres to the WWW::Search test suite mechanism.

BUGS

2.07 now parses for most of what Google produces, but not all. Because Google does not produce universial formatting for all results it produces, there are undoublty a few line formats yet uncovered by the author. Different search terms creates various differing format out puts for each line of results. Example, searching for "visual basic" will create whacky url links, whereas searching for "Visual C++" does not. It is a parsing nitemare really! If you think you uncovered a BUG just remember the above comments!

With the above said, this back-end will produce proper formated results for 96+% of what it is asked to produce. Your milage will vary.

AUTHOR

This backend is maintained and supported by Jim Smyser. <jsmyser@bigfoot.com>

BUGS

2.09 seems now to parse all hits with the new format change so there really shouldn't be any like there were with 2.08.

VERSION HISTORY

2.10 removed warning on absence of description; new test case

2.09 Google NOW returning url and title on one line.

2.07 Added a new parsing routine for yet another found result line. Added a substitute for whacky url links some queries can produce. Added Kingpin's new hash_to_cgi_string() 10/12/99

2.06 Fixed missing links / regexp crap.

2.05 Matching overhaul to get the code parsing right due to multiple tags being used by google on the hit lines. 9/25/99

2.02 Last Minute description changes 7/13/99

2.01 New test mechanism 7/13/99

1.00 First release 7/11/99

LEGALESE

THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.