WWW::Search::Scraper
WWW::Search::Sherlock
WWW::Search::Scraper::*
WWW::SearchResult::*
These modules scrape data from search engines on the WWW (much like Apple's
Sherlock, but these are more capable, complete, and accurate.) Complete
documentation can be found in WWW::Search::Scraper and WWW::Search::Sherlock.
Special options for each type of search engine are documented in their respective modules.
Examples include:
1. SearchApartments - illustrates how to easily set up and use one search engine.
2. Sherlock - illustrates the ease of adapting any Sherlock plugin.
3. Scraper - illustrates how SearchResult sub-classing can be used to build
a more generalized search engine scraper. You can see how this
can be extended to build a multi-engine scraper.
If you want to write new Scraper modules to access new search engines, see
FlipDog.pm for the current "best practices".
Happy Hunting!
AUTHOR: Glenn Wood, glenwood@alumni.caltech.edu
#---------------------------------------------------------------------#
( $VERSION ) = sprintf("%d.%02d", q$Revision: 1.36 $ =~ /(\d+)\.(\d+)/)
v1.36 - Added test.pl; fixed bugs discovered thereby in Dice, apartments, eBay, etc.
v1.35 - Added FlipDog.com, and a few features to Scraper.pm (e.g. 'BOGUS', etc).
v1.34 - Introduced SearchResult sub-classing, with illustration in eg/Scraper.pl
Improved reliability of BAJobs.pm
Dice.com changed the result page format, again. I hope this version
of Dice.pm will be more adaptive to Dice.com's future changes.
Added www.computerjobs.com and www.techies.com (which still has a problem)
Added examples: eg/SearchApartments, improved eg/Scraper.
v1.33 - Added www.apartments.com, and eg/Sherlock.pl
KNOWN PROBLEMS
theWorksUSA.pm more often than not goes into a loop.
techies.pm keeps saying "Please enable your cookies", so it doesn't work at all.