NAME
WWW::Google::Groups - Google Groups Agent
SYNOPSIS
use WWW::Google::Groups;
$agent = new WWW::Google::Groups
(
server => 'groups.google.com',
proxy => 'my.proxy.server:port',
);
$group = $agent->select_group('comp.lang.perl.misc');
$group->starting_thread(0); # Set the first thread to fetch
# Default starting thread is 0
while( $thread = $group->next_thread() ){
while( $article = $thread->next_article() ){
# the returned $article is an Email::Simple object
# See Email::Simple for its methods
print join q/ /, $thread->title, '<'.$article->header('From').'>', $/;
}
}
If you push 'raw' to the argument stack of $thread->next_article(), it will return the raw format of messages.
while( $thread = $group->next_thread() ){
while( $article = $thread->next_article('raw') ){
print $article;
}
}
Even, you can use this more powerful method. It will try to mirror the whole newsgroup and save the messages to an Unix mbox.
$agent->save2mbox(
group => 'comp.lang.perl.misc',
starting_thread => 0,
max_article_count => 10000,
max_thread_count => 1000,
target_mbox => 'perl.misc.mbox',
);
DESCRIPTION
It is heard that the module (is/may be) violating Google's term of service. So use it at your risk. It is written for crawling back the whole histories of several newsgroups, for my personal interests. Since many NNTP servers do not have huge and complete collections, Google becomes my final solution. However, the www interface of google groups cannot satisfy me well, kind of a keyboard/console interface addict and I would like some sort of perl api. That's why I design this module. And hope Google will not notify me of any concern on this evil.
COPYRIGHT
xern <xern@cpan.org>
This module is free software; you can redistribute it or modify it under the same terms as Perl itself.