NAME
WWW::Orkut::Spider - Perl extension for spidering the orkut community
SYNOPSIS
use WWW::Orkut::Spider;
my $orkut = WWW::Orkut::Spider->new;
$orkut->login($user,$pass);
$orkut->get_hisfriends($uid);
print $orkut->get_xml_profile($uid);
DESCRIPTION
WWW::Orkut::Spider uses WWW:Mechanize to scrape orkut.com.
Output is a simple xml format containing friends, communities and profiles for a given Orkut UID.
- Access to orkut.com via WWW::Mechanize
- Collects UIDs
- Fetches Profiles/Communities/Friends for a given UID
- Output via simple xml format
new (proxy)
You can specify a Proxy Server here
i.e: http://www.proxy.de:8080/
or: undef
login (user,pass)
login orkut as user with pass
return undef if unseccessful
logout
logout of orkut
name (uid)
return name of given known uid
users
return array with all known uids
xml (tag,value)
return a simple
<tag>value</tag>
get_myfriends
only after login
follow the link to friendslist
and get friends uids
return 1 if success
get_hisfriends (uid)
parse uid friends page for more uids
follow_friends
follow through all friends pages
called after GET of first friend page
parse_friends
parse html page for friends uids
helper for follow friends
used after GET FriendList
get_friendsfriends (n)
iterate n times over found uids to find more friends
more than n=1 seems insane, unlikely to work
don't let your script crash in this function, WWW::Mechanize may decide to die if orkut.com gets one of its server failures
FIXME: logout/login all 50 requests may help
get_xml_profile (uid)
return profile of uid as simple xml
get_xml_communities (uid)
return communities of uid as simple xml
get_xml_friendslist (uid)
return friendslist of uid as simple xml
SEE ALSO
Net::Orkut ( using LWP directly )
AUTHOR
mm-pause@manno.name
COPYRIGHT AND LICENSE
Copyright (C) 2004 by mm-pause@manno.name
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.