Take me over?
NAME
MediaWiki::Bot - a high-level bot framework for interacting with MediaWiki wikis
VERSION
version 5.007000
SYNOPSIS
use MediaWiki::Bot qw(:constants);
my $bot = MediaWiki::Bot->new({
assert => 'bot',
host => 'de.wikimedia.org',
login_data => { username => "Mike's bot account", password => "password" },
});
my $revid = $bot->get_last("User:Mike.lifeguard/sandbox", "Mike.lifeguard");
print "Reverting to $revid\n" if defined($revid);
$bot->revert('User:Mike.lifeguard', $revid, 'rvv');
DESCRIPTION
MediaWiki::Bot is a framework that can be used to write bots which interface with the MediaWiki API (http://en.wikipedia.org/w/api.php).
METHODS
new
my $bot = MediaWiki::Bot({
host => 'en.wikipedia.org',
operator => 'Mike.lifeguard',
});
Calling MediaWiki::Bot->new()
will create a new MediaWiki::Bot object. The only parameter is a hashref with keys:
agent sets a custom useragent. It is recommended to use
operator
instead, which is all we need to do the right thing for you. If you really want to do it yourself, see https://meta.wikimedia.org/wiki/User-agent_policy for guidance on what information must be included.assert sets a parameter for the AssertEdit extension (commonly 'bot')
operator allows the bot to send you a message when it fails an assert. This is also the recommended way to customize the user agent string, which is required by the Wikimedia Foundation. A warning will be emitted if you omit this.
maxlag allows you to set the maxlag parameter (default is the recommended 5s).
Please refer to the MediaWiki documentation prior to changing this from the default.
protocol allows you to specify 'http' or 'https' (default is 'http')
host sets the domain name of the wiki to connect to
path sets the path to api.php (with no leading or trailing slash)
login_data is a hashref of credentials to pass to "login".
debug - whether to provide debug output.
1 provides only error messages; 2 provides further detail on internal operations.
For example:
my $bot = MediaWiki::Bot->new({
assert => 'bot',
protocol => 'https',
host => 'en.wikimedia.org',
agent => sprintf(
'PerlWikiBot/%s (https://metacpan.org/MediaWiki::Bot; User:Mike.lifeguard)',
MediaWiki::Bot->VERSION
),
login_data => { username => "Mike's bot account", password => "password" },
});
For backward compatibility, you can specify up to three parameters:
my $bot = MediaWiki::Bot->new('My custom useragent string', $assert, $operator);
This form is deprecated will never do auto-login or autoconfiguration, and emits deprecation warnings.
For further reading:
set_wiki
Set what wiki to use. The parameter is a hashref with keys:
host - the domain name
path - the part of the path before api.php (usually 'w')
protocol is either 'http' or 'https'.
If you don't set any parameter, it's previous value is used. If it has never been set, the default settings are 'http', 'en.wikipedia.org' and 'w'.
For example:
$bot->set_wiki({
protocol => 'https',
host => 'secure.wikimedia.org',
path => 'wikipedia/meta/w',
});
For backward compatibility, you can specify up to two parameters:
$bot->set_wiki($host, $path);
This form is deprecated, and will emit deprecation warnings.
login
This method takes a hashref with keys username and password at a minimum. See "Single User Login" and "Basic authentication" for additional options.
Logs the use $username in, optionally using $password. First, an attempt will be made to use cookies to log in. If this fails, an attempt will be made to use the password provided to log in, if any. If the login was successful, returns true; false otherwise.
$bot->login({
username => $username,
password => $password,
}) or die "Login failed";
Once logged in, attempt to do some simple auto-configuration. At present, this consists of:
Warning if the account doesn't have the bot flag, and isn't a sysop account.
Setting an appropriate default assert.
You can skip this autoconfiguration by passing autoconfig => 0
For backward compatibility, you can call this as
$bot->login($username, $password);
This form is deprecated, and will emit deprecation warnings. It will never do autoconfiguration or SUL login.
Single User Login
On WMF wikis, do_sul
specifies whether to log in on all projects. The default is false. But even when false, you still get a CentralAuth cookie for, and are thus logged in on, all languages of a given domain (*.wikipedia.org
, for example). When set, a login is done on each WMF domain so you are logged in on all ~800 content wikis. Since *.wikimedia.org
is not possible, we explicitly include meta, commons, incubator, and wikispecies.
Basic authentication
If you need to supply basic auth credentials, pass a hashref of data as described by LWP::UserAgent:
$bot->login({
username => $username,
password => $password,
basic_auth => { netloc => "private.wiki.com:80",
realm => "Authentication Realm",
uname => "Basic auth username",
pass => "password",
}
}) or die "Couldn't log in";
Bot passwords
MediaWiki::Bot
doesn't yet support the more complicated (but more secure) oAuth login flow for bots. Instead, we support a simpler "bot password", which is a generated password connected to a (possibly-reduced) set of on-wiki privileges, and IP ranges from which it can be used.
To create one, visit Special:BotPasswords
on the wiki. Enter a label for the password, then select the privileges you want to use with that password. This set should be as restricted as possible; most bots only edit existing pages. Keeping the set of privileges as restricted as possible limits the possible damage if the password were ever compromised.
Submit the form, and you'll be given a new "username" that looks like "AccountUsername@bot_password_label", and a generated bot password. To log in, provide those to MediaWiki::Bot
verbatim.
References: API:Login, Logging in
logout
$bot->logout();
The logout method logs the bot out of the wiki. This invalidates all login cookies.
References: API:Logging out
edit
my $text = $bot->get_text('My page');
$text .= "\n\n* More text\n";
$bot->edit({
page => 'My page',
text => $text,
summary => 'Adding new content',
section => 'new',
});
This method edits a wiki page, and takes a hashref of data with keys:
page - the page title to edit
text - the page text to write
summary - an edit summary
minor - whether to mark the edit as minor or not (boolean)
bot - whether to mark the edit as a bot edit (boolean)
assertion - usually 'bot', but see http://mediawiki.org/wiki/Extension:AssertEdit.
section - edit a single section (identified by number) instead of the whole page
An MD5 hash is sent to guard against data corruption while in transit.
You can also call this as:
$bot->edit($page, $text, $summary, $is_minor, $assert, $markasbot);
This form is deprecated, and will emit deprecation warnings.
CAPTCHAs
If a CAPTCHA is encountered, the call to edit
will return false, with the error code set to ERR_CAPTCHA
and the details informing you that solving a CAPTCHA is required for this action. The information you need to actually solve the captcha (for example the URL for the image) is given in $bot->{error}->{captcha}
as a hash reference. You will want to grab the keys 'url' (a relative URL to the image) and 'id' (the ID of the CAPTCHA). Once you have solved the CAPTCHA (presumably by interacting with a human), retry the edit, adding captcha_id
and captcha_solution
parameters:
my $edit = {page => 'Main Page', text => 'got your nose'};
my $edit_status = $bot->edit($edit);
if (not $edit_status) {
if ($bot->{error}->{code} == ERR_CAPTCHA) {
my @captcha_uri = split /\Q?/, $bot->{error}{captcha}{url}, 2;
my $image = URI->new(sprintf '%s://%s%s?%s' =>
$bot->{protocol}, $bot->{host}, $captcha_uri[0], $captcha_uri[1],
);
require Term::ReadLine;
my $term = Term::ReadLine->new('Solve the captcha');
$term->ornaments(0);
my $answer = $term->readline("Please solve $image and type the answer: ");
# Add new CAPTCHA params to the edit we're attempting
$edit->{captcha_id} = $bot->{error}->{captcha}->{id};
$edit->{captcha_solution} = $answer;
$status = $bot->edit($edit);
}
}
References: Editing pages, API:Edit, API:Tokens
move
$bot->move($from_title, $to_title, $reason, $options_hashref);
This moves a wiki page.
If you wish to specify more options (like whether to suppress creation of a redirect), use $options_hashref, which has keys:
movetalk specifies whether to attempt to the talk page.
noredirect specifies whether to suppress creation of a redirect.
movesubpages specifies whether to move subpages, if applicable.
watch and unwatch add or remove the page and the redirect from your watchlist.
ignorewarnings ignores warnings.
my @pages = ("Humor", "Rumor");
foreach my $page (@pages) {
my $to = $page;
$to =~ s/or$/our/;
$bot->move($page, $to, "silly 'merricans");
}
References: API:Move
get_history
my @hist = $bot->get_history($title);
my @hist = $bot->get_history($title, $additional_params);
Returns an array containing the history of the specified page $title.
The optional hash ref $additional_params can be used to tune the query by API parameters, such as 'rvlimit' to return only 'rvlimit' number of revisions (default is as many as possible, but may be limited per query) or 'rvdir' to set the chronological direction.
Example:
my @hist = $bot->get_history('Main Page', {'rvlimit' => 10, 'rvdir' => 'older'})
The array returned contains hashrefs with keys: revid, user, comment, minor, timestamp_date, and timestamp_time.
For backward compatibility, you can specify up to four parameters:
my @hist = $bot->get_history($title, $limit, $revid, $direction);
References: Getting page history, API:Properties#revisions
get_history_step_by_step
my @hist = $bot->get_history_step_by_step($title);
my @hist = $bot->get_history_step_by_step($title, $additional_params);
Same as get_history(), but does not return the full history at once, but let's you loop through it.
The optional call-by-reference hash ref $additional_params can be used to loop through a page's full history by using the 'continue' param returned by the API.
Example:
my $ready;
my $filter_params = {};
while(!$ready){
my @hist = $bot->get_history_step_by_step($page, $filter_params);
if(@hist == 0 || !defined($filter_params->{'continue'})){
$ready = 1;
}
# do something with @hist
}
References: Getting page history, API:Properties#revisions
get_text
Returns the wikitext of the specified $page_title. The first parameter $page_title is the only required one.
The second parameter is a hashref with the following independent optional keys:
rvstartid
- if defined, this function returns the text of that revision, otherwise the newest revision will be used.rvsection
- if defined, returns the text of that section. Otherwise the whole page text will be returned.pageid
- this is an output parameter and can be used to fetch the id of a page without the need of calling "get_id" additionally. Note that the value of this param is ignored and it will be overwritten by this function.rv...
- any param starting with 'rv' will be forwarded to the api call.
A blank page will return wikitext of "" (which evaluates to false in Perl, but is defined); a nonexistent page will return undef (which also evaluates to false in Perl, but is obviously undefined). You can distinguish between blank and nonexistent pages by using defined:
# simple example
my $wikitext = $bot->get_text('Page title');
print "Wikitext: $wikitext\n" if defined $wikitext;
# advanced example
my $options = {'revid'=>123456, 'section_number'=>2};
$wikitext = $bot->get_text('Page title', $options);
die "error, see API error message\n" unless defined $options->{'pageid'};
warn "page doesn't exist\n" if $options->{'pageid'} == MediaWiki::Bot::PAGE_NONEXISTENT;
print "Wikitext: $wikitext\n" if defined $wikitext;
References: Fetching page text, API:Properties#revisions
For backward-compatibility the params revid
and section_number
may also be given as scalar parameters:
my $wikitext = $bot->get_text('Page title', 123456, 2);
print "Wikitext: $wikitext\n" if defined $wikitext;
get_id
Returns the id of the specified $page_title. Returns undef if page does not exist.
my $pageid = $bot->get_id("Main Page");
die "Page doesn't exist\n" if !defined($pageid);
Revisions: API:Properties#info
get_pages
Returns the text of the specified pages in a hashref. Content of undef means page does not exist. Also handles redirects or article names that use namespace aliases.
my @pages = ('Page 1', 'Page 2', 'Page 3');
my $thing = $bot->get_pages(\@pages);
foreach my $page (keys %$thing) {
my $text = $thing->{$page};
print "$text\n" if defined($text);
}
References: Fetching page text, API:Properties#revisions
get_image
$buffer = $bot->get_image('File:Foo.jpg', { width=>256, height=>256 });
Download an image from a wiki. This is derived from a similar function in MediaWiki::API. This one allows the image to be scaled down by passing a hashref with height & width parameters.
It returns raw data in the original format. You may simply spew it to a file, or process it directly with a library such as Imager.
use File::Slurp qw(write_file);
my $img_data = $bot->get_image('File:Foo.jpg');
write_file( 'Foo.jpg', {binmode => ':raw'}, \$img_data );
Images are scaled proportionally. (height/width) will remain constant, except for rounding errors.
Height and width parameters describe the maximum dimensions. A 400x200 image will never be scaled to greater dimensions. You can scale it yourself; having the wiki do it is just lazy & selfish.
References: API:Properties#imageinfo
revert
Reverts the specified $page_title to $revid, with an edit summary of $summary. A default edit summary will be used if $summary is omitted.
my $revid = $bot->get_last("User:Mike.lifeguard/sandbox", "Mike.lifeguard");
print "Reverting to $revid\n" if defined($revid);
$bot->revert('User:Mike.lifeguard', $revid, 'rvv');
References: API:Edit
undo
$bot->undo($title, $revid, $summary, $after);
Reverts the specified $revid, with an edit summary of $summary, using the undo function. To undo all revisions from $revid up to but not including this one, set $after to another revid. If not set, just undo the one revision ($revid).
References: API:Edit
get_last
Returns the revid of the last revision to $page not made by $user. undef is returned if no result was found, as would be the case if the page is deleted.
my $revid = $bot->get_last('User:Mike.lifeguard/sandbox', 'Mike.lifeguard');
if defined($revid) {
print "Reverting to $revid\n";
$bot->revert('User:Mike.lifeguard', $revid, 'rvv');
}
References: API:Properties#revisions
update_rc
This method is deprecated, and will emit deprecation warnings. Replace calls to update_rc()
with calls to the newer recentchanges()
, which returns all available data, including rcid.
Returns an array containing the $limit most recent changes to the wiki's main namespace. The array contains hashrefs with keys title, revid, old_revid, and timestamp.
my @rc = $bot->update_rc(5);
foreach my $hashref (@rc) {
my $title = $hash->{'title'};
print "$title\n";
}
The "Options hashref" is also available:
# Use a callback for incremental processing:
my $options = { hook => \&mysub, };
$bot->update_rc($options);
sub mysub {
my ($res) = @_;
foreach my $hashref (@$res) {
my $page = $hashref->{'title'};
print "$page\n";
}
}
recentchanges($wiki_hashref, $options_hashref)
Returns an array of hashrefs containing recentchanges data.
The first parameter is a hashref with the following keys:
ns - the namespace number, or an arrayref of numbers to specify several; default is the main namespace
limit - the number of rows to fetch; default is 50
user - only list changes by this user
show - itself a hashref where the key is a category and the value is a boolean. If true, the category will be included; if false, excluded. The categories are kinds of edits: minor, bot, anon, redirect, patrolled. See "rcshow" at http://www.mediawiki.org/wiki/API:Recentchanges#Parameters.
An "Options hashref" can be used as the second parameter:
my @rc = $bot->recentchanges({ ns => 4, limit => 100 });
foreach my $hashref (@rc) {
print $hashref->{title} . "\n";
}
# Or, use a callback for incremental processing:
$bot->recentchanges({ ns => [0,1], limit => 500 }, { hook => \&mysub });
sub mysub {
my ($res) = @_;
foreach my $hashref (@$res) {
my $page = $hashref->{title};
print "$page\n";
}
}
The hashref returned might contain the following keys:
ns - the namespace number
revid
old_revid
timestamp
rcid - can be used with "patrol"
pageid
type - one of edit, new, log (there may be others)
title
For backwards compatibility, the previous method signature is still supported:
$bot->recentchanges($ns, $limit, $options_hashref);
References: API:Recentchanges
what_links_here
Returns an array containing a list of all pages linking to $page.
Additional optional parameters are:
One of: all (default), redirects, or nonredirects.
A namespace number to search (pass an arrayref to search in multiple namespaces)
A typical query:
my @links = $bot->what_links_here("Meta:Sandbox",
undef, 1,
{ hook=>\&mysub }
);
sub mysub{
my ($res) = @_;
foreach my $hash (@$res) {
my $title = $hash->{'title'};
my $is_redir = $hash->{'redirect'};
print "Redirect: $title\n" if $is_redir;
print "Page: $title\n" unless $is_redir;
}
}
Transclusions are no longer handled by what_links_here() - use "list_transclusions" instead.
References: Listing incoming links, API:Backlinks
list_transclusions
Returns an array containing a list of all pages transcluding $page.
Other parameters are:
One of: all (default), redirects, or nonredirects
A namespace number to search (pass an arrayref to search in multiple namespaces).
$options_hashref as described by MediaWiki::API:
Set max to limit the number of queries performed.
Set hook to a subroutine reference to use a callback hook for incremental processing.
Refer to the section on "linksearch" for examples.
A typical query:
$bot->list_transclusions("Template:Tlx", undef, 4, {hook => \&mysub});
sub mysub{
my ($res) = @_;
foreach my $hash (@$res) {
my $title = $hash->{'title'};
my $is_redir = $hash->{'redirect'};
print "Redirect: $title\n" if $is_redir;
print "Page: $title\n" unless $is_redir;
}
}
References: Listing transclusions API:Embeddedin
get_pages_in_category
Returns an array containing the names of all pages in the specified category (include the Category: prefix). Does not recurse into sub-categories.
my @pages = $bot->get_pages_in_category('Category:People on stamps of Gabon');
print "The pages in Category:People on stamps of Gabon are:\n@pages\n";
The options hashref is as described in "Options hashref". Use { max => 0 }
to get all results.
References: Listing category contents, API:Categorymembers
get_all_pages_in_category
my @pages = $bot->get_all_pages_in_category($category, $options_hashref);
Returns an array containing the names of all pages in the specified category (include the Category: prefix), including sub-categories. The $options_hashref is described fully in "Options hashref".
References: Listing category contents, API:Categorymembers
get_all_categories
Returns an array containing the names of all categories.
my @categories = $bot->get_all_categories();
print "The categories are:\n@categories\n";
Use { max => 0 }
to get all results. The default number of categories returned is 10, the maximum allowed is 500.
References: API:Allcategories
linksearch
Runs a linksearch on the specified $link and returns an array containing anonymous hashes with keys 'url' for the outbound URL, and 'title' for the page the link is on.
Additional parameters are:
A namespace number to search (pass an arrayref to search in multiple namespaces).
You can search by $protocol (http is default).
$options_hashref is fully documented in "Options hashref":
Set max in $options to get more than one query's worth of results:
my $options = { max => 10, }; # I only want some results my @links = $bot->linksearch("slashdot.org", 1, undef, $options); foreach my $hash (@links) { my $url = $hash->{'url'}; my $page = $hash->{'title'}; print "$page: $url\n"; }
Set hook to a subroutine reference to use a callback hook for incremental processing:
my $options = { hook => \&mysub, }; # I want to do incremental processing $bot->linksearch("slashdot.org", 1, undef, $options); sub mysub { my ($res) = @_; foreach my $hashref (@$res) { my $url = $hashref->{'url'}; my $page = $hashref->{'title'}; print "$page: $url\n"; } }
References: Finding external links, API:Exturlusage
purge_page
Purges the server cache of the specified $page. Returns true on success; false on failure. Pass an array reference to purge multiple pages.
If you really care, a true return value is the number of pages successfully purged. You could check that it is the same as the number you wanted to purge - maybe some pages don't exist, or you passed invalid titles, or you aren't allowed to purge the cache:
my @to_purge = ('Main Page', 'A', 'B', 'C', 'Very unlikely to exist');
my $size = scalar @to_purge;
print "all-at-once:\n";
my $success = $bot->purge_page(\@to_purge);
if ($success == $size) {
print "@to_purge: OK ($success/$size)\n";
}
else {
my $missed = @to_purge - $success;
print "We couldn't purge $missed pages (list was: "
. join(', ', @to_purge)
. ")\n";
}
# OR
print "\n\none-at-a-time:\n";
foreach my $page (@to_purge) {
my $ok = $bot->purge_page($page);
print "$page: $ok\n";
}
References: Purging the server cache, API:Purge
get_namespace_names
my %namespace_names = $bot->get_namespace_names();
Returns a hash linking the namespace id, such as 1, to its named equivalent, such as "Talk".
References: API:Meta#siteinfo
image_usage
Gets a list of pages which include a certain $image. Include the File:
namespace prefix to avoid incurring an extra round-trip (which will also emit a deprecation warnings).
Additional parameters are:
A namespace number to fetch results from (or an arrayref of multiple namespace numbers)
One of all, redirect, or nonredirects.
$options is a hashref as described in the section for "linksearch".
my @pages = $bot->image_usage("File:Albert Einstein Head.jpg");
Or, make use of the "Options hashref" to do incremental processing:
$bot->image_usage("File:Albert Einstein Head.jpg",
undef, undef,
{ hook=>\&mysub, max=>5 }
);
sub mysub {
my $res = shift;
foreach my $page (@$res) {
my $title = $page->{'title'};
print "$title\n";
}
}
References: API:Imageusage
global_image_usage($image, $results, $filterlocal)
Returns an array of hashrefs of data about pages which use the given image.
my @data = $bot->global_image_usage('File:Albert Einstein Head.jpg');
The keys in each hashref are title, url, and wiki. $results
is the maximum number of results that will be returned (not the maximum number of requests that will be sent, like max
in the "Options hashref"); the default is to attempt to fetch 500 (set to 0 to get all results). $filterlocal
will filter out local uses of the image.
References: Extension:GlobalUsage#API
links_to_image
A backward-compatible call to "image_usage". You can provide only the image title.
This method is deprecated, and will emit deprecation warnings.
is_blocked
my $blocked = $bot->is_blocked('User:Mike.lifeguard');
Checks if a user is currently blocked.
References: API:Blocks
test_blocked
Retained for backwards compatibility. Use "is_blocked" for clarity.
This method is deprecated, and will emit deprecation warnings.
test_image_exists
Checks if an image exists at $page.
FILE_NONEXISTENT
(0) means "Nothing there"FILE_LOCAL
(1) means "Yes, an image exists locally"FILE_SHARED
(2) means "Yes, an image exists on Commons"FILE_PAGE_TEXT_ONLY
(3) means "No image exists, but there is text on the page"
If you pass in an arrayref of images, you'll get out an arrayref of results.
use MediaWiki::Bot::Constants;
my $exists = $bot->test_image_exists('File:Albert Einstein Head.jpg');
if ($exists == FILE_NONEXISTENT) {
print "Doesn't exist\n";
}
elsif ($exists == FILE_LOCAL) {
print "Exists locally\n";
}
elsif ($exists == FILE_SHARED) {
print "Exists on Commons\n";
}
elsif ($exists == FILE_PAGE_TEXT_ONLY) {
print "Page exists, but no image\n";
}
References: API:Properties#imageinfo
get_pages_in_namespace
$bot->get_pages_in_namespace($namespace, $limit, $options_hashref);
Returns an array containing the names of all pages in the specified namespace. The $namespace_id must be a number, not a namespace name.
Setting $page_limit is optional, and specifies how many items to retrieve at once. Setting this to 'max' is recommended, and this is the default if omitted. If $page_limit is over 500, it will be rounded up to the next multiple of 500. If $page_limit is set higher than you are allowed to use, it will silently be reduced. Consider setting key 'max' in the "Options hashref" to retrieve multiple sets of results:
# Gotta get 'em all!
my @pages = $bot->get_pages_in_namespace(6, 'max', { max => 0 });
References: API:Allpages
count_contributions
my $count = $bot->count_contributions($user);
Uses the API to count $user's contributions.
References: API:Users
timed_count_contributions
($timed_edits_count, $total_count) = $bot->timed_count_contributions($user, $days);
Uses the API to count $user's contributions in last number of $days and total number of user's contributions (if needed).
Example: If you want to get user contribs for last 30 and 365 days, and total number of edits you would write something like this:
my ($last30days, $total) = $bot->timed_count_contributions($user, 30);
my $last365days = $bot->timed_count_contributions($user, 365);
You could get total number of edits also by separately calling count_contributions like this:
my $total = $bot->count_contributions($user);
and use timed_count_contributions only in scalar context, but that would mean one more call to server (meaning more server load) of which you are excused as timed_count_contributions returns array with two parameters.
References: Extension:UserDailyContribs
last_active
my $latest_timestamp = $bot->last_active($user);
Returns the last active time of $user in YYYY-MM-DDTHH:MM:SSZ
.
References: API:Usercontribs
recent_edit_to_page
my ($timestamp, $user) = $bot->recent_edit_to_page($title);
Returns timestamp and username for most recent (top) edit to $page.
References: API:Properties#revisions
get_users
my @recent_editors = $bot->get_users($title, $limit, $revid, $direction);
Gets the most recent editors to $page, up to $limit, starting from $revision and going in $direction.
References: API:Properties#revisions
was_blocked
for ("Mike.lifeguard", "Jimbo Wales") {
print "$_ was blocked\n" if $bot->was_blocked($_);
}
Returns whether $user has ever been blocked.
References: API:Logevents
test_block_hist
Retained for backwards compatibility. Use "was_blocked" for clarity.
This method is deprecated, and will emit deprecation warnings.
expandtemplates
my $expanded = $bot->expandtemplates($title, $wikitext);
Expands templates on $page, using $text if provided, otherwise loading the page text automatically.
References: API:Parsing wikitext
get_allusers
my @users = $bot->get_allusers($limit, $user_group, $options_hashref);
Returns an array of all users. Default $limit is 500. Optionally specify a $group (like 'sysop') to list that group only. The last optional parameter is an "Options hashref".
References: API:Allusers
db_to_domain
Converts a wiki/database name (enwiki) to the domain name (en.wikipedia.org).
my @wikis = ("enwiki", "kowiki", "bat-smgwiki", "nonexistent");
foreach my $wiki (@wikis) {
my $domain = $bot->db_to_domain($wiki);
next if !defined($domain);
print "$wiki: $domain\n";
}
You can pass an arrayref to do bulk lookup:
my @wikis = ("enwiki", "kowiki", "bat-smgwiki", "nonexistent");
my $domains = $bot->db_to_domain(\@wikis);
foreach my $domain (@$domains) {
next if !defined($domain);
print "$domain\n";
}
References: Extension:SiteMatrix
domain_to_db
my $db = $bot->domain_to_db($domain_name);
As you might expect, does the opposite of "domain_to_db": Converts a domain name (meta.wikimedia.org) into a database/wiki name (metawiki).
References: Extension:SiteMatrix
diff
This allows retrieval of a diff from the API. The return is a scalar containing the HTML table of the diff. Options are passed as a hashref with keys:
title is the title to use. Provide either this or revid.
revid is any revid to diff from. If you also specified title, only title will be honoured.
oldid is an identifier to diff to. This can be a revid, or the special values 'cur', 'prev' or 'next'
References: API:Properties#revisions
prefixindex
This returns an array of hashrefs containing page titles that start with the given $prefix. The hashref has keys 'title' and 'redirect' (present if the page is a redirect, not present otherwise).
Additional parameters are:
One of all, redirects, or nonredirects
A single namespace number (unlike linksearch etc, which can accept an arrayref of numbers).
$options_hashref as described in "Options hashref".
my @prefix_pages = $bot->prefixindex("User:Mike.lifeguard");
# Or, the more efficient equivalent
my @prefix_pages = $bot->prefixindex("Mike.lifeguard", 2);
foreach my $hashref (@pages) {
my $title = $hashref->{'title'};
if $hashref->{'redirect'} {
print "$title is a redirect\n";
}
else {
print "$title\n is not a redirect\n";
}
}
References: API:Allpages
search
This is a simple search for your $search_term in page text. It returns an array of page titles matching.
Additional optional parameters are:
A namespace number to search in, or an arrayref of numbers (default is the main namespace)
$options_hashref is a hashref as described in "Options hashref":
my @pages = $bot->search("Mike.lifeguard", 2);
print "@pages\n";
Or, use a callback for incremental processing:
my @pages = $bot->search("Mike.lifeguard", 2, { hook => \&mysub });
sub mysub {
my ($res) = @_;
foreach my $hashref (@$res) {
my $page = $hashref->{'title'};
print "$page\n";
}
}
References: API:Search
get_log
This fetches log entries, and returns results as an array of hashes. The first parameter is a hashref with keys:
type is the log type (block, delete...)
user is the user who performed the action. Do not include the User: prefix
target is the target of the action. Where an action was performed to a page, it is the page title. Where an action was performed to a user, it is User:$username.
The second is the familiar "Options hashref".
my $log = $bot->get_log({
type => 'block',
user => 'User:Mike.lifeguard',
});
foreach my $entry (@$log) {
my $user = $entry->{'title'};
print "$user\n";
}
$bot->get_log({
type => 'block',
user => 'User:Mike.lifeguard',
},
{ hook => \&mysub, max => 10 }
);
sub mysub {
my ($res) = @_;
foreach my $hashref (@$res) {
my $title = $hashref->{'title'};
print "$title\n";
}
}
References: API:Logevents
is_g_blocked
my $is_globally_blocked = $bot->is_g_blocked('127.0.0.1');
Returns what IP/range block currently in place affects the IP/range. The return is a scalar of an IP/range if found (evaluates to true in boolean context); undef otherwise (evaluates false in boolean context). Pass in a single IP or CIDR range.
References: Extension:GlobalBlocking
was_g_blocked
print "127.0.0.1 was globally blocked\n" if $bot->was_g_blocked('127.0.0.1');
Returns whether an IP/range was ever globally blocked. You should probably call this method only when your bot is operating on Meta - this method will warn if not.
References: API:Logevents
was_locked
my $was_locked = $bot->was_locked('Mike.lifeguard');
Returns whether a user was ever locked. You should probably call this method only when your bot is operating on Meta - this method will warn if not.
References: API:Logevents
get_protection
Returns data on page protection as a array of up to two hashrefs. Each hashref has a type, level, and expiry. Levels are 'sysop' and 'autoconfirmed'; types are 'move' and 'edit'; expiry is a timestamp. Additionally, the key 'cascade' will exist if cascading protection is used.
my $page = 'Main Page';
$bot->edit({
page => $page,
text => rand(),
summary => 'test',
}) unless $bot->get_protection($page);
You can also pass an arrayref of page titles to do bulk queries:
my @pages = ('Main Page', 'User:Mike.lifeguard', 'Project:Sandbox');
my $answer = $bot->get_protection(\@pages);
foreach my $title (keys %$answer) {
my $protected = $answer->{$title};
print "$title is protected\n" if $protected;
print "$title is unprotected\n" unless $protected;
}
References: API:Properties#info
is_protected
This is a synonym for "get_protection", which should be used in preference.
This method is deprecated, and will emit deprecation warnings.
patrol
$bot->patrol($rcid);
Marks a page or revision identified by the $rcid as patrolled. To mark several RCIDs as patrolled, you may pass an arrayref of them. Returns false and sets $bot->{error}
if the account cannot patrol.
References: API:Patrol
$bot->email($user, $subject, $body);
This allows you to send emails through the wiki. All 3 of $user (without the User: prefix), $subject and $body are required. If $user is an arrayref, this will send the same email (subject and body) to all users.
References: API:Email
top_edits
Returns an array of the page titles where the $user is the latest editor. The second parameter is the familiar $options_hashref.
my @pages = $bot->top_edits("Mike.lifeguard", {max => 5});
foreach my $page (@pages) {
$bot->rollback($page, "Mike.lifeguard");
}
Note that accessing the data with a callback happens before filtering the top edits is done. For that reason, you should use "contributions" if you need to use a callback. If you use a callback with top_edits(), you will not necessarily get top edits returned. It is only safe to use a callback if you check that it is a top edit:
$bot->top_edits("Mike.lifeguard", { hook => \&rv });
sub rv {
my $data = shift;
foreach my $page (@$data) {
if (exists($page->{'top'})) {
$bot->rollback($page->{'title'}, "Mike.lifeguard");
}
}
}
References: API:Usercontribs
contributions
my @contribs = $bot->contributions($user, $namespace, $options, $from, $to);
Returns an array of hashrefs of data for the user's contributions. $namespace can be an arrayref of namespace numbers. $options can be specified as in "linksearch". $from and $to are optional timestamps. ISO 8601 date and time is recommended: 2001-01-15T14:56:00Z, see https://www.mediawiki.org/wiki/Timestamp for all possible formats. Note that $from (=ucend) has to be before $to (=ucstart), unlike direct API access.
Specify an arrayref of users to get results for multiple users.
References: API:Usercontribs
upload
$bot->upload({ data => $file_contents, summary => 'uploading file' });
$bot->upload({ file => $file_name, title => 'Target filename.png' });
Upload a file to the wiki. Specify the file by either giving the filename, which will be read in, or by giving the data directly.
References: API:Upload
upload_from_url
Upload file directly from URL to the wiki. Specify URL, the new filename and summary. Summary and new filename are optional.
$bot->upload_from_url({
url => 'http://some.domain.ext/pic.png',
title => 'Target_filename.png',
summary => 'uploading new pic',
});
If on your target wiki is enabled uploading from URL, meaning $wgAllowCopyUploads
is set to true in LocalSettings.php and you have appropriate user rights, you can use this function to upload files to your wiki directly from remote server.
References: API:Upload#Uploading_from_URL
usergroups
Returns a list of the usergroups a user is in:
my @usergroups = $bot->usergroups('Mike.lifeguard');
References: API:Users
get_mw_version
Returns a hash ref with the MediaWiki version. The hash ref contains the keys major, minor, patch, and string. Returns undef on errors.
my $mw_version = $bot->get_mw_version;
# get version as string
my $mw_ver_as_string = $mw_version->{'major'} . '.' . $mw_version->{'minor'};
if(defined $mw_version->{'patch'}){
$mw_ver_as_string .= '.' . $mw_version->{'patch'};
}
# or simply
my $mw_ver_as_string = $mw_version->{'string'};
References: API:Siteinfo
Options hashref
This is passed through to the lower-level interface MediaWiki::API, and is fully documented there.
The hashref can have 3 keys:
- max
-
Specifies the maximum number of queries to retrieve data from the wiki. This is independent of the size of each query (how many items each query returns). Set to 0 to retrieve all the results.
- hook
-
Specifies a coderef to a hook function that can be used to process large lists as they come in. When this is used, your subroutine will get the raw data. This is noted in cases where it is known to be significant. For example, when using a hook with
top_edits()
, you need to check whether the edit is the top edit yourself - your subroutine gets results as they come in, and before they're filtered. - skip_encoding
-
MediaWiki's API uses UTF-8 and any 8 bit character string parameters are encoded automatically by the API call. If your parameters are already in UTF-8 this will be detected and the encoding will be skipped. If your parameters for some reason contain UTF-8 data but no UTF-8 flag is set (i.e. you did not use the
use utf8;
pragma) you should prevent re-encoding by passing an optionskip_encoding => 1
. For example:$category ="Cat\x{e9}gorie:moyen_fran\x{e7}ais"; # latin1 string $bot->get_all_pages_in_category($category); # OK $category = "Cat". pack("U", 0xe9)."gorie:moyen_fran".pack("U",0xe7)."ais"; # unicode string $bot->get_all_pages_in_category($category); # OK $category ="Cat\x{c3}\x{a9}gorie:moyen_fran\x{c3}\x{a7}ais"; # unicode data without utf-8 flag # $bot->get_all_pages_in_category($category); # NOT OK $bot->get_all_pages_in_category($category, { skip_encoding => 1 }); # OK
If you need this, it probably means you're doing something wrong. Feel free to ask for help.
ERROR HANDLING
All functions will return undef in any handled error situation. Further error data is stored in $bot->{error}->{code}
and $bot->{error}->{details}
.
Error codes are provided as constants in MediaWiki::Bot::Constants, and can also be imported through this module:
use MediaWiki::Bot qw(:constants);
AVAILABILITY
The project homepage is https://metacpan.org/module/MediaWiki::Bot.
The latest version of this module is available from the Comprehensive Perl Archive Network (CPAN). Visit http://www.perl.com/CPAN/ to find a CPAN site near you, or see https://metacpan.org/module/MediaWiki::Bot/.
SOURCE
The development version is on github at https://github.com/MediaWiki-Bot/MediaWiki-Bot and may be cloned from git://github.com/MediaWiki-Bot/MediaWiki-Bot.git
BUGS AND LIMITATIONS
You can make new bug reports, and view existing ones, through the web interface at https://github.com/MediaWiki-Bot/MediaWiki-Bot/issues.
AUTHORS
Dan Collins <dcollins@cpan.org>
Mike.lifeguard <lifeguard@cpan.org>
Alex Rowe <alex.d.rowe@gmail.com>
Oleg Alexandrov <oleg.alexandrov@gmail.com>
jmax.code <jmax.code@gmail.com>
Stefan Petrea <stefan.petrea@gmail.com>
kc2aei <kc2aei@gmail.com>
bosborne@alum.mit.edu
Brian Obio <brianobio@gmail.com>
patch and bug report contributors
COPYRIGHT AND LICENSE
This software is Copyright (c) 2021 by the MediaWiki::Bot team <perlwikibot@googlegroups.com>.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007