NAME
WWW::Mechanize - automate interaction with websites
SYNOPSIS
use WWW::Mechanize;
my $agent = WWW::Mechanize->new();
$agent->get($url);
$agent->follow($link);
$agent->form($number);
$agent->field($name, $value);
$agent->click($button);
$agent->back();
$agent->add_header($name => $value);
use Test::More;
like( $agent->{content}, qr/$expected/, "Got expected content" );
DESCRIPTION
This module is intended to help you automate interaction with a website. It bears a not-very-remarkable outwards resemblance to WWW::Chat, on which it is based. The main difference between this module and WWW::Chat is that WWW::Chat requires a pre-processing stage before you can run your script, whereas WWW::Mechanize does not.
WWW::Mechanize is a subclass of LWP::UserAgent, so anything you can do with an LWP::UserAgent, you can also do with this. See LWP::UserAgent for more information on the possibilities.
VERSION
Version 0.31
$Header: /home/cvs/www-mechanize/lib/WWW/Mechanize.pm,v 1.17 2002/09/13 20:16:39 alester Exp $
METHODS
new()
Creates and returns a new WWW::Mechanize object, hereafter referred to as the 'agent'.
my $agent = WWW::Mechanize->new()
$agent->get($url)
Given a URL/URI, fetches it. Returns an HTTP status code.
The results are stored internally in the agent object, as follows:
uri The current URI
req The current request object [HTTP::Request]
res The response received [HTTP::Response]
status The status code of the response
ct The content type of the response
base The base URI for current response
content The content of the response
forms Array of forms found in content [HTML::Form]
form Current form [HTML::Form]
links Array of links found in content
You can get at them with, for example: $agent->{content}
$agent->follow($string|$num)
Follow a link. If you provide a string, the first link whose text matches that string will be followed. If you provide a number, it will be the nth link on the page.
$agent->form($number)
Selects the Nth form on the page as the target for subsequent calls to field() and click(). Emits a warning and returns false if there is no such form. Forms are indexed from 1, that is to say, the first form is number 1 (not zero).
$agent->field($name, $value, $number)
Given the name of a field, set its value to the value specified. This applies to the current form (as set by the form() method or defaulting to the first form on the page).
The optional $number parameter is used to distinguish between two fields with the same name. The fields are numbered from 1.
$agent->click($button, $x, $y);
Has the effect of clicking a button on a form. The first argument is the name of the button to be clicked. The second and third arguments (optional) allow you to specify the (x,y) cooridinates of the click.
If there is only one button on the form, $agent->click()
with no arguments simply clicks that one button.
Returns an HTTP status code.
$agent->submit()
Shortcut for $a->click("submit")
$agent->back();
The equivalent of hitting the "back" button in a browser. Returns to the previous page. Won't go back past the first page.
$agent->add_header(name => $value)
Sets a header for the WWW::Mechanize agent to use every time it gets a webpage. This is NOT stored in the agent object (because if it were, it would disappear if you went back() past where you'd set it) but in the hash variable %WWW::Mechanize::headers, which is a hash of all headers to be set. You can manipulate this directly if you want to; the add_header() method is just provided as a convenience function for the most common case of adding a header.
extract_links()
Extracts HREF links from the content of a webpage.
The return value is a reference to an array containing an array reference for every <A>
and <FRAME>
tag in $self-
{content}>.
The array elements for the <A>
tag are:
- [0]: the contents of the
href
attribute - [1]: the text enclosed by the
<A>
tag - [2]: the contents of the
name
attribute
The array elements for the <FRAME>
tag are:
- [0]: the contents of the
src
attribute - [1]: the contents of the
name
attribute - [2]: the contents of the
name
attribute
INTERNAL METHODS
These methods are only used internally. You probably don't need to know about them.
_push_page_stack() / _pop_page_stack()
The agent keeps a stack of visited pages, which it can pop when it needs to go BACK and so on.
The current page needs to be pushed onto the stack before we get a new page, and the stack needs to be popped when BACK occurs.
Neither of these take any arguments, they just operate on the $agent object.
_do_request()
Performs a request on the $self->{req} request object, and sets a bunch of attributes on $self.
Returns an HTTP::Response object.
BUGS
Please report any bugs via the system at http://rt.cpan.org/
AUTHOR
Copyright 2002 Andy Lester <andy@petdance.com>
Released under the Artistic License. Based on Kirrily Robert's excellent WWW::Automate package.