NAME
AnyEvent::WebDriver - control browsers using the W3C WebDriver protocol
SYNOPSIS
# start geckodriver or any other w3c-compatible webdriver via the shell
$ geckdriver -b myfirefox/firefox --log trace --port 4444
# then use it
use AnyEvent::WebDriver;
# create a new webdriver object
my $wd = new AnyEvent::WebDriver;
# create a new session with default capabilities.
$wd->new_session ({});
$wd->navigate_to ("https://duckduckgo.com/html");
my $searchbox = $wd->find_element ("css selector" => 'input[type="text"]');
$wd->element_send_keys ($searchbox => "free software");
$wd->element_click ($wd->find_element ("css selector" => 'input[type="submit"]'));
sleep 10;
DESCRIPTION
This module aims to implement the W3C WebDriver specification which is the standardised equivalent to the Selenium WebDriver API., which in turn aims at remotely controlling web browsers such as Firefox or Chromium.
At the time of this writing, it was only available as a draft document, so changes will be expected. Also, only geckodriver did implement it, or at least, most of it.
To make most of this module, or, in fact, to make any reasonable use of this module, you would need to refer tot he W3C WebDriver document, which can be found here:
https://w3c.github.io/webdriver/
CREATING WEBDRIVER OBJECTS
- new AnyEvent::WebDriver key => value...
-
Create a new WebDriver object. Example for a remote WebDriver connection (the only type supported at the moment):
my $wd = new AnyEvent::WebDriver host => "localhost", port => 4444;
Supported keys are:
- endpoint => $string
-
For remote connections, the endpoint to connect to (defaults to
http://localhost:4444
). - proxy => $proxyspec
-
The proxy to use (same as the
proxy
argument used by AnyEvent::HTTP). The default isundef
, which disables proxies. To use the system-provided proxy (e.g.http_proxy
environment variable), specify a value ofdefault
. - autodelete => $boolean
-
If true (the default), then automatically execute
delete_session
when the WebDriver object is destroyed with an active session. IF set to a false value, then the session will continue to exist. - timeout => $seconds
-
The HTTP timeout, in (fractional) seconds (default:
300
, but this will likely drastically reduce). This timeout is reset on any activity, so it is not an overall request timeout. Also, individual requests might extend this timeout if they are known to take longer.
SIMPLIFIED API
This section documents the simplified API, which is really just a very thin wrapper around the WebDriver protocol commands. They all block (using AnyEvent condvars) the caller until the result is available, so must not be called from an event loop callback - see "EVENT BASED API" for an alternative.
The method names are pretty much taken directly from the W3C WebDriver specification, e.g. the request documented in the "Get All Cookies" section is implemented via the get_all_cookies
method.
The order is the same as in the WebDriver draft at the time of this writing, and only minimal massaging is done to request parameters and results.
SESSIONS
- $wd->new_session ({ key => value... })
-
Try to connect to the WebDriver and initialize a new session with a "new session" command, passing the given key-value pairs as value (e.g.
capabilities
).No session-dependent methods must be called before this function returns successfully, and only one session can be created per WebDriver object.
On success,
$wd->{sid}
is set to the session ID, and$wd->{capabilities}
is set to the returned capabilities.my $wd = new AnyEvent::Selenium endpoint => "http://localhost:4545"; $wd->new_session ({ capabilities => { pageLoadStrategy => "normal", }. });
- $wd->delete_session
-
Deletes the session - the WebDriver object must not be used after this call.
- $timeouts = $wd->get_timeouts
-
Get the current timeouts, e.g.:
my $timeouts = $wd->get_timeouts; => { implicit => 0, pageLoad => 300000, script => 30000 }
- $wd->set_timeouts ($timeouts)
-
Sets one or more timeouts, e.g.:
$wd->set_timeouts ({ script => 60000 });
NAVIGATION
-
Navigates to the specified URL.
- $url = $wd->get_current_url
-
Queries the current page URL as set by
navigate_to
. - $wd->back
-
The equivalent of pressing "back" in the browser.
- $wd->forward
-
The equivalent of pressing "forward" in the browser.
- $wd->refresh
-
The equivalent of pressing "refresh" in the browser.
- $title = $wd->get_title
-
Returns the current document title.
COMMAND CONTEXTS
- $handle = $wd->get_window_handle
-
Returns the current window handle.
- $wd->close_window
-
Closes the current browsing context.
- $wd->switch_to_window ($handle)
-
Changes the current browsing context to the given window.
- $handles = $wd->get_window_handles
-
Return the current window handles as an array-ref of handle IDs.
- $handles = $wd->switch_to_frame ($frame)
-
Switch to the given frame identified by
$frame
, which must be eitherundef
to go back to the top-level browsing context, an integer to select the nth subframe, or an element object (as e.g. returned by theelement_object
method. - $handles = $wd->switch_to_parent_frame
-
Switch to the parent frame.
- $rect = $wd->get_window_rect
-
Return the current window rect, e.g.:
$rect = $wd->get_window_rect => { height => 1040, width => 540, x => 0, y => 0 }
- $wd->set_window_rect ($rect)
-
Sets the window rect.
- $wd->maximize_window
- $wd->minimize_window
- $wd->fullscreen_window
-
Changes the window size by either maximising, minimising or making it fullscreen. In my experience, this will timeout if no window manager is running.
ELEMENT RETRIEVAL
- $element_id = $wd->find_element ($location_strategy, $selector)
-
Finds the first element specified by the given selector and returns its web element ID (the strong, not the object from the protocol). Raises an error when no element was found.
$element = $wd->find_element ("css selector" => "body a"); $element = $wd->find_element ("link text" => "Click Here For Porn"); $element = $wd->find_element ("partial link text" => "orn"); $element = $wd->find_element ("tag name" => "input"); $element = $wd->find_element ("xpath" => '//input[@type="text"]'); => e.g. "decddca8-5986-4e1d-8c93-efe952505a5f"
- $element_ids = $wd->find_elements ($location_strategy, $selector)
-
As above, but returns an arrayref of all found element IDs.
- $element_id = $wd->find_element_from_element ($element_id, $location_strategy, $selector)
-
Like
find_element
, but looks only inside the specified$element
. - $element_ids = $wd->find_elements_from_element ($element_id, $location_strategy, $selector)
-
Like
find_elements
, but looks only inside the specified$element
.my $head = $wd->find_element ("tag name" => "head"); my $links = $wd->find_elements_from_element ($head, "tag name", "link");
- $element_id = $wd->get_active_element
-
Returns the active element.
ELEMENT STATE
- $bool = $wd->is_element_selected
-
Returns whether the given input or option element is selected or not.
- $string = $wd->get_element_attribute ($element_id, $name)
-
Returns the value of the given attribute.
- $string = $wd->get_element_property ($element_id, $name)
-
Returns the value of the given property.
- $string = $wd->get_element_css_value ($element_id, $name)
-
Returns the value of the given CSS value.
- $string = $wd->get_element_text ($element_id)
-
Returns the (rendered) text content of the given element.
- $string = $wd->get_element_tag_name ($element_id)
-
Returns the tag of the given element.
- $rect = $wd->get_element_rect ($element_id)
-
Returns the element rect(angle) of the given element.
- $bool = $wd->is_element_enabled
-
Returns whether the element is enabled or not.
ELEMENT INTERACTION
- $wd->element_click ($element_id)
-
Clicks the given element.
- $wd->element_clear ($element_id)
-
Clear the contents of the given element.
- $wd->element_send_keys ($element_id, $text)
-
Sends the given text as key events to the given element.
DOCUMENT HANDLING
- $source = $wd->get_page_source
-
Returns the (HTML/XML) page source of the current document.
- $results = $wd->execute_script ($javascript, $args)
-
Synchronously execute the given script with given arguments and return its results (
$args
can beundef
if no arguments are wanted/needed).$ten = $wd->execute_script ("return arguments[0]+arguments[1]", [3, 7]);
- $results = $wd->execute_async_script ($javascript, $args)
-
Similar to
execute_script
, but doesn't wait for script to return, but instead waits for the script to call its last argument, which is added to$args
automatically.$twenty = $wd->execute_async_script ("arguments[0](20)", undef);
COOKIES
-
Returns all cookies, as an arrayref of hashrefs.
# google surely sets a lot of cookies without my consent $wd->navigate_to ("http://google.com"); use Data::Dump; ddx $wd->get_all_cookies;
-
Returns a single cookie as a hashref.
-
Adds the given cookie hashref.
-
Delete the named cookie.
-
Delete all cookies.
ACTIONS
- $wd->perform_actions ($actions)
-
Perform the given actions (an arrayref of action specifications simulating user activity). For further details, read the spec.
An example to get you started:
$wd->navigate_to ("https://duckduckgo.com/html"); $wd->set_timeouts ({ implicit => 10000 }); my $input = $wd->find_element ("css selector", 'input[type="text"]'); $wd->perform_actions ([ { id => "myfatfinger", type => "pointer", pointerType => "touch", actions => [ { type => "pointerMove", duration => 100, origin => $wd->element_object ($input), x => 40, y => 5 }, { type => "pointerDown", button => 1 }, { type => "pause", duration => 40 }, { type => "pointerUp", button => 1 }, ], }, { id => "mykeyboard", type => "key", actions => [ { type => "pause" }, { type => "pause" }, { type => "pause" }, { type => "pause" }, { type => "keyDown", value => "a" }, { type => "pause", duration => 100 }, { type => "keyUp", value => "a" }, { type => "pause", duration => 100 }, { type => "keyDown", value => "b" }, { type => "pause", duration => 100 }, { type => "keyUp", value => "b" }, { type => "pause", duration => 2000 }, { type => "keyDown", value => "\x{E007}" }, # enter { type => "pause", duration => 100 }, { type => "keyUp", value => "\x{E007}" }, # enter { type => "pause", duration => 5000 }, ], }, ]);
- $wd->release_actions
-
Release all keys and pointer buttons currently depressed.
USER PROMPTS
- $wd->dismiss_alert
-
Dismiss a simple dialog, if present.
- $wd->accept_alert
-
Accept a simple dialog, if present.
- $text = $wd->get_alert_text
-
Returns the text of any simple dialog.
- $text = $wd->send_alert_text
-
Fills in the user prompt with the given text.
SCREEN CAPTURE
- $wd->take_screenshot
-
Create a screenshot, returning it as a PNG image in a
data:
URL. - $wd->take_element_screenshot ($element_id)
-
Accept a simple dialog, if present.
HELPER METHODS
- $object = AnyEvent::WebDriver->element_object ($element_id)
- $object = $wd->element_object ($element_id)
-
Encoding element IDs in data structures is done by representing them as an object with a special key and the element ID as value. This helper method does this for you.
EVENT BASED API
This module wouldn't be a good AnyEvent citizen if it didn't have a true event-based API.
In fact, the simplified API, as documented above, is emulated via the event-based API and an AUTOLOAD
function that automatically provides blocking wrappers around the callback-based API.
Every method documented in the "SIMPLIFIED API" section has an equivalent event-based method that is formed by appending a underscore (_
) to the method name, and appending a callback to the argument list (mnemonic: the underscore indicates the "the action is not yet finished" after the call returns).
For example, instead of a blocking calls to new_session
, navigate_to
and back
, you can make a callback-based ones:
my $cv = AE::cv;
$wd->new_session ({}, sub {
my ($status, $value) = @_,
die "error $value->{error}" if $status ne "200";
$wd->navigate_to_ ("http://www.nethype.de", sub {
$wd->back_ (sub {
print "all done\n";
$cv->send;
});
});
});
$cv->recv;
While the blocking methods croak
on errors, the callback-based ones all pass two values to the callback, $status
and $res
, where $status
is the HTTP status code (200 for successful requests, typically 4xx or 5xx for errors), and $res
is the value of the value
key in the JSON response object.
Other than that, the underscore variants and the blocking variants are identical.
LOW LEVEL API
All the simplified API methods are very thin wrappers around WebDriver commands of the same name. They are all implemented in terms of the low-level methods (req
, get
, post
and delete
), which exists in blocking and callback-based variants (req_
, get_
, post_
and delete_
).
Examples are after the function descriptions.
- $wd->req_ ($method, $uri, $body, $cb->($status, $value))
- $value = $wd->req ($method, $uri, $body)
-
Appends the
$uri
to theendpoint/session/{sessionid}/
URL and makes a HTTP$method
request (GET
,POST
etc.).POST
requests can provide a UTF-8-encoded JSON text as HTTP request body, or the empty string to indicate no body is used.For the callback version, the callback gets passed the HTTP status code (200 for every successful request), and the value of the
value
key in the JSON response object as second argument. - $wd->get_ ($uri, $cb->($status, $value))
- $value = $wd->get ($uri)
-
Simply a call to
req_
with$method
set toGET
and an empty body. - $wd->post_ ($uri, $data, $cb->($status, $value))
- $value = $wd->post ($uri, $data)
-
Simply a call to
req_
with$method
set toPOST
- if$body
isundef
, then an empty object is send, otherwise,$data
must be a valid request object, which gets encoded into JSON for you. - $wd->delete_ ($uri, $cb->($status, $value))
- $value = $wd->delete ($uri)
-
Simply a call to
req_
with$method
set toDELETE
and an empty body.
Example: implement get_all_cookies
, which is a simple GET
request without any parameters:
$cookies = $wd->get ("cookie");
Example: implement execute_script
, which needs some parameters:
$results = $wd->post ("execute/sync" => { script => "$javascript", args => [] });
Example: call find_elements
to find all IMG
elements, stripping the returned element objects to only return the element ID strings:
my $elems = $wd->post (elements => { using => "css selector", value => "img" });
# yes, the W3C found an interesting way around the typelessness of JSON
$_ = $_->{"element-6066-11e4-a52e-4f735466cecf"}
for @$elems;
HISTORY
This module was unintentionally created (it started inside some quickly hacked-together script) simply because I couldn't get the existing Selenium::Remote::Driver
module to work, ever, despite multiple attempts over the years and trying to report multiple bugs, which have been completely ignored. It's also not event-based, so, yeah...
AUTHOR
Marc Lehmann <schmorp@schmorp.de>
http://anyevent.schmorp.de