NAME

Playwright - Perl client for Playwright

VERSION

version 0.015

SYNOPSIS

use Playwright;

my $handle = Playwright->new();
my $browser = $handle->launch( headless => 0, type => 'chrome' );
my $page = $browser->newPage();
my $res = $page->goto('http://somewebsite.test', { waitUntil => 'networkidle' });
my $frameset = $page->mainFrame();
my $kidframes = $frameset->childFrames();

# Grab us some elements
my $body = $page->select('body');

# You can also get the innerText
my $text = $body->textContent();
$body->click();
$body->screenshot();

my $kids = $body->selectMulti('*');

DESCRIPTION

Perl interface to a lightweight node.js webserver that proxies commands runnable by Playwright. Checks and automatically installs a copy of the node dependencies in the local folder if needed.

Currently understands commands you can send to all the playwright classes defined in api.json (installed wherever your OS puts shared files for CPAN distributions).

See https://playwright.dev/versions and drill down into your relevant version (run `npm list playwright` ) for what the classes do, and their usage.

All the classes mentioned there will correspond to a subclass of the Playwright namespace. For example:

# ISA Playwright
my $playwright = Playwright->new();
# ISA Playwright::BrowserContext
my $ctx = $playwright->newContext(...);
# ISA Playwright::Page
my $page = $ctx->newPage(...);
# ISA Playwright::ElementHandle
my $element = $ctx->select('body');

See example.pl for a more thoroughly fleshed-out display on how to use this module.

Getting Started

When using the playwright module for the first time, you may be told to install node.js libraries. It should provide you with instructions which will get you working right away.

However, depending on your node installation this may not work due to dependencies for node.js not being in the expected location. To fix this, you will need to update your NODE_PATH environment variable to point to the correct location.

Questions?

Feel free to join the Playwright slack server, as there is a dedicated #playwright-perl channel which I, the module author, await your requests in. https://aka.ms/playwright-slack

Documentation for Playwright Subclasses

The documentation and names for the subclasses of Playwright follow the spec strictly:

Playwright::BrowserContext => https://playwright.dev/docs/api/class-browsercontext Playwright::Page => https://playwright.dev/docs/api/class-page Playwright::ElementHandle => https://playwright.dev/docs/api/class-elementhandle

...And so on. These classes are automatically generated during module build based on the spec hash built by playwright. See generate_api_json.sh and generate_perl_modules.pl if you are interested in how this sausage is made.

You can check what methods are installed for each subclass by doing the following:

use Data::Dumper;
print Dumper($instance->{spec});

There are two major exceptions in how things work versus the upstream Playwright documentation, detailed below in the Selectors section.

Selectors

The selector functions have to be renamed from starting with $ for obvious reasons. The renamed functions are as follows:

$ => select
$$ => selectMulti
$eval => evaluate
$$eval => evalMulti

These functions are present as part of the Page, Frame and ElementHandle classes.

Scripts

The evaluate() and evaluateHandle() functions can only be run in string mode. To maximize the usefulness of these, I have wrapped the string passed with the following function:

const fun = new Function (toEval);
args = [
    fun,
    ...args
];

As such you can effectively treat the script string as a function body. The same restriction on only being able to pass one arg remains from the upstream: https://playwright.dev/docs/api/class-page#pageevalselector-pagefunction-arg

You will have to refer to the arguments array as described here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/arguments

You can also pass Playwright::ElementHandle objects as returned by the select() and selectMulti() routines. They will be correctly translated into DOMNodes as you would get from the querySelector() javascript functions.

Calling evaluate() and evaluateHandle() on Playwright::Element objects will automatically pass the DOMNode as the first argument to your script. See below for an example of doing this.

example of evaluate()

# Read the console
$page->on('console',"return [...arguments]");

my $promise = $page->waitForEvent('console');
#TODO This request can race, the server framework I use to host the playwright spec is *not* FIFO (YET)
sleep 1;
$page->evaluate("console.log('hug')");
my $console_log = $handle->await( $promise );

print "Logged to console: '".$console_log->text()."'\n";

# Convenient usage of evaluate on ElementHandles
# We pass the element itself as the first argument to the JS arguments array for you
$element->evaluate('arguments[0].style.backgroundColor = "#FF0000"; return 1;');

Asynchronous operations

The waitFor* methods defined on various classes fork and exec, waiting on the promise to complete. You will need to wait on the result of the backgrounded action with the await() method documented below.

# Assuming $handle is a Playwright object
my $async = $page->waitForEvent('console');
$page->evaluate('console.log("whee")');
my $result = $handle->await( $async );
my $logged = $result->text();

Getting Object parents

Some things, like elements naturally are children of the pages in which they are found. Sometimes this can get confusing when you are using multiple pages, especially if you let the ref to the page go out of scope. Don't worry though, you can access the parent attribute on most Playwright::* objects:

# Assuming $element is a Playwright::ElementHandle
my $page = $element->{parent};

Firefox Specific concerns

By default, firefox will open PDFs in a pdf.js window. To suppress this behavior (such as in the event you are await()ing a download event), you will have to pass this option to launch():

# Assuming $handle is a Playwright object
my $browser = $handle->launch( type => 'firefox', firefoxUserPrefs => { 'pdfjs.disabled' => JSON::true } );

Leaving browsers alive for manual debugging

Passing the cleanup => 0 parameter to new() will prevent DESTROY() from cleaning up the playwright server when a playwright object goes out of scope.

Be aware that this will prevent debug => 1 from printing extra messages from playwright_server itself, as we redirect the output streams in this case so as not to fill your current session with prints later.

A convenience script has been provided to clean up these orphaned instances, `reap_playwright_servers` which will kill all extant `playwright_server` processes.

Taking videos, Making Downloads

We spawn browsers via BrowserType.launchServer() and then connect to them over websocket. This means you can't just set paths up front and have videos recorded, the Video.path() method will throw. Instead you will need to call the Video.saveAs() method after closing a page to record video:

# Do stuff
...
# Save video
my $video = $page->video;
$page->close();
$video->saveAs('video/example.webm');

It's a similar story with Download classes:

# Do stuff
...
# Wait on Download
my $promise = $page->waitForEvent('download')
# Do some thing triggering a download
...

my $download = $handle->await( $promise );
$download->saveAs('somefile.extension');

Remember when doing an await() with playwright-perl you are waiting on a remote process on a server to complete, which can time out. You may wish to spawn a subprocess using a different tool to download very large files. If this is not an option, consider increasing the timeout on the LWP object used by the Playwright object (it's the 'ua' member of the class).

INSTALLATION NOTE

If you install this module from CPAN, you will likely encounter a croak() telling you to install node module dependencies. Follow the instructions and things should be just fine.

If you aren't, please file a bug!

CONSTRUCTOR

new(HASH) = (Playwright)

Creates a new browser and returns a handle to interact with it.

INPUT

debug (BOOL) : Print extra messages from the Playwright server process. Default: false
timeout (INTEGER) : Seconds to wait for the playwright server to spin up and down.  Default: 30s
cleanup (BOOL) : Whether or not to clean up the playwright server when this object goes out of scope.  Default: true

METHODS

launch(HASH) = Playwright::Browser

The Argument hash here is essentially those you'd see from browserType.launch(). See: https://playwright.dev/docs/api/class-browsertype#browsertypelaunchoptions

There is an additional "special" argument, that of 'type', which is used to specify what type of browser to use, e.g. 'firefox'.

server (HASH) = MIXED

Call Playwright::BrowserServer methods on the server which launched your browser object.

Parameters:

browser : The Browser object you wish to call a server method upon.
command : The BrowserServer method you wish to call

The most common use for this is to get the PID of the underlying browser process:

my $browser = $playwright->launch( browser => chrome );
my $process = $playwright->server( browser => $browser, command => 'process' );
print "Browser process PID: $process->{pid}\n";

BrowserServer methods (at the time of writing) take no arguments, so they are not processed.

await (HASH) = Object

Waits for an asynchronous operation returned by the waitFor* methods to complete and returns the value.

quit, DESTROY

Terminate the browser session and wait for the Playwright server to terminate.

Automatically called when the Playwright object goes out of scope.

BUGS

Please report any bugs or feature requests on the bugtracker website https://github.com/teodesian/playwright-perl/issues

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

AUTHORS

Current Maintainers:

  • George S. Baugh <teodesian@gmail.com>

COPYRIGHT AND LICENSE

Copyright (c) 2020 Troglodyne LLC

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.