NAME
WWW::Mechanize::Plugin::DOM - HTML Document Object Model plugin for Mech
VERSION
0.003 (alpha)
SYNOPSIS
use WWW::Mechanize;
my $m = new WWW::Mechanize;
$m->use_plugin('DOM',
script_handlers => {
default => \&script_handler,
qr/(?:^|\/)(?:x-)?javascript/ => \&script_handler,
},
event_attr_handlers => {
default => \&event_attr_handler,
qr/(?:^|\/)(?:x-)?javascript/ => \&event_attr_handler,
},
);
sub script_handler {
my($mech, $dom_tree, $code, $url, $line, $is_inline) = @_;
# ... code to run the script ...
}
sub event_attr_handler {
my($mech, $elem, $event_name, $code, $url, $line) = @_;
# ... code that returns a coderef ...
}
$m->plugin('DOM')->tree; # DOM tree for the current page
DESCRIPTION
This is a plugin for WWW::Mechanize that provides support for the HTML Document Object Model. This is a part of the WWW::Mechanize::Plugin::JavaScript distribution, but it can be used on its own.
USAGE
To enable this plugin, use Mech's use_plugin
method, as shown in the synopsis.
To access the DOM tree, use $mech->plugin('DOM')->tree
, which returns an HTML::DOM object.
You may provide a subroutine that runs an inline script like this:
$mech->use_plugin('DOM',
script_handlers => {
qr/.../ => sub { ... },
qr/.../ => sub { ... },
# etc
}
);
And a subroutine for turning HTML event attributes into subroutines, like this:
$mech->use_plugin('DOM',
event_attr_handlers => {
qr/.../ => sub { ... },
qr/.../ => sub { ... },
# etc
}
);
In both cases, the qr/.../
should be a regular expression that matches the scripting language to which the handler applies, or the string 'default'. The scripting language will be either a MIME type or the contents of the language
attribute if a script element's type
attribute is not present. The subroutine specified as the 'default' will be used if there is no handler for the scripting language in question or if there is no Content-Script-Type header and, for script_handlers
, the script element has no 'type' or 'language' attribute.
Each time you move to another page with WWW::Mechanize, a different copy of the DOM plugin object is created. So, if you must refer to it in a callback routine, don't use a closure, but get it from the $mech
object that is passed as the first argument.
The line number passed to an event attribute handler requires HTML::DOM 0.012 or higher. It will be undef
will lower versions.
PREREQUISITES
HTML::DOM 0.010 or later (0.012 or higher recommended)
The current stable release of WWW::Mechanize does not support plugins. See WWW::Mechanize::Plugin::JavaScript for more info.
BUGS
Event handlers like onload and onunload are not yet supported. Some events do not yet do everything they are supposed to; e.g., a link's
click
method does not go to the next page.This plugin does not yet provide WWW::Mechanize with all the necessary callback routines (for
extract_images
, etc.).Currently, external scripts referenced within a page are always read as Latin-1. This will be fixed.