NAME

WWW::Mechanize::Chrome::URLBlacklist - blacklist URLs from fetching

SYNOPSIS

use WWW::Mechanize::Chrome;
use WWW::Mechanize::Chrome::URLBlacklist;

my $mech = WWW::Mechanize::Chrome->new();
my $bl = WWW::Mechanize::Chrome::URLBlacklist->new(
    blacklist => [
        qr!\bgoogleadservices\b!,
    ],
    whitelist => [
        qr!\bcorion\.net\b!,
    ],

    # fail all unknown URLs
    default => 'failRequest',
    # allow all unknown URLs
    # default => 'continueRequest',

    on_default => sub {
        warn "Ignored URL $_[0] (action was '$_[1]')",
    },
);
$bl->enable($mech);

DESCRIPTION

This module allows an easy approach to whitelisting/blacklisting URLs so that Chrome does not make requests to the blacklisted URLs.

ATTRIBUTES

whitelist

Arrayref containing regular expressions of URLs to always allow fetching.

blacklist

Arrayref containing regular expressions of URLs to always deny fetching unless they are matched by something in the whitelist.

default

default => 'continueRequest'

The action to take if an URL appears neither in the whitelist nor in the blacklist. The default is continueRequest. If you want to block all unknown URLs, use failRequest

on_default

on_default => sub {
    my( $url, $action ) = @_;
    warn "Unknown URL <$url>";
};

This callback is invoked for every URL that is neither in the whitelist nor in the blacklist. This is useful to see what URLs are still missing a category.

_mech

(internal) The WWW::Mechanize::Chrome instance we are connected to

_request_listener

(internal) The request listener created by WWW::Mechanize::Chrome while listening for URL messages

METHODS

->new

my $bl = WWW::Mechanize::Chrome::URLBlacklist->new(
    blacklist => [
        qr!\bgoogleadservices\b!,
        qr!\ioam\.de\b!,
        qr!\burchin\.js$!,
        qr!.*\.(?:woff|ttf)$!,
        qr!.*\.css(\?\w+)?$!,
        qr!.*\.png$!,
        qr!.*\bfavicon.ico$!,
    ],
);
$bl->enable( $mech );

Creates a new instance of a blacklist, but does not activate it yet. See ->enable for that.

->enable

$bl->enable( $mech );

Attaches the blacklist to a WWW::Mechanize::Chrome object.

->enable

$bl->disable( $mech );

Removes the blacklist to a WWW::Mechanize::Chrome object.

REPOSITORY

The public repository of this module is https://github.com/Corion/www-mechanize-chrome.

SUPPORT

The public support forum of this module is https://perlmonks.org/.

TALKS

I've given a German talk at GPW 2017, see http://act.yapc.eu/gpw2017/talk/7027 and https://corion.net/talks for the slides.

At The Perl Conference 2017 in Amsterdam, I also presented a talk, see http://act.perlconference.org/tpc-2017-amsterdam/talk/7022. The slides for the English presentation at TPCiA 2017 are at https://corion.net/talks/WWW-Mechanize-Chrome/www-mechanize-chrome.en.html.

BUG TRACKER

Please report bugs in this module via the Github bug queue at https://github.com/Corion/WWW-Mechanize-Chrome/issues

AUTHOR

Max Maischein corion@cpan.org

COPYRIGHT (c)

Copyright 2010-2024 by Max Maischein corion@cpan.org.

LICENSE

This module is released under the same terms as Perl itself.