NAME
Net::PublicSuffixList - The Mozilla Public Suffix List
SYNOPSIS
use Net::PublicSuffixList;
my $psl = Net::PublicSuffixList->new;
my $host = 'amazon.co.uk';
# get all the suffixes in host (like, uk and co.uk)
my $suffixes = $psl->suffixes_in( $host );
# get the longest suffix
my $suffix = $psl->longest_suffix_in( $host );
my $hash = $psl->split_host( $host );
DESCRIPTION
I mostly wrote this because I was working on App::url and needed a way to figure out which part of a URL was the registered part and with was the top-level domain.
The Public Suffix List is essentially a self-reported collection of the top-level, generic, country code, or whatever domains.
There are other modules that try to do this, but they come with packaged (old) versions of the Public Suffix List or have limited functionality.
This module can fetch the most current one for you, use one that you provide locally, or even let you completely make it up. You can add entries you want but don't show up in the list, and remove ones you don't think should be there.
- new
-
Create the new object and specify how you'd like to get the data. The network file is about 220Kb, so you might want to fetch it once, store it, and then use
local_path
to use it.The constructor first tries to use a local file. If you've disabled that with
no_local
or the file doesn't exist, it moves on to trying the network. If you've disabled the network withno_net
, then it complains but still returns the object. You can still construct your own list withadd_suffix
.Possible keys:
list_url # the URL for the suffix list local_path # the path to a local file that has the suffix list no_net # do not use the network no_local # do not use a local file cache_dir # location to save the fetched file
- defaults
-
A hash of the default values for everything.
- parse_list( STRING_REF )
-
Take a scalar reference to the contents of the public suffix list, find all the suffices and add them to the object.
- add_suffix( STRING )
-
Add STRING to the known public suffices. This returns the object itself.
Before this adds the suffix, it strips off leading
*
and.*
characters. Some sources specify*.foo.bar
, but this addsfoo.bar
. - remove_suffix( STRING )
-
Remove the STRING as a known public suffices. This returns the object itself.
- suffix_exists( STRING )
-
Return the invocant if the suffix exists, and the empty list otherwise.
- suffixes_in( HOST )
-
Return an array reference of the publix suffixes in HOST, sorted from shortest to longest.
- longest_suffix_in( HOST )
-
Return the longest public suffix in HOST.
- split_host( HOST )
-
Returns a hash reference with these keys:
host the input value suffix the longest public suffix short the input value with the public suffix (and leading dot) removed
- fetch_list_from_local
-
Fetch the public suffix list plaintext file from the path returned by
local_path
. Returns a scalar reference to the text of the raw UTF-8 octets. - fetch_list_from_net
-
Fetch the public suffix list plaintext file from the URL returned by
url
. Returns a scalar reference to the text of the raw UTF-8 octets.If you've set
cache_dir
in the object, this method attempts to cache the response in that directory usingdefault_local_file
as the filename. This cache is different thanlocal_file
although you can use it aslocal_file
. - url
-
Return the configured URL for the public suffix list.
- default_url
-
Return the default URL for the public suffix list.
- local_path
-
Return the configured local path for the public suffix list.
- default_local_path
-
Return the default local path for the public suffix list.
- local_file
-
Return the configured filename for the public suffix list.
- default_local_file
-
Return the default filename for the public suffix list.
TO DO
SEE ALSO
Domain::PublicSuffix, Mozilla::PublicSuffix, IO::Socket::SSL::PublicSuffix
SOURCE AVAILABILITY
This source is in Github:
http://github.com/briandfoy/net-publicsuffixlist
AUTHOR
brian d foy, <bdfoy@cpan.org>
COPYRIGHT AND LICENSE
Copyright © 2020-2021, brian d foy, All Rights Reserved.
You may redistribute this under the terms of the Artistic License 2.0.