NAME
URIDetail - test URIs using detailed URI information
SYNOPSIS
This plugin creates a new rule test type, known as "uri_detail". These rules apply to all URIs found in the message.
loadplugin Mail::SpamAssassin::Plugin::URIDetail
RULE DEFINITIONS AND PRIVILEGED SETTINGS
The format for defining a rule is as follows:
uri_detail SYMBOLIC_TEST_NAME key1 =~ /value1/i key2 !~ /value2/ ...
Supported keys are:
raw
is the raw URI prior to any cleaning (e.g. "http://spamassassin.apache%2Eorg/").
type
is the tag(s) which referenced the raw_uri. parsed is a faked type which specifies that the raw_uri was parsed from the rendered text.
cleaned
is a list including the raw URI and various cleaned versions of the raw URI (http://spamassassin.apache%2Eorg/, https://spamassassin.apache.org/).
text
is the anchor text(s) (text between <a> and </a>) that linked to the raw URI.
domain
is the domain(s) found in the cleaned URIs, as trimmed to registrar boundary by Mail::SpamAssassin::Util::RegistrarBoundaries(3).
host
is the full host(s) in the cleaned URIs. (Supported since SA 3.4.5)
Example rule for matching a URI where the raw URI matches "%2Ebar", the domain "bar.com" is found, and the type is "a" (an anchor tag).
uri_detail TEST1 raw =~ /%2Ebar/ domain =~ /^bar\.com$/ type =~ /^a$/
Example rule to look for suspicious "https" links:
uri_detail FAKE_HTTPS text =~ /\bhttps:/ cleaned !~ /\bhttps:/
Regular expressions should be delimited by slashes.
Negating matches is supported in SpamAssassin 4.0.2 or higher by prefixing the key with an exclamation mark.
Example rule to look for links where the text contains "id.me" but the host is not "id.me":
uri_detail FAKE_ID_ME text =~ /\bid\.me\b/ !host =~ /^id\.me$/
The difference between '!key =~ ...' and 'key !~ ...' is due to the fact that keys can contain multiple values. '!key =~ ...' will be true if none of the values match the regex, while 'key !~ ...' will be true if any of the values do not match the regex.
For example, consider the following data structure:
{
host => {
'id.me' => 1,
'example.com' => 1,
},
text => [
'Login to ID.me'
]
}
The rule 'host !~ /^id\.me$/' would be true, because 'example.com' does not match the regex. The rule '!host =~ /^id\.me$/' would be false, because it is the logical negation of 'host =~ /^id\.me$/' which is true because 'id.me' matches the regex.
There must not be any whitespace between the key and the negation operator.