NAME
Regex::Common::URI::gopher -- Returns a pattern for gopher URIs.
SYNOPSIS
use Regex::Common qw /URI/;
while (<>) {
/$RE{URI}{gopher}/ and print "Contains a gopher URI.\n";
}
DESCRIPTION
$RE{URI}{gopher}{-notab}
Gopher URIs are poorly defined. Originally, RFC 1738 defined gopher URIs, but they were later redefined in an internet draft. One that was expired in June 1997.
The internet draft for gopher URIs defines them as follows:
"gopher:" "//" host [ ":" port ] "/" gopher-type selector
[ "%09" search [ "%09" gopherplus_string ]]
Unfortunally, a selector is defined in such a way that characters may be escaped using the URI escape mechanism. This includes tabs, which escaped are %09
. Hence, the syntax cannot distinguish between a URI that has both a selector and a search part, and an URI where the selector includes an escaped tab. (The text of the draft forbids tabs to be present in the selector though).
$RE{URI}{gopher}
follows the defined syntax. To disallow escaped tabs in the selector and search parts, use $RE{URI}{gopher}{-notab}
.
There are other differences between the text and the given syntax. According to the text, selector strings cannot have tabs, linefeeds or carriage returns in them. The text also allows the entire gopher-path, (the part after the slash following the hostport) to be empty; if this is empty the slash may be omitted as well. However, this isn't reflected in the syntax.
Under {-keep}
, the following are returned:
- $1
-
The entire URI.
- $2
-
The scheme.
- $3
-
The host (name or address).
- $4
-
The port (if any).
- $5
-
The "gopher-path", the part after the / following the host and port.
- $6
-
The gopher-type.
- $7
-
The selector. (When no
{-notab}
is used, this includes the search and gopherplus_string, including the separating escaped tabs). - $8
-
The search, if given. (Only when
{-notab}
is given). - $9
-
The gopherplus_string, if given. (Only when
{-notab}
is given).
head1 REFERENCES
- [RFC 1738]
-
Berners-Lee, Tim, Masinter, L., McCahill, M.: Uniform Resource Locators (URL). December 1994.
- [RFC 1808]
-
Fielding, R.: Relative Uniform Resource Locators (URL). June 1995.
- [GOPHER URL]
-
Krishnan, Murali R., Casey, James: "A Gopher URL Format". Expired Internet draft draft-murali-url-gopher. December 1996.
SEE ALSO
Regex::Common::URI for other supported URIs.
AUTHOR
Alceu Rodrigues de Freitas Junior <glasswalk3r@yahoo.com.br>
LICENSE and COPYRIGHT
This software is copyright (c) 2024 of Alceu Rodrigues de Freitas Junior, glasswalk3r at yahoo.com.br
This file is part of regex-common project.
regex-commonis free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
regex-common is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with regex-common. If not, see (http://www.gnu.org/licenses/).
The original project [Regex::Common](https://metacpan.org/pod/Regex::Common) is licensed through the MIT License, copyright (c) Damian Conway (damian@cs.monash.edu.au) and Abigail (regexp-common@abigail.be).