SYNOPSIS
use Regexp::Common qw( Apache2 );
use Regexp::Common::Apache2 qw( $ap_true $ap_false );
while( <> )
{
my $pos = pos( $_ );
/\G$RE{Apache2}{Word}/gmc and print "Found a word expression at pos $pos\n";
/\G$RE{Apache2}{Variable}/gmc and print "Found a variable $+{varname} at pos $pos\n";
}
# Override Apache2 expressions by the legacy ones
$RE{Apache2}{-legacy => 1}
# or use it with the Legacy prefix:
if( $str =~ /^$RE{Apache2}{LegacyVariable}$/ )
{
print( "Found variable $+{variable} with name $+{varname}\n" );
}
VERSION
v0.1.0
DESCRIPTION
This is the perl port of Apache2 expressions{.perl-module}
The regular expressions have been designed based on Apache2 Backus-Naur Form (BNF) definition as described below in "APACHE2 EXPRESSION"{.perl-module}
You can also use the extended pattern by calling Regexp::Common::Apache2{.perl-module} like:
$RE{Apache2}{-legacy => 1}
All of the regular expressions use named capture. See "%+" in perlvar{.perl-module} for more information on named capture.
APACHE2 EXPRESSION
comp
BNF:
stringcomp
| integercomp
| unaryop word
| word binaryop word
| word "in" listfunc
| word "=~" regex
| word "!~" regex
| word "in" "{" list "}"
$RE{Apache2}{Comp}
For example:
"Jack" != "John"
123 -ne 456
# etc
This uses other expressions namely "stringcomp"{.perl-module}, "integercomp"{.perl-module}, "word"{.perl-module}, "listfunc"{.perl-module}, "regex"{.perl-module}, "list"{.perl-module}
The capture names are:
comp
: Contains the entire capture block
comp_binary
: Matches the expression that uses a binary operator, such as:
==, =, !=, <, <=, >, >=, -ipmatch, -strmatch, -strcmatch, -fnmatch
comp_binaryop
: The binary op used if the expression is a binary comparison. Binary operator is:
==, =, !=, <, <=, >, >=, -ipmatch, -strmatch, -strcmatch, -fnmatch
comp_integercomp
: When the comparison is for an integer comparison as opposed to a string comparison.
comp_list
: Contains the list used to check a word against, such as:
"Jack" in {"John", "Peter", "Jack"}
comp_listfunc
: This contains the listfunc when the expressions contains a word checked against a list function, such as:
"Jack" in listMe("some arguments")
comp_regexp
: The regular expression used when a word is compared to a regular expression, such as:
"Jack" =~ /\w+/
Here, *comp\_regexp* would contain `/\w+/`
comp_regexp_op
: The regular expression operator used when a word is compared to a regular expression, such as:
"Jack" =~ /\w+/
Here, *comp\_regexp\_op* would contain `=~`
comp_stringcomp
: When the comparison is for a string comparison as opposed to an integer comparison.
comp_unary
: Matches the expression that uses unary operator, such as:
-d, -e, -f, -s, -L, -h, -F, -U, -A, -n, -z, -T, -R
comp_word
: Contains the word that is the object of the comparison.
comp_word_in_list
: Contains the expression of a word checked against a list, such as:
"Jack" in {"John", "Peter", "Jack"}
comp_word_in_listfunc
: Contains the word when it is being compared to a listfunc{.perl-module}, such as:
"Jack" in listMe("some arguments")
comp_word_in_regexp
: Contains the expression of a word checked against a regular expression, such as:
"Jack" =~ /\w+/
Here the word `Jack` (without the parenthesis) would be captured in
*comp\_word*
comp_worda
: Contains the first word in comparison expression
comp_wordb
: Contains the second word in comparison expression
cond
BNF:
"true"
| "false"
| "!" cond
| cond "&&" cond
| cond "||" cond
| comp
| "(" cond ")"
$RE{Apache2}{Cond}
For example:
use Regexp::Common::Apache qw( $ap_true $ap_false );
($ap_false && $ap_true)
The capture names are:
cond
: Contains the entire capture block
cond_and
: Contains the expression like:
($ap_true && $ap_true)
cond_false
: Contains the false expression like:
($ap_false)
cond_neg
: Contains the expression if it is preceded by an exclamation mark, such as:
!$ap_true
cond_or
: Contains the expression like:
($ap_true || $ap_true)
cond_true
: Contains the true expression like:
($ap_true)
expr
BNF: cond | string
$RE{Apache2}{Expr}
The capture names are:
expr
: Contains the entire capture block
expr_cond
: Contains the expression of the condition
expr_string
: Contains the expression of a string
function
BNF: funcname "(" words ")"
$RE{Apache2}{Function}
For example:
base64("Some string")
The capture names are:
function
: Contains the entire capture block
function_args
: Contains the list of arguments. In the example above, this would be
Some string
function_name
: The name of the function . In the example above, this would be
base64
integercomp
BNF:
word "-eq" word | word "eq" word
| word "-ne" word | word "ne" word
| word "-lt" word | word "lt" word
| word "-le" word | word "le" word
| word "-gt" word | word "gt" word
| word "-ge" word | word "ge" word
$RE{Apache2}{IntegerComp}
For example:
123 -ne 456
789 gt 234
# etc
The hyphen before the operator is optional, so you can say eq instead
of -eq
The capture names are:
stringcomp
: Contains the entire capture block
integercomp_op
: Contains the comparison operator
integercomp_worda
: Contains the first word in the string comparison
integercomp_wordb
: Contains the second word in the string comparison
join
BNF:
"join" ["("] list [")"]
| "join" ["("] list "," word [")"]
$RE{Apache2}{Join}
For example:
join({"word1" "word2"})
# or
join({"word1" "word2"}, ', ')
This uses "list"{.perl-module} and "word"{.perl-module}
The capture names are:
join
: Contains the entire capture block
join_list
: Contains the value of the list
join_word
: Contains the value for word used to join the list
list
BNF:
split
| listfunc
| "{" words "}"
| "(" list ")
$RE{Apache2}{List}
For example:
split( /\w+/, "Some string" )
# or
{"some", "words"}
# or
(split( /\w+/, "Some string" ))
# or
( {"some", "words"} )
This uses "split"{.perl-module}, "listfunc"{.perl-module}, words{.perl-module} and "list"{.perl-module}
The capture names are:
list
: Contains the entire capture block
list_func
: Contains the value if a "listfunc"{.perl-module} is used
list_list
: Contains the value if this is a list embedded within parenthesis
list_split
: Contains the value if the list is based on a split{.perl-module}
list_words
: Contains the value for a list of words.
listfunc
BNF: listfuncname "(" words ")"
$RE{Apache2}{Function}
For example:
base64("Some string")
This is quite similar to the "function"{.perl-module} regular expression
The capture names are:
listfunc
: Contains the entire capture block
listfunc_args
: Contains the list of arguments. In the example above, this would be
Some string
listfunc_name
: The name of the function . In the example above, this would be
base64
regany
BNF: regex | regsub
$RE{Apache2}{Regany}
For example:
/\w+/i
# or
m,\w+,i
This regular expression includes "regany"{.perl-module} and "regsub"{.perl-module}
The capture names are:
regany
: Contains the entire capture block
regany_regex
: Contains the regular expression. See "regex"{.perl-module}
regany_regsub
: Contains the substitution regular expression. See "regsub"{.perl-module}
regex
BNF:
"/" regpattern "/" [regflags]
| "m" regsep regpattern regsep [regflags]
$RE{Apache2}{Regex}
For example:
/\w+/i
# or
m,\w+,i
The capture names are:
regex
: Contains the entire capture block
regflags
: The regula expression modifiers. See perlre{.perl-module}
This can be any combination of:
i, s, m, g
regpattern
: Contains the regular expression. See perlre{.perl-module} for example and explanation of how to use regular expression. Apache2 uses PCRE, i.e. perl compliant regular expressions.
regsep
: Contains the regular expression separator, which can be any of:
/, #, $, %, ^, |, ?, !, ', ", ",", ;, :, ".", _, -
regsub
BNF: "s" regsep regpattern regsep string regsep [regflags]
$RE{Apache2}{Regsub}
For example:
s/\w+/John/gi
The capture names are:
regflags
: The modifiers used which can be any combination of:
i, s, m, g
See [perlre](https://metacpan.org/pod/perlre){.perl-module} for an
explanation of their usage and meaning
regstring
: The string replacing the text found by the regular expression
regsub
: Contains the entire capture block
regpattern
: Contains the regular expression which is perl compliant since Apache2 uses PCRE.
regsep
: Contains the regular expression separator, which can be any of:
/, #, $, %, ^, |, ?, !, ', ", ",", ;, :, ".", _, -
split
BNF:
"split" ["("] regany "," list [")"]
| "split" ["("] regany "," word [")"]
$RE{Apache2}{Split}
For example:
split( /\w+/, "Some string" )
This uses "regany"{.perl-module}, "list"{.perl-module} and "word"{.perl-module}
The capture names are:
split
: Contains the entire capture block
split_regex
: Contains the regular expression used for the split
split_list
: The list being split. It can also be a word. See below
split_word
: The word being split. It can also be a list. See above
string
BNF: substring | string substring
$RE{Apache2}{String}
For example:
URI accessed is: %{REQUEST_URI}
The capture names are:
string
: Contains the entire capture block
stringcomp
BNF:
word "==" word
| word "!=" word
| word "<" word
| word "<=" word
| word ">" word
| word ">=" word
$RE{Apache2}{StringComp}
For example:
"John" == "Jack"
sub(s/\w+/Jack/i, "John") != "Jack"
# etc
The capture names are:
stringcomp
: Contains the entire capture block
stringcomp_op
: Contains the comparison operator
stringcomp_worda
: Contains the first word in the string comparison
stringcomp_wordb
: Contains the second word in the string comparison
sub
BNF: "sub" ["("] regsub "," word [")"]
$RE{Apache2}{Sub}
For example:
sub(s/\w/John/gi,"Peter")
The capture names are:
sub
: Contains the entire capture block
sub_regsub
: Contains the substitution expression, i.e. in the example above, this would be:
s/\w/John/gi
sub_word
: The target for the substitution. In the example above, this would be "Peter"
substring
BNF: cstring | variable
$RE{Apache2}{Substring}
For example:
Jack
# or
%{REQUEST_URI}
# or
%{:sub(s/\b\w+\b/Peter/, "John"):}
See "variable"{.perl-module} and "word"{.perl-module} regular expression for more on those.
The capture names are:
substring
: Contains the entire capture block
variable
BNF:
"%{" varname "}"
| "%{" funcname ":" funcargs "}"
| "%{:" word ":}"
| "%{:" cond ":}"
| rebackref
$RE{Apache2}{Variable}
# or
$RE{Apache2}{LegacyVariable}
For example:
%{REQUEST_URI}
# or
%{md5:"some string"}
# or
%{:sub(s/\b\w+\b/Peter/, "John"):}
# or a reference to previous regular expression capture groups
$1, $2, etc..
See "word"{.perl-module} and "cond"{.perl-module} regular expression for more on those.
The capture names are:
variable
: Contains the entire capture block
var_cond
: If this is a condition inside a variable, such as:
%{:$ap_true == $ap_false}
var_func_args
: Contains the function arguments.
var_func_name
: Contains the function name.
var_word
: A variable containing a word. See "word"{.perl-module} for more information about word expressions.
varname
: Contains the variable name without the percent sign or dollar sign (if legacy regular expression is enabled) or the possible surrounding accolades
word
BNF:
digits
| "'" string "'"
| '"' string '"'
| word "." word
| variable
| sub
| join
| function
| "(" word ")"
$RE{Apache2}{Word}
This is the most complex regular expression used, since it uses all the others and can recurse deeply
For example:
12
# or
"John"
# or
'Jack'
# or
%{REQUEST_URI}
# or
%{HTTP_HOST}.%{HTTP_PORT}
# or
%{:sub(s/\b\w+\b/Peter/, "John"):}
# or
sub(s,\w+,Paul,gi, "John")
# or
join({"Paul", "Peter"}, ', ')
# or
md5("some string")
# or any word surrounded by parenthesis, such as:
("John")
See "string"{.perl-module}, "word"{.perl-module}, "variable"{.perl-module}, "sub"{.perl-module}, "join"{.perl-module}, "function"{.perl-module} regular expression for more on those.
The capture names are:
word
: Contains the entire capture block
word_digits
: If the word is actually digits, thise contains those digits.
word_dot_word
: This contains the text when two words are separated by a dot.
word_enclosed
: Contains the value of the word enclosed by single or double quotes or by surrounding parenthesis.
word_function
: Contains the word containing a "function"{.perl-module}
word_join
: Contains the word containing a "join"{.perl-module}
word_quote
: If the word is enclosed by single or double quote, this contains the single or double quote character
word_sub
: If the word is a substitution, this contains tha substitution
word_variable
: Contains the word containing a "variable"{.perl-module}
words
BNF:
word
| word "," list
$RE{Apache2}{Words}
For example:
"Jack"
# or
"John", {"Peter", "Paul"}
# or
sub(s/\b\w+\b/Peter/, "John"), {"Peter", "Paul"}
See "word"{.perl-module} and "list"{.perl-module} regular expression for more on those.
The capture names are:
words
: Contains the entire capture block
words_word
: Contains the word
words_list
: Contains the list
LEGACY
There are 2 expressions that can be used as legacy:
comp
: See "comp"{.perl-module}
variable
: See "variable"{.perl-module}
CHANGES & CONTRIBUTIONS
Feel free to reach out to the author for possible corrections, improvements, or suggestions.
AUTHOR
Jacques Deguest <jack@deguest.jp{classes="ARRAY(0x557b8c355d30)"}>
SEE ALSO
https://httpd.apache.org/docs/trunk/en/expr.html
COPYRIGHT & LICENSE
Copyright (c) 2020 DEGUEST Pte. Ltd.
You can use, copy, modify and redistribute this package and associated files under the same terms as Perl itself.