NAME
SPVM::Regex - Regex in SPVM | Regular expression
SYNOPSYS
use Regex;
# Pattern match
{
my $re = Regex->new("ab*c");
my $target = "zabcz";
my $match = $re->match($target, 0);
}
# Pattern match - UTF-8
{
my $re = Regex->new("あ+");
my $target = "いあああい";
my $match = $re->match($target, 0);
}
# Pattern match - Character class and the nagation
{
my $re = Regex->new("[A-Z]+[^A-Z]+");
my $target = "ABCzab";
my $match = $re->match($target, 0);
}
# Pattern match with captures
{
my $re = Regex->new("^(\w+) (\w+) (\w+)$");
my $target = "abc1 abc2 abc3";
my $match = $re->match($target, 0);
if ($match) {
my $cap1 = $re->captures->[0];
my $cap2 = $re->captures->[1];
my $cpa3 = $re->captures->[2];
}
}
# Replace
{
my $re = Regex->new("abc");
my $target = "ppzabcz";
# "ppzABCz"
my $result = $re->replace($target, 0, "ABC");
my $replace_count = $re->replace_count;
}
# Replace with a callback and capture
{
my $re = Regex->new("a(bc)");
my $target = "ppzabcz";
# "ppzABbcCz"
my $result = $re->replace_cb($target, 0, sub : string ($self : self, $re : Regex) {
return "AB" . $re->captures->[0] . "C";
});
}
# Replace all
{
my $re = Regex->new("abc");
my $target = "ppzabczabcz";
# "ppzABCzABCz"
my $result = $re->replace_all($target, 0, "ABC");
}
# Replace all with a callback and capture
{
my $re = Regex->new("a(bc)");
my $target = "ppzabczabcz";
# "ppzABCbcPQRSzABCbcPQRSz"
my $result = $re->replace_all_cb($target, 0, sub : string ($self : self, $re : Regex) {
return "ABC" . $re->captures->[0] . "PQRS";
});
}
DESCRIPTION
Regex provides regular expression functions.
REGULAR EXPRESSION SYNTAX
Regex provides the methodset of Perl regular expression. The target string and regex string is interpretted as UTF-8 string.
# Quantifier
+ more than or equals to 1 repeats
* more than or equals to 0 repeats
? 0 or 1 repeats
{m,n} repeats between m and n
# Regular expression character
^ first of string
$ last of string
. all character except "\n"
# Default mode ASCII mode
\d Not supported [0-9]
\D Not supported not \d
\s Not supported " ", "\t", "\f", "\r", "\n"
\S Not supported not \s
\w Not supported [a-zA-Z0-9_]
\W Not supported not \w
# Character class and the negatiton
[a-z0-9]
[^a-z0-9]
# Capture
(foo)
Regex Options:
s single line mode
a ascii mode
Regex options is used by new_with_options
method.
my $re = Regex->new("^ab+c", "sa");
Limitations:
Regex do not support the same set of characters after a quantifier.
# A exception occurs
Regex->new("a*a");
Regex->new("a?a");
Regex->new("a+a");
Regex->new("a{1,3}a")
If 0 width quantifir is between two same set of characters after a quantifier, it is invalid.
# A exception occurs
Regex->new("\d+\D*\d+");
Regex->new("\d+\D?\d+");
CLASS METHODS
new
my $re = Regex->new("^ab+c");
Create a new Regex object and compile the regex.
new_with_options
my $re = Regex->new("^ab+c", "s");
Create a new Regex object and compile the regex with the options.
INSTANCE METHODS
captures
sub captures : string[] ()
Get the strings captured by "match" method.
match_start
sub match_start : int ()
Get the start byte offset of the string matched by "match" method method.
match_length
sub match_length : int ()
Get the byte length of the string matched by "match" method method.
replace_count
sub replace_count : int ();
Get the replace count of the strings replaced by "replace" or "replace_all" method.
match
sub match : int ($self : self, $target : string, $target_offset : int)
Execute pattern matching to the specific string and the start byte offset of the string.
If the pattern match succeeds, 1 is returned, otherwise 0 is returned.
You can get captured strings using "captures" method, and get the byte offset of the matched whole string using "match_start" method, and get the length of the matched whole string using "match_length" method.
replace
sub replace : string ($self : self, $target : string, $target_offset : int, $replace : string)
Replace the target string specified with the start byte offset with replace string.
replace_cb
sub replace_cb : string ($self : self, $target : string, $target_offset : int, $replace_cb : Regex::Replacer)
Replace the target string specified with the start byte offset with replace callback. The callback must have "replace_to" method defined in Regex::Replacer.
replace_all
sub replace_all : string ($self : self, $target : string, $target_offset : int, $replace : string)
Replace all of the target strings specified with the start byte offset with replace string.
replace_all_cb
sub replace_all_cb : string ($self : self, $target : string, $target_offset : int, $replace_cb : Regex::Replacer)
Replace all of the target strings specified with the start byte offset with replace callback. The callback must have "replace_to" method defined in Regex::Replacer.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 22:
Non-ASCII character seen before =encoding in 'Regex->new("あ+");'. Assuming UTF-8