Name

SPVM::Regex - Regular Expressions

Description

Regex class in SPVM has methods to perform pattern match and replacement using regular expressions.

Google RE2 is used as the regular expression engine.

Usage

Re:

use Re;

my $string = "Hello World"
my $match = Re->m($string, "^Hellow");

# ABC de ABC
my $string_ref = ["abc de abc"];
Re->s($string_ref, ["abc", "g"], "ABC");

Regex:

use Regex;

# Pattern match
{
  my $re = Regex->new("ab*c");
  my $string = "zabcz";
  my $match = $re->match("zabcz");
}

# Pattern match - UTF-8
{
  my $re = Regex->new("あ+");
  my $string = "いあああい";
  my $match = $re->match($string);
}

# Pattern match - Character class and the nagation
{
  my $re = Regex->new("[A-Z]+[^A-Z]+");
  my $string = "ABCzab";
  my $match = $re->match($string);
}

# Pattern match with captures
{
  my $re = Regex->new("^(\w+) (\w+) (\w+)$");
  my $string = "abc1 abc2 abc3";
  my $match = $re->match($string);
  
  if ($match) {
    my $cap1 = $match->cap1;
    my $cap2 = $match->cap2;
    my $cpa3 = $match->cap3;
  }
}

# Replace
{
  my $re = Regex->new("abc");
  my $string = "ppzabcz";
  my $string_ref = [$string];
  
  # "ppzABCz"
  $re->replace($string_ref, "ABC");
}

# Replace with a callback and capture
{
  my $re = Regex->new("a(bc)");
  my $string = "ppzabcz";
  my $string_ref = [$string];
  
  # "ppzABbcCz"
  $re->replace($string_ref, method : string ($re : Regex, $match : Regex::Match) {
    return "AB" . $match->cap1 . "C";
  });
}

# Replace global
{
  my $re = Regex->new("abc");
  my $string = "ppzabczabcz";
  my $string_ref = [$string];
  
  # "ppzABCzABCz"
  $re->replace_g($string_ref, "ABC");
}

# Replace global with a callback and capture
{
  my $re = Regex->new("a(bc)");
  my $string = "ppzabczabcz";
  my $string_ref = [$string];
  
  # "ppzABCbcPQRSzABCbcPQRSz"
  $re->replace_g($string_ref, method : string ($re : Regex, $match : Regex::Match) {
    return "ABC" . $match->cap1 . "PQRS";
  });
}

# . - single line mode
{
  my $re = Regex->new("(.+)", "s");
  my $string = "abc\ndef";
  
  my $match = $re->match($string);
  
  unless ($match) {
    return 0;
  }
  
  unless ($match->cap1 eq "abc\ndef") {
    return 0;
  }
}

Details

Regular Expression Syntax

See Google RE2 Syntax about the syntax of regular expressions.

More Perlish Pattern Match and Replacement

Use Re class if you want to use more Perlish pattern match and replacement.

Class Methods

new

static method new : Regex ($pattern : string, $flags : string = undef);

Creates a new Regex object and compiles the regex pattern $pattern with the flags $flags, and retruns the new object.

Exceptions:

The regex pattern $pattern must be defined. Otherwise an exception is thrown.

If the regex pattern $pattern can't be compiled, an exception is thrown.

Examples:

my $re = Regex->new("^ab+c");
my $re = Regex->new("^ab+c", "s");

Instance Methods

match

method match : Regex::Match ($string_or_buffer : object of string|StringBuffer, $offset_ref : int* = undef, $length : int = -1);

Performs a pattern match on the string or the StringBuffer object $string_or_buffer from the offset $$offset_ref to the length $length.

If the pattern match succeeds, return a new Regex::Match object, otherwise returns undef.

$$offset_ref is updated to the next position if it is specified.

If $length is less than 0, it is set to the length of $string_or_buffer.

Exceptions:

$string_or_buffer must be defined. Otherwise an exception is thrown.

The type of $string_ref_or_buffer must be string or StringBuffer. Otherwise an exception is thrown.

$$offset_ref + $length must be less than or equal to the length of $string_or_buffer. Otherwise an exception is thrown.

replace

method replace : Regex::ReplaceInfo ($string_ref_or_buffer : object of string[]|StringBuffer, $replace : object of string|Regex::Replacer, $offset_ref : int* = undef, $length : int = -1, $options : object[] = undef);

The string to be replaced is either $string_ref_or_buffer->[0] when the type is string or $string_ref_or_buffer when the type is StringBuffer.

Replaces the string from the offset $$offset_ref to the length $length with the replacement string or the callback $replace with the options $options using a regular expression.

$$offset_ref is updated to the next position if it is specified.

If $length is less than 0, it is set to the length of $string_or_buffer.

If the replacement succeeds, returns a new Regex::ReplaceInfo, otherwise retunrs undef.

Options:

  • global

    This option must be a Int object. Otherwise an exception is thrown.

    If the value of the Int object is a true value, the global replacement is performed.

Exceptions:

$string_ref_or_buffer must be defined. Otherwise an exception is thrown.

The type of $string_ref_or_buffer must be string or StringBuffer. Otherwise an exception is thrown.

$replace must be a string or a Regex::Replacer object. Otherwise an exception is thrown.

$$offset_ref must be greater than or equal to 0. Otherwise an exception is thrown.

$$offset_ref + $length must be less than or equal to the length of $string_ref_or_buffer. Otherwise an exception is thrown.

Exceptions of the match_forward method can be thrown.

replace_g

method replace_g : Regex::ReplaceInfo ($string_ref_or_buffer : object of string[]|StringBuffer, $replace : object of string|Regex::Replacer, $offset_ref : int* = undef, $length : int = -1, $options : object[] = undef):

Calls "replace" method given the same arguments but with global option set to 1, and returns its return value.

split

method split : string[] ($string : string, $limit : int = 0);

The same as Fn#split method, but the regular expression is used as the separator.

Repository

SPVM::Regex - Github

See Also

Author

Yuki Kimoto

Contributors

Copyright & License

Copyright (c) 2023 Yuki Kimoto

MIT License