NAME

Regexp::Result - store information about a regexp match for later retrieval

SYNOPSIS

$foo =~ /(a|an|the) (\w+)/;
my $result = Regexp::Result->new();

# ...
# some other code which potentially executes a regular expression

my $determiner = $result->c(1);
# i.e. $1 at the time when the object was created

Have you ever wanted to retain information about a regular expression match, without having to go through the palaver of pulling things out of $1, pos, etc. and assigning them each to temporary variables until you've decided what to use them as?

Regexp::Result objects, when created, contain as much information about a match as perl can tell you. This means that you just need to create one variable and keep it.

Hopefully, your code will be more comprehensible when it looks like $result->last_numbered_match_start->[-1], instead of $-[-1]. The documentation for the punctuation variables, by the way, is hidden away in perldoc perlvar along with scary things like ^H. I've copied most of it and/or rewritten it below.

METHODS

new

Creates a new Regexp::Result object. The object will gather data from the last match (if successful) and store it for later retrieval.

Note that almost all of the contents are read-only.

numbered_captures

This accesses $1, $2, etc as $rr->numbered_captures->[0] etc. Note the numbering difference!

c

This accesses the contents of numbered_captures, but uses numbers from 1 for comparability with $1, $2, $3, etc.

match, prematch, postmatch

'The quick brown fox' =~ /q[\w]+/p;
my $rr = Regexp::Result->new();
print $rr->match;     # prints 'quick'
print $rr->prematch;  # prints 'The '
print $rr->postmatch; # prints ' brown fox'

When a regexp is executed with the /p flag, the variables ${^MATCH}, ${^PREMATCH}, and ${^POSTMATCH} are set. These correspond to the entire text matched by the regular expression, the text in the string which preceded the matched text, and the text in the string which followed it.

The match method provides access to the data in ${^MATCH}.

The prematch method provides access to the data in ${^PREMATCH}.

The postmatch method provides access to the data in ${^POSTMATCH}.

Note: no accessor is provided for $&, $`, and $', because:

a) The author feels they are unnecessary since perl 5.10 introduced ${^MATCH} etc.

b) Implementing accessors for them would force a performance penalty on everyone who uses this module, even if they don't have any need of $&.

last_paren_match

Equivalent to $+.

The text matched by the last capturing parentheses of the match. This is useful if you don't know which one of a set of alternative patterns matched. For example, in:

/Version: (.*)|Revision: (.*)/

last_paren_match stores either the version or revision (whichever exists); perl would number these $1 and $2.

last_submatch_result

Equivalent to $^N.

last_numbered_match_end

Equivalent to @+.

This array holds the offsets of the ends of the last successful submatches in the currently active dynamic scope. $+[0] is the offset into the string of the end of the entire match. This is the same value as what the pos function returns when called on the variable that was matched against. The nth element of this array holds the offset of the nth submatch, so $+[1] is the offset past where $1 ends, $+[2] the offset past where $2 ends, and so on.

last_numbered_match_start

Equivalent to @-.

This array holds the offsets of the starts of the last successful submatches in the currently active dynamic scope. $-[0] is the offset into the string of the start of the entire match. The nth element of this array holds the offset of the nth submatch, so $-[1] is the offset where $1 starts, $-[2] the offset where $2 starts, and so on.

named_paren_matches

'wxyz' =~ /(?<ODD>w)(?<EVEN>x)(?<ODD>y)(?<EVEN>z)/

# named_paren_matches is now:
#
# {
#     EVEN => [ 'x', 'z' ],
#     ODD  => [ 'w', 'y' ]
# }

Equivalent to %-.

This variable allows access to the named capture groups in the last successful match in the currently active dynamic scope. To each capture group name found in the regular expression, it associates a reference to an array containing the list of values captured by all buffers with that name (should there be several of them), in the order where they appear.

last_named_paren_matches

'wxyz' =~ /(?<ODD>w)(?<EVEN>x)(?<ODD>y)(?<EVEN>z)/

# last_named_paren_matches is now:
#
# {
#     EVEN => 'x',
#     ODD  => 'w',
# }

The "%+" hash allows access to the named capture buffers, should they exist, in the last successful match in the currently active dynamic scope.

The keys of the "%+" hash list only the names of buffers that have captured (and that are thus associated to defined values).

Note: %- and %+ are tied views into a common internal hash associated with the last successful regular expression. Therefore mixing iterative access to them via each may have unpredictable results. Likewise, if the last successful match changes, then the results may be surprising.

Author's note: I have no idea why this is a useful thing to use. But perl provides it, and it is occasionally used according to http://grep.cpan.me/ (461 distros, of which some the string \%\+|\$\+\{ is in a binary stream).

last_regexp_code_result

The result of evaluation of the last successful (?{ code }) regular expression assertion (see perlre).

re_debug_flags

The current value of the regex debugging flags. Set to 0 for no debug output even when the re 'debug' module is loaded. See re for details.

pos

Returns the end of the match. Equivalent to $+[0].

BUGS

Please report any bugs or feature requests to the github issues tracker at https://github.com/pdl/Regexp-Result/issues. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

AUTHORS

Daniel Perrett

LICENSE AND COPYRIGHT

Copyright 2012-2013 Daniel Perrett.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.