NAME

re::engine::Plugin - Pure-Perl regular expression engine plugin interface

SYNOPSIS

use feature ':5.10';
use re::engine::Plugin (
    comp => sub {
        my ($re) = @_; # A re::engine::Plugin object

        # return value ignored
    },
    exec => sub {
        my ($re, $str) = @_;

       # We always like ponies!
       return 1 if $str eq 'pony';
       return;
    }
);

"pony" =~ /yummie/;

DESCRIPTION

As of perl 5.9.5 it's possible lexically replace perl's built-in regular expression engine (see perlreguts). This module provides glue for writing such a wrapper in Perl instead of the provided C/XS interface.

NOTE: This module is a development release that does not work with any version of perl other than the current (as of February 2007) blead. The provided interface is not a complete wrapper around the native interface (yet!) but the parts that are left can be implemented with additional methods so the completed API shouldn't have any major changes.

METHODS

import

Takes a list of key-value pairs with the only mandatory pair being "exec" and its callback routine. Both subroutine references and the string name of a subroutine (e.g. "main::exec") can be specified. The real CODE ref is currently looked up in the symbol table in the latter case.

comp

An optional sub to be called when a pattern is being compiled, note that a single pattern may be compiled more than once by perl.

The subroutine will be called with a regexp object (see "Regexp object"). The regexp object will be stored internally along with the pattern and provided as the first argument for the other callback routines (think of it as $self).

If your regex implementation needs to validate its pattern this is the right place to croak on an invalid one (but see "BUGS").

The return value of this subroutine is discarded.

exec

Called when a given pattern is being executed, the first argument is the regexp object and the second is the string being matched. The routine should return true if the pattern matched and false if it didn't.

intuit

TODO: implement

checkstr

TODO: implement

free

TODO: implement

dupe

TODO: implement

numbered_buff_get

TODO: implement

named_buff_get

TODO: implement

flags

"/PATTERN/cgimosx" in perlop

TODO

  • Provide an API for named ($+{name}) and unnamed ($1, $2, ...) match variables, allow specifying both offsets into the pattern and any given scalar.

  • Find some neat example for the "SYNOPSIS", suggestions welcome.

BUGS

Please report any bugs that aren't already listed at http://rt.cpan.org/Dist/Display.html?Queue=re-engine-Plugin to http://rt.cpan.org/Public/Bug/Report.html?Queue=re-engine-Plugin

  • Calling die or anything that uses it (such as carp) in the "comp" callback routines will not be trapped by an eval block that the pattern is in, i.e.

    use Carp qw(croak);
    use re::engine::Plugin(
        comp => sub {
            my $re = shift;
            croak "Your pattern is invalid"
                unless $re->pattern =~ /pony/;
        }
    );
    
    # Ignores the eval block
    eval { /you die in C<eval>, you die for real/ };

    Simply put this happens because the real subroutine call happens indirectly and not in the scope of the eval block.

Regexp object

The regexp object is passed around as the first argument to all the callback routines, it supports the following method calls (with more to come!).

pattern

Returns the pattern this regexp was compiled with.

flags

Returns a string of flags the pattern was compiled with. (e.g. "xs"). The flags are not guarenteed to be in any particular order, so don't depend on the current one.

stash

Returns or sets a user-defined stash that's passed around with the pattern, this is useful for passing around an arbitary scalar between callback routines, example:

use re::engine::Plugin (
    comp => sub { $_[0]->stash( [ 1 .. 5 ] ) },
    comp => sub { $_[0]->stash }, # Get [ 1 .. 5]
);
minlen

The minimum length a given string must be to match the pattern, set this to an integer in comp and perl will not call your exec routine unless the string being matched as at least that long. Returns the currently set length if not called with any arguments or undef if no length has been set.

SEE ALSO

"Pluggable Interface" in perlreguts

THANKS

Yves explaining why I made the regexp engine a sad panda.

AUTHOR

Ævar Arnfjörð Bjarmason <avar@cpan.org>

LICENSE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Copyright 2007 Ævar Arnfjörð Bjarmason.