NAME
Regex::PreSuf - create regular expressions from word lists
DESCRIPTION
use Regex::PreSuf;
my $re = presuf(qw(foobar fooxar foozap));
# $re should be now 'foo(?:zap|[bx]ar)'
This module creates regular expressions out of 'word lists', lists of strings, matching the same words. The easiest thing to do would be of course just to concatenate the words with '|' but this module tries to be cleverer. It finds out the common prefixes and suffixes of the words and then recursively looks at the remaining differences. It knows about character classes. These optimized regular expressions normally run few dozen percentages faster than the simple-minded '|'-concatenation.
The downsides:
- the original order of the words is not necessarily respected, for example because the character class matches are be collected together, separate from the '|' alternations.
- because the module blithely ignores any specialness of any regular expression metacharacters such as the
*?+{}[]
, please do not use them in the words, the resulting regular expression will most likely be illegal
For the second downside there is an exception. The module has some rudimentary grasp of what to do with the 'any character' metacharacter. If you call presuf()
like this:
my $re = presuf({ anychar=>1 }, qw(foobar foo.ar fooxar));
# $re should be now 'foo.ar'
Beware, though, there are limits to the grasp:
my $re = presuf({ anychar=>1 }, qw(foobar foo.ar fooxa.));
# $re _could_ be now 'foo.a.'
# but it is 'foo(?:xa.|.ar)'
Finesses like this may or may not be implemented in future releases.
COPYRIGHT
Jarkko Hietaniemi <jhi@iki.fi>
This code is distributed under the same copyright terms as Perl itself.