NAME
Mock::Data::Regex - Generator that uses a Regex as a template to generate strings
SYNOPSIS
# Automatically used when you give a Regexp ref to Mock::Data
my $mock= Mock::Data->new(generators => { word => qr/\w+/ });
# or use stand-alone
my $email= Mock::Data::Regex->new( qr/ [-a-z]+\d{0,2} @ [a-z]{2,20} \. (com|net|org) /xa );
say $email->generate; # o25@nskwprtpqlqbeg.org
# define attributes, or override them on demand
say Mock::Data::Regex->new($regex)->generate($mock, { max_repetition => 50 });
say Mock::Data::Regex->new(regex => $regex, max_repetition => 50)->generate($mock);
# constrain the characters selected
my $any= Mock::Data::Regex->new(qr/.+/);
say $any->generate($mock, { min_codepoint => 0x20, max_codepoint => 0xFFFF });
# surround generated regex-match with un-matched prefix/suffix
say $email->generate($mock, { prefix => q{<a href="mailto:}, suffix => q{">Contact</a>} });
DESCRIPTION
This generator creates strings that match a user-supplied regular expression.
CONSTRUCTOR
new
my $gen= Mock::Data::Regex->new( $regex_ref );
...->new( \%options );
...->new( %options );
The constructor can take a key/value list of attributes, hash of attributes, or a single argument which is assumed to be a regular expression.
Any attribute may be supplied in %options
. The regular expression must be provided, and it is parsed immediately to check whether it is supported by this module. (this module lacks support for several regex features, such as lookaround assertions and backreferences)
ATTRIBUTES
regex
The regular expression this generator is matching. This will always be a regex-ref, even if you gave a string to the constructor.
regex_parse_tree
A data structure describing the regular expression. WARNING: The API of this data structure may change in future versions.
min_codepoint
The minimum codepoint to be considered when processing the regular expression or generating strings from it. You might choose to set this to i.e. 0x20 to avoid generating control characters. This only affects selection from character sets; literal control characters in the pattern will still be returned.
max_codepoint
The maximum codepoint to be considered when processing the regular expression or generating strings from it. Setting this to a low value (like 127 for ASCII) can speed up the algorithm in many cases. This is set to 127 automatically if the "regex" has the /a
flag.
max_repetition
max_repetition => '+8',
max_repetition => 10,
Whenever a regex has an un-bounded repetition, this determines the upper bound on the random number of repetitions. Set this to a plain number to specify an absolute maximum, or string with leading plus sign ("+$n"
) to specify a maximum relative to the minimum. The default is "+8"
.
prefix
->new(regex => qr/foo/, prefix => '_')->generate # returns "_foo"
->new(regex => qr/^foo/, prefix => '_')->generate # returns "foo"
->new(regex => qr/^foo/m, prefix => '_')->generate # returns "_\nfoo"
A generator or template to add to the beginning of the output whenever the regex is not anchored at the start or is multi-line. It will be joined to the output with a "\n" if the regex is multi-line and anchored from '^'.
suffix
->new(regex => qr/foo/, suffix => '_')->generate # returns "foo_"
->new(regex => qr/foo$/, suffix => '_')->generate # returns "foo"
->new(regex => qr/foo$/m, suffix => '_')->generate # returns "foo\n_"
A generator or template to add to the end of the output whenever the regex is not anchored at the end.
METHODS
generate
my $str= $generator->generate($mockdata, \%options);
Return a string matching the regular expression. The %options
may override the following attributes: "min_codepoint", "max_codepoint", "max_repetitions", "prefix", "suffix".
compile
Return a generator coderef that calls "generate" on this object.
parse
Parse a regular expression, returning a parse tree describing it. This can be called as a class method.
get_charset
If the regular expression is nothing more than a charset (or repetition of one charset) this returns that charset. If the regular expression is more complicated than a simple charset, this returns undef
.
SEE ALSO
- String::Random::Regexp::regxstring
-
Probably a better implementation, but depends on a C++ compiler.
- String::Random
- Regexp::Genex
AUTHOR
Michael Conrad <mike@nrdvana.net>
VERSION
version 0.04
COPYRIGHT AND LICENSE
This software is copyright (c) 2024 by Michael Conrad.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.