NAME
Tstregex - A Hybrid Regex Diagnostic Tool (single file Library module and command tool) shows the longest Regular Expression match / highlight the rejected part Example: $ perl lib/Tstregex.pm '/^[a-z]*\d{3}$/' 'abc123' 'abc12a' abc123 abc12a (^[a-z]*\d{3}$)
# Above, the normal parts are the longuest matching substring when bold parts highlights the rejected substring (idem with regexp lexical groups between parenthesis)
SYNOPSIS
$ tstregex 'regex' string1 string2 ... stringN
OPTIONS (CLI)
-h --help
show that help..
-v --verbose
shows key info on (un)matching..
-d --diag
Triggers the Enriched Diagnostic View. It displays: - The string with the failing part highlighted. - The exact token in the regex that caused the break. - A visual pointer (^--- HERE) aligned with the regex syntax. - Execution time (useful for spotting ReDoS/Exponential backtracking).
-a --assert
Misc: performs a huge test suite various a large collection of regexp tests with Tstregex..
Perl Module SYNOPSIS
use Tstregex;
my $ctx = tstregex_init_desc('/^\d{3}/');
tstregex($ctx, '12a');
if (!tstregex_is_full_match($res))
{
my $token = tstregex_get_fail_token($res);
my $pos = tstregex_get_match_len($res);
print "Failure on token '$token' at column $pos\n";
}
API
tstregex_init_desc($raw_re)
Pre-parses the regex, handles delimiters (m!!, //, etc.), extracts modifiers (i, s, m, x), and prepares the nibbling steps. Returns a context hash.
tstregex($ctx, $string)
Executes the diagnostic. Updates the context.
tstregex_is_full_match
Returns match status of input string (BOOL 0 OR 1)
tstregex_get_match_portion
Returns the matching portion in case of full match (might be smaller than input string, depending on anchors..)
tstregex_get_match_len
Returns the matching substring length
tstregex_get_fail_token
Returns the failing token in the regexp
tstregex_get_re_clean
Returns the matching regexp subpart
tstregex_get_re_raw
Returns the internal representation of the regexp
tstregex_get_prefix_offset
Returns the offset of the original regexp in the raw regexp
DESCRIPTION
tstregex is designed to solve the "Black Box" problem of Regular Expressions. When a complex regex fails, Perl usually just says "No Match". This tool identifies exactly where and why it failed by finding the longest possible partial match.
EXAMPLE
$ perl lib/Tstregex.pm '/^[a-z]*\d{3}$/' 'abc123' 'abc12a'
abc123
abcB<12a> (B<^[a-z]*>\d{3}$)
The tool highlights the part of the string where the match failed.
The "Nibbling" Engine
The diagnostic logic uses a "Nibbling" (grignotage) strategy:
- 1. Decomposition
-
The engine breaks down your regex into a hierarchy of valid sub-patterns (lexical groups, atoms, and quantifiers) from longest to shortest.
- 2. Longest Match Search
-
It iteratively tests these sub-patterns against the input string. It's not just checking if the start matches, but what is the maximum sequence of instructions the engine could follow before hitting a wall.
- 3. Failure Point Identification
-
Once the longest matching sub-pattern is found, the tool identifies the very next token in your regex syntax. This is your "Point of Failure".
AUTHOR
Olivier Delouya - 2026
LICENSE
Artistic Version 2
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 26:
Unterminated C<...> sequence