NAME

Tstregex - A Hybrid Regex Diagnostic Tool (single file Library module and command tool) shows the longest Regular Expression match / highlight the rejected part Example: $ perl lib/Tstregex.pm '/^[a-z]*\d{3}$/' 'abc123' 'abc12a' abc123 abc12a (^[a-z]*\d{3}$)

# Above, the normal parts are the longuest matching substring when bold parts highlights the rejected substring (idem with regexp lexical groups between parenthesis)

SYNOPSIS

$ tstregex 'regex' string1 string2 ... stringN

OPTIONS (CLI)

-h --help

show that help..

-v --verbose

shows key info on (un)matching..

-d --diag

Triggers the Enriched Diagnostic View. It displays: - The string with the failing part highlighted. - The exact token in the regex that caused the break. - A visual pointer (^--- HERE) aligned with the regex syntax. - Execution time (useful for spotting ReDoS/Exponential backtracking).

-a --assert

Misc: performs a huge test suite various a large collection of regexp tests with Tstregex..

Perl Module SYNOPSIS

use Tstregex;
my $ctx = tstregex_init_desc('/^\d{3}/');
tstregex($ctx, '12a');
if (!tstregex_is_full_match($res))
    {
    my $token = tstregex_get_fail_token($res);
    my $pos   = tstregex_get_match_len($res);
    print "Failure on token '$token' at column $pos\n";
    }

API

tstregex_init_desc($raw_re)

Pre-parses the regex, handles delimiters (m!!, //, etc.), extracts modifiers (i, s, m, x), and prepares the nibbling steps. Returns a context hash.

tstregex($ctx, $string)

Executes the diagnostic. Updates the context.

tstregex_is_full_match

Returns match status of input string (BOOL 0 OR 1)

tstregex_get_match_portion

Returns the matching portion in case of full match (might be smaller than input string, depending on anchors..)

tstregex_get_match_len

Returns the matching substring length

tstregex_get_fail_token

Returns the failing token in the regexp

tstregex_get_re_clean

Returns the matching regexp subpart

tstregex_get_re_raw

Returns the internal representation of the regexp

tstregex_get_prefix_offset

Returns the offset of the original regexp in the raw regexp

DESCRIPTION

tstregex is designed to solve the "Black Box" problem of Regular Expressions. When a complex regex fails, Perl usually just says "No Match". This tool identifies exactly where and why it failed by finding the longest possible partial match.

EXAMPLE

$ perl lib/Tstregex.pm '/^[a-z]*\d{3}$/' 'abc123' 'abc12a'
abc123
abcB<12a> (B<^[a-z]*>\d{3}$)

The tool highlights the part of the string where the match failed.

The "Nibbling" Engine

The diagnostic logic uses a "Nibbling" (grignotage) strategy:

1. Decomposition

The engine breaks down your regex into a hierarchy of valid sub-patterns (lexical groups, atoms, and quantifiers) from longest to shortest.

It iteratively tests these sub-patterns against the input string. It's not just checking if the start matches, but what is the maximum sequence of instructions the engine could follow before hitting a wall.

3. Failure Point Identification

Once the longest matching sub-pattern is found, the tool identifies the very next token in your regex syntax. This is your "Point of Failure".

AUTHOR

Olivier Delouya - 2026

LICENSE

Artistic Version 2

1 POD Error

The following errors were encountered while parsing the POD:

Around line 26:

Unterminated C<...> sequence