Name
Data::NFA - Non deterministic finite state machine from regular expression.
Synopsis
Create a non deterministic finite state machine from a regular expression which can then be converted into a deterministic finite state machine by Data::DFA and used to parse sequences of symbols.
For example, the regular expression:
((a|b)*)**4
produces the following machine:
use Data::NFA qw(:all);
use Data::Table::Text qw(:all);
use Test::More qw(no_plan);
my $N = 4;
my $s = q(zeroOrMore(choice(element("a"), element("b"))));
my $nfa = eval qq(fromExpr(($s)x$N));
ok $nfa->printNws("((a|b)*)**$N: ") eq nws <<END;
((a|b)*)**4:
Location F Transitions Jumps
0 1 { a => 1 } [2, 4, 6, 8, 10, 12, 14, 16]
1 1 undef [0, 2, 3, 4, 6, 8, 10, 12, 14, 16]
2 0 { b => 3 } undef
3 1 undef [0, 2, 4, 6, 8, 10, 12, 14, 16]
4 1 { a => 5 } [6, 8, 10, 12, 14, 16]
5 1 undef [4, 6, 7, 8, 10, 12, 14, 16]
6 0 { b => 7 } undef
7 1 undef [4, 6, 8, 10, 12, 14, 16]
8 1 { a => 9 } [10, 12, 14, 16]
9 1 undef [8, 10, 11, 12, 14, 16]
10 0 { b => 11 } undef
11 1 undef [8, 10, 12, 14, 16]
12 1 { a => 13 } [14, 16]
13 1 undef [12, 14, 15, 16]
14 0 { b => 15 } undef
15 1 undef [12, 14, 16]
16 1 undef undef
END
Description
Non deterministic finite state machine from regular expression.
Version "20181027".
The following sections describe the methods in each functional area of this module. For an alphabetic listing of all methods by name see Index.
Construct regular expression
Construct a regular expression that defines the language to be parsed using the following combining operations which can all be imported:
element($)
One element. An element can also be represented by a string or number
Parameter Description
1 $label Transition symbol
Example:
if (1) {
my $nfa = fromExpr(𝗲𝗹𝗲𝗺𝗲𝗻𝘁("a"));
ok $nfa->print("Element: a") eq <<END;
Element: a
Location F Transitions
0 0 { a => 1 }
1 1 undef
END
ok $nfa->isFinal(1);
ok !$nfa->isFinal(0);
ok $nfa->parse(qw(a));
ok !$nfa->parse(qw(a b));
ok !$nfa->parse(qw(b));
ok !$nfa->parse(qw(b a));
}
This is a static method and so should be invoked as:
Data::NFA::element
sequence(@)
Sequence of elements and/or symbols.
Parameter Description
1 @elements Elements
Example:
if (1) {
my $nfa = fromExpr(𝘀𝗲𝗾𝘂𝗲𝗻𝗰𝗲(element("a"), element("b")));
ok $nfa->print("Sequence: ab") eq <<END;
Sequence: ab
Location F Transitions
0 0 { a => 1 }
1 0 { b => 2 }
2 1 undef
END
ok $nfa->parse(qw(a b));
ok !$nfa->parse(qw(b a));
ok !$nfa->parse(qw(a));
ok !$nfa->parse(qw(b));
}
This is a static method and so should be invoked as:
Data::NFA::sequence
optional(@)
An optional sequence of elements and/or symbols.
Parameter Description
1 @element Elements
Example:
if (1) {
my $nfa = fromExpr(element("a"), 𝗼𝗽𝘁𝗶𝗼𝗻𝗮𝗹(element("b")), element("c"));
ok $nfa->print("Optional: ab?c") eq <<END;
Optional: ab?c
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } [2]
2 0 { c => 3 } undef
3 1 undef undef
END
ok $nfa->parse(qw(a b c));
ok $nfa->parse(qw(a c));
ok !$nfa->parse(qw(a c b));
}
This is a static method and so should be invoked as:
Data::NFA::optional
zeroOrMore(@)
Zero or more repetitions of a sequence of elements and/or symbols.
Parameter Description
1 @element Elements
Example:
if (1) {
my $nfa = fromExpr(element("a"), 𝘇𝗲𝗿𝗼𝗢𝗿𝗠𝗼𝗿𝗲(element("b")), element("c"));
ok $nfa->print("Zero Or More: ab*c") eq <<END;
Zero Or More: ab*c
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } [3]
2 0 undef [1, 3]
3 0 { c => 4 } undef
4 1 undef undef
END
ok $nfa->parse(qw(a c));
ok $nfa->parse(qw(a b c));
ok $nfa->parse(qw(a b b c));
ok !$nfa->parse(qw(a b b d));
}
if (1) {
my $nfa = fromExpr(element("a"),
𝘇𝗲𝗿𝗼𝗢𝗿𝗠𝗼𝗿𝗲(choice(element("a"),
element("a"))),
element("a"));
ok $nfa->print("aChoice: (a(a|a)*a") eq <<END;
aChoice: (a(a|a)*a
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { a => 2 } [3, 5]
2 0 undef [1, 3, 4, 5]
3 0 { a => 4 } undef
4 0 undef [1, 3, 5]
5 0 { a => 6 } undef
6 1 undef undef
END
is_deeply [1 .. 6], $nfa->statesReachableViaSymbol(1, "a");
is_deeply [1 .. 6], $nfa->statesReachableViaSymbol(2, "a");
is_deeply [1, 3, 4, 5], $nfa->statesReachableViaSymbol(3, "a");
ok !$nfa->parse(qw(a));
ok $nfa->parse(qw(a a));
ok $nfa->parse(qw(a a a));
ok !$nfa->parse(qw(a b a));
}
This is a static method and so should be invoked as:
Data::NFA::zeroOrMore
oneOrMore(@)
One or more repetitions of a sequence of elements and/or symbols.
Parameter Description
1 @element Elements
Example:
if (1) {
my $nfa = fromExpr(element("a"), 𝗼𝗻𝗲𝗢𝗿𝗠𝗼𝗿𝗲(element("b")), element("c"));
ok $nfa->print("One or More: ab+c") eq <<END;
One or More: ab+c
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } undef
2 0 undef [1, 3]
3 0 { c => 4 } undef
4 1 undef undef
END
is_deeply [], $nfa->statesReachableViaSymbol(2,"a");
is_deeply [1..3], $nfa->statesReachableViaSymbol(2,"b");
is_deeply [4], $nfa->statesReachableViaSymbol(2,"c");
ok !$nfa->parse(qw(a c));
ok $nfa->parse(qw(a b c));
ok $nfa->parse(qw(a b b c));
ok !$nfa->parse(qw(a b b d));
}
This is a static method and so should be invoked as:
Data::NFA::oneOrMore
choice(@)
Choice from amongst one or more elements and/or symbols.
Parameter Description
1 @elements Elements to be chosen from
Example:
if (1) {
my $nfa = fromExpr(element("a"),
𝗰𝗵𝗼𝗶𝗰𝗲(element("b"), element("c")),
element("d"));
ok $nfa->print("Choice: (a(b|c)d") eq <<END;
Choice: (a(b|c)d
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } [3]
2 0 undef [4]
3 0 { c => 4 } undef
4 0 { d => 5 } undef
5 1 undef undef
END
is_deeply [], $nfa->statesReachableViaSymbol(1, "a");
is_deeply [2, 4], $nfa->statesReachableViaSymbol(1, "b");
is_deeply [4], $nfa->statesReachableViaSymbol(1, "c");
is_deeply ['a'..'d'], [$nfa->symbols];
ok $nfa->parse(qw(a b d));
ok $nfa->parse(qw(a c d));
ok !$nfa->parse(qw(a b c d));
}
This is a static method and so should be invoked as:
Data::NFA::choice
except(@)
Choice from amongst all symbols except the ones mentioned
Parameter Description
1 @elements Elements not to be chosen from
Example:
if (1) {
my $nfa = fromExpr(choice(qw(a b c)), 𝗲𝘅𝗰𝗲𝗽𝘁(qw(c x)), choice(qw(a b c)));
ok $nfa->print("(a|b|c)(c!x)(a|b|c):") eq <<END;
(a|b|c)(c!x)(a|b|c):
Location F Transitions Jumps
0 0 { a => 1 } [2, 4]
1 0 undef [5, 7]
2 0 { b => 3 } undef
3 0 undef [5, 7]
4 0 { c => 5 } undef
5 0 { a => 6 } [7]
6 0 undef [8, 10, 12]
7 0 { b => 8 } undef
8 0 { a => 9 } [10, 12]
9 1 undef [13]
10 0 { b => 11 } undef
11 1 undef [13]
12 0 { c => 13 } undef
13 1 undef undef
END
ok !$nfa->parse(qw(a a));
ok $nfa->parse(qw(a a a));
ok !$nfa->parse(qw(a c a));
}
This is a static method and so should be invoked as:
Data::NFA::except
Non Deterministic finite state machine
fromExpr(@)
Create an NFA from a regular expression.
Parameter Description
1 @expr Expressions
Example:
if (1) {
my $nfa = 𝗳𝗿𝗼𝗺𝗘𝘅𝗽𝗿
(element("a"),
oneOrMore(choice(qw(b c))),
optional(element("d")),
element("e")
);
ok $nfa->print("a(b|c)+d?e :") eq <<END;
a(b|c)+d?e :
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } [3]
2 0 undef [1, 3 .. 6]
3 0 { c => 4 } undef
4 0 undef [1, 3, 5, 6]
5 0 { d => 6 } [6]
6 0 { e => 7 } undef
7 1 undef undef
END
is_deeply ['a'..'e'], [$nfa->symbols];
ok !$nfa->parse(qw(a e));
ok !$nfa->parse(qw(a d e));
ok $nfa->parse(qw(a b c e));
ok $nfa->parse(qw(a b c d e));
}
This is a static method and so should be invoked as:
Data::NFA::fromExpr
print($$)
Print the current state of the finite automaton. If it is non deterministic, the non deterministic jumps will be shown as well as the transitions table. If deterministic, only the transitions table will be shown.
Parameter Description
1 $states States
2 $title Title
Example:
if (1) {
my $nfa = fromExpr
(element("a"),
oneOrMore(choice(qw(b c))),
optional(element("d")),
element("e")
);
ok $nfa->𝗽𝗿𝗶𝗻𝘁("a(b|c)+d?e :") eq <<END;
a(b|c)+d?e :
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } [3]
2 0 undef [1, 3 .. 6]
3 0 { c => 4 } undef
4 0 undef [1, 3, 5, 6]
5 0 { d => 6 } [6]
6 0 { e => 7 } undef
7 1 undef undef
END
is_deeply ['a'..'e'], [$nfa->symbols];
ok !$nfa->parse(qw(a e));
ok !$nfa->parse(qw(a d e));
ok $nfa->parse(qw(a b c e));
ok $nfa->parse(qw(a b c d e));
}
symbols($)
Return an array of all the transition symbols.
Parameter Description
1 $states States
Example:
if (1) {
my $nfa = fromExpr
(element("a"),
oneOrMore(choice(qw(b c))),
optional(element("d")),
element("e")
);
ok $nfa->print("a(b|c)+d?e :") eq <<END;
a(b|c)+d?e :
Location F Transitions Jumps
0 0 { a => 1 } undef
1 0 { b => 2 } [3]
2 0 undef [1, 3 .. 6]
3 0 { c => 4 } undef
4 0 undef [1, 3, 5, 6]
5 0 { d => 6 } [6]
6 0 { e => 7 } undef
7 1 undef undef
END
is_deeply ['a'..'e'], [$nfa->𝘀𝘆𝗺𝗯𝗼𝗹𝘀];
ok !$nfa->parse(qw(a e));
ok !$nfa->parse(qw(a d e));
ok $nfa->parse(qw(a b c e));
ok $nfa->parse(qw(a b c d e));
}
isFinal($$)
Whether this is a final state or not
Parameter Description
1 $states States
2 $stateName Name of state
Example:
if (1) {
my $nfa = fromExpr(element("a"));
ok $nfa->print("Element: a") eq <<END;
Element: a
Location F Transitions
0 0 { a => 1 }
1 1 undef
END
ok $nfa->𝗶𝘀𝗙𝗶𝗻𝗮𝗹(1);
ok !$nfa->𝗶𝘀𝗙𝗶𝗻𝗮𝗹(0);
ok $nfa->parse(qw(a));
ok !$nfa->parse(qw(a b));
ok !$nfa->parse(qw(b));
ok !$nfa->parse(qw(b a));
}
allTransitions($)
Return all transitions in the NFA as {stateName}{symbol} = [reachable states]
Parameter Description
1 $states States
Example:
if (1) {
my $s = q(zeroOrMore(choice(element("a"))));
my $nfa = eval qq(fromExpr(sequence($s,$s)));
ok $nfa->print("a*a* 1:") eq <<END;
a*a* 1:
Location F Transitions Jumps
0 1 { a => 1 } [2, 4]
1 1 undef [0, 2, 4]
2 1 { a => 3 } [4]
3 1 undef [2, 4]
4 1 undef undef
END
is_deeply [0 .. 4], $nfa->statesReachableViaSymbol(0, q(a));
is_deeply [0 .. 4], $nfa->statesReachableViaSymbol(1, q(a));
is_deeply [2, 3, 4], $nfa->statesReachableViaSymbol(2, q(a));
is_deeply [2, 3, 4], $nfa->statesReachableViaSymbol(3, q(a));
is_deeply $nfa->𝗮𝗹𝗹𝗧𝗿𝗮𝗻𝘀𝗶𝘁𝗶𝗼𝗻𝘀, {
"0" => { a => [0 .. 4] },
"1" => { a => [0 .. 4] },
"2" => { a => [2, 3, 4] },
"3" => { a => [2, 3, 4] },
"4" => { a => [] },
};
ok $nfa->print("a*a* 2:") eq <<END;
a*a* 2:
Location F Transitions Jumps
0 1 { a => 1 } [2, 4]
1 1 undef [0, 2, 4]
2 1 { a => 3 } [4]
3 1 undef [2, 4]
4 1 undef undef
END
}
parse($@)
Parse an array of symbols
Parameter Description
1 $states States
2 @symbols Array of symbols
Example:
if (1) {
my $nfa = fromExpr(element("a"));
ok $nfa->print("Element: a") eq <<END;
Element: a
Location F Transitions
0 0 { a => 1 }
1 1 undef
END
ok $nfa->isFinal(1);
ok !$nfa->isFinal(0);
ok $nfa->𝗽𝗮𝗿𝘀𝗲(qw(a));
ok !$nfa->𝗽𝗮𝗿𝘀𝗲(qw(a b));
ok !$nfa->𝗽𝗮𝗿𝘀𝗲(qw(b));
ok !$nfa->𝗽𝗮𝗿𝘀𝗲(qw(b a));
}
Data::NFA::State Definition
NFA State
Output fields
final - Whether this state is final
jumps - {jump => 1} : jumps from this state not consuming any input symbols
return - Return point
transitions - {symbol => state} : transitions from this state consuming one input symbol
Private Methods
newState(%)
Create a new NFA state
Parameter Description
1 %options NFA state as hash
fromExpr2($$$)
Create an NFA from a regular expression.
Parameter Description
1 $states States
2 $expr Regular expression constructed from L<element|/element> L<sequence|/sequence> L<optional|/optional> L<zeroOrMore|/zeroOrMore> L<oneOrMore|/oneOrMore> L<choice|/choice>
3 $symbols Set of symbols used by the NFA.
propagateFinalState($)
Mark the states that can reach the final state with a jump as final
Parameter Description
1 $states States
statesReachableViaJumps($$)
Find the names of all the states that can be reached from a specified state via jumps alone
Parameter Description
1 $states States
2 $StateName Name of start state
printWithJumps($$$)
Print the current state of an NFA with jumps.
Parameter Description
1 $states States
2 $title Title
3 $print Print to STDERR if 2 or to STDOUT if 1
printWithOutJumps($$$)
Print the current state of an NFA without jumps
Parameter Description
1 $states States
2 $title Title
3 $print Print to STDERR if 2 or to STDOUT if 1
statesReachableViaSymbol($$$$)
Find the names of all the states that can be reached from a specified state via a specified symbol and all the jumps available.
Parameter Description
1 $states States
2 $StateName Name of start state
3 $symbol Symbol to reach on
4 $cache A hash to be used as a cache
parse2($$@)
Parse an array of symbols
Parameter Description
1 $states States
2 $stateName Current state
3 @symbols Remaining symbols
Index
1 allTransitions - Return all transitions in the NFA as {stateName}{symbol} = [reachable states]
2 choice - Choice from amongst one or more elements and/or symbols.
3 element - One element.
4 except - Choice from amongst all symbols except the ones mentioned
5 fromExpr - Create an NFA from a regular expression.
6 fromExpr2 - Create an NFA from a regular expression.
7 isFinal - Whether this is a final state or not
8 newState - Create a new NFA state
9 oneOrMore - One or more repetitions of a sequence of elements and/or symbols.
10 optional - An optional sequence of elements and/or symbols.
11 parse - Parse an array of symbols
12 parse2 - Parse an array of symbols
13 print - Print the current state of the finite automaton.
14 printWithJumps - Print the current state of an NFA with jumps.
15 printWithOutJumps - Print the current state of an NFA without jumps
16 propagateFinalState - Mark the states that can reach the final state with a jump as final
17 sequence - Sequence of elements and/or symbols.
18 statesReachableViaJumps - Find the names of all the states that can be reached from a specified state via jumps alone
19 statesReachableViaSymbol - Find the names of all the states that can be reached from a specified state via a specified symbol and all the jumps available.
20 symbols - Return an array of all the transition symbols.
21 zeroOrMore - Zero or more repetitions of a sequence of elements and/or symbols.
Installation
This module is written in 100% Pure Perl and, thus, it is easy to read, comprehend, use, modify and install via cpan:
sudo cpan install Data::NFA
Author
Copyright
Copyright (c) 2016-2019 Philip R Brenan.
This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.