NAME

Wraith - Parser Combinator in Perl

SYNOPSIS

use Wraith qw ( $succeed $many $token );

my ($E, $Etail, $T, $Ttail, $F, $num);
Wraith_rule->makerules(\$E, \$Etail, \$T, \$Ttail, \$F, \$num);

$E = ((\$T) >> (\$Etail)) ** sub { my ($tval, $etval) = @{$_[0]}; if ($etval) { return [ $etval->($tval) ]; } else { return [ $tval ] } };
$Etail = (
            ( $token->('\+') >> (\$T) >> (\$Etail) ) **
            sub { 
                my ($discard, $tval, $etval) = @{$_[0]};
                if ($etval) {
                    return [ sub { $_[0] + $etval->($tval) } ];
                } else {
                    return [ sub { $_[0] + $tval } ];
                }
            }
         ) |
         (
            ( $token->('-') >> (\$T) >> (\$Etail) ) **
            sub { 
                my ($discard, $tval, $etval) = @{$_[0]};
                if ($etval) {
                    return [ sub { $_[0] - $etval->($tval) } ];
                } else {
                    return [ sub { $_[0] - $tval } ];
                }
            }
         ) |
         $succeed->( [] );
$T = ((\$F) >> (\$Ttail)) ** sub { my ($fval, $ttval) = @{$_[0]}; if ($ttval) { return [ $ttval->($fval) ]; } else { return [ $fval ] } };
$Ttail = (
            ( $token->('\*') >> (\$F) >> (\$Ttail) ) **
            sub { 
                my ($discard, $fval, $ttval) = @{$_[0]};
                if ($ttval) {
                    return [ sub { $ttval->($_[0] * $fval) } ];
                } else {
                    return [ sub { $_[0] * $fval } ];
                }
            }
         ) |
         (
            ( $token->('\/') >> (\$F) >> (\$Ttail) ) **
            sub { 
                my ($discard, $fval, $ttval) = @{$_[0]};
                if ($ttval) {
                    return [ sub { 
                                 if ($fval) { 
                                     return $ttval->($_[0] / $fval);
                                 } else { 
                                     return $_[0]; 
                                 } 
                             } 
                           ];
                } else {
                    return [ sub { if ($fval) { return $_[0] / $fval; } else { return $_[0] } } ];
                }
            }
         ) |
         $succeed->( [] );
$F = ( ( $token->('\(') >> (\$E) >> $token->('\)') ) ** sub { [ $_[0]->[1] ] } ) |
     (\$num);
$num = $token->('[1-9][0-9]*');

print $E->('1 + 13 / 2 * 3 + 2 * (2 + 3)')->[0]->[0]->[0], "\n";
print $E->('2 * 3')->[0]->[0]->[0], "\n";

DESCRIPTION

Wraith is a simple parser combinator library (not monadic nor memoized) inspired by Boost.Spirit. It is not complete as Spirit but the fundamental operators are implemented.

When applied with arguments, all operators/combinators return a function, which takes a string as input sentence(s) and return a reference to a list of pairs: [ $pair_1, $pair_2, ..., $pair_n ], where each pair is a reference to a two-element list: [ ref_to_list_of_results, input_unprocessed ], in which ref_to_list_of_results is a reference to a list of analysis results and input_unprocessed is a string representing the unprocessed input so far.

Basic Operators:

reference $succeed

It is a curried version of operator succeed. The first parameter of succeed is the analysis result and the second parameter is the unprocessed input string.

reference $fail

It takes an argument, discards it and return an empty list.

Those two operators are rarely used. Use them if you need new combinators or empty matches.

reference $literal

It takes one character as the only argument. The returned function match the first character of its input against the argument character and return (argument, input_left) if matched, where input_left is the input without its first character, or return an empty list if failed to match.

reference $literals

Almost the same as $literal, but takes a string as the only argument and match the first character of input with each character in argument string until matched.

reference $token

Takes a regex string as its first argument. The second and optional argument is a regex string of skipped strings. It matches the regex at the beginning of the input string, return (token, input_left) if matched or an empty list if failed.

Combinators:

There are four combinators: then for sequence, alt for alternative, many for kleene star and using for semantic actions. Except many, the combinators are overloaded perl operators which takes at least one operator, combinator, compsite of combinators, product, or reference to an instance of those classes as the left-hand-side operand.

The returned list of function generated by combinators is a list of tokens in the order of they appeared in the products.

operator >>

Sequence combinator. For example, the product S -> T S would be written as $S = \$T >> $S; where $S and $T are rules, i.e, products.

operator |

Alternative combinator. For example, the product S -> P | Q would be written as $S = \$T | \$Q; where $S, $T and $Q are rules.

operator **

Using combinator. It takes a operator, combinator, compsite of combinators, product, or reference to an instance of those classes as the left-hand-side operand and a subroutine as the right-hand-side operand. The returned value of lhs operand will be passed to rhs operand, and the returned value of rhs operand applied with its argument will be returned. This combinator is used for semantic actions. The returned value must be a reference to a list containing all the results given by the semantic subroutine.

reference $many

Kleene star combinator. The argument combinator will be matched at least zero time. The returned value is a list of all possible matchings.

Rules:

Rules are products. Products are compsite of operators and/or combinators. To create a product, a scalar variable must be declared, my $P; and then, call Wraith_rules->makerules(\$P) to make it a rule.

Wraith_rules->makerules( @list_of_references_to_products )

It takes a list of references to would-be rules and returned the blessed references. However the returned values can be omitted for the contents of the variables are already blessed. Thus, the variables are able to use the overloaded operators.

AUTHOR

Bo Wang <sceneviper@hotmail.com>

COPYRIGHT

Copyright 2013 - Bo Wang

SEE ALSO

Parser::Combinators, which implements parsec-like parser combinators.

LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.