NAME

Parse::FSM::Driver - Run-time engine for Parse::FSM parser

SYNOPSIS

use MyParser; # isa Parse::FSM::Driver

$parser = MyParser->new;
$parser->input( \&lexer );
$parser->user( $user_pointer );

$result = $parser->parse( $start_rule );
$result = $parser->parse_start_rule;

$token = $parser->peek_token;
$token = $parser->get_token;
$parser->unget_token(@tokens);

DESCRIPTION

This module implements a deterministic top-down parser based on a pre-computed Finite State Machine (FSM).

The FSM is generated by Parse::FSM, by reading a BNF-type grammar file and generating a run-time module that includes the state tables. The module also include the run-time parsing routine that follows the state tables to obtain a parse of the input.

This module is not intended to be used stand alone. It is used as a base class by the modules generated by Parse::FSM.

METHODS - SETUP

new

Creates a new object.

user

Get/set of the parser user pointer. The user pointer is not used by the parser, and is available for communication between the parser actions and the calling module.

It can for example point to a data structure that describes the objects already identified in the parse.

METHODS - INPUT STREAM

input

Get/set the parser input lexer iterator. The iterator is a code reference of a function that returns the next token to be parsed as an array ref, with token type and token value [$type, $value]. It returns undef on end of input. E.g. for a simple expression lexer:

sub make_lexer {
  my($line) = @_;
  return sub {
    for ($line) {
      /\G\s+/gc;
      return [NUM  => $1] if /\G(\d+)/gc;
      return [NAME => $1] if /\G([a-z]\w*)/gci;
      return [$1   => $1] if /\G(.)/gc;
      return;
    }
  };
}
$parser->input(make_lexer("2+3*4"));

peek_token

Returns the next token to be retrieved by the lexer, but keeps it in the input queue. Can be used by a rule action to decide based on the input that follows.

get_token

Extracts the next token from the lexer stream. Can be used by a rule action to discard the following tokens.

unget_token

Pushes back the given list of tokens to the lexer input stream, to be retrieved on the next calls to get_token.

METHODS - PARSING

parse

This function receives an optional start rule name, and uses the default rule of the grammar if not supplied.

It parses the input stream, leaving the stream at the first unparsed token, and returns the parse value - the result of the action function for the start rule.

The function dies with an error message indicating the input that cannot be parsed in case of a parse error.

parse_XXX

For each rule XXX in the grammar, Parse::FSM creates a corresponding parse_XXX to start the parse at that rule. This is a short-cut to parse('XXX').

AUTHOR, BUGS, FEEDBACK, LICENSE, COPYRIGHT

See Parse::FSM