NAME

Parse::Marpa::CONCEPTS - Concepts helpful for Using Marpa

BEWARE: THIS DOCUMENT IS UNDER CONSTRUCTION AND VERY INCOMPLETE

THIS DOCUMENT IS UNDER CONSTRUCTION AND VERY INCOMPLETE

OVERVIEW

This document is about practical concepts, concepts for actually putting Marpa to work in your applications. It's not about the mathematics or the parsing theory behind Marpa. That's documented elsewhere.

It's also not a tutorial on grammars or BNF. For that consult a modern textbook, such as Grune and Jacobs Parsing Techniques - Second Edition, or Wikipedia. In Wikipedia, the article on Backus-Naur form is a good place to start.

Only concepts common to all of Marpa's grammar interfaces are covered. Speaking of which ...

GRAMMAR INTERFACES

A grammar is specified to Marpa through a grammar interface, which may itself be described by Marpa grammar. Right now there are only two grammar interfaces: the Marpa Demonstration Language and the raw grammar interface.

The Raw Grammar Interface

The raw grammar interface is a set of options to the constructor for Marpa grammar objects, Parse::Marpa::new(). All the other grammar interfaces will need to use the raw grammar interface indirectly. It is efficient, but for most situations users will want something higher level. The documentation for the raw grammar interface (as yet unwritten) is at Parse::Marpa::RAW.

The Marpa Demonstration Language

In Marpa's eyes all higher level grammar interfaces will be equal. I call the one that I am delivering with Marpa the Marpa Demonstration Language instead of the "Marpa Language" to emphasize that it is not intended to have special status. Its documentation is at Parse::Marpa::LANGUAGE.

Your Grammar Interface Here

Users are not only allowed to design their own Marpa interfaces, I hope they feel enticed to do so.

STEPS IN PARSING A TEXT

In parsing a text, Marpa follows a strict sequence, although much of it is hidden from the user. For example, when a parse object is created from a grammar which has not been precomputed, the parse object constructor will silently perform not just the precomputation of the grammar, but also a deep copy of it. In fact, if the Parse::Marpa::marpa() routine is used, the entire sequence below will be performed automatically and none of the methods listed below need be called directly.

In each step listed, the lowest level method performing it is mentioned. Calling these methods in this sequence will rarely be the best approach. For example, calling Parse::Marpa::precompute() directly is rarely necessary and clutters the code. See the main Parse::Marpa documentation page for pointers to the easiest interfaces, and ways to exercise greater control when that is desired. Also see the detailed documentation for each method for hints as to when it is best used.

  • Creation of a grammar object

    A grammar object is created with Parse::Marpa::new(). although it may called indirectly.

  • Adding rules to the grammar object

    Rules must added to the grammar object. This is done using the interfaces, and the raw interface is always involved. The raw interface may be called directly, or it may be hidden behind a higher level interface. At the lowest level, rules are added with the Parse::Marpa::new() and the Parse::Marpa::set() methods.

  • Precomputing the grammar object

    Before a parse object can be created, Marpa must do a series of precomputations on the grammar. This step rarely needs to be performed explicitly, but when that is necessary, the method call is Parse::Marpa::precompute.

  • Deep copying the grammar

    Marpa parse objects work with a copy of the grammar, so that multiple parses may go on at once. The deep copying is done by writing the grammar out with Data::Dumper, then eval'ing the result and tweaking it.

    These two subphases are available to the user as Parse::Marpa::compile() and Parse::Marpa::decompile(). The result of compile() is a string, which may be written into a file. A subsequent Marpa process may read this file and continue the parse. See the descriptions of Parse::Marpa::compile() and Parse::Marpa::decompile() for more details.

  • Creating the parse object

    To parse a text, a parse object must be created. The constructor Parse::Marpa::Parse::new() is always called to do this, whether directly or indirectly. Strings and options specifying semantics may have been set in earlier phases, but it is only now that the semantics are determined. After this point they cannot change.

  • Token Input

    A series of tokens is input to the parse object, and recognition is performed as the input is received. Marpa, therefore, will eventually be capable of on-line or stream processing.

    If Marpa's input is structured in the same way as conventional parsers, the input is recognized and ready to be evaluated when at the same time a conventional on-line parser would be ready. Marpa can deal with ambiguous, variable length, and even overlapping tokens, and the question of when the user can expect these input tokens to be recognized and ready to be evaluated as part of a parse is more complicated than with other parsers. See elsewhere in this document, under "Earlemes and Tokens".

    Currently input may be specified as text, with the Parse::Marpa::Parse::text() method or directly as tokens with the Parse::Marpa::Parse::earleme() method.

  • Initial Parse Evaluation

    Once the input is recognized, it can be evaluated. The first value is computed with the Parse::Marpa::Parse::initial() method. initial()'s return value is to indicate success or failure. The value of the parse is accessible with the Parse::Marpa::Parse::value() method.

  • Parse Iteration

    In Marpa a token sequence can have more than one parse. These are iterated through with the Parse::Marpa::Parse::next() method and values are retrieved with Parse::Marpa::Parse::value() method.