This is api.info, produced by makeinfo version 6.5 from api.texi.
This manual (23 June 2022) is for Libmarpa 9.0.3.
Copyright (C) 2022 Jeffrey Kegler.
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use, copy,
modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
INFO-DIR-SECTION Development
START-INFO-DIR-ENTRY
* Marpa R2: (marpa-r2). Marpa R2 parser library
END-INFO-DIR-ENTRY
File: api.info, Node: Top, Next: No warranty, Prev: (dir), Up: (dir)
Libmarpa: The Marpa low-level library
*************************************
This manual (23 June 2022) is for Libmarpa 9.0.3.
Copyright (C) 2022 Jeffrey Kegler.
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use, copy,
modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
* Menu:
* No warranty::
* About this document::
* About Libmarpa::
* Architecture::
* Input::
* Exhaustion::
* Semantics::
* Threads::
* Failure::
* Introduction to the method descriptions::
* Static methods::
* Configuration methods::
* Grammar methods::
* Recognizer methods::
* Progress reports::
* Bocage methods::
* Ordering methods::
* Tree methods::
* Value methods::
* Events::
* Error methods macros and codes::
* Technical notes::
* Advanced input models::
* Futures::
* Deprecated techniques and methods::
* Index of terms::
-- The Detailed Node Listing --
About this document
* How to read this document::
* Prerequisites::
* Parsing theory::
* Terminology and notation::
Terminology and notation
* Application and diagnostic behavior::
Architecture
* Major objects::
* Time objects::
* Reference counting::
* Numbered objects::
Input
* Earlemes::
* The basic models of input::
* Terminals::
Earlemes
* The traditional input model::
* The latest earleme::
* The current earleme::
* The furthest earleme::
The basic models of input
* The standard model of input::
* Ambiguous input::
Failure
* Libmarpa's approach to failure::
* User non-conformity to specified behavior::
* Classifying failure::
* Memory allocation failure::
* Undetected failure::
* Irrecoverable hard failure::
* Partially recoverable hard failure::
* Library-recoverable hard failure::
* Fully recoverable hard failure::
* Soft failure::
* Error codes::
Introduction to the method descriptions
* About the overviews::
* Naming conventions::
* Return values::
* How to read the method descriptions::
Grammar methods
* Grammar overview::
* Grammar constructor::
* Grammar reference counting::
* Symbol methods::
* Rule methods::
* Sequence methods::
* Rank methods::
* Grammar precomputation::
Recognizer methods
* Recognizer overview::
* Creating a new recognizer::
* Recognizer reference counting::
* Recognizer life cycle mutators::
* Location accessors::
* Other parse status methods::
Bocage methods
* Bocage overview::
* Bocage constructor::
* Bocage reference counting::
* Bocage accessor::
Ordering methods
* Ordering overview::
* Ordering constructor::
* Ordering reference counting::
* Order accessor::
* Non-default ordering::
Tree methods
* Tree overview::
* Tree constructor::
* Tree reference counting::
* Tree iteration::
Value methods
* Value overview::
* How to use the valuator::
* Advantages of step-driven valuation::
* Maintaining the stack::
* Valuator constructor::
* Valuator reference counting::
* Stepping through the valuator::
* Valuator steps by type::
* Basic step accessors::
* Other step accessors::
Maintaining the stack
* Sizing the stack::
* Initializing locations in the stack::
Events
* Events overview::
* Basic event accessors::
* Completion events::
* Symbol nulled events::
* Prediction events::
* Symbol expected events::
* Event codes::
Error methods, macros and codes
* Error methods::
* Error Macros::
* External error codes::
* Internal error codes::
Technical notes
* Data types used by Libmarpa::
* Why so many time objects::
* Design of numbered objects::
* LHS Terminals::
Advanced input models
* The dense variable-length token model::
* The fully general input model::
Futures
* Orthogonal treatment of exhaustion::
* Furthest earleme values::
* Additional recoverable failures in marpa_r_alternative()::
* Untested methods::
Untested methods
* Ranking methods::
* Zero-width assertion methods::
* Methods for revising parses::
Deprecated techniques and methods
* Valued and unvalued symbols::
Valued and unvalued symbols
* What unvalued symbols were::
* Grammar methods dealing with unvalued symbols::
* Registering semantics in the valuator::
File: api.info, Node: No warranty, Next: About this document, Prev: Top, Up: Top
1 No warranty
*************
The Libmarpa license takes precedence over the statements in this
document. In particular, the license states that Libmarpa is free
software and has no warranty. No statement in this document should be
construed as providing any kind of warranty.
File: api.info, Node: About this document, Next: About Libmarpa, Prev: No warranty, Up: Top
2 About this document
*********************
* Menu:
* How to read this document::
* Prerequisites::
* Parsing theory::
* Terminology and notation::
File: api.info, Node: How to read this document, Next: Prerequisites, Prev: About this document, Up: About this document
2.1 How to read this document
=============================
This is essentially a reference document, but its early chapters lay out
concepts essential to the others. Readers will usually want to read the
chapters up and including *note Introduction to the method
descriptions:: in order. Otherwise, they should follow their interests.
File: api.info, Node: Prerequisites, Next: Parsing theory, Prev: How to read this document, Up: About this document
2.2 Prerequisites
=================
This document is very far from self-contained. It assumes the
following:
* The reader knows the C programming language at least well enough to
understand function prototypes and return values.
* The reader has read the documents for one of Libmarpa's upper
layers. As of this writing, the only such layer is 'Marpa::R2' or
'Marpa::R3', in Perl.
* The reader knows some parsing theory (*note Parsing theory::).
File: api.info, Node: Parsing theory, Next: Terminology and notation, Prev: Prerequisites, Up: About this document
2.3 Parsing theory
==================
This document assumes some acquaintance with parsing theory. The
reader's level of knowledge is probably adequate if he can answer the
following questions, either immediately or after a little reflection.
* What is a BNF rule?
* What is a Marpa sequence rule?
* As a reminder, Marpa's sequence rules are implemented as left
recursions. What does that mean?
* Take a Marpa sequence rule at random. What does it look like when
rewritten in BNF?
* What does the sequence look like when rewritten in BNF as a
right-recursion?
File: api.info, Node: Terminology and notation, Prev: Parsing theory, Up: About this document
2.4 Terminology and notation
============================
In this document,
* A "boolean" value, or "boolean", is an integer which is 0 or 1.
* "iff" abbreviates "if and only if".
* "application" means an "application" of Libmarpa. In this
document, a Libmarpa application is not necessarily an application
program. For our purposes, an "application" might be another
library which uses Libmarpa.
* 'max(x,y)' is the maximum of 'x' and 'y', where 'x' and 'y' are two
numbers.
* "Libmarpa method", or just "method" means a C function or a
function-like macro of the Libmarpa library.
* "user" means a "user" of the Libmarpa library. A user of the
library is also a programmer, so that in this documents, "user" and
"programmer" are essentially synonyms.
* "We" (and "us" and "our") refer to the authors. As of this
writing, there is a primary author, but the plural is traditional,
and our "we" is intended to include the reader and everyone we are
joining on the millenia-old voyage of discovery into mathematics
and language.
* Menu:
* Application and diagnostic behavior::
File: api.info, Node: Application and diagnostic behavior, Prev: Terminology and notation, Up: Terminology and notation
2.4.1 Application and diagnostic behavior
-----------------------------------------
An "application behavior" is a behavior on which it is intended that the
design of applications will be based. Most of the behaviors specified
in this document are application behaviors. We sometimes say that
"applications may expect" a certain behavior to emphasize that that
behavior is an application behavior.
After an irrecoverable failure, the behavior of a Libmarpa
application is undefined, so that there are no behaviors which can be
relied on for normal application processing, and therefore, there are no
application behaviors. In this circumstance, some of the application
behaviors become diagnostic behaviors. A "diagnostic behavior" is a
behavior which it is suggested that the programmer may attempt in the
face of an irrecoverable failure, for testing, diagnostics and
debugging. They are hoped for, rather than expected, and intended to
allow the programmer to deal with irrecoverable failures as smoothly as
possible. (*Note Failure::.)
In this document, a behavior is a diagnostic behavior only if that is
specifically indicated. Applications should not be designed to rely on
diagnostics behaviors. We sometimes say that "diagnostics may attempt"
a certain behavior to emphasize that that behavior is a diagnostic
behavior.
File: api.info, Node: About Libmarpa, Next: Architecture, Prev: About this document, Up: Top
3 About Libmarpa
****************
Libmarpa implements the Marpa parsing algorithm. Marpa is named after
the legendary 11th century Tibetan translator, Marpa Lotsawa. In
creating Marpa, I depended heavily on previous work by Jay Earley, Joop
Leo, John Aycock and Nigel Horspool.
Libmarpa implements the entire Marpa algorithm. This library does
the necessary grammar preprocessing, recognizes the input, and produces
parse trees. It also supports the ordering, iteration and evaluation of
the parse trees.
Libmarpa is very low-level. For example, it has no strings. Rules,
symbols, and token values are all represented by integers. This, of
course, will not suffice for many applications. Users will very often
want names for the symbols, non-integer values for tokens, or both.
Typically, applications will use arrays to translate Libmarpa's integer
ID's to strings or other values as required.
Libmarpa also does *not* implement most of the semantics. Libmarpa
does have an evaluator (called a "valuator"), but it does *not*
manipulate the stack directly. Instead, Libmarpa, based on its
traversal of the parse tree, passes optimized step by step stack
manipulation instructions to the upper layer. These instructions
indicate the token or rule involved, and the proper location for the
true token value or the result of the rule evaluation. For rule
evaluations, the instructions include the stack location of the
arguments.
Marpa requires most semantics to be implemented in the application.
This allows the application total flexibility. It also puts the
application is in a much better position to prevent errors, to catch
errors at runtime or, failing all else, to successfully debug the logic.
File: api.info, Node: Architecture, Next: Input, Prev: About Libmarpa, Up: Top
4 Architecture
**************
* Menu:
* Major objects::
* Time objects::
* Reference counting::
* Numbered objects::
File: api.info, Node: Major objects, Next: Time objects, Prev: Architecture, Up: Architecture
4.1 Major objects
=================
The classes of Libmarpa's object system fall into two types: major and
numbered. These are the Libmarpa's major classes, in sequence.
* Configuration: A configuration object is a thread-safe way to hold
configuration variables, as well as the return code from failed
attempts to create grammar objects.
* Grammar: A grammar object contains rules and symbols, with their
properties.
* Recognizer: A recognizer object reads input.
* Bocage: A bocage object is a collection of parse trees, as found by
a recognizer. Bocages are similar to parse forests.
* Ordering: An ordering object is an ordering of the trees in a
bocage.
* Tree: A tree object is a bocage iterator.
* Value: A value object is a tree iterator. Iteration of tree using
a value object produces "steps". These "steps" are instructions to
the application on how to evaluate the semantics, and how to
manipulate the stack.
The major objects have one letter abbreviations, which are used
frequently. These are, in the standard sequence,
* Configuration: C
* Grammar: G
* Recognizer: R
* Bocage: B
* Ordering: O
* Tree: T
* Value: V
File: api.info, Node: Time objects, Next: Reference counting, Prev: Major objects, Up: Architecture
4.2 Time objects
================
All of Libmarpa's major classes, except the configuration class, are
"time" classes. Except for objects in the grammar class, all time
objects are created from another time object. Each time object is
created from a time object of the class before it in the sequence. A
recognizer cannot be created without a precomputed grammar; a bocage
cannot be created without a recognizer; and so on.
When one time object is used to create a second time object, the
first time object is the "parent object" and the second time object is
the "child object". For example, when a bocage is created from a
recognizer, the recognizer is the parent object, and the bocage is the
child object.
Grammars have no parent object. Every other time object has exactly
one parent object. Value objects have no child objects. All other time
objects can have any number of children, from zero up to a number
determined by memory or some other machine-determined limit.
Every time object has a "base grammar". A grammar object is its own
base grammar. The base grammar of a recognizer is the grammar that it
was created with. The base grammar of any other time object is the base
grammar of its parent object. For example, the base grammar of a bocage
is the base grammar of the recognizer that it was created with.
File: api.info, Node: Reference counting, Next: Numbered objects, Prev: Time objects, Up: Architecture
4.3 Reference counting
======================
Every object in a "time" class has its own, distinct, lifetime, which is
controlled by the object's reference count. Reference counting follows
the usual practice. Contexts which take a share of the "ownership" of
an object increase the reference count by 1. When a context
relinquishes its share of the ownership of an object, it decreases the
reference count by 1.
Each class of time object has a "ref" and an "unref" method, to be
used by those contexts which need to explicitly increment and decrement
the reference count. For example, the "ref" method for the grammar
class is 'marpa_g_ref()' and the "unref" method for the grammar class is
'marpa_g_unref()'.
Time objects do not have explicit destructors. When the reference
count of a time object reaches 0, that time object is destroyed.
Much of the necessary reference counting is performed automatically.
The context calling the constructor of a time object does not need to
explicitly increase the reference count, because Libmarpa time objects
are always created with a reference count of 1.
Child objects "own" their parents, and when a child object is
successfully created, the reference count of its parent object is
automatically incremented to reflect this. When a child object is
destroyed, it automatically decrements the reference count of its
parent.
In a typical application, a calling context needs only to remember to
"unref" each time object that it creates, once it is finished with that
time object. All other reference decrements and increments are taken
care of automatically. The typical application never needs to
explicitly call one of the "ref" methods.
More complex applications may find it convenient to have one or more
contexts share ownership of objects created in another context. These
more complex situations are the only cases in which the "ref" methods
will be needed.
File: api.info, Node: Numbered objects, Prev: Reference counting, Up: Architecture
4.4 Numbered objects
====================
In addition to its major, "time" objects, Libmarpa also has numbered
objects. Numbered objects do not have lifetimes of their own. Every
numbered object belongs to a time object, and is destroyed with it.
Rules and symbols are numbered objects. Tokens values are another class
of numbered objects.
File: api.info, Node: Input, Next: Exhaustion, Prev: Architecture, Up: Top
5 Input
*******
* Menu:
* Earlemes::
* The basic models of input::
* Terminals::
File: api.info, Node: Earlemes, Next: The basic models of input, Prev: Input, Up: Input
5.1 Earlemes
============
* Menu:
* The traditional input model::
* The latest earleme::
* The current earleme::
* The furthest earleme::
File: api.info, Node: The traditional input model, Next: The latest earleme, Prev: Earlemes, Up: Earlemes
5.1.1 The traditional input model
---------------------------------
In traditional Earley parsers, the concept of location is very simple.
Locations are numbered from 0 to N, where N is the length of the input.
Every location has an Earley set, and vice versa. Location 0 is the
start location. Every location after the start location has exactly one
input token associated with it.
Some applications do not fit this traditional input model -- natural
language processing requires ambiguous tokens, for example. Libmarpa
allows a wide variety of alternative input models.
In Libmarpa a location is called a "earleme". The number of an
Earley set is the "ID of the Earley set", or its "ordinal". In the
traditional model, the ordinal of an Earley set and its earleme are
always exactly the same, but in Libmarpa's advanced input models the
ordinal of an Earley set can be different from its location (earleme).
The important earleme values are the latest earleme. the current
earleme, and the furthest earleme. Latest, current and furthest
earleme, when they have determinate values, obey a lexical order in this
sense: The latest earleme is always at or before the current earleme,
and the current earleme is always at or before the furthest earleme.
File: api.info, Node: The latest earleme, Next: The current earleme, Prev: The traditional input model, Up: Earlemes
5.1.2 The latest earleme
------------------------
The "latest Earley set" is the Earley set completed most recently. This
is initially the Earley set at location 0. The latest Earley set is
always the Earley set with the highest ordinal, and the Earley set with
the highest earleme location. The "latest earleme" is the earleme of
the latest Earley set. If there is an Earley set at the current
earleme, it is the latest Earley set and the latest earleme is equal to
the current earleme. There is never an Earley set after the current
earleme, and therefore the latest Earley set is never after the current
earleme. The 'marpa_r_start input()' and 'marpa_r_earleme_complete()'
methods are only ones that change the latest earleme. *Note
marpa_r_start_input(): marpa_r_start_input. and *note
marpa_r_earleme_complete(): marpa_r_earleme_complete.
The latest earleme is different from the current earleme if and only
if there is no Earley set at the current earleme. A different end of
parsing can be specified, but by default, parsing is of the input in the
range from earleme 0 to the latest earleme.
File: api.info, Node: The current earleme, Next: The furthest earleme, Prev: The latest earleme, Up: Earlemes
5.1.3 The current earleme
-------------------------
The "current earleme" is the earleme that Libmarpa is currently working
on. More specifically, it is the one at which new tokens will *start*.
Since tokens are never zero length, a new token will always end after
the current earleme. 'marpa_r_start_input()' initializes the current
earleme to 0, and every call to 'marpa_r_earleme_complete()' advances
the current earleme by 1. The 'marpa_r_start input()' and
'marpa_r_earleme_complete()' methods are only ones that change the
current earleme. *Note marpa_r_start_input(): marpa_r_start_input. and
*note marpa_r_earleme_complete(): marpa_r_earleme_complete.
File: api.info, Node: The furthest earleme, Prev: The current earleme, Up: Earlemes
5.1.4 The furthest earleme
--------------------------
Loosely speaking, the "furthest earleme" is the furthest earleme reached
by the parse. More precisely, it is the highest numbered earleme at
which a token ends and is 0 if there are no tokens. The furthest
earleme is 0 when a recognizer is created. With every call to
'marpa_r_alternative()', the end of the token it adds is calculated. A
token ends at the earleme location CURRENT+LENGTH, where CURRENT is the
current earleme, and LENGTH is the length of the newly added token. If
'old_f' is the furthest earleme before a call to
'marpa_r_alternative()', the furthest earleme after the call is
'max(old_f, current+length)'. The 'marpa_r_new()' and
'marpa_r_alternative()' methods are only ones that change the furthest
earleme. *Note marpa_r_new(): marpa_r_new. and *note
marpa_r_alternative(): marpa_r_alternative.
In the basic input models, where every token has length 1, calling
'marpa_r_earleme_complete()' after each 'marpa_r_alternative()' call is
sufficient to process all inputs, and the furthest earleme's value can
be typically be ignored. In alternative input models, where tokens have
lengths greater than 1, calling 'marpa_r_earleme_complete()' once after
the last token is read may not be enough to ensure that all tokens have
been processed. To ensure that all tokens have been processed, an
application must advance the current earleme by calling
'marpa_r_earleme_complete()', until the current earleme is equal to the
furthest earleme.
File: api.info, Node: The basic models of input, Next: Terminals, Prev: Earlemes, Up: Input
5.2 The basic models of input
=============================
For the purposes of presentation, we (somewhat arbitrarily) divide
Libmarpa's input models into two groups: basic and advanced. In the
"basic input models of input", every token is exactly one earleme long.
This implies that, in a basic model of input,
* every token is the same length,
* the ordinal of an Earley set will always be the same as its earleme
location, and
* the latest earleme and the current earleme are always equal.
In the "advanced models of input", tokens may have a length other
than 1. Most applications use the basic input models. The details of
the advanced models of input are presented in a later chapter. *Note
Advanced input models::.
* Menu:
* The standard model of input::
* Ambiguous input::
File: api.info, Node: The standard model of input, Next: Ambiguous input, Prev: The basic models of input, Up: The basic models of input
5.2.1 The standard model of input
---------------------------------
In the standard model of input, there is exactly one successful
'marpa_r_alternative()' call immediately previous to every
'marpa_r_earleme_complete()' call. A 'marpa_r_alternative()' call is
"immediately previous" to a 'marpa_r_earleme_complete()' call iff that
'marpa_r_earleme_complete()' call is the first
'marpa_r_earleme_complete()' call after the 'marpa_r_alternative()'
call.
Recall that, since the standard model is a basic model, the token
length in every successful call to 'marpa_r_alternative()' will be one.
For an input of length N, there will be exactly N
'marpa_r_earleme_complete()' calls, and all but the last call to
'marpa_r_earleme_complete()' must be successful.
In the standard model, after a successful call to
'marpa_r_alternative()', if C is the value of the current earleme before
the call,
* the current earleme will remain unchanged and therefore will be C;
and
* the furthest earleme be C+1.
In the standard model, a call to 'marpa_r_earleme_complete()' follows
a successful call of 'marpa_r_alternative()', so that the value of the
furthest earleme before the call to 'marpa_r_earleme_complete()' will be
'c+1', where C is the value of the current earleme. After a successful
call to 'marpa_r_earleme_complete()',
* the current earleme will be advanced to 'c+1'; and
* the furthest earleme will be C+1, and therefore equal to the
current earleme.
Recall that, in the basic models of input, the latest earleme is
always equal to the current earleme.
File: api.info, Node: Ambiguous input, Prev: The standard model of input, Up: The basic models of input
5.2.2 Ambiguous input
---------------------
We can loosen the standard model to allow more than one successful call
to 'marpa_r_alternative()' immediately previous to each call to
'marpa_r_earleme_complete()'. This change will mean that multiple
tokens become possible at each earleme -- in other words, that the input
becomes ambiguous. We continue to require that there be at least one
successful call to 'marpa_r_alternative()' before each call to
'marpa_r_earleme_complete()'. And we recall that, since this is a basic
input model, all tokens must have a length of 1.
In the ambiguous input model, the behavior of the current, latest and
furthest earlemes are exactly as described for the standard model.
*Note The standard model of input::.
File: api.info, Node: Terminals, Prev: The basic models of input, Up: Input
5.3 Terminals
=============
A terminal symbol is a symbol which may appear in the input.
Traditionally, all LHS symbols, as well as the start symbol, must be
non-terminals. This is Marpa's behavior, by default.
Marpa allows the user to eliminate the distinction between terminals
and non-terminals. In this, it differs from traditional parsers.
Libmarpa can arrange for a terminal to appear on the LHS of one or more
rules, or for a terminal to be the start symbol. However, since
terminals can never be zero length, it is a logical contradiction for a
nulling symbol to also be a terminal and Marpa does not allow it.
Token values are 'int''s. Libmarpa does nothing with token values
except accept them from the application and return them during parse
evaluation.
File: api.info, Node: Exhaustion, Next: Semantics, Prev: Input, Up: Top
6 Exhaustion
************
A parse is "exhausted" when it cannot accept any further input. A parse
is "active" iff it is not exhausted. For a parse to be exhausted, the
furthest earleme and the current earleme must be equal. However, the
converse is not always the case: if more tokens can be read at the
current earleme, then it is possible for the furthest earleme and the
current earleme to be equal in an active parse.
Parse exhaustion always has a location. That is, if a parse is
exhausted it is exhausted at some earleme location 'X'. If a parse is
exhausted at location 'X', then
* There may be valid parses at 'X'.
* The parse was active at all locations earlier than 'X'.
* There may be valid parses at locations before 'X'.
* There will be no valid parses at locations after 'X'.
* No tokens can start at location 'X'.
* No tokens can end at a location after 'X'.
* No tokens can start at any location after 'X'.
* No tokens will be accepted by an exhausted parser. It is an
irrecoverable hard failure to call 'marpa_r_alternative()' after a
parser has become exhausted.
* No Earley sets will be at any location after 'X'.
* No earlemes are completed by, and no Earley sets are created by, an
exhausted parser. It is an irrecoverable hard failure to call
'marpa_r_earleme_complete()' after a parser has become exhausted.
Users sometimes assume that parse exhaustion means parse failure.
But other users sometimes assume that parse exhaustion means parse
success. For many grammars, there are strong associations between parse
exhaustion and parse success, but the strong association can go either
way, Both exhaustion-loving and exhaustion-hating grammars are very
common in practical application.
In an "exhaustion-hating" application, parse exhaustion typically
means parse failure. C programs, Perl scripts and most programming
languages are exhaustion-hating applications. If a C program is
well-formed, it is always possible to read more input. The same is true
of a Perl program that does not have a '__DATA__' section.
In an "exhaustion-loving" application parse exhaustion means parse
success. A toy example of an exhaustion-loving application is the
language consisting of balanced parentheses. When the parentheses come
into perfect balance the parse is exhausted, because any further input
would unbalance the brackets. And the parse succeeds when the
parentheses come into perfect balance. Exhaustion means success. Any
language which balances start and end indicators will tend to be
exhaustion-loving. HTML and XML, with their start and end tags, can be
seen as exhaustion-loving languages.
One common form of exhaustion-loving parsing occurs in lexers which
look for longest matches. Exhaustion will indicate that the longest
match has been found.
It is possible for a language to be exhaustion-loving at some points
and exhaustion-hating at others. We mentioned Perl's '__DATA__' as a
complication in a basically exhaustion-hating language.
'marpa_r_earleme_complete()' and 'marpa_r_start_input' are the only
methods that may encounter parse exhaustion. *Note
marpa_r_earleme_complete(): marpa_r_earleme_complete. and *note
marpa_r_start_input(): marpa_r_start_input. When the
'marpa_r_start_input' or 'marpa_r_earleme_complete()' methods exhaust
the parse, they generate a 'MARPA_EVENT_EXHAUSTED' event. Applications
can also query parse exhaustion status directly with the
'marpa_r_is_exhausted()' method. *Note marpa_r_is_exhausted():
marpa_r_is_exhausted.
File: api.info, Node: Semantics, Next: Threads, Prev: Exhaustion, Up: Top
7 Semantics
***********
Libmarpa handling of semantics is unusual. Most semantics are left up
to the application, but Libmarpa guides them. Specifically, the
application is expected to maintain the evaluation stack. Libmarpa's
valuator provides instructions on how to handle the stack. Libmarpa's
stack handling instructions are called "steps". For example, a Libmarpa
step might tell the application that the value of a token needs to go
into a certain stack position. Or a Libmarpa step might tell the
application that a rule is to be evaluated. For rule evalution,
Libmarpa will tell the application where the operands are to be found,
and where the result must go.
The detailed discussion of Libmarpa's handling of semantics is in the
reference chapters of this document, under the appropriate methods and
classes. The most extensive discussion of the semantics is in the
section that deals with the methods of the value time class (*note Value
methods::).
File: api.info, Node: Threads, Next: Failure, Prev: Semantics, Up: Top
8 Threads
*********
Libmarpa is thread-safe, given circumstances as described below. The
Libmarpa methods are not reentrant.
Libmarpa is C89-compliant. It uses no global data, and calls only
the routines that are defined in the C89 standard and that can be made
thread-safe. In most modern implementations, the default C89
implementation is thread-safe to the extent possible. But the C89
standard does not require thread-safety, and even most modern
environments allow the user to turn thread safety off. To be
thread-safe, Libmarpa must be compiled and linked in an environment that
provides thread-safety.
While Libmarpa can be used safely across multiple threads, a Libmarpa
grammar cannot be. Further, a Libmarpa time object can only be used
safely in the same thread as its base grammar. This is because all time
objects with the same base grammar share data from that base grammar.
To work around this limitation, the same grammar definition can be
used to a create a new Libmarpa grammar time object in each thread. If
there is sufficient interest, future versions of Libmarpa could allow
thread-safe cloning of grammars and other time objects.
File: api.info, Node: Failure, Next: Introduction to the method descriptions, Prev: Threads, Up: Top
9 Failure
*********
As a reminder, no language in this chapter (or, for that matter, in this
document) should be read as providing, or suggesting the existence of, a
warranty. *Note license::. Also, *note No warranty::.
* Menu:
* Libmarpa's approach to failure::
* User non-conformity to specified behavior::
* Classifying failure::
* Memory allocation failure::
* Undetected failure::
* Irrecoverable hard failure::
* Partially recoverable hard failure::
* Library-recoverable hard failure::
* Fully recoverable hard failure::
* Soft failure::
* Error codes::
File: api.info, Node: Libmarpa's approach to failure, Next: User non-conformity to specified behavior, Prev: Failure, Up: Failure
9.1 Libmarpa's approach to failure
==================================
Libmarpa is a C language library, and inherits the traditional C
language approach to avoiding and handling user programming errors.
This approach will strike readers unfamiliar with this tradition as
putting an appallingly large portion of the burden of avoiding
application programmer error on the application programmer themself.
But in the early 1970's, when the C language first stabilized, the
alternative, and the consensus choice for its target applications was
assembly language. In that context, C was radical in its willingness to
incur a price in efficiency in order to protect the programmer from
themself. C was considered to take a excessively "hand holding"
approach which very much flew in the face of consensus.
The decades have made a large difference in the trade-offs, and the
consensus about the degree to which even a low-level language should
protect the user has changed. It seems inevitable that C will be
replaced as the low-level language of choice, by a language which places
fewer burdens on the programmer, and more on the machine. The question
seems to be not whether C will be dethroned as the "go to" language for
low-level progamming, but when, and by which alternative.
Modern hardware makes many simple checks essentially cost-free, and
Libmarpa's efforts to protect the application programmer go well beyond
what would have been considered best practice in the past. But it
remains a C language library. But, on the whole, the Libmarpa
application programmer must be prepared to exercise the high degree of
carefulness traditionally required by its C language environment.
Libmarpa places the burden of avoiding irrecoverable failures, and of
handling recoverable failures, largely on the application programmer.
File: api.info, Node: User non-conformity to specified behavior, Next: Classifying failure, Prev: Libmarpa's approach to failure, Up: Failure
9.2 User non-conformity to specified behavior
=============================================
This document specifies many behaviors for Libmarpa application programs
to follow, such as the nature of the arguments to each method. The C
language environment specifies many more behaviors, such as proper
memory management. When a non-conformity to specified behavior is
unintentional and problematic, it is frequently called a "bug". Even
the most carefully programmed Libmarpa application may sometimes contain
a "bug". In addition, some specified behaviors are explicitly stated as
characterizing a primary branch of the processing, rather than made
mandatory for all successful processing. Non-conformity to
non-mandatory behaviors can be efficiently recoverable, and is often
intentional.
This chapter describes how non-conformity to specified behavior by a
Libmarpa application is handled by Libmarpa. Non-conformity to
specified behavior by a Libmarpa application is also called, for the
purposes of this document, a "Libmarpa application programming failure".
In contexts where no ambiguity arises, "Libmarpa application programming
failure" will usually be abbreviated to "failure".
"Libmarpa application programming success" in a context is defined as
the absence of unrecovered failure in that context. When no ambiguity
arises, "Libmarpa application programming success" is almost always
abbreviated to "success". For example, the success of an application
means the application ran without any irrecoverable failures, and that
it recovered from all the recoverable failures that were detected.
File: api.info, Node: Classifying failure, Next: Memory allocation failure, Prev: User non-conformity to specified behavior, Up: Failure
9.3 Classifying failure
=======================
A Libmarpa application programming failure, unless specified otherwise,
is an irrecoverable failure. Once an irrecoverable failure has
occurred, the further behavior of the program is undefined.
Nonetheless, we specify, and Libmarpa attempts, diagnostics behaviors
(*note Application and diagnostic behavior::) in an effort to handle
irrecoverable failures as smoothly as possible.
A Libmarpa application programming failure is recoverable if and only
if it is specified as such.
A failure is called a "hard failure" is it has an error code
associated with it. A recoverable failure is called a "soft failure" if
it has no associated error code. (For more on error codes, see *note
Error codes::.)
All failures fall into one of five types. In order of severity,
these are
* *memory allocation failures*,
* *undetected failures*,
* *irrecoverable hard failures*,
* *partially recoverable hard failures*, and
* *fully recoverable hard failures*, and
* *soft failures*.
File: api.info, Node: Memory allocation failure, Next: Undetected failure, Prev: Classifying failure, Up: Failure
9.4 Memory allocation failure
=============================
Failure to allocate memory is the most irrecoverable of irrecoverable
errors. Even effective error handling assumes the ability to allocate
memory, so that the practice has been, in the event of a memory
allocation failure, to take Draconian action. On "memory allocation
failure", as with all irrecoverable failures, Libmarpa's behavior in
undefined, but Libmarpa attempts to terminate the current program
abnormally by calling 'abort()'.
Memory allocation failure is the only case in which Libmarpa
terminates the program. In all other cases, Libmarpa leaves the
decision to terminate the program, whether normally or abnormally, up to
the application programmer.
Memory allocation failure does not have an error code. As a pedantic
matter, memory allocation failure is neither a hard or a soft failure.
File: api.info, Node: Undetected failure, Next: Irrecoverable hard failure, Prev: Memory allocation failure, Up: Failure
9.5 Undetected failure
======================
An "undetected failure" is a failure that the Libmarpa library does not
detect. Many failures are impossible or impractical for a C library to
detect. Two examples of failure that the Libmarpa methods do not detect
are writes outside the bounds of allocated memory, and use of memory
after it has been freed. C is not strongly typed, and arguments of
Libmarpa routines undergo only a few simple tests, tests which are
inadequate to detect many of the potential problems.
By undetected failure we emphasize that we mean failures undetected
*by the Libmarpa methods*. In the examples just given, there exist
tools that can help the programmer detect memory errors and other tools
exist to check the sanity of method arguments.
This document points out some of the potentially undetected problems,
when doing so seems more helpful than tedious. But any attempt to list
all the undetected problems would be too large and unwieldy to be
useful.
Undetected failure is always irrecoverable. An undetected failure is
neither a hard or a soft failure.
File: api.info, Node: Irrecoverable hard failure, Next: Partially recoverable hard failure, Prev: Undetected failure, Up: Failure
9.6 Irrecoverable hard failure
==============================
An "irrecoverable hard failure" is an irrecoverable Libmarpa application
programming failure that has an error code associated with it. Libmarpa
attempts to behave as predictably as possible in the face of a hard
failure, but once an irrecoverable failure occurs, the behavior of a
Libmarpa application is undefined.
In the event of an irrecoverable failure, there are no application
behaviors. The diagnostic behavior for a hard failure is as described
for the method which detects the hard failure. At a minimum, this
diagnostic behavior will be returning from the method which detects the
hard failure with the return value specified for hard failure, and
setting the error code as specified for hard failure.
File: api.info, Node: Partially recoverable hard failure, Next: Library-recoverable hard failure, Prev: Irrecoverable hard failure, Up: Failure
9.7 Partially recoverable hard failure
======================================
A "partially recoverable hard failure" is a recoverable Libmarpa
application programming failure
* that has an error code associated with it; and
* after which some, but not all, of the application behaviors remain
available to the programmer.
For every partially recoverable hard failure, this document specifies
the application behaviors that remain available after it occurs. The
most common kind of partially recoverable hard failure is a
library-recoverable hard failure. For an example of partially
recoverable hard failure, *note Library-recoverable hard failure::.
File: api.info, Node: Library-recoverable hard failure, Next: Fully recoverable hard failure, Prev: Partially recoverable hard failure, Up: Failure
9.8 Library-recoverable hard failure
====================================
A "library-recoverable hard failure" is a type of partially recoverable
hard failure. Loosely described, it is a hard failure which allows the
programmer to continue to use many of the Libmarpa methods in the
library, but which disallows certain methods on some objects.
To state the restrictions of application behaviors more precisely,
let the "failure grammar" be the base grammar of the method which
detected the library-recoverable hard failure. After a
library-recoverable hard failure, the following behaviors are no longer
applcation behaviors:
* Libmarpa mutator and constructor method calls where the base
grammar is the failure grammar.
Recall that any use of a behavior which is not an application
behavior is an irrecoverable failure.
The application behaviors remaining after a library-recoverable hard
failure are the following:
* All Libmarpa accessor method calls, even those whose base grammar
is the failure grammar.
* All Libmarpa destructor method calls, even those whose base grammar
is the failure grammar. An application will often want to destroy
all Libmarpa objects whose base grammar is the failure grammar, in
order to clear memory of unusable objects.
* All Libmarpa mutator and constructor method calls, except those
whose base grammar is the failure grammar.
* All Libmarpa static method calls.
* All use of non-Libmarpa interfaces, including other libraries and
the C language environment.
An example of a library-recoverable hard failure is the
'MARPA_ERR_COUNTED_NULLABLE' error in the 'marpa_g_precompute' method.
*Note marpa_g_precompute(): marpa_g_precompute.
File: api.info, Node: Fully recoverable hard failure, Next: Soft failure, Prev: Library-recoverable hard failure, Up: Failure
9.9 Fully recoverable hard failure
==================================
A "fully recoverable hard failure" is a recoverable Libmarpa application
programming failure
* that has an error code associated with it; and
* after which all of the application behaviors remain available to
the programmer.
One example of a fully recoverable hard failure is the error code
'MARPA_ERR_UNEXPECTED_TOKEN_ID'. The "Ruby Slippers" parsing technique
(*note Ruby Slippers::), which has seen extensive usage, is based on
Libmarpa's ability to recover from a 'MARPA_ERR_UNEXPECTED_TOKEN_ID'
error fully and efficiently,
File: api.info, Node: Soft failure, Next: Error codes, Prev: Fully recoverable hard failure, Up: Failure
9.10 Soft failure
=================
An "soft failure" is an recoverable Libmarpa application programming
failure that has no error code associated with it. Hard errors are
assigned error codes in order to tell them apart. Error codes are not
necessary or useful for soft errors, because there is at most one type
of soft failure per Libmarpa method.
"Soft failures" are so called, because they are the least severe kind
of failure. The most severe failures are "bugs" -- unintended, and a
symptom of a problem. Soft failures, on the other hand, are a frequent
occurrence in normal, successful, processing. In the phrase "soft
failure", the word "failure" is used in the same sense that its cognate
"fail" is used when we say that a loop terminates when it "fails" its
loop condition. That "failure" is of a condition necessary to continue
on a main branch of processing, and a signal to proceed on another
branch.
It is expected that Libmarpa applications will be designed such that
successful execution is based on the handling specified for soft
failures. In fact, a non-trival Libmarpa application can hardly be
designed except on that basis.
File: api.info, Node: Error codes, Prev: Soft failure, Up: Failure
9.11 Error codes
================
As stated, every hard failure has an associated error code. Full
descriptions of the error codes that are returned by the external
methods are given in their own section (*note External error codes::).
How the error code is accessed depends on the method which detects
the hard failure associated with that error code. Methods for time
objects always set the error code in the base grammar, from which it may
be accessed using the error methods described below (*note Error
methods::). If a method has no base grammar, the way in which the error
code for the hard failures that it detects can be accessed will be
stated in the description of that method.
Since the error of a time object is set in the base grammar, it
follows that every object with the same base grammar has the same error
code. Objects with different base grammars may have different error
codes.
While error codes are properties of a base grammar, irrecoverability
is application-wide. That is, whenever any irrecoverable failure
occurs, the entire application is irrecoverable. Once an application
becomes irrecoverable, those Libmarpa objects with error codes for
recoverable errors are still subject to the general irrecoverability.
File: api.info, Node: Introduction to the method descriptions, Next: Static methods, Prev: Failure, Up: Top
10 Introduction to the method descriptions
******************************************
The following chapters describe Libmarpa's methods in detail.
* Menu:
* About the overviews::
* Naming conventions::
* Return values::
* How to read the method descriptions::
File: api.info, Node: About the overviews, Next: Naming conventions, Prev: Introduction to the method descriptions, Up: Introduction to the method descriptions
10.1 About the overviews
========================
The method descriptions are grouped into chapters and sections. Each
such group of methods descriptions begins, optionally, with an overview.
These overviews, again optionally, end with a "cheat sheet". The "cheat
sheets" name the most important Libmarpa methods in that chapter or
section, in the order in which they are typically used, and very briefly
describe their purpose.
The overviews sometimes speak of an "archetypal" application. The
"archetypal Libmarpa application" implements a complete logic flow,
starting with the creation of a grammar, and proceeding all the way to
the return of the final result from a value object. In the archetypal
Libmarpa application, the grammar, input and semantics are all small but
non-trivial.
File: api.info, Node: Naming conventions, Next: Return values, Prev: About the overviews, Up: Introduction to the method descriptions
10.2 Naming conventions
=======================
Methods in Libmarpa follow a strict naming convention. All methods have
a name beginning with 'marpa_', if they are part of the external
interface. If an external method is not a static method, its name is
prefixed with one of 'marpa_c_', 'marpa_g_', 'marpa_r_', 'marpa_b_',
'marpa_o_', 'marpa_t_' or 'marpa_v_', where the single letter between
underscores is one of the Libmarpa major class abbreviations. The
letter indicates which class the method belongs to.
Methods that are exported, but that are part of the internal
interface, begin with '_marpa_'. Methods that are part of the internal
interface (often called "internal methods") are subject to change and
are intended for use only by Libmarpa's developers.
Libmarpa reserves the 'marpa_' and '_marpa_' prefixes for itself,
with all their capitalization variants. All Libmarpa names visible
outside the package will begin with a capitalization variant of one of
these two prefixes.
File: api.info, Node: Return values, Next: How to read the method descriptions, Prev: Naming conventions, Up: Introduction to the method descriptions
10.3 Return values
==================
Some general conventions for return values are worth mentioning:
* For methods that return an integer, a return value of -1 usually
indicates soft failure.
* For methods that return an integer, a return value of -2 usually
indicates hard failure.
* For methods that return an integer, a return value greater of zero
or more usually indicates success.
* If a method returns an pointer value, 'NULL' usually indicates
failure. Any other result usually indicates success.
The Libmarpa programmer should not overly rely on the general
conventions for return values. In particular, -2 may sometimes be
ambiguous -- both a valid return value for success, and a potential
indication of hard failure. In this case, the programmer must
distinguish the two return statuses based on the error code, and a
programmer who is relying too heavily on the general conventions will
fall into a trap. For a the description of the return values of
'marpa_g_rule_rank_set()', *note Rank methods::.
File: api.info, Node: How to read the method descriptions, Prev: Return values, Up: Introduction to the method descriptions
10.4 How to read the method descriptions
========================================
The method descriptions are written on the assumption that the reader
has the following in mind while reading them:
* Each method description begins with the signature of its "topic
method".
* In the method description, the phrase "this method" always refers
to the topic method.
* Whenever "this method" is the subject of a sentence in the method
description, it may be elided, so that, for example, "This method
returns 42" becomes "Returns 42".
* If the return type of a method is not 'void', the last paragraph of
its method description is a "return value summary". The return
value summary starts with the label "*Return Value*".
* Every method returns in exactly one of three statuses: success,
hard failure, or soft failure.
* A return status of hard failure indicates that the method detected
a hard failure.
* A method may have several kinds of hard failure, including several
kinds of irrecoverable hard failure and several kinds of
recoverable hard failure. On return, these can be distinguished by
their error codes.
* If a method call hard fails, its error code is that associated with
the hard failure. Unless stated otherwise in the return value
summary, the error code is set in the base grammar of the method
call, and may be accessed with the methods described below. *Note
Error methods::.
* If a method allows a recoverable hard failure, this is explicitly
stated in its return value summary, along with the associated error
code. The method description with state the circumstances under
which the recoverable hard failure occurs, and what the application
must do to recover.
* A return status of soft failure indicates that the method detected
a soft failure.
* Every method has at most one kind of soft failure.
* If a method allows a soft failure, this is explicitly stated in its
return value summary, and the method description will state the
circumstances under which the soft failure occurs, and what the
application must do to recover.
* If a method call soft fails, the value of the error code is
indeterminate.
* If a method call succeeds, the value of the error code is
indeterminate.
* A return status of success indicates that the method did not detect
any failures.
* If both a hard failure and a soft failure occur, the return status
will be hard failure.
* If both a recoverable hard failure and an irrecoverable hard
failure occur, the error code will be for an irrecoverable hard
failure.
* The behaviors specified for success and soft failure are
application behaviors.
* The behaviors specified for hard failures are diagnostic behaviors
if an irrecoverable failure occurred, and application behaviors
otherwise.
File: api.info, Node: Static methods, Next: Configuration methods, Prev: Introduction to the method descriptions, Up: Top
11 Static methods
*****************
-- Function: Marpa_Error_Code marpa_check_version ( int REQUIRED_MAJOR,
int REQUIRED_MINOR, int REQUIRED_MICRO )
[Accessor] Checks that the Marpa library in use is compatible with
the given version. Generally, the application programmer will pass
in the constants 'MARPA_MAJOR_VERSION', 'MARPA_MINOR_VERSION', and
'MARPA_MICRO_VERSION' as the three arguments, to check that their
application was compiled with headers the match the version of
Libmarpa that they are using.
If REQUIRED_MAJOR.REQUIRED_MINOR.REQUIRED_MICRO is an exact match
with 9.0.3, the method succeeds. Otherwise the return status is an
irrecoverable hard failure.
*Return value*: On success, 'MARPA_ERR_NONE'. On hard failure, the
error code.
-- Function: Marpa_Error_Code marpa_version ( int* version)
[Accessor] Writes the version number in VERSION. It is an
undetected irrecoverable hard failure if VERSION does not have room
for three 'int''s.
*Return value*: Always succeeds. The return value is
indeterminate.
File: api.info, Node: Configuration methods, Next: Grammar methods, Prev: Static methods, Up: Top
12 Configuration methods
************************
The configuration object is intended for future extensions. These may
allow the application to override Libmarpa's memory allocation and fatal
error handling without resorting to global variables, and therefore in a
thread-safe way. Currently, the only function of the 'Marpa_Config'
class is to give 'marpa_g_new()' a place to put its error code.
'Marpa_Config' is Libmarpa's only "major" class which is not a time
class. There is no constructor or destructor, although 'Marpa_Config'
objects *do* need to be initialized before use. Aside from its own
accessor, 'Marpa_Config' objects are only used by 'marpa_g_new()' and no
reference to their location is not kept in any of Libmarpa's time
objects. The intent is to that it be convenient to have them in memory
that might be deallocated soon after 'marpa_g_new()' returns. For
example, they could be put on the stack.
-- Function: int marpa_c_init ( Marpa_Config* CONFIG)
[Mutator] Initialize the CONFIG information to "safe" default
values. An irrecoverable error will result if an uninitialized
configuration is used to create a grammar.
*Return value*: Always succeeds. The return value is
indeterminate.
-- Function: Marpa_Error_Code marpa_c_error ( Marpa_Config* CONFIG,
const char** P_ERROR_STRING )
[Accessor] Error codes are usually kept in the base grammar, which
leaves 'marpa_g_new()' no place to put its error code on failure.
Objects of the 'Marpa_Config' class provide such a place.
P_ERROR_STRING is reserved for use by the internals. Applications
should set it to 'NULL'.
*Return value*: The error code in CONFIG. Always succeeds, so that
'marpa_c_error()' never requires an error code for itself.
File: api.info, Node: Grammar methods, Next: Recognizer methods, Prev: Configuration methods, Up: Top
13 Grammar methods
******************
* Menu:
* Grammar overview::
* Grammar constructor::
* Grammar reference counting::
* Symbol methods::
* Rule methods::
* Sequence methods::
* Rank methods::
* Grammar precomputation::
File: api.info, Node: Grammar overview, Next: Grammar constructor, Prev: Grammar methods, Up: Grammar methods
13.1 Overview
=============
An archetypal application has a grammar. To create a grammar, use the
'marpa_g_new()' method. When a grammar is no longer in use, its memory
can be freed using the 'marpa_g_unref()' method.
To be precomputed, a grammar must have one or more symbols. To
create symbols, use the 'marpa_g_symbol_new()' method.
To be precomputed, a grammar must have one or more rules. To create
rules, use the 'marpa_g_rule_new()' and 'marpa_g_sequence_new()'
methods.
For non-trivial parsing, one or more of the symbols must be
terminals. To mark a symbol as a terminal, use the
'marpa_g_symbol_is_terminal_set()' method.
To be precomputed, a grammar must have exactly one start symbol. To
mark a symbol as the start symbol, use the 'marpa_g_start_symbol_set()'
method.
Before parsing with a grammar, it must be precomputed. To precompute
a grammar, use the 'marpa_g_precompute()' method.
File: api.info, Node: Grammar constructor, Next: Grammar reference counting, Prev: Grammar overview, Up: Grammar methods
13.2 Creating a new grammar
===========================
-- Function: Marpa_Grammar marpa_g_new ( Marpa_Config* CONFIGURATION )
[Constructor] Creates a new grammar time object. The returned
grammar object is not yet precomputed, and will have no symbols and
rules. Its reference count will be 1.
Unless the application calls 'marpa_c_error()' Libmarpa will not
reference the location pointed to by the CONFIGURATION argument
after 'marpa_g_new()' returns. (*Note marpa_c_error():
marpa_c_error.) The CONFIGURATION argument may be 'NULL', but if
it is, there will be no way to determine the error code on failure.
*Return value*: On success, the grammar object. On hard failure,
'NULL'. Also on hard failure, if the CONFIGURATION argument is not
'NULL', the error code is set in CONFIGURATION. The error code may
be accessed using 'marpa_c_error()'.
-- Function: int marpa_g_force_valued ( Marpa_Grammar G )
[Mutator] It is recommended that this call be made immediately
after the grammar constructor. It turns off a deprecated feature.
The 'marpa_g_force_valued()' forces all the symbols in a grammar to
be "valued". The opposite of a valued symbol is one about whose
value you do not care. This distinction has been made in the past
in hope of gaining efficiencies at evaluation time. Current
thinking is that the gains do not repay the extra complexity.
*Return value*: On success, a non-negative integer, whose value is
otherwise indeterminate. On failure, -2.
File: api.info, Node: Grammar reference counting, Next: Symbol methods, Prev: Grammar constructor, Up: Grammar methods
13.3 Tracking the reference count of the grammar
================================================
-- Function: Marpa_Grammar marpa_g_ref (Marpa_Grammar G)
[Mutator] Increases the reference count of G by 1. Not needed by
most applications.
*Return value*: On success, G. On hard failure, 'NULL'.
-- Function: void marpa_g_unref (Marpa_Grammar G)
[Destructor] Decreases the reference count by 1, destroying G once
the reference count reaches zero.
File: api.info, Node: Symbol methods, Next: Rule methods, Prev: Grammar reference counting, Up: Grammar methods
13.4 Symbol methods
===================
-- Function: Marpa_Symbol_ID marpa_g_start_symbol (Marpa_Grammar G)
[Accessor] When successful, returns the ID of the start symbol.
Soft fails, if there is no start symbol. The start symbol is set
by the 'marpa_g_start_symbol_set()' call.
*Return value*: On success, the ID of the start symbol, which is
always a non-negative number. On soft failure, -1. On hard
failure, -2.
-- Function: Marpa_Symbol_ID marpa_g_start_symbol_set ( Marpa_Grammar
G, Marpa_Symbol_ID SYM_ID)
[Mutator] When successful, sets the start symbol of grammar G to
symbol SYM_ID. Soft fails if SYM_ID is well-formed (a non-negative
integer), but a symbol with that ID does not exist.
*Return value*: On success, SYM_ID, which will always be a
non-negative number. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_highest_symbol_id (Marpa_Grammar G)
[Accessor] *Return value*: On success, the numerically largest
symbol ID of G. On hard failure, -2.
-- Function: int marpa_g_symbol_is_accessible (Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
[Accessor] A symbol is "accessible" if it can be reached from the
start symbol. Soft fails if SYM_ID is well-formed (a non-negative
integer), but a symbol with that ID does not exist. A common hard
failure is calling this method with a grammar that is not
precomputed.
*Return value*: On success, 1 if symbol SYM_ID is accessible, 0 if
not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_symbol_is_nullable ( Marpa_Grammar g,
Marpa_Symbol_ID sym_id)
[Accessor] A symbol is "nullable" if it sometimes produces the
empty string. A *nulling* symbol is always a *nullable* symbol,
but not all *nullable* symbols are *nulling* symbols. Soft fails
if SYM_ID is well-formed (a non-negative integer), but a symbol
with that ID does not exist. A common hard failure is calling this
method with a grammar that is not precomputed.
*Return value*: On success, 1 if symbol SYM_ID is nullable, 0 if
not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_symbol_is_nulling (Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
[Accessor] A symbol is "nulling" if it always produces the empty
string. Soft fails if SYM_ID is well-formed (a non-negative
integer), but a symbol with that ID does not exist. A common hard
failure is calling this method with a grammar that is not
precomputed.
*Return value*: On success, 1 if symbol SYM_ID is nulling, 0 if
not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_symbol_is_productive (Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
[Accessor] A symbol is "productive" if it can produce a string of
terminals. All nullable symbols are considered productive. Soft
fails if SYM_ID is well-formed (a non-negative integer), but a
symbol with that ID does not exist. A common hard failure is
calling this method with a grammar that is not precomputed.
*Return value*: On success, 1 if symbol SYM_ID is productive, 0 if
not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_symbol_is_start ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
[Accessor] On success, if SYM_ID is the start symbol, returns 1.
On success, if SYM_ID is not the start symbol, returns 0. On
success, if no start symbol has been set, returns 0. is the start
symbol.
Soft fails if SYM_ID is well-formed (a non-negative integer), but a
symbol with that ID does not exist.
*Return value*: On success, 1 or 0. On soft failure, -1. On hard
failure, -2.
-- Function: int marpa_g_symbol_is_terminal ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
[Accessor] On succcess, returns the "terminal status" of a SYM_ID.
The terminal status is 1 if SYM_ID is a terminal, 0 otherwise. To
be used as an input symbol in the 'marpa_r_alternative()' method, a
symbol must be a terminal.
By default, a symbol is a terminal if and only if it does not
appear on the LHS of any rule. The terminal status can be set
explicitly with the 'marpa_g_symbol_is_terminal_set()' method.
*Note marpa_g_symbol_is_terminal_set():
marpa_g_symbol_is_terminal_set.
Soft fails if SYM_ID is well-formed (a non-negative integer), but a
symbol with that ID does not exist.
*Return value*: On success, 1 or 0. On soft failure, -1. On hard
failure, -2.
-- Function: int marpa_g_symbol_is_terminal_set ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID, int VALUE)
[Mutator] Sets the "terminal status" of a symbol. This function
flags symbol SYM_ID as a terminal if VALUE is 1, or flags it as a
non-terminal if VALUE is 0. To be used as an input symbol in the
'marpa_r_alternative()' method, a symbol must be a terminal. On
success, this method returns VALUE.
Once set to a value with this method, the terminal status of a
symbol is "locked" at that value. A subsequent call to this method
that attempts to change the terminal status of SYM_ID to a value
different from its current one will hard fail with error code
'MARPA_ERR_TERMINAL_IS_LOCKED'. Other hard failures include when
VALUE is not 0 or 1; and when the grammar G is precomputed.
By default, a symbol is a terminal if and only if it does not
appear on the LHS of any rule. An attempt to flag a nulling symbol
as a terminal will cause a failure, but this is not necessarily
detected before precomputation.
*Return value*: On success, VALUE, which will be 1 or 0. On soft
failure, -1. On hard failure, -2.
-- Function: Marpa_Symbol_ID marpa_g_symbol_new (Marpa_Grammar G)
[Mutator] When successful, creates a new symbol in grammar G.
*Return value*: On success, the ID of the new symbol; which will be
a non-negative integer. On hard failure, -2.
File: api.info, Node: Rule methods, Next: Sequence methods, Prev: Symbol methods, Up: Grammar methods
13.5 Rule methods
=================
-- Function: int marpa_g_highest_rule_id (Marpa_Grammar G)
[Accessor] *Return value*: On success, the numerically largest rule
ID of G. On hard failure, -2.
-- Function: int marpa_g_rule_is_accessible (Marpa_Grammar G,
Marpa_Rule_ID RULE_ID)
[Accessor] A rule is "accessible" if it can be reached from the
start symbol. A rule is accessible if and only if its LHS symbol
is accessible. The start rule is always an accessible rule.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist. A common hard failure is
calling this method with a grammar that is not precomputed.
*Return value*: On success 1 or 0: 1 if rule with ID RULE_ID is
accessible, 0 if not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_rule_is_nullable ( Marpa_Grammar G,
Marpa_Rule_ID RULEID)
[Accessor] A rule is "nullable" if it sometimes produces the empty
string. A *nulling* rule is always a *nullable* rule, but not all
*nullable* rules are *nulling* rules.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist. A common hard failure is
calling this method with a grammar that is not precomputed.
*Return value*: On success 1 or 0: 1 if the rule with ID RULE_ID is
nullable, 0 if not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_rule_is_nulling (Marpa_Grammar G,
Marpa_Rule_ID RULEID)
[Accessor] A rule is "nulling" if it always produces the empty
string.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist. A common hard failure is
calling this method with a grammar that is not precomputed.
*Return value*: On success 1 or 0: 1 if the rule with ID RULE_ID is
nulling, 0 if not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_rule_is_loop (Marpa_Grammar G, Marpa_Rule_ID
RULE_ID)
[Accessor] A rule is a loop rule if it non-trivially produces the
string of length one which consists only of its LHS symbol. Such a
derivation takes the parse back to where it started, hence the term
"loop". "Non-trivially" means the zero-step derivation does not
count -- the derivation must have at least one step.
The presence of a loop rule makes a grammar infinitely ambiguous,
and applications will typically want to treat them as fatal errors.
But nothing forces an application to do this, and Marpa will
successfully parse and evaluate grammars with loop rules.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist. A common hard failure is
calling this method with a grammar that is not precomputed.
*Return value*: On success 1 or 0: 1 if the rule with ID RULE_ID is
a loop rule, 0 if not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_rule_is_productive (Marpa_Grammar G,
Marpa_Rule_ID RULE_ID)
[Accessor] A rule is "productive" if it can produce a string of
terminals. A rule is productive if and only if all the symbols on
its RHS are productive. The empty string counts as a string of
terminals, so that a nullable rule is always a productive rule.
For that same reason, an empty rule is considered productive.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist. A common hard failure is
calling this method with a grammar that is not precomputed.
*Return value*: On success 1 or 0: 1 if the rule with ID RULE_ID is
productive, 0 if not. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_rule_length ( Marpa_Grammar G, Marpa_Rule_ID
RULE_ID)
[Accessor] The length of a rule is the number of symbols on its
RHS.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist.
*Return value*: On success, the length of the rule with ID RULE_ID.
On soft failure, -1. On hard failure, -2.
-- Function: Marpa_Symbol_ID marpa_g_rule_lhs ( Marpa_Grammar G,
Marpa_Rule_ID RULE_ID)
[Accessor] Soft fails if RULE_ID is well-formed (a non-negative
integer), but a rule with that ID does not exist.
*Return value*: On success, the ID of the LHS symbol of the rule
with ID RULE_ID. On soft failure, -1. On hard failure, -2.
-- Function: Marpa_Rule_ID marpa_g_rule_new (Marpa_Grammar G,
Marpa_Symbol_ID LHS_ID, Marpa_Symbol_ID *RHS_IDS, int LENGTH)
[Mutator] On success, creates a new external BNF rule in grammar G.
The ID of the new rule will be a non-negative integer, which will
be unique to that rule. In addition to BNF rules, Marpa also
allows sequence rules, which are created by the
'marpa_g_sequence_new()' method. *Note marpa_g_sequence_new():
marpa_g_sequence_new.
Sequence rules and BNF rules are both rules: They share the same
series of rule IDs, and are accessed and manipulated by the same
methods, with the only differences being as noted in the
descriptions of those methods.
The LHS symbol is LHS_ID, and there are LENGTH symbols on the RHS.
The RHS symbols are in an array pointed to by RHS_IDS.
Possible hard failures, with their error codes, include:
* 'MARPA_ERR_SEQUENCE_LHS_NOT_UNIQUE': The LHS symbol is the
same as that of a sequence rule.
* 'MARPA_ERR_DUPLICATE_RULE': The new rule would duplicate
another BNF rule. Another BNF rule is considered the
duplicate of the new one, if its LHS symbol is the same as
symbol LHS_ID, if its length is the same as LENGTH, and if its
RHS symbols match one for one those in the array of symbols
RHS_IDS.
*Return value*: On success, the ID of the new external rule. On
hard failure, -2.
-- Function: Marpa_Symbol_ID marpa_g_rule_rhs ( Marpa_Grammar G,
Marpa_Rule_ID RULE_ID, int IX)
[Accessor] When successful, returns the ID of the symbol at index
IX in the RHS of the rule with ID RULE_ID. The indexing of RHS
symbols is zero-based.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist.
A common hard failure is for IX not to be a valid index of the RHS.
This happens if IX is less than zero, or or if IX is greater than
or equal to the length of the rule.
*Return value*: On success, a symbol ID, which is always
non-negative. On soft failure, -1. On hard failure, -2.
File: api.info, Node: Sequence methods, Next: Rank methods, Prev: Rule methods, Up: Grammar methods
13.6 Sequence methods
=====================
-- Function: int marpa_g_rule_is_proper_separation ( Marpa_Grammar G,
Marpa_Rule_ID RULE_ID)
[Accessor] When successful, returns
* 1 if RULE_ID is the ID of a sequence rule whose proper
separation flag is set,
* 0 if RULE_ID is the ID of a sequence rule whose proper
separation flag is not set,
* 0 if RULE_ID is the ID of a rule that is not a sequence rule.
Does not distinguish sequence rules without proper separation from
non-sequence rules. That is, does not distinguish an unset proper
separation flag from a proper separation flag which value is
undefined because RULE_ID is the ID of a BNF rule. Applications
which want to determine whether or not a rule is a sequence rule
can use 'marpa_g_sequence_min()' to do this. *Note
marpa_g_sequence_min(): marpa_g_sequence_min.
Soft fails if RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist.
*Return value*: On success, 1 or 0. On soft failure, -1. On hard
failure, -2.
-- Function: int marpa_g_sequence_min ( Marpa_Grammar G, Marpa_Rule_ID
RULE_ID)
[Accessor] On success, returns the mininum length of a sequence
rule. Soft fails if a rule with ID RULE_ID exists, but is not a
sequence rule. This soft failure can used to test whether or not a
rule is a sequence rule.
Hard fails irrecoverably if RULE_ID is not well-formed (a
non-negative number). Also, hard fails irrecoverably if no rule
with ID RULE_ID exists, even when RULE_ID is well formed. Note
that, in its handling of the non-existence of a rule for its rule
argument, this method differs from many of the other grammar
methods. Grammar methods which take a rule ID argument more often
treat the non-existence of rule for a well-formed rule ID as a
soft, recoverable, failure.
*Return value*: On success, the minimum length of the sequence rule
with ID RULE_ID, which is always non-negative. On soft failure,
-1. On hard failure, -2.
-- Function: Marpa_Rule_ID marpa_g_sequence_new (Marpa_Grammar G,
Marpa_Symbol_ID LHS_ID, Marpa_Symbol_ID RHS_ID,
Marpa_Symbol_ID SEPARATOR_ID, int MIN, int FLAGS )
[Mutator] When successful, adds a new sequence rule to grammar G,
and return its ID. The ID of the sequence rule will be a
non-negative integer, which is unique to that rule. All rules are
numbered in the same series, so that a BNF rule will never have the
same rule ID as a sequence rule, and vice versa.
Sequence rules are "sugar" -- their presence in the Libmarpa
interface does not extend its power. Every Libmarpa grammar which
can be written using sequence rules can be rewritten as a grammar
without sequence rules.
The LHS of the sequence is LHS_ID, and the item to be repeated on
the RHS of the sequence is RHS_ID. The sequence must be repeated
at least MIN times, where MIN is 0 or 1. If SEPARATOR_ID is
non-negative, it is a separator symbol.
The LHS symbol cannot be the LHS of any other rule, whether a BNF
rule or a sequence rule. On an attempt to create an sequence rule
with a duplicate LHS, this method hard fails, with an error code of
'MARPA_ERR_SEQUENCE_LHS_NOT_UNIQUE'.
The sequence RHS, or item, is restricted to a single symbol, and
that symbol cannot be nullable. If SEPARATOR_ID is a symbol, it
also cannot be a nullable symbol. Nullables on the RHS of sequence
rules are prohibited because it is not completely clear what an
application intends when it asks for a sequence of items, some of
which are nullable -- the most natural interpretation of this
usually results in a highly ambiguous grammar.
Libmarpa allows highly ambiquous grammars and a programmer who
wants a grammar with sequences containing nullable items or
separators can can write that grammar using BNF rules. The use of
BNF rules make it clearer that ambiguity is what the programmer
intended, and allows the programmer more flexibility.
If 'flags & MARPA_PROPER_SEPARATION' is non-zero, separation is
"proper", that is, a trailing separator is not allowed. The term
"proper" is based on the idea that properly-speaking, separators
should actually separate items. Proper separation has no effect at
the Libmarpa level -- it is tracked as a convenience for the
higher-level interfaces to Libmarpa, which may want to offer the
ability to discard separators in the semantics. (Some higher-level
interfaces, in fact, may choose to discard separation by default.)
At the Libmarpa level, sequences always "keep separators".
*Return value*: On success, the ID of the newly added sequence
rule, which is always non-negative. On hard failure, -2.
-- Function: int marpa_g_sequence_separator ( Marpa_Grammar G,
Marpa_Rule_ID RULE_ID)
[Accessor] On success, returns the symbol ID of the separator of
the sequence rule with ID RULE_ID. Soft fails if there is no
separator. The causes of hard failure include RULE_ID not being
well-formed; RULE_ID not being the ID of a rule which exists; and
RULE_ID not being the ID a sequence rule.
*Return value*: On success, a symbol ID, which is always
non-negative. On soft failure, -1. On hard failure, -2.
-- Function: int marpa_g_symbol_is_counted (Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
[Accessor] On success, returns a boolean whose value is 1 iff the
symbol with ID SYM_ID is counted. A symbol is "counted" iff
* it appears on the RHS of a sequence rule, or
* it is used as the separator symbol of a sequence rule.
Soft fails iff SYM_ID is well-formed (a non-negative integer), but
a symbol with that ID does not exist.
*Return value*: On success, a boolean. On soft failure, -1. On
hard failure, -2.
File: api.info, Node: Rank methods, Next: Grammar precomputation, Prev: Sequence methods, Up: Grammar methods
13.7 Rank methods
=================
-- Function: Marpa_Rank marpa_g_rule_rank ( Marpa_Grammar G,
Marpa_Rule_ID rule_id)
[Accessor] When successful, returns the rank of the rule with ID
RULE_ID. When a rule is created, its rank is initialized to the
default rank of the grammar. The default rank of the grammar is 0.
*Return value*: On success, returns a rule rank, and sets the error
code to 'MARPA_ERR_NONE'. The rule rank is an integer. On hard
failure, returns -2, and sets the error code to an appropriate
value, which will never be 'MARPA_ERR_NONE'. Note that -2 is a
valid rule rank, so that when -2 is returned, the error code is the
only way to distinguish success from failure. The error code can
be determined using 'marpa_g_error()'. *Note marpa_g_error():
marpa_g_error.
-- Function: Marpa_Rank marpa_g_rule_rank_set ( Marpa_Grammar G,
Marpa_Rule_ID RULE_ID, Marpa_Rank RANK)
[Mutator] When successful, sets the rank of the rule with ID
RULE_ID to RANK and returns RANK.
*Return value*: On success, returns RANK, which will be an integer,
and sets the error code to 'MARPA_ERR_NONE'. On hard failure,
returns -2, and sets the error code to an appropriate value, which
will never be 'MARPA_ERR_NONE'. Note that -2 is a valid rule rank,
so that when -2 is returned, the error code is the only way to
distinguish success from failure. The error code can be determined
using 'marpa_g_error()'. *Note marpa_g_error(): marpa_g_error.
-- Function: int marpa_g_rule_null_high ( Marpa_Grammar G,
Marpa_Rule_ID rule_id)
[Accessor] On success, returns a boolean whose value is 1 iff "null
ranks high" is set in the rule with ID RULE_ID. When a rule is
created, it has "null ranks high" set.
For more on the "null ranks high" setting, read the description of
'marpa_g_rule_null_high_set()'. *Note
marpa_g_rule_null_high_set(): marpa_g_rule_null_high_set.
Soft fails iff RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist.
*Return value*: On success, a boolean. On soft failure, -1. On
hard failure, -2.
-- Function: int marpa_g_rule_null_high_set ( Marpa_Grammar G,
Marpa_Rule_ID RULE_ID, int FLAG)
[Mutator] On success,
* sets "null ranks high" in the rule with ID RULE_ID if the
value of the boolean FLAG is 1;
* unsets "null ranks high" in the rule with ID RULE_ID if the
value of the boolean FLAG is 0; and
* returns FLAG.
The "null ranks high" setting affects the ranking of rules with
properly nullable symbols on their right hand side. If a rule has
properly nullable symbols on its RHS, each instance in which it
appears in a parse will have a pattern of nulled and non-nulled
symbols. Such a pattern is called a "null variant".
If the "null ranks high" is set, nulled symbols rank high. If the
"null ranks high" is unset is the default), nulled symbols rank
low. Ranking of a null variants is done from left-to-right.
Soft fails iff RULE_ID is well-formed (a non-negative integer), but
a rule with that ID does not exist.
Hard fails if the grammar has been precomputed.
*Return value*: On success, a boolean. On soft failure, -1. On
hard failure, -2.
File: api.info, Node: Grammar precomputation, Prev: Rank methods, Up: Grammar methods
13.8 Precomputing the Grammar
=============================
-- Function: int marpa_g_has_cycle (Marpa_Grammar G)
[Accessor] On success, returns a boolean which is 1 iff G has a
cycle. Cycles make a grammar infinitely ambiguous, and are
considered useless in current practice. Cycles make processing the
grammar less efficient, sometimes considerably so. Applications
will almost always want to treat cycles as mistakes on the part of
the writer of the grammar. To determine which rules are in the
cycle, 'marpa_g_rule_is_loop()' can be used.
*Return value*: On success, a boolean. On hard failure, -2.
-- Function: int marpa_g_is_precomputed (Marpa_Grammar G)
[Accessor] *Return value*: On success, a boolean which is 1 iff
grammar G is precomputed. On hard failure, -2.
-- Function: int marpa_g_precompute (Marpa_Grammar G)
[Mutator] On success, and on fully recoverable hard failure,
precomputes the grammar G. Precomputation involves running a
series of grammar checks and "precomputing" some useful information
which is kept internally to save repeated calculations. After
precomputation, the grammar is "frozen" in many respects, and many
grammar mutators which succeed before precomputation will cause
hard failures after precomputation. Precomputation is necessary
for a recognizer to be generated from a grammar.
When called, clears any events already in the event queue. May
return one or more events. The types of event that this method may
return are A 'MARPA_EVENT_LOOP_RULES',
'MARPA_EVENT_COUNTED_NULLABLE', 'MARPA_EVENT_NULLING_TERMINAL'.
All of these events occur only on failure. Applications must be
prepared for this method to return additional events, including
events which occur on success. Events may be queried using the
'marpa_g_event()' method. *Note marpa_g_event(): marpa_g_event.
The fully recoverable hard failure is
'MARPA_ERR_GRAMMAR_HAS_CYCLE'. Recall that for fully recoverable
hard failures this method precomputes the grammar. Most
appplications, however, will want to treat a grammar with cycles as
if it were a library-recoverable error. A
'MARPA_ERR_GRAMMAR_HAS_CYCLE' error occurs iff a
'MARPA_EVENT_LOOP_RULES' event occurs. For more details on cycles,
*note marpa_g_has_cycle(): marpa_g_has_cycle.
The error code 'MARPA_ERR_COUNTED_NULLABLE' is library-recoverable.
This failure occurs when a symbol on the RHS of a sequence rule is
nullable, which Libmarpa does not allow in a grammar. Error code
'MARPA_ERR_COUNTED_NULLABLE' occurs iff one or more
'MARPA_EVENT_COUNTED_NULLABLE' events occur. There is one
'MARPA_EVENT_COUNTED_NULLABLE' event for every symbol which is a
nullable on the right hand side of a sequence rule. An application
may use these events to inform the user of the problematic symbols,
and this detail may help the user fix the grammar.
The error code item 'MARPA_ERR_NULLING_TERMINAL' is
library-recoverable. This failure occurs when a nulling symbol is
also flagged as a terminal. Since terminals cannot be of zero
length, this is a logical impossibility, and Libmarpa does not
allow nulling terminals in a grammar. Error code item
'MARPA_ERR_NULLING_TERMINAL' occurs iff one or more
'MARPA_EVENT_NULLING_TERMINAL' events occur. There is one
'MARPA_EVENT_NULLING_TERMINAL' events for every nulling terminal in
the grammar. An application may use these events to inform the
user of the problematic symbols, and this detail may help the user
fix the grammar.
Among the other error codes which may case this method to fail are
the following:
* 'MARPA_ERR_NO_RULES': The grammar has no rules.
* 'MARPA_ERR_NO_START_SYMBOL': No start symbol was specified.
* 'MARPA_ERR_INVALID_START_SYMBOL': A start symbol ID was
specified, but it is not the ID of a valid symbol.
* 'MARPA_ERR_START_NOT_LHS': The start symbol is not on the LHS
of any rule.
* 'MARPA_ERR_UNPRODUCTIVE_START': The start symbol is not
productive.
More details of these can be found under the description of the
appropriate code. *Note External error codes::.
*Return value*: On success, a non-negative number, whose value is
otherwise indeterminate. On hard failure, -2. For the error code
'MARPA_ERR_GRAMMAR_HAS_CYCLE', the hard failure is fully
recoverable. For the error codes 'MARPA_ERR_COUNTED_NULLABLE' and
'MARPA_ERR_NULLING_TERMINAL', the hard failure is
library-recoverable.
File: api.info, Node: Recognizer methods, Next: Progress reports, Prev: Grammar methods, Up: Top
14 Recognizer methods
*********************
* Menu:
* Recognizer overview::
* Creating a new recognizer::
* Recognizer reference counting::
* Recognizer life cycle mutators::
* Location accessors::
* Other parse status methods::
File: api.info, Node: Recognizer overview, Next: Creating a new recognizer, Prev: Recognizer methods, Up: Recognizer methods
14.1 Recognizer overview
========================
An archetypal application uses a recognizer to read input. To create a
recognizer, use the 'marpa_r_new()' method. When a recognizer is no
longer in use, its memory can be freed using the 'marpa_r_unref()'
method.
To make a recognizer ready for input, use the 'marpa_r_start_input()'
method.
The recognizer starts with its current earleme at location 0. To
read a token at the current earleme, use the 'marpa_r_alternative()'
call.
To complete the processing of the current earleme, and move forward
to a new one, use the 'marpa_r_earleme_complete()' call.
File: api.info, Node: Creating a new recognizer, Next: Recognizer reference counting, Prev: Recognizer overview, Up: Recognizer methods
14.2 Creating a new recognizer
==============================
-- Function: Marpa_Recognizer marpa_r_new ( Marpa_Grammar G )
[Constructor] On success, creates a new recognizer and increments
the reference count of G, the base grammar, by one. In the new
recognizer,
* the reference count will be 1;
* the furthest earleme will be 0; and
* latest and current earleme will be undefined.
*Return value*: On success, the newly created recognizer, which is
never 'NULL'. If G is not precomputed, or on other hard failure,
'NULL'.
File: api.info, Node: Recognizer reference counting, Next: Recognizer life cycle mutators, Prev: Creating a new recognizer, Up: Recognizer methods
14.3 Keeping the reference count of a recognizer
================================================
-- Function: Marpa_Recognizer marpa_r_ref (Marpa_Recognizer R)
[Mutator] Increases the reference count by 1. This method is not
needed by most applications.
*Return value*: On success, the recognizer object, R, which is
never 'NULL'. On hard failure, 'NULL'.
-- Function: void marpa_r_unref (Marpa_Recognizer R)
[Destructor] Decreases the reference count by 1, destroying R once
the reference count reaches zero. When R is destroyed, the
reference count of its base grammar is decreased by one. If this
takes the reference count of the base grammar to zero, the base
grammar is also destroyed.
File: api.info, Node: Recognizer life cycle mutators, Next: Location accessors, Prev: Recognizer reference counting, Up: Recognizer methods
14.4 Life cycle mutators
========================
-- Function: int marpa_r_start_input (Marpa_Recognizer R)
[Mutator] When successful, does the following:
* Readies R to accept input.
* Completes the first Earley set, which is the Earley set whose
ID is 0 and which is located at earleme 0.
* Leaves the latest, current and furthest earlemes all at 0.
* Clears any events that were in the event queue before this
method was called.
* If this method exhausts the parse, generates a
'MARPA_EVENT_EXHAUSTED' event. *Note Exhaustion::.
* May generate one or more 'MARPA_EVENT_SYMBOL_NULLED',
'MARPA_EVENT_SYMBOL_PREDICTED', or
'MARPA_EVENT_SYMBOL_EXPECTED' events. *Note Events::.
*Return value*: On success, a non-negative value, whose value is
otherwise indeterminate. On hard failure, -2.
-- Function: int marpa_r_alternative (Marpa_Recognizer R,
Marpa_Symbol_ID TOKEN_ID, int VALUE, int LENGTH)
The TOKEN_ID argument must be the symbol ID of a terminal. The
VALUE argument is an integer that represents the "value" of the
token, and which should not be zero. The LENGTH argument is the
length of the token, which must be greater than zero.
On success, does the following, where CURRENT is the value of the
current earleme before the call and FURTHEST is the value of the
furthest earleme before the call:
* Reads a new token into R. The symbol ID of the token will be
TOKEN_ID. The token will start at CURRENT and end at
'CURRENT+LENGTH'.
* Sets the value of the furthest earleme to
'max(CURRENT+LENGTH,FURTHEST)'.
* Leaves the values of the latest and current earlemes
unchanged.
After recoverable failure, the following are the case:
* The tokens read into R are unchanged. Specifically, no new
token has been read into R.
* The values of the latest, current and furthest earlemes are
unchanged.
Libmarpa allows tokens to be ambiguous. Two tokens are ambiguous
if they end at the same earleme location. If two tokens are
ambiguous, Libmarpa will attempt to produce all the parses that
include either of them.
Libmarpa allows tokens to overlap. Let the notation T@S-E indicate
that token T starts at earleme S and ends at earleme E. Let
T1@S1-E1 and T2@S2-E2 be two tokens such that S1<=S2. We say that
T1 and T2 overlap iff E1>S2.
The VALUE argument is not used inside Libmarpa -- it is simply
stored to be returned by the valuator as a convenience for the
application. In applications where the token's actual value is not
an integer, it is expected that the application will use VALUE as a
"virtual" value, perhaps finding the actual value by using VALUE to
index an array. Some applications may prefer to track token values
on their own, perhaps based on the earleme location and TOKEN_ID,
instead of using Libmarpa's token values.
A VALUE of 0 does not cause a failure, but it is reserved for
unvalued symbols, a now-deprecated feature. *Note Valued and
unvalued symbols::.
Hard fails irrecoverably with 'MARPA_ERR_DUPLICATE_TOKEN' if the
token added would be a duplicate. Two tokens are duplicates iff
all of the following are true:
* They would have the same start earleme. In other words, if
'marpa_r_alternative()' attempts to read them while at the
same current earleme.
* They have the same TOKEN_ID.
* They have the same LENGTH.
If a token was not accepted because of its token ID, hard fails
with the 'MARPA_ERR_UNEXPECTED_TOKEN_ID'. This hard failure is
fully recoverable so that, for example, the application may retry
this method with different token IDs until it succeeds. These
retries are efficient, and are quite useable as a parsing technique
-- so much so we have given the technique a name: "the Ruby
Slippers". The Ruby Slippers are used in several applications.
*Return value*: On success, 'MARPA_ERR_NONE'. On failure, an error
code other than 'MARPA_ERR_NONE'. The hard failure for
'MARPA_ERR_UNEXPECTED_TOKEN_ID' is fully recoverable.
-- Function: int marpa_r_earleme_complete (Marpa_Recognizer R)
For the purposes of this method description, we define the
following:
* CURRENT is the value of the current earleme before the call of
'marpa_r_earleme_complete'.
* LATEST is the value of the latest earleme before the call of
'marpa_r_earleme_complete'.
* An "expected" terminal is one expected at a current earleme,
in the same sense that 'marpa_r_terminal_is_expected()'
determines if a terminal is "expected" at the current earleme.
*Note marpa_r_terminals_expected():
marpa_r_terminals_expected.
* An "anticipated" terminal is one that was accepted by the
'marpa_r_alternative()' to end at an earleme after the current
earleme. An anticipated terminal will have length greater
than one. "Anticipated" terminals only occur if the
application is using an advanced model of input. *Note
Advanced input models::.
On success, does the final processing for the current earleme,
including the following:
* Advances the current earleme, incrementing its value by 1.
That is, sets the current earleme to 'CURRENT+1'.
* If any token was accepted at CURRENT, creates a new Earley set
which will be the latest Earley set. After the call, the
latest earleme will be equal to the new current earleme,
'CURRENT+1'.
* If no token was accepted at CURRENT, no Earley set is created.
After the call, the value of the latest earleme will be
unchanged -- that is, it will remain at LATEST. Success when
no tokens were accepted at CURRENT can only occur if the
application is using an advanced model of input. *Note
Advanced input models::.
* The value of the furthest earleme is never changed by a call
to 'marpa_r_earleme_complete()'.
* Clears the event queue of any events which occured before this
method was called.
* May generate one or more 'MARPA_EVENT_SYMBOL_COMPLETED',
'MARPA_EVENT_SYMBOL_NULLED', 'MARPA_EVENT_SYMBOL_PREDICTED',
or 'MARPA_EVENT_SYMBOL_EXPECTED' events. *Note Events::.
* If an application-settable threshold on the number of Earley
items has been reached or exceeded, generates a
'MARPA_EVENT_EARLEY_ITEM_THRESHOLD' event. Often, the
application will want to treat this event as if it were a
library-recoverable failure. *Note
marpa_r_earley_item_warning_threshold_set():
marpa_r_earley_item_warning_threshold_set.
* If the parse is exhausted, triggers a 'MARPA_EVENT_EXHAUSTED'
event. Exhaustion on success only occurs if no terminals are
expected at the current earleme after the call to this method
(that is, at 'CURRENT+1') and no terminals are anticipated
after 'CURRENT+1'.
On hard failure with the code 'MARPA_ERR_PARSE_EXHAUSTED', does the
following:
* Leaves the current earleme at CURRENT. The current earleme
will be the same as the furthest earleme.
* The value of the furthest earleme is never changed by a call
to 'marpa_r_earleme_complete()'.
* Leaves the value of the latest earleme at LATEST. No new
Earley set is created.
* Sets the parse exhausted, so that no more tokens will be
accepted. *Note Exhaustion::.
* Leaves the parse in a state where no terminals are expected or
anticipated.
* Clears the event queue of any events which occured before the
call to this method.
* Triggers a 'MARPA_EVENT_EXHAUSTED' event and no others.
* Leaves valid any parses that were valid at the current or
earlier earlemes. Processing with these can continue, and it
for this reason that we consider hard failures with the code
'MARPA_ERR_PARSE_EXHAUSTED' to be fully recoverable.
We note that exhaustion can occur when this method fails and when
it succeeds. The distinction is that, on success, the call creates
a new Earley set before becoming exhausted while, on failure, it
becomes exhausted without creating a new Earley set.
*Return value*: On success, the number of events generated. On
hard failure, -2. Hard failure with the code
'MARPA_ERR_PARSE_EXHAUSTED' is fully recoverable.
File: api.info, Node: Location accessors, Next: Other parse status methods, Prev: Recognizer life cycle mutators, Up: Recognizer methods
14.5 Location accessors
=======================
-- Function: 'Marpa_Earleme' marpa_r_current_earleme (Marpa_Recognizer
R)
Return value: If input has started, the current earleme. If input
has not started, -1. Always succeeds.
-- Function: Marpa_Earleme marpa_r_earleme ( Marpa_Recognizer R,
Marpa_Earley_Set_ID SET_ID)
In the default, token-stream model, Earley set ID and earleme are
always equal, but this is not the case in other input models. (The
ID of an Earley set ID is also called its ordinal.) If there is no
Earley set whose ID is SET_ID, 'marpa_r_earleme()' fails. If
SET_ID was negative, the error code is set to
'MARPA_ERR_INVALID_LOCATION'. If SET_ID is greater than the
ordinal of the latest Earley set, the error code is set to
'MARPA_ERR_NO_EARLEY_SET_AT_LOCATION'.
At this writing, there is no method for the inverse operation
(conversion of an earleme to an Earley set ID). One consideration
in writing such a method is that not all earlemes correspond to
Earley sets. Applications that want to map earlemes to Earley sets
will have no trouble if they are using the standard input model --
the Earley set ID is always exactly equal to the earleme in that
model. For other applications that want an earleme-to-ID mapping,
the most general method is create an ID-to-earleme array using the
'marpa_r_earleme()' method and invert it.
Return value: On success, the earleme corresponding to Earley set
SET_ID. On failure, -2.
-- Function: int marpa_r_earley_set_value ( Marpa_Recognizer R,
Marpa_Earley_Set_ID earley_set)
Returns the integer value of EARLEY_SET. For more details, see the
description of 'marpa_r_earley_set_values()'.
Return value: On success, the value of EARLEY_SET. On failure, -2.
-- Function: int marpa_r_earley_set_values ( Marpa_Recognizer R,
Marpa_Earley_Set_ID earley_set, int* p_value, void** p_pvalue
)
If P_VALUE is non-zero, sets the location pointed to by P_VALUE to
the integer value of the Earley set. Similarly, if P_PVALUE is
non-zero, sets the location pointed to by P_PVALUE to the pointer
value of the Earley set.
The "value" and "pointer" of an Earley set are an arbitrary integer
and an arbitrary pointer that the application can use for its own
purposes. In character-per-earleme input models, for example, the
integer can be the codepoint of the current character. In a
traditional token-per-earleme input model, they could be used to
indicate the string value of the token - the pointer could point to
the start of the string, and the integer could indicate its length.
The Earley set value and pointer can be set using the
'marpa_r_latest_earley_set_values_set()' method. The Earley set
integer value defaults to -1, and the pointer value defaults to
'NULL'.
Return value: On success, returns a non-negative integer. On
failure, returns -2.
-- Function: 'unsigned int' marpa_r_furthest_earleme (Marpa_Recognizer
R)
Always returns the furthest earleme.
*Return value*: On success, the furthest earleme. Always succeeds.
-- Function: Marpa_Earley_Set_ID marpa_r_latest_earley_set
(Marpa_Recognizer R)
This method returns the Earley set ID (ordinal) of the latest
Earley set. Applications that want the value of the latest earleme
can convert this value using the 'marpa_r_earleme()' method.
Return value: On success, the ID of the latest Earley set. Always
succeeds.
-- Function: int marpa_r_latest_earley_set_value_set ( Marpa_Recognizer
R, int value)
Sets the integer value of the latest Earley set. For more details,
see the description of 'marpa_r_latest_earley_set_values_set()'.
Return value: On success, the new value of EARLEY_SET. On failure,
-2.
-- Function: int marpa_r_latest_earley_set_values_set (
Marpa_Recognizer R, int value, void* pvalue)
Sets the integer and pointer value of the latest Earley set. For
more about the "integer value" and "pointer value" of an Earley
set, see the description of the 'marpa_r_earley_set_values()'
method.
Return value: On success, returns a non-negative integer. On
failure, returns -2.
File: api.info, Node: Other parse status methods, Prev: Location accessors, Up: Recognizer methods
14.6 Other parse status methods
===============================
-- Function: int marpa_r_earley_item_warning_threshold
(Marpa_Recognizer R)
Returns the Earley item warning threshold. *Note
marpa_r_earley_item_warning_threshold_set():
marpa_r_earley_item_warning_threshold_set.
*Return value*: The Earley item warning threshold. Always
succeeds.
-- Function: int marpa_r_earley_item_warning_threshold_set
(Marpa_Recognizer R, int THRESHOLD)
[Mutator] On success, sets the Earley item warning threshold. The
"Earley item warning threshold" is a number that is compared with
the count of Earley items in each Earley set. When it is matched
or exceeded, a 'MARPA_EVENT_EARLEY_ITEM_THRESHOLD' event is
created. *Note MARPA_EVENT_EARLEY_ITEM_THRESHOLD::.
If THRESHOLD is zero or less, an unlimited number of Earley items
will be allowed without warning. This will rarely be what the user
wants.
By default, Libmarpa calculates a value based on the grammar. The
formula Libmarpa uses is the result of some experience, and most
applications will be happy with it.
What should be done when the threshold is exceeded, depends on the
application, but exceeding the threshold means that it is very
likely that the time and space resources consumed by the parse will
prove excessive. This is often a sign of a bug in the grammar.
Applications often will want to smoothly shut down the parse, in
effect treating the 'MARPA_EVENT_EARLEY_ITEM_THRESHOLD' event as
equivalent to library-recoverable hard failure.
Return value: The value that the Earley item warning threshold has
after the method call is finished. Always succeeds.
-- Function: int marpa_r_is_exhausted (Marpa_Recognizer R)
A parser is "exhausted" if it cannot accept any more input. Both
successful and failed parses can be exhausted. In many grammars,
the parse is always exhausted as soon as it succeeds. Good parses
may also exist at earlemes prior to the current one.
Return value: 1 if the parser is exhausted, 0 otherwise. Always
succeeds.
-- Function: int marpa_r_terminals_expected ( Marpa_Recognizer R,
Marpa_Symbol_ID* BUFFER)
Returns a list of the ID's of the symbols that are acceptable as
tokens at the current earleme. BUFFER is expected to be large
enough to hold the result. This is guaranteed to be the case if
the buffer is large enough to hold a number of 'Marpa_Symbol_ID''s
that is greater than or equal to the number of symbols in the
grammar.
Return value: On success, the number of 'Marpa_Symbol_ID''s in
BUFFER. On failure, -2.
-- Function: int marpa_r_terminal_is_expected ( Marpa_Recognizer R,
Marpa_Symbol_ID SYMBOL_ID)
Return values on success: If SYMBOL_ID is the ID of a valid
terminal symbol that is expected at the current earleme, a number
greater than zero. If SYMBOL_ID is the ID of a valid terminal
symbol that is *not* expected at the current earleme, or if
SYMBOL_ID is the ID of a valid symbol that is not a terminal, zero.
Failure cases: Returns -2 on failure. It is a failure if SYMBOL_ID
is not the ID of a valid symbol.
File: api.info, Node: Progress reports, Next: Bocage methods, Prev: Recognizer methods, Up: Top
15 Progress reports
*******************
An important advantage of the Marpa algorithm is the ability to easily
get full information about the state of the parse.
To start a progress report, use the 'marpa_r_progress_report_start()'
command. Only one progress report can be in use at any one time.
To get the information in a progress report, it is necessary to step
through the progress report items. To get the data for the current
progress report item, and advance to the next one, use the
'marpa_r_progress_item()' method.
To destroy a progress report, freeing the memory it uses, call the
'marpa_r_progress_report_finish()' method.
-- Function: int marpa_r_progress_report_reset ( Marpa_Recognizer R)
Resets the progress report. Assumes a report of the progress has
already been initialized at some Earley set for recognizer R, with
'marpa_r_progress_report_start()'. The reset progress report will
be positioned before its first item.
Return value: On success, a non-negative value. On failure, -2.
-- Function: int marpa_r_progress_report_start ( Marpa_Recognizer R,
Marpa_Earley_Set_ID SET_ID)
Initializes a report of the progress at Earley set SET_ID for
recognizer R. If a progress report already exists, it is destroyed
and its memory is freed. Initially, the progress report is
positioned before its first item.
If no Earley set with ID SET_ID exists,
'marpa_r_progress_report_start()' fails. The error code is
'MARPA_ERR_INVALID_LOCATION' if SET_ID is negative. The error code
is 'MARPA_ERR_NO_EARLEY_SET_AT_LOCATION' if SET_ID is greater than
the ID of the latest Earley set.
Return value: On success, the number of report items available. If
the recognizer has not been started; if SET_ID does not exist; or
on other failure, -2.
-- Function: int marpa_r_progress_report_finish ( Marpa_Recognizer R )
Destroys the report of the progress at Earley set SET_ID for
recognizer R, freeing the memory and other resources. It is often
not necessary to call this method. Any previously existing
progress report is destroyed automatically whenever a new progress
report is started, and when the recognizer is destroyed.
Return value: -2 if no progress report has been started, or on
other failure. On success, a non-negative value.
-- Function: Marpa_Rule_ID marpa_r_progress_item ( Marpa_Recognizer R,
int* POSITION, Marpa_Earley_Set_ID* ORIGIN )
This method allows access to the data for the next item of a
progress report. If there are no more progress report items, it
returns -1 as a termination indicator and sets the error code to
'MARPA_ERR_PROGRESS_REPORT_EXHAUSTED'. Either the termination
indicator, or the item count returned by
'marpa_r_progress_report_start()', can be used to determine when
the last item has been seen.
On success, the dot position is returned in the location pointed to
by the POSITION argument, and the origin is returned in the
location pointed to by the ORIGIN argument. On failure, the
locations pointed to by the POSITION and ORIGIN arguments are
unchanged.
Return value: On success, the rule ID of the next progress report
item. If there are no more progress report items, -1. If either
the POSITION or the ORIGIN argument is 'NULL', or on other failure,
-2.
File: api.info, Node: Bocage methods, Next: Ordering methods, Prev: Progress reports, Up: Top
16 Bocage methods
*****************
* Menu:
* Bocage overview::
* Bocage constructor::
* Bocage reference counting::
* Bocage accessor::
File: api.info, Node: Bocage overview, Next: Bocage constructor, Prev: Bocage methods, Up: Bocage methods
16.1 Overview
=============
A bocage is structure containing the full set of parses found by
processing the input according to the grammar. The bocage structure is
new with Libmarpa, but is very similar in purpose to the more familar
parse forests.
To create a bocage, use the 'marpa_b_new()' method.
When a bocage is no longer in use, its memory can be freed using the
'marpa_b_unref()' method.
File: api.info, Node: Bocage constructor, Next: Bocage reference counting, Prev: Bocage overview, Up: Bocage methods
16.2 Creating a new bocage
==========================
-- Function: Marpa_Bocage marpa_b_new (Marpa_Recognizer R,
Marpa_Earley_Set_ID EARLEY_SET_ID)
Creates a new bocage object, with a reference count of 1. The
reference count of its parent recognizer object, R, is increased by
1. If EARLEY_SET_ID is -1, the Earley set at the current earleme
is used, if there is one.
If EARLEY_SET_ID is -1 and there is no Earley set at the current
earleme; or if EARLEY_SET_ID is -1 and there is no parse ending at
Earley set EARLEY_SET_ID, 'marpa_b_new()' fails and the error code
is set to 'MARPA_ERR_NO_PARSE'.
Success return value: On success, the new bocage object. On
failure, 'NULL'.
File: api.info, Node: Bocage reference counting, Next: Bocage accessor, Prev: Bocage constructor, Up: Bocage methods
16.3 Reference counting
=======================
-- Function: Marpa_Bocage marpa_b_ref (Marpa_Bocage B)
Increases the reference count by 1. Not needed by most
applications.
Return value: On success, B. On failure, 'NULL'.
-- Function: void marpa_b_unref (Marpa_Bocage B)
Decreases the reference count by 1, destroying B once the reference
count reaches zero. When B is destroyed, the reference count of
its parent recognizer is decreased by 1. If this takes the
reference count of the parent recognizer to zero, it too is
destroyed. If the parent recognizer is destroyed, the reference
count of its base grammar is decreased by 1. If this takes the
reference count of the base grammar to zero, it too is destroyed.
File: api.info, Node: Bocage accessor, Prev: Bocage reference counting, Up: Bocage methods
16.4 Accessors
==============
-- Function: int marpa_b_ambiguity_metric (Marpa_Bocage B)
Returns an ambiguity metric. The metric is 1 is the parse is
unambiguous. If the metric is 2 or greater, the parse is
ambiguous. It was originally intended to have values greater than
2 be an cheaply computed estimate of the degree of ambiguity, but a
satisfactory scheme for this has yet to be implemented.
Return value on success: 1 if the bocage is not for an ambiguous
parse; 2 or greater if the bocage is for an ambiguous parse.
Failures: On failure, -2.
-- Function: int marpa_b_is_null (Marpa_Bocage B)
Return value on success: A number greater than or equal to 1 if the
bocage is for a null parse; otherwise, 0.
Failures: On failure, -2.
File: api.info, Node: Ordering methods, Next: Tree methods, Prev: Bocage methods, Up: Top
17 Ordering methods
*******************
* Menu:
* Ordering overview::
* Ordering constructor::
* Ordering reference counting::
* Order accessor::
* Non-default ordering::
File: api.info, Node: Ordering overview, Next: Ordering constructor, Prev: Ordering methods, Up: Ordering methods
17.1 Overview
=============
Before iterating the parses in the bocage, they must be ordered. To
create an ordering, use the 'marpa_o_new()' method. When an ordering is
no longer in use, its memory can be freed using the 'marpa_o_unref()'
method.
An ordering is "frozen" once the first tree iterator is created using
it. A frozen ordering cannot be changed.
As of this writing, the only methods to order parses are internal and
undocumented. This is expected to change.
File: api.info, Node: Ordering constructor, Next: Ordering reference counting, Prev: Ordering overview, Up: Ordering methods
17.2 Creating an ordering
=========================
-- Function: Marpa_Order marpa_o_new ( Marpa_Bocage B)
Creates a new ordering object, with a reference count of 1. The
reference count of its parent bocage object, B, is increased by 1.
Return value: On success, the new ordering object. On failure,
'NULL'.
File: api.info, Node: Ordering reference counting, Next: Order accessor, Prev: Ordering constructor, Up: Ordering methods
17.3 Reference counting
=======================
-- Function: Marpa_Order marpa_o_ref ( Marpa_Order O)
Increases the reference count by 1. Not needed by most
applications.
Return value: On success, O. On failure, 'NULL'.
-- Function: void marpa_o_unref ( Marpa_Order O)
Decreases the reference count by 1, destroying O once the reference
count reaches zero. Beginning with O's parent bocage, Libmarpa
then proceeds up the chain of parent objects. Every time a child
is destroyed, the reference count of its parent is decreased by 1.
Every time the reference count of an object is decreased by 1, if
that reference count is now zero, that object is destroyed.
Libmarpa follows this chain of decrements and destructions as
required, all the way back to the base grammar, if necessary.
File: api.info, Node: Order accessor, Next: Non-default ordering, Prev: Ordering reference counting, Up: Ordering methods
17.4 Accessors
==============
-- Function: int marpa_o_ambiguity_metric (Marpa_Order O)
Returns an ambiguity metric. The metric is 1 is the parse is
unambiguous. If the metric is 2 or greater, the parse is
ambiguous. It was originally intended to have values greater than
2 be an cheaply computed estimate of the degree of ambiguity, but a
satisfactory scheme for this has yet to be implemented.
If the ordering is not already frozen, it will be frozen on return
from 'marpa_o_ambiguity_metric()'. 'marpa_o_ambiguity_metric()' is
considered an "accessor", because it is assumed that the ordering
is frozen when 'marpa_o_ambiguity_metric()' is called.
Return value on success: 1 if the ordering is not for an ambiguous
parse; 2 or greater if the ordering is for an ambiguous parse.
Failures: On failure, -2.
-- Function: int marpa_o_is_null (Marpa_Order O)
Return value on success: A number greater than or equal to 1 if the
ordering is for a null parse; otherwise, 0.
Failures: On failure, -2.
File: api.info, Node: Non-default ordering, Prev: Order accessor, Up: Ordering methods
17.5 Non-default ordering
=========================
-- Function: int marpa_o_high_rank_only_set ( Marpa_Order O, int FLAG)
-- Function: int marpa_o_high_rank_only ( Marpa_Order O)
These methods, respectively, set and query the "high rank only"
flag of ordering O. A FLAG of 1 indicates that, when ranking, all
choices should be discarded except those of the highest rank. A
FLAG of 0 indicates that no choices should be discarded on the
basis of their rank.
A value of 1 is the default. The value of the "high rank only"
flag has no effect unless ranking has been turned on using the
'marpa_o_rank()' method.
Return value: On success, the value of the "high rank only" flag
*after* the call. On failure, -2.
-- Function: int marpa_o_rank ( Marpa_Order O )
By default, the ordering of parse trees is arbitrary. This method
causes the ordering to be ranked according to the ranks of symbols
and rules, the "null ranks high" flags of the rules, and the "high
rank only" flag of the ordering. Once this method returns, the
ordering is frozen.
Return value: On success, a non-negative value. On failure, -2.
File: api.info, Node: Tree methods, Next: Value methods, Prev: Ordering methods, Up: Top
18 Tree methods
***************
* Menu:
* Tree overview::
* Tree constructor::
* Tree reference counting::
* Tree iteration::
File: api.info, Node: Tree overview, Next: Tree constructor, Prev: Tree methods, Up: Tree methods
18.1 Overview
=============
Once the bocage has an ordering, the parses trees can be iterated.
Marpa's "parse tree iterators" iterate the parse trees contained in a
bocage object. In Libmarpa, "parse tree iterators" are usually just
called "trees".
To create a tree, use the 'marpa_t_new()' method. A newly created
tree iterator is positioned before the first parse tree. When a tree
iterator is no longer in use, its memory can be freed using the
'marpa_t_unref()' method.
To position a newly created tree iterator at the first parse tree,
use the 'marpa_t_next()' method. Once the tree iterator is positioned
at a parse tree, the same 'marpa_t_next()' method is used to position it
to the next parse tree.
File: api.info, Node: Tree constructor, Next: Tree reference counting, Prev: Tree overview, Up: Tree methods
18.2 Creating a new tree iterator
=================================
-- Function: Marpa_Tree marpa_t_new (Marpa_Order O)
Creates a new tree iterator, with a reference count of 1. The
reference count of its parent ordering object, O, is increased by
1.
When initialized, a tree iterator is positioned before the first
parse tree. To position the tree iterator to the first parse, the
application must call 'marpa_t_next()'.
Return value: On success, a newly created tree. On failure,
'NULL'.
File: api.info, Node: Tree reference counting, Next: Tree iteration, Prev: Tree constructor, Up: Tree methods
18.3 Reference counting
=======================
-- Function: Marpa_Tree marpa_t_ref (Marpa_Tree T)
Increases the reference count by 1. Not needed by most
applications.
Return value: On success, T. On failure, 'NULL'.
-- Function: void marpa_t_unref (Marpa_Tree T)
Decreases the reference count by 1, destroying T once the reference
count reaches zero. Beginning with T's parent ordering, Libmarpa
then proceeds up the chain of parent objects. Every time a child
is destroyed, the reference count of its parent is decreased by 1.
Every time the reference count of an object is decreased by 1, if
that reference count is now zero, that object is destroyed.
Libmarpa follows this chain of decrements and destructions as
required, all the way back to the base grammar, if necessary.
File: api.info, Node: Tree iteration, Prev: Tree reference counting, Up: Tree methods
18.4 Iterating through the trees
================================
-- Function: int marpa_t_next ( Marpa_Tree T)
Positions T at the next parse tree in the iteration. Tree
iterators are initialized to the position before the first parse
tree, so this method must be called before creating a valuator from
a tree.
If a tree iterator is positioned after the last parse, the tree is
said to be "exhausted". A tree iterator for a bocage with no parse
trees is considered to be "exhausted" when initialized. If the
tree iterator is exhausted, 'marpa_t_next()' returns -1 as a
termination indicator, and sets the error code to
'MARPA_ERR_TREE_EXHAUSTED'.
Return value: On success, a non-negative value. If the tree
iterator is exhausted, -1. On failure, -2.
-- Function: int marpa_t_parse_count ( Marpa_Tree T)
The parse counter counts the number of parse trees traversed so
far. The count includes the current iteration of the tree, so that
a value of 0 indicates that the tree iterator is at its initialized
position, before the first parse tree.
Return value: The number of parses traversed so far. Always
succeeds.
File: api.info, Node: Value methods, Next: Events, Prev: Tree methods, Up: Top
19 Value methods
****************
* Menu:
* Value overview::
* How to use the valuator::
* Advantages of step-driven valuation::
* Maintaining the stack::
* Valuator constructor::
* Valuator reference counting::
* Stepping through the valuator::
* Valuator steps by type::
* Basic step accessors::
* Other step accessors::
File: api.info, Node: Value overview, Next: How to use the valuator, Prev: Value methods, Up: Value methods
19.1 Overview
=============
The archetypal application needs a value object (or "valuator") to
produce the value of the parse. To create a valuator, use the
'marpa_v_new()' method. When a valuator is no longer in use, its memory
can be freed using the 'marpa_v_unref()' method.
The application is required to maintain the stack, and the
application is also required to implement most of the semantics,
including the evaluation of rules. Libmarpa's valuator provides
instructions to the application on how to manipulate the stack. To
iterate through this series of instructions, use the 'marpa_v_step()'
method.
When successful, 'marpa_v_step()' returns the type of step. Most
step types have values associated with them. To access these values use
the methods described in the section *note Basic step accessors::. How
to perform the steps is described in the sections *note How to use the
valuator:: and *note Stepping through the valuator::.
File: api.info, Node: How to use the valuator, Next: Advantages of step-driven valuation, Prev: Value overview, Up: Value methods
19.2 How to use the valuator
============================
Libmarpa's valuator provides the application with "steps", which are
instructions for stack manipulation. Libmarpa itself does not maintain
a stack. This leaves the upper layer in total control of the stack and
the values which are placed on it.
As example may make this clearer. Suppose the evalution is at a
place in the parse tree where an addition is being performed. Libmarpa
does not know that the operation is an addition. It will tell the
application that rule number R is to be applied to the arguments at
stack locations N and N+1, and that the result is to placed in stack
location N.
In this system the application keeps track of the semantics for all
rules, so it looks up rule R and determines that it is an addition. The
application can do this by using R as an index into an array of
callbacks, or by any other method it chooses. Let's assume a callback
implements the semantics for rule R. Libmarpa has told the application
that two arguments are available for this operation, and that they are
at locations N and N+1 in the stack. They might be the numbers 42 and
711. So the callback is called with its two arguments, and produces a
return value, let's say, 753. Libmarpa has told the application that
the result belongs at location N in the stack, so the application writes
753 to location N.
Since Libmarpa knows nothing about the semantics, the operation for
rule R could be string concatenation instead of addition. Or, if it is
addition, it could allow for its arguments to be floating point or
complex numbers. Since the application maintains the stack, it is up to
the application whether the stack contains integers, strings, complex
numbers, or polymorphic objects which are capable of being any of these
things and more.
File: api.info, Node: Advantages of step-driven valuation, Next: Maintaining the stack, Prev: How to use the valuator, Up: Value methods
19.3 Advantages of step-driven valuation
========================================
Step-driven valuation hides Libmarpa's grammar rewrites from the
application, and is quite efficient. Libmarpa knows which rules are
sequences. Libmarpa optimizes stack manipulations based on this
knowledge. Long sequences are very common in practical grammars. For
these, the stack manipulations suggested by Libmarpa's step-driven
valuator will be significantly faster than the traditional stack
evaluation algorithm.
Step-driven evalution has another advantage. To illustrate this,
consider what is a very common case: The semantics are implemented in a
higher-level language, using callbacks. If Libmarpa did not use
step-driven valuation, it would need to provide for this case. But for
generality, Libmarpa would have to deal in C callbacks. Therefore, a
middle layer would have to create C language wrappers for the callbacks
in the higher level language.
The implementation that results is this: The higher level language
would need to wrap each callback in C. When calling Libmarpa, it would
pass the wrappered callback. Libmarpa would then need to call the C
language "wrappered" callback. Next, the wrapper would call the
higher-level language callback. The return value, which would be data
native to the higher-level language, would need to be passed to the C
language wrapper, which will need to make arrangements for it to be
based back to the higher-level language when appropriate.
A setup like this is not terribly efficient. And exception handling
across language boundaries would be very tricky. But neither of these
is the worst problem.
Callbacks are hard to debug. Wrappered callbacks are even worse.
Calls made across language boundaries are harder yet to debug. In the
system described above, by the time a return value is finally consumed,
a language boundary will have been crossed four times.
How do Libmarpa users deal with difficulties like this? Usually, by
doing the absolute minimum possible in the callbacks. A horrific
debugging enviroment can become a manageable one if there is next to no
code to be debugged. And this can be accomplished by doing as much as
possible in pre- and post-processing.
In essence, callbacks force applications to do most of the
programming via side effects. One need not be a functional programming
purist to find this a very undesirable style of design to force on an
application. But the ability to debug can make the difference between
code that does work and code that does not. Unfairly or not, code is
rarely considered well-designed when it does not work.
So, while step-driven valuation seems a roundabout approach, it is
simpler and more direct than the likely alternatives. And there is
something to be said for pushing semantics up to the higher levels --
they can be expected to know more about it.
These advantages of step-driven valuation are strictly in the context
of a low-level interface. The author is under no illusion that direct
use of Libmarpa's valuator will be found satisfactory by most Libmarpa
users, even those using the C language. The author certainly avoids
using step-driven valuation directly. Libmarpa's valuator is intended
to be used via an upper layer, one which *does* know about semantics.
File: api.info, Node: Maintaining the stack, Next: Valuator constructor, Prev: Advantages of step-driven valuation, Up: Value methods
19.4 Maintaining the stack
==========================
This section discusses in detail the requirements for maintaining the
stack. In some cases, such as implementation using a Perl array,
fulfilling these requirements is trivial. Perl auto-extends its arrays,
and initializes the element values, on every read or write. For the C
programmer, things are not quite so easy.
In this section, we will assume a C90 or C99 standard-conformant C
application. This assumption is convenient on two grounds. First, this
will be the intended use for many readers. Second, standard-conformant
C is a "worst case". Any issue faced by a programmer of another
environment is likely to also be one that must be solved by the C
programmer.
Libmarpa often optimizes away unnecessary stack writes to stack
locations. When it does so, it will not necessarily optimize away all
reads to that stack location. This means that a location's first
access, as suggested by the Libmarpa step instructions, may be a read.
This possibility requires a special awareness from the C programmer, as
discussed in the sections *note Sizing the stack:: and *note
Initializing locations in the stack::.
In the discussions in this document, stack locations are non-negative
integers. The bottom of the stack is location 0. In moving from the
bottom of the stack to the top, the numbers increase. Stack location Y
is said to be "greater" than stack location X if stack location Y is
closer to the top of stack than location X, and therefore stack
locations are considered greater or lesser if the integers that
represent them are greater or lesser. Another way to state that a stack
location Y is greater (lesser) than stack location X is to say that a
stack location Y is later (earlier) than stack location X.
* Menu:
* Sizing the stack::
* Initializing locations in the stack::
File: api.info, Node: Sizing the stack, Next: Initializing locations in the stack, Prev: Maintaining the stack, Up: Maintaining the stack
19.4.1 Sizing the stack
-----------------------
If an implementation applies Libmarpa's step instructions literally,
using a physical stack, it must make sure the stack is large enough.
Specifically, the application must do the following
* Ensure location 0 exists -- in other words that the stack is at
least length 1.
* For 'MARPA_STEP_TOKEN' steps, ensure that location
'marpa_v_result(v)' exists.
* For 'MARPA_STEP_NULLING_SYMBOL' steps, ensure that location
'marpa_v_result(v)' exists.
* For 'MARPA_STEP_RULE' steps, ensure that stack locations from
'marpa_v_arg_0(v)' to 'marpa_v_arg_n(v)' exist.
Three aspects of these requirements deserve special mention. First,
note that the requirement for a 'MARPA_STEP_RULE' is that the
application size the stack to include the arguments to be read. Because
stack writes may be optimized away, an application, when reading, cannot
assume that the stack was sized appropriately by a prior write. The
first access to a new stack location may be a read.
Second, note that there is no explicit requirement that the
application size the stack to include the location for the result of the
'MARPA_STEP_RULE' step. An application is allowed to assume that result
will go into one of the locations that were read.
Third, special note should be made of the requirement that location 0
exist. By convention, the parse result resides in location 0 of the
stack. Because of potential optimizations, an application cannot assume
that it will receive a Libmarpa step instruction that either reads from
or writes to location 0.
File: api.info, Node: Initializing locations in the stack, Prev: Sizing the stack, Up: Maintaining the stack
19.4.2 Initializing locations in the stack
------------------------------------------
Write optimizations also creates issues for implementations which
require data to be initialized before reading. Every fully
standard-conforming C application is such an implementation. Both C90
and C99 allow "trap values", and therefore conforming applications must
be prepared for an uninitialized location to contain one of those.
Reading a trap value may cause an abend. (It is safe, in
standard-conforming C, to write to a location containing a trap value.)
The requirement that locations be initialized before reading occurs
in other implementations. Any implementation that has a "universe" of
"safe" values, may require special precautions. The required
precautions may amount to a need to initialize "uninitialized" values.
A practical example might be an implementation that expects all
locations to contain a pointer which it can safely indirect from. In
such implementations, just as in standard-conformant C, every stack
location needs to be initialized before being read.
Due to write optimizations, an application cannot rely on Libmarpa's
step instructions to initialize every stack location before its first
read. One way to safely deal with the initialization of stack
locations, is to do all of the following:
* When starting evaluation, ensure that the stack contains at least
location 0.
* Also, when starting evaluation, initialize every location in the
stack.
* Whenever the stack is extended, initialize every stack location
added.
Applications which try to optimize out some of these initializations
need to be aware that an application can never assume that activity in
the stack is safely "beyond" an uninitialized location. Libmarpa steps
often revisit earlier sections of the stack, and these revisits may
include reads of previously unvisited stack locations.
File: api.info, Node: Valuator constructor, Next: Valuator reference counting, Prev: Maintaining the stack, Up: Value methods
19.5 Creating a new valuator
============================
-- Function: Marpa_Value marpa_v_new ( Marpa_Tree T )
Creates a new valuator. The parent object of the new valuator will
be the tree iterator T, and the reference count of the new valuator
will be 1. The reference count of T is increased by 1.
The parent tree iterator is "paused", so that the tree iterator
cannot move on to a new parse tree until the valuator is destroyed.
Many valuators of the same parse tree can exist at once. A tree
iterator is "unpaused" when all of the valuators of that tree
iterator are destroyed.
Return value: On success, the newly created valuator. On failure,
'NULL'.
File: api.info, Node: Valuator reference counting, Next: Stepping through the valuator, Prev: Valuator constructor, Up: Value methods
19.6 Reference counting
=======================
-- Function: Marpa_Value marpa_v_ref (Marpa_Value V)
Increases the reference count by 1. Not needed by most
applications.
Return value: On success, V. On failure, 'NULL'.
-- Function: void marpa_v_unref ( Marpa_Value V)
Decreases the reference count by 1, destroying V once the reference
count reaches zero. Beginning with V's parent tree, Libmarpa then
proceeds up the chain of parent objects. Every time a child is
destroyed, the reference count of its parent is decreased by 1.
Every time the reference count of an object is decreased by 1, if
that reference count is now zero, that object is destroyed.
Libmarpa follows this chain of decrements and destructions as
required, all the way back to the base grammar, if necessary.
File: api.info, Node: Stepping through the valuator, Next: Valuator steps by type, Prev: Valuator reference counting, Up: Value methods
19.7 Stepping through the valuator
==================================
-- Function: Marpa_Step_Type marpa_v_step ( Marpa_Value V)
This method "steps through" the valuator. The return value is a
'Marpa_Step_Type', an integer which indicates the type of step.
How the application is expected to act on each step is described
below (*note Valuator steps by type::). When the iteration through
the steps is finished, 'marpa_v_step()' returns
'MARPA_STEP_INACTIVE'.
Return value: On success, a 'Marpa_Step_Type', which always be a
non-negative integer. On failure, -2.
File: api.info, Node: Valuator steps by type, Next: Basic step accessors, Prev: Stepping through the valuator, Up: Value methods
19.8 Valuator steps by type
===========================
-- Macro: Marpa_Step_Type MARPA_STEP_RULE
The semantics of a rule should be performed. The application can
find the value of the rule's children in the stack locations from
'marpa_v_arg_0(v)' to 'marpa_v_arg_n(v)'. The semantics for the
rule whose ID is 'marpa_v_rule(v)' should be executed on these
child values, and the result placed in 'marpa_v_result(v)'. In the
case of a 'MARPA_STEP_RULE' step, the stack location of
'marpa_v_result(v)' is guaranteed to be equal to
'marpa_v_arg_0(v)'.
-- Macro: Marpa_Step_Type MARPA_STEP_TOKEN
The semantics of a non-null token should be performed. The
application's value for the token whose ID is 'marpa_v_token(v)'
should be placed in stack location 'marpa_v_result(v)'. Its value
according to Libmarpa will be in 'marpa_v_token_value(v)'.
-- Macro: Marpa_Step_Type MARPA_STEP_NULLING_SYMBOL
The semantics for a nulling symbol should be performed. The ID of
the symbol is 'marpa_v_symbol(v)' and its value should be placed in
stack location 'marpa_v_result(v)'.
-- Macro: Marpa_Step_Type MARPA_STEP_INACTIVE
The valuator has gone through all of its steps and is now inactive.
The value of the parse will be in stack location 0. Because of
optimizations, it is possible for valuator to immediately became
inactive -- 'MARPA_STEP_INACTIVE' could be both the first and last
step.
-- Macro: Marpa_Step_Type MARPA_STEP_INITIAL
The valuator is new and has yet to go through any steps.
-- Macro: Marpa_Step_Type MARPA_STEP_INTERNAL1
-- Macro: Marpa_Step_Type MARPA_STEP_INTERNAL2
-- Macro: Marpa_Step_Type MARPA_STEP_TRACE
These step types are reserved for internal purposes.
File: api.info, Node: Basic step accessors, Next: Other step accessors, Prev: Valuator steps by type, Up: Value methods
19.9 Basic step accessors
=========================
The basic step accessors are so called because their information is
basic to the stack manipulation. The basic step accessors are
implemented as macros. They always succeed.
-- Macro: int marpa_v_arg_0 (Marpa_Value V)
For a 'MARPA_STEP_RULE' step, returns the stack location where the
value of first child can be found.
-- Macro: int marpa_v_arg_n (Marpa_Value V)
For a 'MARPA_STEP_RULE' step, returns the stack location where the
value of the last child can be found.
-- Macro: int marpa_v_result (Marpa_Value V)
For 'MARPA_STEP_RULE', 'MARPA_STEP_TOKEN', and
'MARPA_STEP_NULLING_SYMBOL' steps, returns the stack location where
the result of the semantics should be placed.
-- Macro: Marpa_Rule_ID marpa_v_rule (Marpa_Value V)
For the 'MARPA_STEP_RULE' step, returns the ID of the rule.
-- Macro: Marpa_Step_Type marpa_v_step_type (Marpa_Value V)
Returns the current step type: 'MARPA_STEP_TOKEN',
'MARPA_STEP_RULE', etc. Usually not needed since this is also the
return value of 'marpa_v_step()'.
-- Macro: Marpa_Symbol_ID marpa_v_symbol (Marpa_Value V)
For the 'MARPA_STEP_NULLING_SYMBOL' step, returns the ID of the
symbol. The value returned is the same as that returned by the
'marpa_v_token()' macro.
-- Macro: Marpa_Symbol_ID marpa_v_token (Marpa_Value V)
For the 'MARPA_STEP_TOKEN' step, returns the ID of the token. The
value returned is the same as that returned by the
'marpa_v_symbol()' macro.
-- Macro: int marpa_v_token_value (Marpa_Value V)
For the 'MARPA_STEP_TOKEN' step, returns the integer which is (or
which represents) the value of the token.
File: api.info, Node: Other step accessors, Prev: Basic step accessors, Up: Value methods
19.10 Other step accessors
==========================
This section contains the step accessors that are not basic to stack
manipulation, but which provide other useful information about the
parse. These step accessors are implemented as macros.
All of these accessors always succeed, but if called when they are
irrelevant they return an unspecified value. In this context, an
"unspecified value" is a value that is either -1 or the ID of a valid
Earley set, but which is otherwise unpredictable.
-- Macro: Marpa_Earley_Set_ID marpa_v_es_id (Marpa_Value V)
Return value: If the current step type is 'MARPA_STEP_RULE', the
Earley Set ordinal where the rule ends. If the current step type
is 'MARPA_STEP_TOKEN' or 'MARPA_STEP_NULLING_SYMBOL', the Earley
Set ordinal where the symbol ends. If the current step type is
anything else, an unspecified value.
-- Macro: Marpa_Earley_Set_ID marpa_v_rule_start_es_id (Marpa_Value V)
Return value: If the current step type is 'MARPA_STEP_RULE', the
Earley Set ordinal where the rule begins. If the current step type
is anything else, an unspecified value.
-- Macro: Marpa_Earley_Set_ID marpa_v_token_start_es_id (Marpa_Value V)
Return value: If the current step type is 'MARPA_STEP_TOKEN' or
'MARPA_STEP_NULLING_SYMBOL', the Earley Set ordinal where the token
begins. If the current step type is anything else, an unspecified
value.
File: api.info, Node: Events, Next: Error methods macros and codes, Prev: Value methods, Up: Top
20 Events
*********
* Menu:
* Events overview::
* Basic event accessors::
* Completion events::
* Symbol nulled events::
* Prediction events::
* Symbol expected events::
* Event codes::
File: api.info, Node: Events overview, Next: Basic event accessors, Prev: Events, Up: Events
20.1 Overview
=============
Events are generated by the 'marpa_g_precompute()',
'marpa_r_earleme_complete()', and 'marpa_r_start_input()' methods. The
methods are called event-active. Event-active methods always clear all
previous events, so that after an event-active method the only events
available will be those generated by that method.
Some Libmarpa methods clear the event queue. The user is expected to
query events immediately after the method that generated them. We note
especially that events are kept in the base grammar, so that multiple
recognizers using the same base grammar overwrite each other's events.
To find out how many events were generated by the last event-active
method, use the 'marpa_g_event_count()' method.
To query a specific event, use the 'marpa_g_event()' and
'marpa_g_event_value()' methods.
In reading this chapter, we will need to be aware that it contains a
mixture of grammar and recognizer methods.
File: api.info, Node: Basic event accessors, Next: Completion events, Prev: Events overview, Up: Events
20.2 Basic event accessors
==========================
-- Function: Marpa_Event_Type marpa_g_event (Marpa_Grammar G,
Marpa_Event* EVENT, int IX)
On success, the type of the IX'th event is returned and the data
for the IX'th event is placed in the location pointed to by EVENT.
Event indexes are in sequence. Valid events will be in the range
from 0 to N, where N is one less than the event count. The event
count can be queried using the 'marpa_g_event_count()' method.
Return value: On success, the type of event IX. If there is no
IX'th event, if IX is negative, or on other failure, -2. On
failure, the locations pointed to by EVENT are not changed.
-- Function: int marpa_g_event_count ( Marpa_Grammar g )
Return value: On success, the number of events. On failure, -2.
-- Macro: int marpa_g_event_value (Marpa_Event* EVENT)
This macro provides access to the "value" of the event. The
semantics of the value varies according to the type of the event,
and is described in the section on event codes (*note Event
codes::).
File: api.info, Node: Completion events, Next: Symbol nulled events, Prev: Basic event accessors, Up: Events
20.3 Completion events
======================
-- Function: int marpa_g_completion_symbol_activate ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID, int REACTIVATE )
Allows the user to deactivate and reactivate symbol completion
events in the grammar. When a recognizer is created, the
activation status of each of its events is initialized to the
activation status of that event in the base grammar. If REACTIVATE
is zero, the event is deactivated in the grammar. If REACTIVATE is
one, the event is activated in the grammar.
Symbol completion events are active by default if the symbol was
set up for completion events in the grammar. If a symbol was not
set up for completion events in the grammar, symbol completion
events are inactive by default and any attempt to change that is a
fatal error.
The activation status of a completion event in the grammar can only
be changed if the symbol is marked as a completion event symbol in
the grammar, and before the grammar is precomputed. However, if a
symbol is marked as a completion event symbol in the recognizer,
the completion event can be deactivated and reactivated in the
recognizer.
Success cases: On success, the method returns the value of
REACTIVATE. The method succeeds trivially if the symbol is already
set as indicated by REACTIVATE.
Failure cases: If the active status of the completion event for
SYM_ID cannot be set as indicated by REACTIVATE, the method fails.
On failure, -2 is returned.
-- Function: int marpa_r_completion_symbol_activate ( Marpa_Recognizer
R, Marpa_Symbol_ID SYM_ID, int REACTIVATE )
Allows the user to deactivate and reactivate symbol completion
events in the recognizer. If REACTIVATE is zero, the event is
deactivated. If REACTIVATE is one, the event is activated.
Symbol completion events are active by default if the symbol was
set up for completion events in the grammar. If a symbol was not
set up for completion events in the grammar, symbol completion
events are inactive by default and any attempt to change that is a
fatal error.
Success cases: On success, the method returns the value of
REACTIVATE. The method succeeds trivially if the symbol is already
set as indicated by REACTIVATE.
Failure cases: If the active status of the completion event for
SYM_ID cannot be set as indicated by REACTIVATE, the method fails.
On failure, -2 is returned.
-- Function: int marpa_g_symbol_is_completion_event ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
-- Function: int marpa_g_symbol_is_completion_event_set ( Marpa_Grammar
G, Marpa_Symbol_ID SYM_ID, int VALUE)
Libmarpa can be set up to generate an
'MARPA_EVENT_SYMBOL_COMPLETED' event whenever the symbol is
completed. A symbol is said to be *completed* when a non-nulling
rule with that symbol on its LHS is completed.
For completion events to occur, the symbol must be marked as a
completion event symbol. The
'marpa_g_symbol_is_completion_event_set()' function marks symbol
SYM_ID as a completion event symbol if VALUE is 1, and unmarks it
it as a completion event symbol if VALUE is 0. The
'marpa_g_symbol_is_completion_event()' method returns the current
value of the completion event marking for symbol SYM_ID.
Marking a completion event sets its activation status to on.
Unmarking a completion event sets its activation status to off.
The completion event marking cannot be changed once the grammar is
precomputed.
If a completion event is marked, its activation status can be
changed using the 'marpa_g_completion_symbol_activate()' method.
Note that, if a symbol is marked as a completion event symbol in
the recognizer, its completion event can be deactivated and
reactivated in the recognizer.
Nulled rules and symbols will never cause completion events.
Nullable symbols may be marked as completion event symbols, but
this will have an effect only when the symbol is not nulled.
Nulling symbols may be marked as completion event symbols, but no
completion events will ever be generated for a nulling symbol.
Note that this implies at no completion event will ever be
generated at earleme 0, the start of parsing.
Success: On success, 1 if symbol SYM_ID is a completion event
symbol after the call, 0 otherwise.
Failures: If SYM_ID is well-formed, but there is no such symbol,
-1. If the grammar G is precomputed; or on other failure, -2.
File: api.info, Node: Symbol nulled events, Next: Prediction events, Prev: Completion events, Up: Events
20.4 Symbol nulled events
=========================
-- Function: int marpa_g_nulled_symbol_activate ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID, int REACTIVATE )
Allows the user to deactivate and reactivate symbol nulled events
in the grammar. When a recognizer is created, the activation
status of each of its events is initialized to the activation
status of that event in the base grammar. If REACTIVATE is zero,
the event is deactivated in the grammar. If REACTIVATE is one, the
event is activated in the grammar.
Symbol nulled events are active by default if the symbol was set up
for nulled events in the grammar. If a symbol was not set up for
nulled events in the grammar, symbol nulled events are inactive by
default and any attempt to change that is a fatal error.
The activation status of a nulled event in the grammar can only be
changed if the symbol is marked as a nulled event symbol in the
grammar, and before the grammar is precomputed. However, if a
symbol is marked as a nulled event symbol in the recognizer, the
nulled event can be deactivated and reactivated in the recognizer.
Success cases: On success, the method returns the value of
REACTIVATE. The method succeeds trivially if the symbol is already
set as indicated by REACTIVATE.
Failure cases: If the active status of the nulled event for SYM_ID
cannot be set as indicated by REACTIVATE, the method fails. On
failure, -2 is returned.
-- Function: int marpa_r_nulled_symbol_activate ( Marpa_Recognizer R,
Marpa_Symbol_ID SYM_ID, int BOOLEAN )
Allows the user to deactivate and reactivate symbol nulled events
in the recognizer. If BOOLEAN is zero, the event is deactivated.
If BOOLEAN is one, the event is activated.
Symbol nulled events are active by default if the symbol was set up
for nulled events in the grammar. If a symbol was not set up for
nulled events in the grammar, symbol nulled events are inactive by
default and any attempt to change that is a fatal error.
Success cases: On success, the method returns the value of BOOLEAN.
The method succeeds trivially if the symbol is already set as
indicated by BOOLEAN.
Failure cases: If the active status of the nulled event for SYM_ID
cannot be set as indicated by BOOLEAN, the method fails. On
failure, -2 is returned.
-- Function: int marpa_g_symbol_is_nulled_event ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
-- Function: int marpa_g_symbol_is_nulled_event_set ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID, int VALUE)
Libmarpa can set up to generate an 'MARPA_EVENT_SYMBOL_NULLED'
event whenever the symbol is nulled. A symbol is said to be
*nulled* when a zero length instance of that symbol is recognized.
For nulled events to occur, the symbol must be marked as a nulled
event symbol. The 'marpa_g_symbol_is_nulled_event_set()' function
marks symbol SYM_ID as a nulled event symbol if VALUE is 1, and
unmarks it it as a nulled event symbol if VALUE is 0. The
'marpa_g_symbol_is_nulled_event()' method returns the current value
of the nulled event marking for symbol SYM_ID.
Marking a nulled event sets its activation status to on. Unmarking
a nulled event sets its activation status to off. The nulled event
marking cannot be changed once the grammar is precomputed.
If a nulled event is marked, its activation status can be changed
using the 'marpa_g_nulled_symbol_activate()' method. Note that, if
a symbol is marked as a nulled event symbol in the recognizer, its
nulled event can be deactivated and reactivated in the recognizer.
As a reminder, a symbol instance is a symbol at a specific location
in the input, and with a specific length. Also, whenever a nulled
symbol instance is recognized at a location, it is acceptable at
that location, and vice versa.
When a symbol instance is recognized at a location, it will
generate a nulled event or a prediction event, but never both. A
symbol instance of zero length, when recognized at a location,
generates a nulled event at that location, and does not generate a
completion event. A symbol instance of non-zero length, when
acceptable at a location, generates a completion event at that
location, and does not generate a nulled event.
When a symbol instance is acceptable at a location, it will
generate a nulled event or a prediction event, but never both. A
symbol instance of zero length, when acceptable at a location,
generates a nulled event at that location, and does not generate a
prediction event. A symbol instance of non-zero length, when
acceptable at a location, generates a prediction event at that
location, and does not generate a nulled event.
While it is not possible for a *symbol instance* to generate both a
nulled event and a completion event at a location, it is quite
possible that a *symbol* might generate both kinds of event at that
location. This is because multiple instances of the same symbol
may be recognized at a given location, and these instances will
have different lengths. If one instance is recognized at a given
location as zero length and a second, non-zero-length, instance is
recognized at the same location, the first will generate only
nulled events, while the second will generate only completion
events. For similar reasons, while a *symbol instance* will never
generate both a null event and a prediction event at a location,
multiple instances of the same symbol may do so.
Zero length derivations can be ambiguous. When a zero length
symbol is recognized, all of its zero-length derivations are also
considered to be recognized.
The 'marpa_g_symbol_is_nulled_event_set()' method will mark a
symbol as a nulled event symbol, even if the symbol is
non-nullable. This is convenient, for example, for automatically
generated grammars. Applications which wish to treat it as a
failure if there is an attempt to mark a non-nullable symbol as a
nulled event symbol, can check for this case using the
'marpa_g_symbol_is_nullable()' method.
Success: On success, 1 if symbol SYM_ID is a nulled event symbol
after the call, 0 otherwise.
Failures: If SYM_ID is well-formed, but there is no such symbol,
-1. If the grammar G is precomputed; or on other failure, -2.
File: api.info, Node: Prediction events, Next: Symbol expected events, Prev: Symbol nulled events, Up: Events
20.5 Prediction events
======================
-- Function: int marpa_g_prediction_symbol_activate ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID, int REACTIVATE )
Allows the user to deactivate and reactivate symbol prediction
events in the grammar. When a recognizer is created, the
activation status of each of its events is initialized to the
activation status of that event in the base grammar. If REACTIVATE
is zero, the event is deactivated in the grammar. If REACTIVATE is
one, the event is activated in the grammar.
Symbol prediction events are active by default if the symbol was
set up for prediction events in the grammar. If a symbol was not
set up for prediction events in the grammar, symbol prediction
events are inactive by default and any attempt to change that is a
fatal error.
The activation status of a prediction event in the grammar can only
be changed if the symbol is marked as a prediction event symbol in
the grammar, and before the grammar is precomputed. However, if a
symbol is marked as a prediction event symbol in the recognizer,
the prediction event can be deactivated and reactivated in the
recognizer.
Success cases: On success, the method returns the value of
REACTIVATE. The method succeeds trivially if the symbol is already
set as indicated by REACTIVATE.
Failure cases: If the active status of the prediction event for
SYM_ID cannot be set as indicated by REACTIVATE, the method fails.
On failure, -2 is returned.
-- Function: int marpa_r_prediction_symbol_activate ( Marpa_Recognizer
R, Marpa_Symbol_ID SYM_ID, int BOOLEAN )
Allows the user to deactivate and reactivate symbol prediction
events in the recognizer. If BOOLEAN is zero, the event is
deactivated. If BOOLEAN is one, the event is activated.
Symbol prediction events are active by default if the symbol was
set up for prediction events in the grammar. If a symbol was not
set up for prediction events in the grammar, symbol prediction
events are inactive by default and any attempt to change that is a
fatal error.
Success cases: On success, the method returns the value of BOOLEAN.
The method succeeds trivially if the symbol is already set as
indicated by BOOLEAN.
Failure cases: If the active status of the prediction event for
SYM_ID cannot be set as indicated by BOOLEAN, the method fails. On
failure, -2 is returned.
-- Function: int marpa_g_symbol_is_prediction_event ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID)
-- Function: int marpa_g_symbol_is_prediction_event_set ( Marpa_Grammar
G, Marpa_Symbol_ID SYM_ID, int VALUE)
Libmarpa can be set up to generate a 'MARPA_EVENT_SYMBOL_PREDICTED'
event when a non-nulled symbol is predicted. A non-nulled symbol
is said to be *predicted* when a instance of it is acceptable at
the current earleme according to the grammar. Nulled symbols do
not generate predictions.
For predicted events to occur, the symbol must be marked as a
predicted event symbol. The
'marpa_g_symbol_is_predicted_event_set()' function marks symbol
SYM_ID as a predicted event symbol if VALUE is 1, and unmarks it it
as a predicted event symbol if VALUE is 0. The
'marpa_g_symbol_is_predicted_event()' method returns the current
value of the predicted event marking for symbol SYM_ID.
Marking a prediction event sets its activation status to on.
Unmarking a prediction event sets its activation status to off.
The prediction event marking cannot be changed once the grammar is
precomputed.
If a prediction event is marked, its activation status can be
changed using the 'marpa_g_prediction_symbol_activate()' method.
Note that, if a symbol is marked as a prediction event symbol in
the recognizer, its prediction event can be deactivated and
reactivated in the recognizer.
Success: On success, 1 if symbol SYM_ID is a predicted event symbol
after the call, 0 otherwise.
Failures: If SYM_ID is well-formed, but there is no such symbol,
-1. If the grammar G is precomputed; or on other failure, -2.
File: api.info, Node: Symbol expected events, Next: Event codes, Prev: Prediction events, Up: Events
20.6 Symbol expected events
===========================
-- Function: int marpa_r_expected_symbol_event_set ( Marpa_Recognizer
R, Marpa_Symbol_ID SYMBOL_ID, int VALUE)
Sets the "expected symbol event bit" for SYMBOL_ID to VALUE. A
recognizer event is created whenever symbol SYMBOL_ID is expected
at the current earleme. if and only if the expected symbol event
bit for SYMBOL_ID is 1. The "expected symbol event bit" must be 1
or 0.
In this context, "expected" means "expected as a terminal". Even
if a symbol is predicted at the current earleme, if it is not
acceptable as a terminal, it does not trigger an "expected symbol
event".
By default, the "expected symbol event bit" is 0. It is an error
to attempt to set the "expected symbol event bit" to 1 for a
nulling symbol, an inaccessible symbol, or an unproductive symbol.
Return value: The value of the event bit after the method call is
finished. -2 if SYMBOL_ID is not the ID of a valid symbol; if it
is the ID of an nulling, inaccessible for unproductive symbol; or
on other failure.
File: api.info, Node: Event codes, Prev: Symbol expected events, Up: Events
20.7 Event codes
================
-- Macro: int MARPA_EVENT_NONE
Applications should never see this event. Event value: Undefined.
Suggested message: "No event".
-- Macro: int MARPA_EVENT_COUNTED_NULLABLE
A nullable symbol is either the separator for, or the right hand
side of, a sequence. Event value: The ID of the symbol. Suggested
message: "This symbol is a counted nullable".
-- Macro: int MARPA_EVENT_EARLEY_ITEM_THRESHOLD
This event indicates that an application-settable threshold on the
number of Earley items has been reached or exceeded. *Note
marpa_r_earley_item_warning_threshold_set():
marpa_r_earley_item_warning_threshold_set.
Event value: The current Earley item count. Suggested message:
"Too many Earley items".
-- Macro: int MARPA_EVENT_EXHAUSTED
The parse is exhausted. Event value: Undefined. Suggested
message: "Recognizer is exhausted".
-- Macro: int MARPA_EVENT_LOOP_RULES
One or more rules are loop rules -- rules that are part of a cycle.
Cycles are pathological cases of recursion, in which the same
symbol string derives itself a potentially infinite number of
times. Nonetheless, Marpa parses in the presence of these, and it
is up to the application to treat these as fatal errors, something
they almost always will wish to do. Event value: The count of loop
rules. Suggested message: "Grammar contains a infinite loop".
-- Macro: int MARPA_EVENT_NULLING_TERMINAL
A nulling symbol is also a terminal. Event value: The ID of the
symbol. Suggested message: "This symbol is a nulling terminal".
-- Macro: int MARPA_EVENT_SYMBOL_COMPLETED
The recognizer can be set to generate an event a symbol is
completed using its 'marpa_g_symbol_is_completion_event_set()'
method. (A symbol is "completed" if and only if any rule with that
symbol as its LHS is completed.) This event code indicates that
one of those events occurred. Event value: The ID of the completed
symbol. Suggested message: "Completed symbol".
-- Macro: int MARPA_EVENT_SYMBOL_EXPECTED
The recognizer can be set to generate an event when a symbol is
expected as a terminal, using its
'marpa_r_expected_symbol_event_set()' method. Note that this event
only triggers if the symbol is expected as a terminal. Predicted
symbols which are not expected as terminals do not trigger this
event. This event code indicates that one of those events
occurred. Event value: The ID of the expected symbol. Suggested
message: "Expecting symbol".
-- Macro: int MARPA_EVENT_SYMBOL_NULLED
The recognizer can be set to generate an event when a symbol is
nulled - that is, recognized as a zero-length symbol. To set an
nulled symbol event, use the recognizer's
'marpa_r_nulled_symbol_event_set()' method. This event code
indicates that a nulled symbol event occurred. Event value: The ID
of the nulled symbol. Suggested message: "Symbol was nulled".
-- Macro: int MARPA_EVENT_SYMBOL_PREDICTED
The recognizer can be set to generate an event when a symbol is
predicted. To set an predicted symbol event, use the recognizer's
'marpa_g_symbol_is_prediction_event_set()' method. Unlike the
'MARPA_EVENT_SYMBOL_EXPECTED' event, the
'MARPA_EVENT_SYMBOL_PREDICTED' event triggers for predictions of
both non-terminals and terminals. This event code indicates that a
predicted symbol event occurred. Event value: The ID of the
predicted symbol. Suggested message: "Symbol was predicted".
File: api.info, Node: Error methods macros and codes, Next: Technical notes, Prev: Events, Up: Top
21 Error methods, macros and codes
**********************************
* Menu:
* Error methods::
* Error Macros::
* External error codes::
* Internal error codes::
File: api.info, Node: Error methods, Next: Error Macros, Prev: Error methods macros and codes, Up: Error methods macros and codes
21.1 Error methods
==================
-- Function: Marpa_Error_Code marpa_g_error ( Marpa_Grammar G, const
char** P_ERROR_STRING)
When a method fails, this method allows the application to read the
error code. P_ERROR_STRING is reserved for use by the internals.
Applications should set it to 'NULL'.
Return value: The last error code from a Libmarpa method. Always
succeeds.
-- Function: Marpa_Error_Code marpa_g_error_clear ( Marpa_Grammar G )
Sets the error code to 'MARPA_ERR_NONE'. Not often used, but now
and then it can be useful to force the error code to a known state.
Return value: 'MARPA_ERR_NONE'. Always succeeds.
File: api.info, Node: Error Macros, Next: External error codes, Prev: Error methods, Up: Error methods macros and codes
21.2 Error Macros
=================
-- Macro: int MARPA_ERRCODE_COUNT
The number of error codes. All error codes, whether internal or
external, will be integers, non-negative but strictly less than
'MARPA_ERRCODE_COUNT'.
File: api.info, Node: External error codes, Next: Internal error codes, Prev: Error Macros, Up: Error methods macros and codes
21.3 External error codes
=========================
This section lists the external error codes. These are the only error
codes that users of the Libmarpa external interface should ever see.
Internal error codes are in their own section (*note Internal error
codes::).
-- Macro: int MARPA_ERR_NONE
No error condition. The error code is initialized to this value.
Methods which do not result in failure sometimes reset the error
code to 'MARPA_ERR_NONE'. Numeric value: 0. Suggested message:
"No error".
-- Macro: int MARPA_ERR_BAD_SEPARATOR
A separator was specified for a sequence rule, but its ID was not
that of a valid symbol. Numeric value: 6. Suggested message:
"Separator has invalid symbol ID".
-- Macro: int MARPA_ERR_BEFORE_FIRST_TREE
A tree iterator is positioned before the first tree, and it was
specified in a context where that is not allowed. A newly created
tree is positioned before the first tree. To position a newly
created tree iterator to the first tree use the 'marpa_t_next()'
method. Numeric value: 91. Suggested message: "Tree iterator is
before first tree".
-- Macro: int MARPA_ERR_COUNTED_NULLABLE
A "counted" symbol was found that is also a nullable symbol. A
"counted" symbol is one that appears on the RHS of a sequence rule.
If a symbol is nullable, counting its occurrences becomes
difficult. Questions of definition and problems of implementation
arise. At a minimum, a sequence with counted nullables would be
wildly ambigious.
Sequence rules are simply an optimized shorthand for rules that can
also be written in ordinary BNF. If the equivalent of a sequence of
nullables is really what your application needs, nothing in
Libmarpa prevents you from specifying that sequence with ordinary
BNF rules.
Numeric value: 8. Suggested message: "Nullable symbol on RHS of a
sequence rule".
-- Macro: int MARPA_ERR_DUPLICATE_RULE
This error indicates an attempt to add a BNF rule which is a
duplicate of a BNF rule already in the grammar. Two BNF rules are
considered duplicates if
* Both rules have the same left hand symbol, and
* Both rules have the same right hand symbols in the same order.
Duplication of sequence rules, and duplication between BNF rules
and sequence rules, is dealt with by requiring that the LHS of a
sequence rule not be the LHS of any other rule.
Numeric value: 11. Suggested message: "Duplicate rule".
-- Macro: int MARPA_ERR_DUPLICATE_TOKEN
This error indicates an attempt to add a duplicate token. A token
is a duplicate if one already read at the same earleme has the same
symbol ID and the same length. Numeric value: 12. Suggested
message: "Duplicate token".
-- Macro: int MARPA_ERR_YIM_COUNT
This error code indicates that an implementation-defined limit on
the number of Earley items per Earley set was exceedeed. This
limit is different from the Earley item warning threshold, an
optional limit on the number of Earley items in an Earley set,
which can be set by the application.
The implementation defined-limit is very large, at least
500,000,000 earlemes. An application is unlikely ever to see this
error. Libmarpa's use of memory would almost certainly exceed the
implementation's limits before it occurred. Numeric value: 13.
Suggested message: "Maximum number of Earley items exceeded".
-- Macro: int MARPA_ERR_EVENT_IX_NEGATIVE
A negative event index was specified. That is not allowed.
Numeric value: 15. Suggested message: "Negative event index".
-- Macro: int MARPA_ERR_EVENT_IX_OOB
An non-negative event index was specified, but there is no event at
that index. Since the events are in sequence, this means it was
too large. Numeric value: 16. Suggested message: "No event at
that index".
-- Macro: int MARPA_ERR_GRAMMAR_HAS_CYCLE
The grammar has a cycle -- one or more loop rules. This is a
recoverable error, although most applications will want to treat it
as fatal. For more see the description of *note
marpa_g_precompute::. Numeric value: 17. Suggested message:
"Grammar has cycle".
-- Macro: int MARPA_ERR_HEADERS_DO_NOT_MATCH
This is an internal error, and indicates that Libmarpa was wrongly
built. Libmarpa was compiled with headers which do not match the
rest of the code. The solution is to find a correctly built
Libmarpa. Numeric value: 98. Suggested message: "Internal error:
Libmarpa was built incorrectly"
-- Macro: int MARPA_ERR_I_AM_NOT_OK
The Libmarpa base grammar is in a "not ok" state. Currently, the
only way this can happen is if Libmarpa memory is being
overwritten. Numeric value: 29. Suggested message: "Marpa is in a
not OK state".
-- Macro: int MARPA_ERR_INACCESSIBLE_TOKEN
This error code indicates that the token symbol is an inaccessible
symbol -- one which cannot be reached from the start symbol.
Since the inaccessibility of a symbol is a property of the grammar,
this error code typically indicates an application error.
Nevertheless, a retry at this location, using another token ID, may
succeed. At this writing, the author knows of no uses of this
technique.
Numeric value: 18. Suggested message: "Token symbol is
inaccessible".
-- Macro: int MARPA_ERR_INVALID_BOOLEAN
A function was called that takes a boolean argument, but the value
of that argument was not either 0 or 1. Numeric value: 22.
Suggested message: "Argument is not boolean".
-- Macro: int MARPA_ERR_INVALID_LOCATION
The location (Earley set ID) is not valid. It may be invalid for
one of two reasons:
* It is negative, and it is being used as the argument to a
method for which that negative value does not have a special
meaning.
* It is after the latest Earley set.
For users of input models other than the standard one, the term
"location", as used in association with this error code, means
Earley set ID or Earley set ordinal. In the standard input model,
this will always be identical with Libmarpa's other idea of
location, the earleme.
Numeric value: 25. Suggested message: "Location is not valid".
-- Macro: int MARPA_ERR_INVALID_START_SYMBOL
A start symbol was specified, but its symbol ID is not that of a
valid symbol. Numeric value: 27. Suggested message: "Specified
start symbol is not valid".
-- Macro: int MARPA_ERR_INVALID_ASSERTION_ID
A method was called with an invalid assertion ID. This is a
assertion ID which not only does not exist, but cannot exist.
Currently that means its value is less than zero. Numeric value:
96. Suggested message: "Assertion ID is malformed".
-- Macro: int MARPA_ERR_INVALID_RULE_ID
A method was called with an invalid rule ID. This is a rule ID
which not only does not exist, but cannot exist. Currently that
means its value is less than zero. Numeric value: 26. Suggested
message: "Rule ID is malformed".
-- Macro: int MARPA_ERR_INVALID_SYMBOL_ID
A method was called with an invalid symbol ID. This is a symbol ID
which not only does not exist, but cannot exist. Currently that
means its value is less than zero. Numeric value: 28. Suggested
message: "Symbol ID is malformed".
-- Macro: int MARPA_ERR_MAJOR_VERSION_MISMATCH
There was a mismatch in the major version number between the
requested version of libmarpa, and the actual one. Numeric value:
30. Suggested message: "Libmarpa major version number is a
mismatch".
-- Macro: int MARPA_ERR_MICRO_VERSION_MISMATCH
There was a mismatch in the micro version number between the
requested version of libmarpa, and the actual one. Numeric value:
31. Suggested message: "Libmarpa micro version number is a
mismatch".
-- Macro: int MARPA_ERR_MINOR_VERSION_MISMATCH
There was a mismatch in the minor version number between the
requested version of libmarpa, and the actual one. Numeric value:
32. Suggested message: "Libmarpa minor version number is a
mismatch".
-- Macro: int MARPA_ERR_NO_EARLEY_SET_AT_LOCATION
A non-negative Earley set ID (also called an Earley set ordinal)
was specified, but there is no corresponding Earley set. Since the
Earley set ordinals are in sequence, this means that the specified
ID is greater than that of the latest Earley set. Numeric value:
39. Suggested message: "Earley set ID is after latest Earley set".
-- Macro: int MARPA_ERR_NOT_PRECOMPUTED
The grammar is not precomputed, and attempt was made to do
something with it that is not allowed for unprecomputed grammars.
For example, a recognizer cannot be created from a grammar until it
is precomputed. Numeric value: 34. Suggested message: "This
grammar is not precomputed".
-- Macro: int MARPA_ERR_NO_PARSE
The application attempted to create a bocage from a recognizer
without a parse. Applications will often want to treat this as a
soft error. Numeric value: 41. Suggested message: "No parse".
-- Macro: int MARPA_ERR_NO_RULES
A grammar which has no rules is being used in a way that is not
allowed. Usually the problem is that the user is trying to
precompute the grammar. Numeric value: 42. Suggested message:
"This grammar does not have any rules".
-- Macro: int MARPA_ERR_NO_START_SYMBOL
The grammar has no start symbol, and an attempt was made to perform
an operation which requires one. Usually the problem is that the
user is trying to precompute the grammar. Numeric value: 43.
Suggested message: "This grammar has no start symbol".
-- Macro: int MARPA_ERR_NO_SUCH_ASSERTION_ID
A method was called with an assertion ID which is well-formed, but
the assertion does not exist. Numeric value: 97. Suggested
message: "No assertion with this ID exists".
-- Macro: int MARPA_ERR_NO_SUCH_RULE_ID
A method was called with a rule ID which is well-formed, but the
rule does not exist. Numeric value: 89. Suggested message: "No
rule with this ID exists".
-- Macro: int MARPA_ERR_NO_SUCH_SYMBOL_ID
A method was called with a symbol ID which is well-formed, but the
symbol does not exist. Numeric value: 90. Suggested message: "No
symbol with this ID exists".
-- Macro: int MARPA_ERR_NO_TOKEN_EXPECTED_HERE
This error code indicates that no tokens at all were expected at
this earleme location. This can only happen in alternative input
models.
Typically, this indicates an application programming error.
Retrying input at this location will always fail. But if the
application is able to leave this earleme empty, a retry at a later
location, using this or another token, may succeed. At this
writing, the author knows of no uses of this technique.
Numeric value: 44. Suggested message: "No token is expected at
this earleme location".
-- Macro: int MARPA_ERR_NOT_A_SEQUENCE
This error occurs in situations where a rule is required to be a
sequence, and indicates that the rule of interest is, in fact, not
a sequence.
Numeric value: 99. Suggested message: "Rule is not a sequence".
-- Macro: int MARPA_ERR_NULLING_TERMINAL
Marpa does not allow a symbol to be both nulling and a terminal.
Numeric value: 49. Suggested message: "A symbol is both terminal
and nulling".
-- Macro: int MARPA_ERR_ORDER_FROZEN
The Marpa order object has been frozen. If a Marpa order object is
frozen, it cannot be changed.
Multiple tree iterators can share a Marpa order object, but that
order object is frozen after the first tree iterator is created
from it. Applications can order an bocage in many ways, but they
must do so by creating multiple order objects.
Numeric value: 50. Suggested message: "The ordering is frozen".
-- Macro: int MARPA_ERR_PARSE_EXHAUSTED
The parse is exhausted. Numeric value: 53. Suggested message:
"The parse is exhausted".
-- Macro: int MARPA_ERR_PARSE_TOO_LONG
The parse is too long. The limit on the length of a parse is
implementation dependent, but it is very large, at least
500,000,000 earlemes.
This error code is unlikely in the standard input model. Almost
certainly memory would be exceeded before it could occur. If an
application sees this error, it almost certainly using one of the
non-standard input models.
Most often this message will occur because of a request to add a
single extremely long token, perhaps as a result of an application
error. But it is also possible this error condition will occur
after the input of a large number of long tokens.
Numeric value: 54. Suggested message: "This input would make the
parse too long".
-- Macro: int MARPA_ERR_POINTER_ARG_NULL
In a method which takes pointers as arguments, one of the pointer
arguments is 'NULL', in a case where that is not allowed. One such
method is 'marpa_r_progress_item()'. Numeric value: 56. Suggested
message: "An argument is null when it should not be".
-- Macro: int MARPA_ERR_PRECOMPUTED
An attempt was made to use a precomputed grammar in a way that is
not allowed. Often this is an attempt to change the grammar.
Nearly every change to a grammar after precomputation invalidates
the precomputation, and is therefore not allowed. Numeric value:
57. Suggested message: "This grammar is precomputed".
-- Macro: int MARPA_ERR_PROGRESS_REPORT_NOT_STARTED
No recognizer progress report is currently active, and an action
has been attempted which is inconsistent with that. One such
action would be a 'marpa_r_progress_item()' call. Numeric value:
59. Suggested message: "No progress report has been started".
-- Macro: int MARPA_ERR_PROGRESS_REPORT_EXHAUSTED
The progress report is "exhausted" -- all its items have been
iterated through. Numeric value: 58. Suggested message: "The
progress report is exhausted".
-- Macro: int MARPA_ERR_RANK_TOO_LOW
A symbol or rule rank was specified which was less than an
implementation-defined minimum. Implementations will always allow
at least those ranks in the range between -134,217,727 and
134,217,727. Numeric value: 85. Suggested message: "Rule or
symbol rank too low".
-- Macro: int MARPA_ERR_RANK_TOO_HIGH
A symbol or rule rank was specified which was greater than an
implementation-defined maximum. Implementations will always allow
at least those ranks in the range between -134,217,727 and
134,217,727. Numeric value: 86. Suggested message: "Rule or
symbol rank too high".
-- Macro: int MARPA_ERR_RECCE_IS_INCONSISTENT
The recognizer is "inconsistent", usually because the user has
rejected one or more rules or terminals, and has not yet called the
'marpa_r_consistent()' method. Numeric value: 95. Suggested
message: "The recognizer is inconsistent.
-- Macro: int MARPA_ERR_RECCE_NOT_ACCEPTING_INPUT
The recognizer is not accepting input, and the application has
attempted something that is inconsistent with that fact. Numeric
value: 60. Suggested message: "The recognizer is not accepting
input".
-- Macro: int MARPA_ERR_RECCE_NOT_STARTED
The recognizer has not been started. and the application has
attempted something that is inconsistent with that fact. Numeric
value: 61. Suggested message: "The recognizer has not been
started".
-- Macro: int MARPA_ERR_RECCE_STARTED
The recognizer has been started. and the application has attempted
something that is inconsistent with that fact. Numeric value: 62.
Suggested message: "The recognizer has been started".
-- Macro: int MARPA_ERR_RHS_IX_NEGATIVE
The index of a RHS symbol was specified, and it was negative. That
is not allowed. Numeric value: 63. Suggested message: "RHS index
cannot be negative".
-- Macro: int MARPA_ERR_RHS_IX_OOB
A non-negative index of RHS symbol was specified, but there is no
symbol at that index. Since the indexes are in sequence, this
means the index was greater than or equal to the rule length.
Numeric value: 64. Suggested message: "RHS index must be less than
rule length".
-- Macro: int MARPA_ERR_RHS_TOO_LONG
An attempt was made to add a rule with too many right hand side
symbols. The limit on the RHS symbol count is implementation
dependent, but it is very large, at least 500,000,000 symbols.
This is far beyond what is required in any current practical
grammar. An application with rules of this length is almost
certain to run into memory and other limits. Numeric value: 65.
Suggested message: "The RHS is too long".
-- Macro: int MARPA_ERR_SEQUENCE_LHS_NOT_UNIQUE
The LHS of a sequence rule cannot be the LHS of any other rule,
whether a sequence rule or a BNF rule. An attempt was made to
violate this restriction. Numeric value: 66. Suggested message:
"LHS of sequence rule would not be unique".
-- Macro: int MARPA_ERR_START_NOT_LHS
The start symbol is not on the LHS on any rule. That means it
could never match any possible input, not even the null string.
Presumably, an error in writing the grammar. Numeric value: 73.
Suggested message: "Start symbol not on LHS of any rule".
-- Macro: int MARPA_ERR_SYMBOL_IS_NOT_COMPLETION_EVENT
An attempt was made to use a symbol in a way that requires it to be
set up for completion events, but the symbol was not set set up for
completion events. The archetypal case is an attempt to activate
completion events for the symbol in the recognizer. The archetypal
case is an attempt to activate a completion event in the recognizer
for a symbol that is not set up as a completion event. Numeric
value: 92. Suggested message: "Symbol is not set up for completion
events".
-- Macro: int MARPA_ERR_SYMBOL_IS_NOT_NULLED_EVENT
An attempt was made to use a symbol in a way that requires it to be
set up for nulled events, but the symbol was not set set up for
nulled events. The archetypal case is an attempt to activate a
nulled events in the recognizer for a symbol that is not set up as
a nulled event. Numeric value: 93. Suggested message: "Symbol is
not set up for nulled events".
-- Macro: int MARPA_ERR_SYMBOL_IS_NOT_PREDICTION_EVENT
An attempt was made to use a symbol in a way that requires it to be
set up for predictino events, but the symbol was not set set up for
predictino events. The archetypal case is an attempt to activate a
prediction event in the recognizer for a symbol that is not set up
as a prediction event. Numeric value: 94. Suggested message:
"Symbol is not set up for prediction events".
-- Macro: int MARPA_ERR_SYMBOL_VALUED_CONFLICT
Unvalued symbols are a deprecated Marpa feature, which may be
avoided with the 'marpa_g_force_valued()' method. An unvalued
symbol may take on any value, and therefore a symbol which is
unvalued at some points cannot safely to be used to contain a value
at others. This error indicates that such an unsafe use is being
attempted. Numeric value: 74. Suggested message: "Symbol is
treated both as valued and unvalued".
-- Macro: int MARPA_ERR_TERMINAL_IS_LOCKED
An attempt was made to change the terminal status of a symbol to a
different value after it was locked. Numeric value: 75. Suggested
message: "The terminal status of the symbol is locked".
-- Macro: int MARPA_ERR_TOKEN_IS_NOT_TERMINAL
A token was specified whose symbol ID is not a terminal. Numeric
value: 76. Suggested message: "Token symbol must be a terminal".
-- Macro: int MARPA_ERR_TOKEN_LENGTH_LE_ZERO
A token length was specified which is less than or equal to zero.
Zero-length tokens are not allowed in Libmarpa. Numeric value: 77.
Suggested message: "Token length must greater than zero".
-- Macro: int MARPA_ERR_TOKEN_TOO_LONG
The token length is too long. The limit on the length of a token
is implementation dependent, but it is at least 500,000,000
earlemes. An application using a token that long is almost certain
to run into some other limit. Numeric value: 78. Suggested
message: "Token is too long".
-- Macro: int MARPA_ERR_TREE_EXHAUSTED
A Libmarpa parse tree iterator is "exhausted", that is, it has no
more parses. Numeric value: 79. Suggested message: "Tree iterator
is exhausted".
-- Macro: int MARPA_ERR_TREE_PAUSED
A Libmarpa tree is "paused" and an operation was attempted which is
inconsistent with that fact. Typically, this operation will be a
call of the 'marpa_t_next()' method. Numeric value: 80. Suggested
message: "Tree iterator is paused".
-- Macro: int MARPA_ERR_UNEXPECTED_TOKEN_ID
An attempt was made to read a token where a token with that symbol
ID is not expected. This message can also occur when an attempt is
made to read a token at a location where no token is expected.
Numeric value: 81. Suggested message: "Unexpected token".
-- Macro: int MARPA_ERR_UNPRODUCTIVE_START
The start symbol is unproductive. That means it could never match
any possible input, not even the null string. Presumably, an error
in writing the grammar. Numeric value: 82. Suggested message:
"Unproductive start symbol".
-- Macro: int MARPA_ERR_VALUATOR_INACTIVE
The valuator is inactive in a context where that should not be the
case. Numeric value: 83. Suggested message: "Valuator inactive".
-- Macro: int MARPA_ERR_VALUED_IS_LOCKED
Unvalued symbols are a deprecated Marpa feature, which may be
avoided with the 'marpa_g_force_valued()' method. This error code
indicates that the valued status of a symbol is locked, and an
attempt was made to change it to a status different from the
current one. Numeric value: 84. Suggested message: "The valued
status of the symbol is locked".
-- Macro: int MARPA_ERR_SYMBOL_IS_NULLING
An attempt was made to do something with a nulling symbol that is
not allowed. For example, the ID of a nulling symbol cannot be an
argument to 'marpa_r_expected_symbol_event_set()' -- because it is
not possible to create an "expected symbol" event for a nulling
symbol. Numeric value: 87. Suggested message: "Symbol is
nulling".
-- Macro: int MARPA_ERR_SYMBOL_IS_UNUSED
An attempt was made to do something with an unused symbol that is
not allowed. An "unused" symbol is a inaccessible or unproductive
symbol. For example, the ID of a unused symbol cannot be an
argument to 'marpa_r_expected_symbol_event_set()' -- because it is
not possible to create an "expected symbol" event for an unused
symbol. Numeric value: 88. Suggested message: "Symbol is not
used".
File: api.info, Node: Internal error codes, Prev: External error codes, Up: Error methods macros and codes
21.4 Internal error codes
=========================
An internal error code may be one of two things: First, it can be an
error code which arises from an internal Libmarpa programming issue (in
other words, something happening in the code that was not supposed to be
able to happen.) Second, it can be an error code which only occurs when
a method from Libmarpa's internal interface is used. Both kinds of
internal error message share one common trait -- users of the Libmarpa's
external interface should never see them.
Internal error messages require someone with knowledge of the
Libmarpa internals to follow up on them. They usually do not have
descriptions or suggested messages.
-- Macro: int MARPA_ERR_AHFA_IX_NEGATIVE
Numeric value: 1.
-- Macro: int MARPA_ERR_AHFA_IX_OOB
Numeric value: 2.
-- Macro: int MARPA_ERR_ANDID_NEGATIVE
Numeric value: 3.
-- Macro: int MARPA_ERR_ANDID_NOT_IN_OR
Numeric value: 4.
-- Macro: int MARPA_ERR_ANDIX_NEGATIVE
Numeric value: 5.
-- Macro: int MARPA_ERR_BOCAGE_ITERATION_EXHAUSTED
Numeric value: 7.
-- Macro: int MARPA_ERR_DEVELOPMENT
"Development" errors were used heavily during Libmarpa's
development, when it was not yet clear how precisely to classify
every error condition. Unless they are using a developer's
version, users of the external interface should never see
development errors.
Development errors have an error string associated with them. The
error string is a short 7-bit ASCII error string which describes
the error. Numeric value: 9. Suggested message: "Development
error, see string".
-- Macro: int MARPA_ERR_DUPLICATE_AND_NODE
Numeric value: 10.
-- Macro: int MARPA_ERR_YIM_ID_INVALID
Numeric value: 14.
-- Macro: int MARPA_ERR_INTERNAL
A "catchall" internal error. Numeric value: 19.
-- Macro: int MARPA_ERR_INVALID_AHFA_ID
The AHFA ID was invalid. There are no AHFAs any more, so this
message should not occur. Numeric value: 20.
-- Macro: int MARPA_ERR_INVALID_AIMID
The AHM ID was invalid. The term "AIMID" is a legacy of earlier
implementations and must be kept for backward compatibility.
Numeric value: 21.
-- Macro: int MARPA_ERR_INVALID_IRLID
Numeric value: 23.
-- Macro: int MARPA_ERR_INVALID_NSYID
Numeric value: 24.
-- Macro: int MARPA_ERR_NOOKID_NEGATIVE
Numeric value: 33.
-- Macro: int MARPA_ERR_NOT_TRACING_COMPLETION_LINKS
Numeric value: 35.
-- Macro: int MARPA_ERR_NOT_TRACING_LEO_LINKS
Numeric value: 36.
-- Macro: int MARPA_ERR_NOT_TRACING_TOKEN_LINKS
Numeric value: 37.
-- Macro: int MARPA_ERR_NO_AND_NODES
Numeric value: 38.
-- Macro: int MARPA_ERR_NO_OR_NODES
Numeric value: 40.
-- Macro: int MARPA_ERR_NO_TRACE_YS
Numeric value: 46.
-- Macro: int MARPA_ERR_NO_TRACE_PIM
Numeric value: 47.
-- Macro: int MARPA_ERR_NO_TRACE_YIM
Numeric value: 45.
-- Macro: int MARPA_ERR_NO_TRACE_SRCL
Numeric value: 48.
-- Macro: int MARPA_ERR_ORID_NEGATIVE
Numeric value: 51.
-- Macro: int MARPA_ERR_OR_ALREADY_ORDERED
Numeric value: 52.
-- Macro: int MARPA_ERR_PIM_IS_NOT_LIM
Numeric value: 55.
-- Macro: int MARPA_ERR_SOURCE_TYPE_IS_NONE
Numeric value: 70.
-- Macro: int MARPA_ERR_SOURCE_TYPE_IS_TOKEN
Numeric value: 71.
-- Macro: int MARPA_ERR_SOURCE_TYPE_IS_COMPLETION
Numeric value: 68.
-- Macro: int MARPA_ERR_SOURCE_TYPE_IS_LEO
Numeric value: 69.
-- Macro: int MARPA_ERR_SOURCE_TYPE_IS_AMBIGUOUS
Numeric value: 67.
-- Macro: int MARPA_ERR_SOURCE_TYPE_IS_UNKNOWN
Numeric value: 72.
File: api.info, Node: Technical notes, Next: Advanced input models, Prev: Error methods macros and codes, Up: Top
22 Technical notes
******************
This section contains technical notes that are not necessary for the
main presentation, but which may be useful or interesting.
* Menu:
* Data types used by Libmarpa::
* Why so many time objects::
* Design of numbered objects::
* LHS Terminals::
File: api.info, Node: Data types used by Libmarpa, Next: Why so many time objects, Prev: Technical notes, Up: Technical notes
22.1 Data types used by Libmarpa
================================
Libmarpa does not use any floating point data or strings. All data are
either integers or pointers.
File: api.info, Node: Why so many time objects, Next: Design of numbered objects, Prev: Data types used by Libmarpa, Up: Technical notes
22.2 Why so many time objects?
==============================
Marpa is an aggressively multi-pass algorithm. Marpa achieves its
efficiency, not in spite of making multiple passes over the data, but
because of it. Marpa regularly substitutes two fast O(N) passes for a
single O(N log N) pass. Marpa's proliferation of time objects is in
keeping with its multi-pass approach.
Bocage objects come at no cost, even for unambiguous parses, because
the same pass which creates the bocage also deals with other issues
which are of major significance for unambiguous parses. It is the
post-processing of the bocage pass that enables Marpa to do both left-
and right-recursion in linear time.
Of the various objects, the best case for elimination is of the
ordering object. In many cases, the ordering is trivial. Either the
parse is unambiguous, or the application does not care about the order
in which parses are returned. But while it would be easy to add an
option to bypass creation of an ordering object, there is little to be
gained from it. When the ordering is trivial, its overhead is very
small -- essentially a handful of subroutine calls. Many orderings
accomplish nothing, but these cost next to nothing.
Tree objects come at minimal cost to unambiguous grammars, because
the same pass that allows iteration through multiple parse trees does
the tree traversal. This eliminates much of the work that otherwise
would need to be done in the valuation time object. In the current
implement, the valuation time object needs only to step through a
sequence already determined in the tree iterator.
File: api.info, Node: Design of numbered objects, Next: LHS Terminals, Prev: Why so many time objects, Up: Technical notes
22.3 Numbered objects
=====================
As the name suggests, the choice was made to implement numbered objects
as integers, and not as pointers. In standard-conformant C, integers
can be safely checked for validity, while pointers cannot.
There are efficiency tradeoffs between pointers and integers but they
are complicated, and they go both ways. Pointers can be faster, but
integers can be used as indexes into more than one data structure.
Which is actually faster depends on the design. Integers allow for a
more flexible design, so that once the choice is settled on, careful
programming can make them a win, possibly a very big one.
The approach taken in Libmarpa was to settle, from the outset, on
integers as the implementation for numbered objects, and to optimize on
that basis. The author concedes that it is possible that others redoing
Libmarpa from scratch might find that pointers are faster. But the
author is confident that they will also discover, on modern
architectures, that the lack of safe validity checking is far too high a
price to pay for the difference in speed.
File: api.info, Node: LHS Terminals, Prev: Design of numbered objects, Up: Technical notes
22.4 LHS terminals
==================
Marpa's idea in losing the sharp division between terminals and
non-terminals is that the distinction, while helpful for proving
theorems, is not essential in practice. LHS symbols in the input might
be useful for "short circuiting" the rules in which they occur. This
may prove helpful in debugging, or have other applications.
However, it also can be useful, for checking input validity as well
as for efficiency, to follow tradition and distinguish non-terminals
from terminals. For this reason, the traditional behavior is the
default in Libmarpa.
File: api.info, Node: Advanced input models, Next: Futures, Prev: Technical notes, Up: Top
23 Advanced input models
************************
In an earlier chapter, we introduced Libmarpa's concept of input, and
described its basic input models. *Note Input::. In this chapter we
describe Libmarpa's advanced models of input. These advanced input
models have attracted considerable interest. However, they have seen
little actual use so far, and for that reason we delayed their
consideration until now.
A Libmarpa input model is "advanced" if it allows tokens of length
other than 1. The advanced input models are also called
"variable-length token models" because they allow the token length to
vary from the "normal" length of 1.
* Menu:
* The dense variable-length token model::
* The fully general input model::
File: api.info, Node: The dense variable-length token model, Next: The fully general input model, Prev: Advanced input models, Up: Advanced input models
23.1 The dense variable-length token model
==========================================
In the "dense variable-length model of input", one or more successful
calls of 'marpa_r_alternative()' must be immediately previous to every
call to 'marpa_r_earleme_complete()'. Note that, for a variable-length
input model to be "dense" according to this definition, at least one
successful call of 'marpa_r_alternative()' must be immediately previous
to each call to 'marpa_r_earleme_complete()'. Recall that, in this
document, we say that a 'marpa_r_alternative()' call is "immediately
previous" to a 'marpa_r_earleme_complete()' call iff that
'marpa_r_earleme_complete()' call is the first
'marpa_r_earleme_complete()' call after the 'marpa_r_alternative()'
call.
In the dense model of input, after a successful call of
'marpa_r_alternative()', the earleme variables are as follows:
* The furthest earleme will be 'max(OLD_F, OLD_C+LENGTH)',
* where OLD_F is the furthest earleme before the call to
'marpa_r_alternative()',
* OLD_C is the value of the current earleme before the call to
'marpa_r_alternative()', and
* LENGTH is the length of the token read.
* 'marpa_r_alternative()' never changes the latest or current
earleme.
In the dense variable-length model of input, the effect of the
'marpa_r_earleme_complete()' mutator on the earleme variables is the
same as for the basic models of input. *Note The standard model of
input::.
In the dense model of input, the latest earleme is always the same as
the current earleme. In fact, the latest earleme and the current
earleme are always the same, except in the fully general model of input.
File: api.info, Node: The fully general input model, Prev: The dense variable-length token model, Up: Advanced input models
23.2 The fully general input model
==================================
In the "sparse variable-length model of input", zero or more successful
calls of 'marpa_r_alternative()' must be immediately previous to every
call to 'marpa_r_earleme_complete()'. The sparse model is the dense
variable-length model, with its only restriction lifted -- the sparse
variable-length input model allows calls to 'marpa_r_earleme_complete()'
that are not immediately preceded by calls to 'marpa_r_alternative()'.
Since it is unrestricted, the sparse input model is Libmarpa's fully
general input model. Because of this, it may be useful for us specify
the effect of mutators on the earleme variables in detail, even at the
expense of some repetition.
In the sparse input model, "empty earlemes" are now possible. An
empty earleme is an earleme with no tokens and no Earley set. An empty
earleme occurs iff 'marpa_r_earleme_complete()' is called when there is
no immediately previous call to 'marpa_r_alternative()'. The sparse
model takes its name from the fact that there may be earlemes with no
Earley set. In the sparse model, Earley sets are "sparsely" distributed
among the earlemes.
In the dense model of input, the effect on the earleme variables of a
successful call of the 'marpa_r_alternative()' mutator is the same as
for the sparse model of input:
* The furthest earleme will be 'max(OLD_F, OLD_C+LENGTH)',
* where OLD_F is the furthest earleme before the call to
'marpa_r_alternative()',
* OLD_C is the value of the current earleme before the call to
'marpa_r_alternative()', and
* LENGTH is the length of the token read.
* 'marpa_r_alternative()' never changes the latest or current
earleme.
In the sparse model, when the earleme is not empty, the effect of a
call to 'marpa_r_earleme_complete()' on the earleme variables is the
same as in the dense and the basic models of input. Specifically, the
following will be true:
* The current earleme will be advanced to 'OLD_C+1', where OLD_C is
the current earleme before the call.
* The latest earleme will be 'OLD_C+1', and therefore will be equal
to the current earleme.
* The value of the furthest earleme is never changed by a call to
'marpa_r_earleme_complete()'.
Recall that, in the dense and basic input models, as a matter of
definition, there are no empty earlemes. For the sparse input model, in
the case of an empty earleme, the effect of the
'marpa_r_earleme_complete()' mutator on the earleme variables is the
following:
* The current earleme will be advanced to 'OLD_C+1', where OLD_C is
the current earleme before the call.
* The latest earleme will remain at OLD_L, where the latest earleme
before the call is OLD_L. This implies that the latest earleme
will be less than the current earleme.
* The furthest earleme is never changed by a call to
'marpa_r_earleme_complete()'.
After a call to 'marpa_r_earleme_complete()' for an empty earleme,
the lastest and current earlemes will have different values. In a parse
that never calls 'marpa_r_earleme_complete()' for an empty earleme, the
lastest and current earlemes will always be the same.
File: api.info, Node: Futures, Next: Deprecated techniques and methods, Prev: Advanced input models, Up: Top
24 Futures
**********
This chapter discusses features that are *not* in the external
interface, but that might be added to the external interface in the
future.
* Menu:
* Orthogonal treatment of exhaustion::
* Furthest earleme values::
* Additional recoverable failures in marpa_r_alternative()::
* Untested methods::
File: api.info, Node: Orthogonal treatment of exhaustion, Next: Furthest earleme values, Prev: Futures, Up: Futures
24.1 Orthogonal treatment of exhaustion
=======================================
The treatment of parse exhaustion is very awkward.
'marpa_r_start_input()' returns success on exhaustion, while
'marpa_r_earleme_complete()' either returns success or a hard failure,
depending on circumstances. *Note marpa_r_earleme_complete():
marpa_r_earleme_complete. and *note marpa_r_start_input():
marpa_r_start_input.
Ideally the treatment should be simpler, more intuitive and more
orthogonal. Better, perhaps, would be to always treat parse exhaustion
as a soft failure.
File: api.info, Node: Furthest earleme values, Next: Additional recoverable failures in marpa_r_alternative(), Prev: Orthogonal treatment of exhaustion, Up: Futures
24.2 Furthest earleme values
============================
'marpa_r_furthest_earleme' returns 'unsigned int' which is
non-orthogonal with 'marpa_r_current_earleme'. This leaves no room for
an failure return value, which we deal with by not checking for
failures, of which the only important one is calling
'marpa_r_furthest_earleme' before the start of input. To consider
'marpa_r_furthest_earleme' we consider furthese earleme to have been
initialized when the recognizer was created, which is another
non-orthogonality with 'marpa_r_current_earleme'.
All this might be fine, if something were gained, but in fact in the
furthest earleme, unless there is a problem, always becomes the current
earleme, and no use cases for extremely long variable-length tokens are
envisioned, so that the two should never be far apart. Additionally,
the additional values for the furthest earleme only come into play if
the parse is to large for the computer memories as of this writing.
Summarizing, 'marpa_r_furthest_earleme', should return an 'int', like
'marpa_r_current_earleme', and the non-orthogonalities should be
eliminated.
File: api.info, Node: Additional recoverable failures in marpa_r_alternative(), Next: Untested methods, Prev: Furthest earleme values, Up: Futures
24.3 Additional recoverable failures in marpa_r_alternative()
=============================================================
Among the hard failures that marpa_r_alternative() returns are the error
codes 'MARPA_ERR_DUPLICATE_TOKEN', 'MARPA_ERR_NO_TOKEN_EXPECTED_HERE'
and 'MARPA_ERR_INACCESSIBLE_TOKEN'. These are currently irrecoverable.
They may in fact be fully recoverable, but are not documented as such
because this has not been tested.
At this writing, we know of no applications which attempt to recover
from these errors. It is possible that these error codes may also be
useable for the techniques similar to the Ruby Slippers, as of this
writing, we know of no proposals to use them in this way.
File: api.info, Node: Untested methods, Prev: Additional recoverable failures in marpa_r_alternative(), Up: Futures
24.4 Untested methods
=====================
The methods of this section are not in the external interface, because
they have not been adequately tested. Their fate is uncertain. Users
should regard these methods as unsupported.
* Menu:
* Ranking methods::
* Zero-width assertion methods::
* Methods for revising parses::
File: api.info, Node: Ranking methods, Next: Zero-width assertion methods, Prev: Untested methods, Up: Untested methods
24.4.1 Ranking methods
----------------------
-- Function: Marpa_Rank marpa_g_default_rank_set ( Marpa_Grammar G,
Marpa_Rank RANK)
-- Function: Marpa_Rank marpa_g_default_rank ( Marpa_Grammar G)
These methods, respectively, set and query the default rank of the
grammar. When a grammar is created, the default rank is 0. When
rules and symbols are created, their rank is the default rank of
the grammar.
Changing the grammar's default rank does not affect those rules and
symbols already created, only those that will be created. This
means that the grammar's default rank can be used to, in effect,
assign ranks to groups of rules and symbols. Applications may find
this behavior useful.
Return value: On success, returns the rank *after* the call, and
sets the error code to 'MARPA_ERR_NONE'. On failure, returns -2,
and sets the error code to an appropriate value, which will never
be 'MARPA_ERR_NONE'. Note that when the rank is -2, the error code
is the only way to distinguish success from failure. The error
code can be determined by using the 'marpa_g_error()' call.
-- Function: Marpa_Rank marpa_g_symbol_rank_set ( Marpa_Grammar G,
Marpa_Symbol_ID SYM_ID, Marpa_Rank RANK)
-- Function: Marpa_Rank marpa_g_symbol_rank ( Marpa_Grammar G,
Marpa_Symbol_ID sym_id)
These methods, respectively, set and query the rank of a symbol
SYM_ID. When SYM_ID is created, its rank initialized to the
default rank of the grammar.
Return value: On success, returns the rank *after* the call, and
sets the error code to 'MARPA_ERR_NONE'. On failure, returns -2,
and sets the error code to an appropriate value, which will never
be 'MARPA_ERR_NONE'. Note that when the rank is -2, the error code
is the only way to distinguish success from failure. The error
code can be determined by using the 'marpa_g_error()' call.
File: api.info, Node: Zero-width assertion methods, Next: Methods for revising parses, Prev: Ranking methods, Up: Untested methods
24.4.2 Zero-width assertion methods
-----------------------------------
-- Function: Marpa_Assertion_ID marpa_g_zwa_new ( Marpa_Grammar G, int
DEFAULT_VALUE)
-- Function: int marpa_g_zwa_place ( Marpa_Grammar G,
Marpa_Assertion_ID ZWAID, Marpa_Rule_ID XRL_ID, int RHS_IX)
-- Function: int marpa_r_zwa_default ( Marpa_Recognizer R,
Marpa_Assertion_ID ZWAID)
On success, returns previous default value of the assertion.
-- Function: int marpa_r_zwa_default_set ( Marpa_Recognizer R,
Marpa_Assertion_ID ZWAID, int DEFAULT_VALUE)
Changes default value to DEFAULT_VALUE. On success, returns
previous default value of the assertion.
-- Function: Marpa_Assertion_ID marpa_g_highest_zwa_id ( Marpa_Grammar
G )
File: api.info, Node: Methods for revising parses, Prev: Zero-width assertion methods, Up: Untested methods
24.4.3 Methods for revising parses
----------------------------------
Marpa allows an application to "change its mind" about a parse, rejected
rule previously recognized or predicted, and terminals previously
scanned. The methods in this section provide that capability.
-- Function: Marpa_Earleme marpa_r_clean ( Marpa_Recognizer R)
File: api.info, Node: Deprecated techniques and methods, Next: Index of terms, Prev: Futures, Up: Top
25 Deprecated techniques and methods
************************************
* Menu:
* Valued and unvalued symbols::
File: api.info, Node: Valued and unvalued symbols, Prev: Deprecated techniques and methods, Up: Deprecated techniques and methods
25.1 Valued and unvalued symbols
================================
* Menu:
* What unvalued symbols were::
* Grammar methods dealing with unvalued symbols::
* Registering semantics in the valuator::
File: api.info, Node: What unvalued symbols were, Next: Grammar methods dealing with unvalued symbols, Prev: Valued and unvalued symbols, Up: Valued and unvalued symbols
25.1.1 What unvalued symbols were
---------------------------------
Libmarpa symbols can have values, which is the traditional way of doing
semantics. Libmarpa also allows symbols to be unvalued. An "unvalued"
symbol is one whose value is unpredictable from instance to instance.
If a symbol is unvalued, we sometimes say that it has "whatever"
semantics.
Situations where the semantics can tolerate unvalued symbols are
surprisingly frequent. For example, the top-level of many languages is
a series of major units, all of whose semantics are typically
accomplished via side effects. The compiler is typically indifferent to
the actual value produced by these major units, and tracking them is a
waste of time. Similarly, the value of the separators in a list is
typically ignored.
Rules are unvalued if and only if their LHS symbols are unvalued.
When rules and symbols are unvalued, Libmarpa optimizes their
evaluation.
It is in principle unsafe to check the value of a symbol if it can be
unvalued. For this reason, once a symbol has been treated as valued,
Libmarpa marks it as valued. Similarly, once a symbol has been treated
as unvalued, Libmarpa marks it as unvalued. Once marked, a symbol's
valued status is "locked" and cannot be changed later.
The valued status of terminals is marked the first time they are
read. The valued status of LHS symbols must be explicitly marked by the
application when initializing the valuator -- this is Libmarpa's
equivalent of registering a callback.
LHS terminals are disabled by default. If allowed, the user should
be aware that the valued status of a LHS terminal will be locked in the
recognizer if it is used as a terminal, and the symbol's use as a rule
LHS in the valuator must be consistent with the recognizer's marking.
Marpa reports an error when a symbol's use conflicts with its locked
valued status. Doing so usually saves the Libmarpa user some tricky
debugging further down the road.
File: api.info, Node: Grammar methods dealing with unvalued symbols, Next: Registering semantics in the valuator, Prev: What unvalued symbols were, Up: Valued and unvalued symbols
25.1.2 Grammar methods dealing with unvalued symbols
----------------------------------------------------
-- Function: int marpa_g_symbol_is_valued_set ( Marpa_Grammar G,
Marpa_Symbol_ID SYMBOL_ID, int value)
-- Function: int marpa_g_symbol_is_valued ( Marpa_Grammar G,
Marpa_Symbol_ID SYMBOL_ID)
These methods, respectively, set and query the "valued status" of a
symbol. Once set to a value with the
'marpa_g_symbol_is_valued_set()' method, the valued status of a
symbol is "locked" at that value. It cannot thereafter be changed.
Subsequent calls to 'marpa_g_symbol_is_valued_set()' for the same
SYM_ID will fail, leaving SYM_ID's valued status unchanged, unless
VALUE is the same as the locked-in value.
Return value: On success, 1 if the symbol SYMBOL_ID is valued after
the call, 0 if not. If the valued status is locked and VALUE is
different from the current status, -2. If VALUE is not 0 or 1; or
on other failure, -2.
File: api.info, Node: Registering semantics in the valuator, Prev: Grammar methods dealing with unvalued symbols, Up: Valued and unvalued symbols
25.1.3 Registering semantics in the valuator
--------------------------------------------
By default, Libmarpa's valuator objects assume that non-terminal symbols
have no semantics. The archetypal application will need to register
symbols that contain semantics. The primary method for doing this is
'marpa_v_symbol_is_valued()'. Applications will typically register
semantics by rule, and these applications will find the
'marpa_v_rule_is_valued()' method more convenient.
-- Function: int marpa_v_symbol_is_valued_set ( Marpa_Value V,
Marpa_Symbol_ID SYM_ID, int STATUS )
-- Function: int marpa_v_symbol_is_valued ( Marpa_Value V,
Marpa_Symbol_ID SYM_ID )
These methods, respectively, set and query the valued status of
symbol SYM_ID. 'marpa_v_symbol_is_valued_set()' will set the
valued status to the value of its STATUS argument. A valued status
of 1 indicates that the symbol is valued. A valued status of 0
indicates that the symbol is unvalued. If the valued status is
locked, an attempt to change to a status different from the current
one will fail (error code 'MARPA_ERR_VALUED_IS_LOCKED').
Return value: On success, the valued status *after* the call. If
VALUE is not either 0 or 1, or on other failure, -2.
-- Function: int marpa_v_rule_is_valued_set ( Marpa_Value V,
Marpa_Rule_ID RULE_ID, int STATUS )
-- Function: int marpa_v_rule_is_valued ( Marpa_Value V, Marpa_Rule_ID
RULE_ID )
These methods, respectively, set and query the valued status for
the LHS symbol of rule RULE_ID. 'marpa_v_rule_is_valued_set()'
sets the valued status to the value of its STATUS argument.
A valued status of 1 indicates that the symbol is valued. A valued
status of 0 indicates that the symbol is unvalued. If the valued
status is locked, an attempt to change to a status different from
the current one will fail (error code
'MARPA_ERR_VALUED_IS_LOCKED').
Rules have no valued status of their own. The valued status of a
rule is always that of its LHS symbol. These methods are
conveniences -- they save the application the trouble of looking up
the rule's LHS.
Return value: On success, the valued status of the rule RULE_ID's
LHS symbol *after* the call. If VALUE is not either 0 or 1, or on
other failure, -2.
-- Function: int marpa_v_valued_force ( Marpa_Value V)
This methods locks the valued status of all symbols to 1, indicated
that the symbol is valued. If this is not possible, for example
because one of the grammar's symbols already is locked at a valued
status of 0, failure is returned.
Return value: On success, a non-negative number. On failure,
returns -2, and sets the error code to an appropriate value, which
will never be 'MARPA_ERR_NONE'.
File: api.info, Node: Index of terms, Prev: Deprecated techniques and methods, Up: Top
Index of terms
**************
This index is of terms that are used in a special sense in this
document. Not every use of these terms is indexed -- only those uses
which are in some way defining.
[index ]
* Menu:
* accessible rule: Rule methods. (line 14)
* accessible symbol: Symbol methods. (line 34)
* active parse: Exhaustion. (line 7)
* advanced input model: Advanced input models.
(line 13)
* advanced models of input: The basic models of input.
(line 15)
* application: Terminology and notation.
(line 9)
* application behavior: Application and diagnostic behavior.
(line 6)
* applications, exhaustion-hating: Exhaustion. (line 38)
* applications, exhaustion-loving: Exhaustion. (line 44)
* archetypal Libmarpa application: About the overviews. (line 13)
* base grammar (of a time object): Time objects. (line 24)
* basic models of input: The basic models of input.
(line 7)
* behavior, application: Application and diagnostic behavior.
(line 6)
* behavior, diagnostic: Application and diagnostic behavior.
(line 16)
* boolean: Terminology and notation.
(line 7)
* boolean value: Terminology and notation.
(line 7)
* child object (of a time object): Time objects. (line 15)
* counted symbol: Sequence methods. (line 118)
* dense variable-length input model: The dense variable-length token model.
(line 6)
* diagnostic behavior: Application and diagnostic behavior.
(line 16)
* earleme: The traditional input model.
(line 16)
* earleme, current: The current earleme. (line 6)
* earleme, empty: The fully general input model.
(line 18)
* earleme, furthest: The furthest earleme.
(line 6)
* earleme, latest: The latest earleme. (line 9)
* Earley item warning threshold: Other parse status methods.
(line 19)
* Earley set, latest: The latest earleme. (line 6)
* empty earleme: The fully general input model.
(line 18)
* exhausted parse: Exhaustion. (line 6)
* exhaustion-hating applications: Exhaustion. (line 38)
* exhaustion-loving applications: Exhaustion. (line 44)
* failure: User non-conformity to specified behavior.
(line 21)
* failure, fully recoverable hard: Fully recoverable hard failure.
(line 6)
* failure, hard: Classifying failure. (line 16)
* failure, irrecoverable hard: Irrecoverable hard failure.
(line 6)
* failure, Libmarpa application programming: User non-conformity to specified behavior.
(line 21)
* failure, library-recoverable hard: Library-recoverable hard failure.
(line 6)
* failure, memory allocation: Memory allocation failure.
(line 9)
* failure, partially recoverable hard: Partially recoverable hard failure.
(line 6)
* failure, soft: Classifying failure. (line 17)
* failure, soft <1>: Soft failure. (line 6)
* failure, undetected: Undetected failure. (line 6)
* frozen ordering: Ordering overview. (line 11)
* fully recoverable hard failure: Fully recoverable hard failure.
(line 6)
* hard failure: Classifying failure. (line 16)
* hard failure, fully recoverable: Fully recoverable hard failure.
(line 6)
* hard failure, irrecoverable: Irrecoverable hard failure.
(line 6)
* hard failure, library-recoverable: Library-recoverable hard failure.
(line 6)
* hard failure, partially recoverable: Partially recoverable hard failure.
(line 6)
* ID (of an Earley set): The traditional input model.
(line 17)
* iff: Terminology and notation.
(line 8)
* immediately previous (to a marpa_r_earleme_complete() call): The standard model of input.
(line 8)
* input model, advanced: Advanced input models.
(line 13)
* input model, dense variable-length: The dense variable-length token model.
(line 6)
* input model, sparse variable-length: The fully general input model.
(line 6)
* input model, variable-length token: Advanced input models.
(line 14)
* input, advanced models of: The basic models of input.
(line 15)
* input, basic models of: The basic models of input.
(line 7)
* irrecoverable hard failure: Irrecoverable hard failure.
(line 6)
* iterator, parse tree: Tree overview. (line 7)
* Libmarpa application programming failure: User non-conformity to specified behavior.
(line 21)
* Libmarpa application programming success: User non-conformity to specified behavior.
(line 25)
* Libmarpa application, archetypal: About the overviews. (line 13)
* library-recoverable hard failure: Library-recoverable hard failure.
(line 6)
* locked value status (of a symbol): What unvalued symbols were.
(line 28)
* max(x,y): Terminology and notation.
(line 13)
* memory allocation failur: Memory allocation failure.
(line 9)
* method: Terminology and notation.
(line 15)
* models of input, advanced: The basic models of input.
(line 15)
* models of input, basic: The basic models of input.
(line 7)
* nullable rule: Rule methods. (line 28)
* nullable symbol: Symbol methods. (line 46)
* nulling rule: Rule methods. (line 42)
* nulling symbol: Symbol methods. (line 59)
* ordering, frozen: Ordering overview. (line 11)
* ordinal (of an Earley set): The traditional input model.
(line 17)
* our: Terminology and notation.
(line 20)
* parent object (of a time object): Time objects. (line 14)
* parse tree: Tree overview. (line 7)
* parse tree iterator: Tree overview. (line 7)
* parse, active: Exhaustion. (line 7)
* parse, exhausted: Exhaustion. (line 6)
* partially recoverable hard failure: Partially recoverable hard failure.
(line 6)
* previous (to a marpa_r_earleme_complete() call), immediately: The standard model of input.
(line 8)
* productive rule: Rule methods. (line 76)
* productive symbol: Symbol methods. (line 71)
* proper separation: Sequence methods. (line 90)
* Ruby Slippers: Recognizer life cycle mutators.
(line 86)
* rule, accessible: Rule methods. (line 14)
* rule, nullable: Rule methods. (line 28)
* rule, nulling: Rule methods. (line 42)
* rule, productive: Rule methods. (line 76)
* separation, proper: Sequence methods. (line 90)
* soft failure: Classifying failure. (line 17)
* soft failure <1>: Soft failure. (line 6)
* sparse variable-length input model: The fully general input model.
(line 6)
* success: User non-conformity to specified behavior.
(line 25)
* success, Libmarpa application programming: User non-conformity to specified behavior.
(line 25)
* symbol, accessible: Symbol methods. (line 34)
* symbol, counted: Sequence methods. (line 118)
* symbol, nullable: Symbol methods. (line 46)
* symbol, nulling: Symbol methods. (line 59)
* symbol, productive: Symbol methods. (line 71)
* symbol, unvalued: What unvalued symbols were.
(line 7)
* tree: Tree overview. (line 7)
* undetected failure: Undetected failure. (line 6)
* unvalued symbol: What unvalued symbols were.
(line 7)
* us: Terminology and notation.
(line 20)
* user: Terminology and notation.
(line 17)
* valuator: Value overview. (line 6)
* value status, locked (of a symbol): What unvalued symbols were.
(line 28)
* value, boolean: Terminology and notation.
(line 7)
* variable-length input model, dense: The dense variable-length token model.
(line 6)
* variable-length input model, sparse: The fully general input model.
(line 6)
* variable-length token input model: Advanced input models.
(line 14)
* we: Terminology and notation.
(line 20)
Tag Table:
Node: Top1400
Ref: license1643
Node: No warranty6480
Node: About this document6847
Node: How to read this document7097
Node: Prerequisites7565
Node: Parsing theory8162
Node: Terminology and notation8879
Node: Application and diagnostic behavior10138
Node: About Libmarpa11611
Node: Architecture13443
Node: Major objects13649
Node: Time objects14974
Node: Reference counting16428
Node: Numbered objects18481
Node: Input18915
Node: Earlemes19081
Node: The traditional input model19317
Node: The latest earleme20702
Node: The current earleme21941
Node: The furthest earleme22725
Node: The basic models of input24340
Node: The standard model of input25248
Node: Ambiguous input26983
Node: Terminals27848
Node: Exhaustion28710
Node: Semantics32379
Node: Threads33435
Node: Failure34688
Node: Libmarpa's approach to failure35363
Node: User non-conformity to specified behavior37341
Node: Classifying failure39111
Node: Memory allocation failure40306
Node: Undetected failure41307
Node: Irrecoverable hard failure42544
Node: Partially recoverable hard failure43465
Node: Library-recoverable hard failure44285
Node: Fully recoverable hard failure46185
Node: Soft failure46934
Node: Error codes48209
Node: Introduction to the method descriptions49542
Node: About the overviews49922
Node: Naming conventions50889
Node: Return values52034
Node: How to read the method descriptions53248
Node: Static methods56339
Ref: marpa_check_version56505
Ref: marpa_version57291
Node: Configuration methods57595
Ref: marpa_c_init58632
Ref: marpa_c_error58953
Node: Grammar methods59510
Node: Grammar overview59845
Node: Grammar constructor60890
Ref: marpa_g_new61075
Ref: marpa_g_force_valued61939
Node: Grammar reference counting62609
Ref: marpa_g_ref62834
Ref: marpa_g_unref63052
Node: Symbol methods63216
Ref: marpa_g_start_symbol63376
Ref: marpa_g_start_symbol_set63789
Ref: marpa_g_highest_symbol_id64234
Ref: marpa_g_symbol_is_accessible64409
Ref: marpa_g_symbol_is_nullable64933
Ref: marpa_g_symbol_is_nulling65561
Ref: marpa_g_symbol_is_productive66072
Ref: marpa_g_symbol_is_start66642
Ref: marpa_g_symbol_is_terminal67167
Ref: marpa_g_symbol_is_terminal_set68008
Ref: marpa_g_symbol_new69220
Node: Rule methods69481
Ref: marpa_g_highest_rule_id69627
Ref: marpa_g_rule_is_accessible69798
Ref: marpa_g_rule_is_nullable70448
Ref: marpa_g_rule_is_nulling71079
Ref: marpa_g_rule_is_loop71601
Ref: marpa_g_rule_is_productive72663
Ref: marpa_g_rule_length73457
Ref: marpa_g_rule_lhs73864
Ref: marpa_g_rule_new74226
Ref: marpa_g_rule_rhs75726
Node: Sequence methods76429
Ref: marpa_g_rule_is_proper_separation76581
Ref: marpa_g_sequence_min77665
Ref: marpa_g_sequence_new78688
Ref: marpa_g_sequence_separator81528
Ref: marpa_g_symbol_is_counted82078
Node: Rank methods82643
Ref: marpa_g_rule_rank82797
Ref: marpa_g_rule_rank_set83620
Ref: marpa_g_rule_null_high84340
Ref: marpa_g_rule_null_high_set85010
Node: Grammar precomputation86205
Ref: marpa_g_has_cycle86358
Ref: marpa_g_is_precomputed86949
Ref: marpa_g_precompute87132
Node: Recognizer methods91046
Node: Recognizer overview91382
Node: Creating a new recognizer92138
Ref: marpa_r_new92344
Node: Recognizer reference counting92866
Ref: marpa_r_ref93119
Ref: marpa_r_unref93404
Node: Recognizer life cycle mutators93769
Ref: marpa_r_start_input93967
Ref: marpa_r_alternative94825
Ref: Ruby Slippers97608
Ref: marpa_r_earleme_complete98287
Node: Location accessors102834
Ref: marpa_r_current_earleme103027
Ref: marpa_r_earleme103229
Ref: marpa_r_earley_set_value104560
Ref: marpa_r_earley_set_values104866
Ref: marpa_r_furthest_earleme106064
Ref: marpa_r_latest_earley_set106267
Ref: marpa_r_latest_earley_set_value_set106653
Ref: marpa_r_latest_earley_set_values_set106977
Node: Other parse status methods107404
Ref: marpa_r_earley_item_warning_threshold107574
Ref: marpa_r_earley_item_warning_threshold_set107896
Ref: marpa_r_is_exhausted109289
Ref: marpa_r_terminals_expected109708
Ref: marpa_r_terminal_is_expected110271
Node: Progress reports110833
Ref: marpa_r_progress_report_reset111588
Ref: marpa_r_progress_report_start111987
Ref: marpa_r_progress_report_finish112806
Ref: marpa_r_progress_item113340
Node: Bocage methods114405
Node: Bocage overview114646
Node: Bocage constructor115166
Ref: marpa_b_new115345
Node: Bocage reference counting116036
Ref: marpa_b_ref116209
Ref: marpa_b_unref116402
Node: Bocage accessor116936
Ref: marpa_b_ambiguity_metric117064
Ref: marpa_b_is_null117630
Node: Ordering methods117834
Node: Ordering overview118105
Node: Ordering constructor118709
Ref: marpa_o_new118894
Node: Ordering reference counting119176
Ref: marpa_o_ref119354
Ref: marpa_o_unref119546
Node: Order accessor120152
Ref: marpa_o_ambiguity_metric120312
Ref: marpa_o_is_null121158
Node: Non-default ordering121363
Ref: marpa_o_high_rank_only_set121509
Ref: marpa_o_high_rank_only121581
Ref: marpa_o_rank122224
Node: Tree methods122655
Node: Tree overview122880
Node: Tree constructor123707
Ref: marpa_t_new123892
Node: Tree reference counting124361
Ref: marpa_t_ref124527
Ref: marpa_t_unref124716
Node: Tree iteration125322
Ref: marpa_t_next125481
Ref: marpa_t_parse_count126234
Node: Value methods126630
Node: Value overview127042
Node: How to use the valuator128118
Node: Advantages of step-driven valuation130090
Node: Maintaining the stack133570
Node: Sizing the stack135582
Node: Initializing locations in the stack137340
Node: Valuator constructor139377
Ref: marpa_v_new139569
Node: Valuator reference counting140227
Ref: marpa_v_ref140417
Ref: marpa_v_unref140608
Node: Stepping through the valuator141212
Ref: marpa_v_step141426
Node: Valuator steps by type141965
Node: Basic step accessors143902
Node: Other step accessors145764
Node: Events147310
Node: Events overview147603
Node: Basic event accessors148666
Ref: marpa_g_event148832
Ref: marpa_g_event_count149488
Node: Completion events149891
Ref: marpa_g_completion_symbol_activate150054
Ref: marpa_r_completion_symbol_activate151595
Ref: marpa_g_symbol_is_completion_event152574
Ref: marpa_g_symbol_is_completion_event_set152680
Node: Symbol nulled events154708
Ref: marpa_g_nulled_symbol_activate154873
Ref: marpa_r_nulled_symbol_activate156361
Ref: marpa_g_symbol_is_nulled_event157288
Ref: marpa_g_symbol_is_nulled_event_set157390
Node: Prediction events161484
Ref: marpa_g_prediction_symbol_activate161648
Ref: marpa_r_prediction_symbol_activate163189
Ref: marpa_g_symbol_is_prediction_event164150
Ref: marpa_g_symbol_is_prediction_event_set164256
Node: Symbol expected events165904
Ref: marpa_r_expected_symbol_event_set166069
Node: Event codes167156
Ref: MARPA_EVENT_EARLEY_ITEM_THRESHOLD167653
Node: Error methods macros and codes170889
Node: Error methods171161
Ref: marpa_g_error171337
Ref: marpa_g_error_clear171715
Node: Error Macros171988
Node: External error codes172355
Node: Internal error codes195983
Node: Technical notes199734
Node: Data types used by Libmarpa200143
Node: Why so many time objects200445
Node: Design of numbered objects202213
Node: LHS Terminals203455
Node: Advanced input models204151
Node: The dense variable-length token model204987
Node: The fully general input model206855
Node: Futures210229
Node: Orthogonal treatment of exhaustion210667
Node: Furthest earleme values211359
Node: Additional recoverable failures in marpa_r_alternative()212659
Node: Untested methods213527
Node: Ranking methods213976
Ref: marpa_g_default_rank_set214150
Ref: marpa_g_default_rank214246
Ref: marpa_g_symbol_rank_set215277
Ref: marpa_g_symbol_rank215396
Node: Zero-width assertion methods216085
Ref: marpa_g_zwa_new216296
Ref: marpa_g_zwa_place216394
Ref: marpa_r_zwa_default216520
Ref: marpa_r_zwa_default_set216684
Ref: marpa_g_highest_zwa_id216917
Node: Methods for revising parses217004
Ref: marpa_r_clean217392
Node: Deprecated techniques and methods217457
Node: Valued and unvalued symbols217683
Node: What unvalued symbols were218019
Node: Grammar methods dealing with unvalued symbols220179
Ref: marpa_g_symbol_is_valued_set220473
Ref: marpa_g_symbol_is_valued220587
Node: Registering semantics in the valuator221379
Ref: marpa_v_symbol_is_valued_set222010
Ref: marpa_v_symbol_is_valued222121
Ref: marpa_v_rule_is_valued_set222830
Ref: marpa_v_rule_is_valued222938
Ref: marpa_v_valued_force223928
Node: Index of terms224417
End Tag Table