NAME
Marpa::R2::Advanced::Thin - Direct access to Libmarpa
About this document
At this moment, this document is INCOMPLETE and, for that reason NOT 100% RELIABLE.
Most Marpa users can ignore this document. It describes Marpa's "thin" interface. The provides efficient access to Marpa's core library, Libmarpa. It provides the ultimate in Marpa speed, power and flexibility.
The "thin" interface is very low-level and NOT convenient to use -- user-friendliness is expected to be provided by an upper layer. The "thin" interface is intended for those writing upper layers for Marpa. It is also for those writing applications, when they want to eliminate the overhead of an upper layer, or when they want the flexibility provided by direct access to Libmarpa.
This document assumes that the reader is familiar with the other Marpa::R2 documentation, as well as the Libmarpa API document. This means the reader will have to know some C language, enough to understand C function prototypes.
How this document is written
The Libmarpa interface is described in the Libmarpa API document, and this document avoids duplicating the material there. This document states general rules for the "thin" interface. Methods that do not depart from the general rules are not specifically mentioned.
While this style and level of documentation is efficient, and the standard for C library interfaces to Perl, it is, admittedly, very terse. As an aid to the reader, an example of the usage of the thin interface is presented below. While small, the example is non-trival. It includes a full logic flow, starting with the definition of the grammar and contining all the way to the iteration of the values of an ambiguous parse.
Methods in the thin interface
As of this writing, the thin interface has no methods of its own. Each of its methods is a wrapper for a method from the Libmarpa interface.
Not all Libmarpa methods have thin interface wrappers. None of Libmarpa's internal methods are included in the thin interface. Additionally, some of Libmarpa's external methods provide services that are handled internally by the thin interface, and wrappers to those methods are therefore not included in the actual interface. When an external Libmarpa method is omitted, this will be specificially stated, with the reason for the omission.
Whenever an external Libmarpa method is not mentioned in this document, the reader can assume that it has a wrapper that is implemented according to the general guidelines, as given below. Where the implementation of an external libmarpa methods is an exception to the guidelines, or has other peculiarities, that will be explicitly stated.
Libmarpa time classes
As a reminder, the classes of Libmarpa's time objects are, in sequence, grammar, recognizer, bocage, ordering, tree and value. The one-letter abbreviations for these are, respectively, g
, r
, b
, o
, t
and v
.
Methods omitted
No internal Libmarpa method is part of the thin interface. The marpa_check_version()
static method is not part of the thin interface, because the thin interface interface handles its own version matching.
The thin interface deals with all reference counting issues, and interconnects Libmarpa's reference counting with Perl's. The application can rely on Libmarpa objects being cleaned up properly as part of Perl's ordinary garbage collection. For this reason, there are no thin wrappers for the mapra_g_ref()
, mapra_g_unref()
, mapra_r_ref()
, mapra_r_unref()
, mapra_b_ref()
, mapra_b_unref()
, mapra_o_ref()
, mapra_o_unref()
, mapra_t_ref()
, mapra_t_unref()
mapra_v_ref()
, and mapra_v_unref()
methods.
There are no thin wrappers for the marpa_g_error()
and marpa_r_error()
methods. The thin interface provides its own interface to Libmarpa's error information, one which is more convenient in the Perl environment.
Libmarpa time objects and constructors
The thin interface implements a Perl class corresponding to each of the Libmarpa time classes. Objects in the thin Marpa classes should be treated as opaque scalars. No applications should define new elements for a thin Marpa classes, redefine, overload or remove existing elements, or subclass the class itself. The only operations an application should perform on objects blessed into the thin interfaces classes is to assign them, to use them to call methods in their class, and to pass them as arguments where appropriate.
Marpa_Grammar Marpa::R2::Thin::G
Marpa_Recognizer Marpa::R2::Thin::R
Marpa_Bocage Marpa::R2::Thin::B
Marpa_Ordering Marpa::R2::Thin::O
Marpa_Tree Marpa::R2::Thin::T
Marpa_Value Marpa::R2::Thin::V
Constructors for the time objects may be called using the new
method of the corresponding Perl class. For example,
my $recce = Marpa::R2::Thin::R->new($grammar);
The thin interface takes care of Libmarpa's reference counting for the user. Marpa thin interface's time objects should be destroyed implicitly by undefining them, or by letting them go out of scope.
The general pattern
The thin Marpa methods often follow a general pattern, based on their corresponding Libmarpa time class method. Internal class instance methods for Libmarpa's time classes have names of the form marpa_g_start_symbol_set
. The name begins with a fixed six-letter prefix marpa_
, followed by a single letter (in this case "g
"), and another underscore. The single letter is one of Libmarpa's time class abbreviations, and indicates which class the method belongs to.
In general pattern of Marpa's thin interface, the corresponding Marpa thin Perl closure would be be a method in the appropriate Marpa thin class, whose name is the same except for the 8-letter prefix. For example, the Marpa thin method corresponding to marpa_g_start_symbol_set
would be named start_symbol_set
and would be a method of the Marpa::R2::Thin:G
Perl class.
When a Libmarpa method returns -1 to indicate failure, a Marpa thin interface following the general pattern returns a Perl undef
. When a Libmarpa method returns -2 to indicate failure, a Marpa thin interface following the general pattern throws a Perl exception.
Libmarpa's class instance methods prototypes have an object of the appropriate class as their first ("self") argument. Zero or more other non-self arguments follow this first time class argument. In the corresponding thin Marpa method, if it follows the general pattern, the arguments to the Perl method closure are the the arguments of the C function in the same order, and converted in Perl variables as described next. (Here I am following the convention in the perlobj of considered the "self" object to be a Perl method's first argument.)
In the general pattern, every return value or argument whose type is one of Libmarpa's time classes is converted to the corresponding Marpa thin interface class. Return values and arguments of Libmarpa's numbered classes (Marpa_Rule_ID
and Marpa_Symbol_ID
) are converted to Perl scalar integers. C language int
's are also converted to Perl scalar integers.
Note that will NOT convert a Perl true to a 1 or a Perl false to a 0. The thin interface expects even those arguments which Libmapra interprets as booleans to be numbers, as specified in the Libmarpa API. This usually means the that arguments must either be NUMERIC one or NUMERIC zero. This allows for future extensions to the Libmarpa interface that accept and interpret other numeric values.
Here is an example of a Libmarpa function whose corresponding Marpa thin method follows the general pattern.
marpa_g_start_symbol_set (grammar, symbol_S);
and here is the corresonding thin Marpa call:
$grammar->start_symbol_set($symbol_S);
Error methods
The thin interface to Libmarpa provides error methods more appropriate to the Perl environment, than Libmarpa's own.
my @error_names = Marpa::R2::Thin::error_names();
my $error_code = $grammar->error_code();
my $error_name = $error_names[$error_code];
my $error_description = $grammar->error();
The error_code()
method returns the most recent error code, which is an integer. The error_names()
static method returns an array of error "names": these are the error code macros, as listed in the Libmarpa documentation. The error code can be used as an index into the array of error names. The programmer can expect error codes and error names to be kept stable.
The error()
method returns the "error description", a string that provides a fuller description of the latest error than does the error name. Error descriptions are subject to change, Because error descriptions can be kept up to date, they may more accurately reflect the nature of the error than the error name.
throw_set()
Configuration methods and variables
Libmarpa's configuration objects and methods are omitted from the Marpa thin interface. Their functions are handled by the thin interface internally. The Marpa thin interface has a configuration variable:
$Marpa::R2::Thin::THROW
If a Perl true, the Marpa thin interface throws failures as exceptions. If a Perl false, the Marpa thin interface methods return failure, as described for each method. Defaults to true.
Each grammar has its own "throw" flag. This variables controls only the initial setting of that variable. A grammar's "throw" flag can be reset using the
$g->throw_set()
method, after which its setting is independent of the$Marpa::R2::Thin::THROW
variable.For the
$r->alternative()
method, the "Ruby Slippers" flag also affects which issues are throw as exceptions. See its description for details.
Grammar methods
Constructor
my $grammar = Marpa::R2::Thin::G->new();
There are no arguments to the Marpa thin interface's grammar constructor. A failure occurs if there is a version mismatch, which should not happen -- it indicates a problem with the way that the library was built. On success, its return value is a thin interface grammar object. On failure in scalar context, its return value is a Perl undef. On failure in array context, its return value is a 2-element array whose first element is a Perl undef
and whose second element is the error code.
event()
my ( $event_type, $value ) = $grammar->event( $event_ix++ );
The event()
method returns a two-element array on success. The first element is a string naming the event type, and the second is a scalar representing its value. The string for an event type is its macro name, as given in the Libmarpa API document.
The value of the event, when defined for an event type, is always, as of this writing, a Perl scalar number. The number is either a symbol ID or a count, as described in the Libmarpa API document.
The permissible range of event indexes can be found with the Marpa thin interface's event_count()
grammar method, which corresponds to Libmarpa's marpa_g_event_count()
method. The thin interface's event_count()
method follows the general pattern.
Since event()
returns the event value whenever it exists, the Libmarpa marpa_g_event_value()
method is unneeded. The Libmarpa marpa_g_event_value()
method has no corresponding Marpa thin interface method.
rule_new()
my $start_rule_id = $grammar->rule_new( $symbol_S, [$symbol_E] );
The rule_new()
grammar method is the Libmarpa thin interface method corresponding to the marpa_g_rule_new()
method. It takes two arguments, both required. The first argument is a symbol ID representing the rule's LHS, and the second argument is a reference to an array of symbol ID's. The symbol ID's in the array represent the RHS. On success, the return value is the ID of the new rule.
sequence_new()
my $sequence_rule_id = $grammar->sequence_new(
$symbol_S,
$symbol_a,
{ separator => $symbol_sep,
proper => 0,
min => 1
}
);
The sequence_new()
grammar method is the Libmarpa thin interface method corresponding to the marpa_g_sequence_new()
method. It takes three arguments, all required. The first argument is a symbol ID representing the sequence's LHS. The second argument is a symbol ID representing the sequence's RHS. The third argument is a reference to a hash of named arguments.
The hash of named arguments may be empty. If not empty, its keys, and their values, must be one of the following:
separator
-
The value of the
separator
named argument will be treated as an integer, and passed as the separator ID argument to themarpa_g_sequence_new()
method. It defaults to -1. proper
-
If the value of
proper
named argument is a Perl true value, theMARPA_PROPER_SEPARATION
flag will be set in the flags passed to themarpa_g_sequence_new()
method. Otherwise, theMARPA_PROPER_SEPARATION
flag will not be set. min
-
The value of the
separator
named argument will be treated as an integer, and passed as the separator ID argument to themarpa_g_sequence_new()
method. It defaults to 0.
On success, the return value is the ID of the new sequence.
precompute()
$grammar->precompute();
The precompute()
method follows the general pattern. In addition to errors, precompute()
also reports events. Events are queried using the grammar's event()
method.
On success, precompute()
returns an event count. But, even when there is an error, precompute()
often reports one or more events. It is not safe to assume that no events occurred unless precompute()
succeeds and reports an event count of zero.
Omitted methods
The marpa_g_error()
method is omitted because it is replaced by the error methods offered in the Marpa thin interface. The marpa_g_ref()
and marpa_g_unref()
methods are omitted because the Marpa thin interface performs their function. The marpa_g_event_value()
method is omitted because its function is performed by the thin interface's event()
grammar method.
General pattern methods
All methods that are part of the Libmarpa external interface, but that are not mentioned explicitly in this section, are implemented following the general pattern, as described above
Recognizer methods
Marpa::R2::Thin::R->new()
my $recce = Marpa::R2::Thin::R->new($grammar);
The <new()
> method takes a Marpa thin grammar object as its one argument. On success, it returns a Marpa thin recognzier object. On an unthrown failure, it returns undef.
ruby_slippers_set()
Omitted methods
The marpa_r_ref()
and marpa_r_unref()
methods are omitted because the Marpa thin interface performs their function.
General pattern methods
All methods that are part of the Libmarpa external interface, but that are not mentioned explicitly in this section, are implemented following the general pattern, as described above
Example
my $grammar = Marpa::R2::Thin::G->new();
my $symbol_S = $grammar->symbol_new();
my $symbol_E = $grammar->symbol_new();
$grammar->start_symbol_set($symbol_S);
my $symbol_op = $grammar->symbol_new();
my $symbol_number = $grammar->symbol_new();
my $start_rule_id = $grammar->rule_new( $symbol_S, [$symbol_E] );
my $op_rule_id =
$grammar->rule_new( $symbol_E, [ $symbol_E, $symbol_op, $symbol_E ] );
my $number_rule_id = $grammar->rule_new( $symbol_E, [$symbol_number] );
$grammar->precompute();
my $recce = Marpa::R2::Thin::R->new($grammar);
$recce->start_input();
# The numbers from 1 to 3 are themselves --
# that is, they index their own token value.
# Important: zero cannot be itself!
my @token_values = ( 0 .. 3 );
my $zero = -1 + +push @token_values, 0;
my $minus_token_value = -1 + push @token_values, q{-};
my $plus_token_value = -1 + push @token_values, q{+};
my $multiply_token_value = -1 + push @token_values, q{*};
$recce->alternative( $symbol_number, 2, 1 );
$recce->earleme_complete();
$recce->alternative( $symbol_op, $minus_token_value, 1 );
$recce->earleme_complete();
$recce->alternative( $symbol_number, $zero, 1 );
$recce->earleme_complete();
$recce->alternative( $symbol_op, $multiply_token_value, 1 );
$recce->earleme_complete();
$recce->alternative( $symbol_number, 3, 1 );
$recce->earleme_complete();
$recce->alternative( $symbol_op, $plus_token_value, 1 );
$recce->earleme_complete();
$recce->alternative( $symbol_number, 1, 1 );
$recce->earleme_complete();
my $latest_earley_set_ID = $recce->latest_earley_set();
my $bocage = Marpa::R2::Thin::B->new( $recce, $latest_earley_set_ID );
my $order = Marpa::R2::Thin::O->new($bocage);
my $tree = Marpa::R2::Thin::T->new($order);
my @actual_values = ();
while ( $tree->next() ) {
my $valuator = Marpa::R2::Thin::V->new($tree);
$valuator->rule_is_valued_set( $op_rule_id, 1 );
$valuator->rule_is_valued_set( $start_rule_id, 1 );
$valuator->rule_is_valued_set( $number_rule_id, 1 );
my @stack = ();
STEP: while ( my ( $type, @step_data ) = $valuator->step() ) {
last STEP if not defined $type;
if ( $type eq 'MARPA_STEP_TOKEN' ) {
my ( undef, $token_value_ix, $arg_n ) = @step_data;
$stack[$arg_n] = $token_values[$token_value_ix];
next STEP;
}
if ( $type eq 'MARPA_STEP_RULE' ) {
my ( $rule_id, $arg_0, $arg_n ) = @step_data;
if ( $rule_id == $start_rule_id ) {
my ( $string, $value ) = @{ $stack[$arg_n] };
$stack[$arg_0] = "$string == $value";
next STEP;
}
if ( $rule_id == $number_rule_id ) {
my $number = $stack[$arg_0];
$stack[$arg_0] = [ $number, $number ];
next STEP;
}
if ( $rule_id == $op_rule_id ) {
my $op = $stack[ $arg_0 + 1 ];
my ( $right_string, $right_value ) = @{ $stack[$arg_n] };
my ( $left_string, $left_value ) = @{ $stack[$arg_0] };
my $value;
my $text = '(' . $left_string . $op . $right_string . ')';
if ( $op eq q{+} ) {
$stack[$arg_0] = [ $text, $left_value + $right_value ];
next STEP;
}
if ( $op eq q{-} ) {
$stack[$arg_0] = [ $text, $left_value - $right_value ];
next STEP;
}
if ( $op eq q{*} ) {
$stack[$arg_0] = [ $text, $left_value * $right_value ];
next STEP;
}
die "Unknown op: $op";
} ## end if ( $rule_id == $op_rule_id )
die "Unknown rule $rule_id";
} ## end if ( $type eq 'MARPA_STEP_RULE' )
die "Unexpected step type: $type";
} ## end while ( my ( $type, @step_data ) = $valuator->step() )
push @actual_values, $stack[0];
} ## end while ( $tree->next() )
Copyright and License
Copyright 2012 Jeffrey Kegler
This file is part of Marpa::R2. Marpa::R2 is free software: you can
redistribute it and/or modify it under the terms of the GNU Lesser
General Public License as published by the Free Software Foundation,
either version 3 of the License, or (at your option) any later version.
Marpa::R2 is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser
General Public License along with Marpa::R2. If not, see
http://www.gnu.org/licenses/.