NAME

Marpa::R2::Semantics::Null - How Marpa evaluates null rules and symbols

OVERVIEW

In Marpa parses, rules and symbols can be nulled -- in other words they can derive the zero-length, or null, string. Which symbols can be, or are, nulled, depends on the grammar and the input. When a symbol or rule is not nulled, the symbol is said to be visible.

Even the start symbol can be nulled, in which case the entire parse derives the null string. A parse in which the start symbol is nulled is called a null parse.

When evaluating a parse, nulled rules and symbols are assigned values as described in the semantics document. This document provides additional detail on the assignment of values to nulled symbols.

OVERVIEW

Null values come from rules

All null values for symbols come from rules with that symbol on their LHS. For a symbol to be nulled, it must be on the LHS of at least one nullable rule. The action of one of these nullable rules will be the action for the nulled symbol.

If the action is a constant, then that constant is the value of the nulled symbol. If the action is a rule evaluation closure, then that closure is called with no child arguments, and the closure's result is the value of the nulled symbol.

It may be that more than one nullable rule has that symbol on its LHS, and and these rules have different action names. In that case, the action for the empty rule is the one which applies. It is a fatal error if the nullable rules for a LHS symbol have different action names, and none of them is an empty rule. A simple way to fix this problem is create an empty rule that decide the semantics to be applied to nulled symbols.

Null subtrees

A null subtree is a subtree all of whose symbols and rules are nulled. Marpa prunes all null subtrees back to their topmost nulled symbol.

The "lost" semantics of the non-topmost symbols and rules of null subtrees is usually not missed. Nulled subtrees cannot contain input, and therefore do no contain token symbols. So no token values are lost when nulled subtrees are pruned. As bushy as a null subtree might be, all of its symbols and rules are nulled.

Since nulled symbols and rules correspond to zero-length strings, so we are literally dealing here with the "semantics of nothing". In theory the semantics of nothing can be arbitrarily complex. In practice it should be possible to keep them simple. But if an application actually needs it, Marpa could implement an arbitrarily complex, and even a dynamic, "semantics of nothing", as described below.

EXAMPLE

As already stated, Marpa prunes every null subtree back to its topmost null symbol. Here is an example:

sub L {
    shift;
    return 'L(' . ( join q{;}, map { $_ // '[ERROR!]' } @_ ) . ')';
}

sub R {
    return 'R(): I will never be called';
}

sub S {
    shift;
    return 'S(' . ( join q{;}, map { $_ // '[ERROR!]' } @_ ) . ')';
}

sub X { return 'X(' . $_[1] . ')'; }
sub Y { return 'Y(' . $_[1] . ')'; }

our $null_A = 'null A';
our $null_B = 'null B';
our $null_L = 'null L';
our $null_R = 'null R';
our $null_X = 'null X';
our $null_Y = 'null Y';

my $grammar = Marpa::R2::Grammar->new(
    {   start   => 'S',
        actions => 'main',
        rules   => [
            [ 'S', [qw/L R/] ],
            [ 'L', [qw/A B X/] ],
            [ 'L', [], 'null_L' ],
            [ 'R', [qw/A B Y/] ],
            [ 'R', [], 'null_R' ],
            [ 'A', [], 'null_A' ],
            [ 'B', [], 'null_B' ],
            [ 'X', [], 'null_X' ],
            [ 'X', [qw/x/] ],
            [ 'Y', [], 'null_Y' ],
            [ 'Y', [qw/y/] ],
        ],
    }
);

$grammar->precompute();

my $recce = Marpa::R2::Recognizer->new( { grammar => $grammar } );

$recce->read( 'x', 'x' );

If we write the unpruned parse tree in pre-order, depth-first, indenting children below their parents, we get something like this:

0: Visible Rule: S := L R
     1: Visible Rule L := A B X
         1.1: Nulled Symbol A
         1.2: Nulled Symbol B
         1.3: Token, Value is 'x'
     2: Nulled Rule, Rule R := A B Y
         2.1: Nulled Symbol A
         2.2: Nulled Symbol B
         2.3: Nulled Symbol Y

In this example, five symbols and a rule are nulled. The rule and three of the symbols are in a single subtree: 2, 2.1, 2.2 and 2.3. Marpa prunes every null subtree back to its topmost symbol, which in this case is the LHS of the rule numbered 2.

The pruned tree looks like this

0: Visible Rule: S := L R
     1: Visible Rule L := A B X
         1.1: Nulled Symbol A
         1.2: Nulled Symbol B
         1.3: Token, Value is 'x'
     2: LHS of Nulled Rule, Symbol R

Here is the output:

S(L(null A;null B;X(x));null R)

In the output we see

  • The null value for symbol 1.1: "null A". This comes from the empty rule for A.

  • The null value for symbol 1.2: "null B". This comes from the empty rule for B.

  • The token value for symbol 1.3: "x".

  • An application of the semantic Perl closure for the rule L := A B X.

  • The null value for rule 2: "null R". This comes from the empty rule for R.

  • An application of the semantic Perl closure for the rule S := L R

We do not see any output for symbols 2.1 (A), 2.2 (B), or 2.3 (Y) because they were not topmost in the pruned subtree. We do not see an application of the rule evaluation closure for rule R := A B Y, because there is an empty rule for R, and that takes priority.

ADVANCED

In rare cases, your application may call for null values with a complex semantics.

Implementing a complex but constant null semantics

If an application's semantics of nothing, while complex, remains constant, you can handle it by setting every nullable symbol's null_value property to the value which your semantics produces when that nullable symbol is the root symbol of a null subtree.

Implementing a complex and dynamic null semantics

If the values of an application's null values are not constants, Marpa can still calculate them. Here is the most general method:

  • Determine which of the application's nullable symbols have a dynamic semantics. Call these the dynamic nullables.

  • Let the action of the empty rule with the dynamic nullables on their LHS be a constant that can be used as a hash key.

  • For every rule with a dynamic nullable on its right hand side, write the rule evaluation closure so that it looks up that hash key in a hash whose values are Perl closures. The closures found by hash lookup can then use an arbitrarily complex semantics for calculating the value of the dynamic nullable.

COPYRIGHT AND LICENSE

Copyright 2012 Jeffrey Kegler
This file is part of Marpa::R2.  Marpa::R2 is free software: you can
redistribute it and/or modify it under the terms of the GNU Lesser
General Public License as published by the Free Software Foundation,
either version 3 of the License, or (at your option) any later version.

Marpa::R2 is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser
General Public License along with Marpa::R2.  If not, see
http://www.gnu.org/licenses/.