NAME

Parse::Marpa::Doc::Diagnostics - Diagnostics

DESCRIPTION

This document describes the techniques for debugging Marpa parses and grammars and describes the Marpa methods and options whose main use is tracing and debugging.

Basic Debugging Techniques

First look at the place where the parse was exhausted. That, along with inspection of the input and the grammar, is often enough to spot the problem. Typically you'll have tried that already before consulting this document. You probably also already have Marpa's warnings option turned on (it's on by default), but if not, you probably should.

When a problem is not obvious, the first thing I do is turn on the trace_lex option. This tells you which tokens the lexer is looking for and which ones it thinks it found. If the problem is in lexing, trace_lex tells you the whole story. Even if the problem is in the grammar, which tokens the lexer is looking for is a clue to what the recognizer is doing. That is because Marpa uses predictive lexing and only looks for tokens that will result in a successful parse according to the grammar.

It sometimes helps to look carefully at the output of show_rules and show_symbols, to check if anything there is clearly not right or not what you expected.

If you're getting far enough to be able to return parse values from Parse::Marpa::Evaluator::next method, but something is still wrong, it can help to run the Parse::Marpa::Evaluator::show method after the call to next. The show method returns the parse derivation.

Advanced Techniques

Next, depending on where in the process you're having problems, you might want to turn on some of the more helpful traces. trace_actions will show you the actions as they are being finalized. In an ambiguous parse, trace_evaluation_choices shows the choices Marpa is making. trace_iteration_changes and trace_rules traces the initialization of, and changes in, node values.

The above should be enough to enable you to spot any problem in writing a grammar. But if you are interested in doing a complete investigation of a parse, do the following:

  • Run show_symbols on the precomputed grammar.

  • Run show_rules on the precomputed grammar.

  • Run show_SDFA on the precomputed grammar.

  • Turn on trace_lex before input.

  • Run show_status on the recognizer.

  • Run show on the evaluator.

Note that when the input text to the grammar is of any length, the output from show_status and trace_lex can be large. You'll want to work with short inputs if at all possible. The internals document has example outputs from the show_SDFA and show_status methods, and explains how to read them.

OPTIONS

These are Marpa options. Unless otherwise stated, these Marpa options are valid for all methods which accept Marpa options as named arguments ( Parse::Marpa::mdl, Parse::Marpa::Grammar::new, Parse::Marpa::Grammar::set, and Parse::Marpa::Recognizer::new), and are useful at any point in the parse. Trace output goes to the trace file handle.

academic

The academic option turns off all grammar rewriting. The resulting grammar is useless for recognition and parsing. The purpose of the academic argument is allow testing that Marpa's precomputations can accurately duplicate examples from textbooks. This is handy for testing the internals. An exception is thrown if the user attempts to create a recognizer from a grammar marked academic. The academic option cannot be set in the recognizer or after the grammar is precomputed.

trace_actions

Traces actions as they are compiled. Little or no knowledge of Marpa internals required. This option is useless once the recognizer has been created. Setting it after that point will result in a warning.

trace_evaluation_choices

This option traces the non-trivial choices Marpa makes among rules, among links, or among tokens. A choice is non-trivial when there is more than one alternative. Non-trivial choices only occur if the grammar is ambiguous. Knowledge of Marpa internals probably needed. May usefully be set at any point in the parse.

trace_iteration_changes

Traces setting of, and changes in, node values during parse evaluation. Knowledge of Marpa internals very useful. May usefully be set at any point in the parse.

trace_iteration_searches

Traces Marpa's search through the Earley items during parse evaluation. Requires knowledge of Marpa internals. May usefully be set at any point in the parse, but probably only useful in combination with trace_iteration_changes. trace_iterations turns on both.

trace_iterations

A short hand for setting both trace_iteration_changes and trace_iteration_searches. May usefully be set at any point in the parse.

trace_lex

A short hand for setting both trace_lex_matches and trace_lex_tries. Very useful, and can be interpreted with limited knowledge of Marpa internals. Because Marpa uses predictive lexing, this can give you an idea not only of how lexing is working, but also of what the recognizer is looking for. May be set at any point in the parse, but will be useless if set after input is complete.

trace_lex_matches

Traces every successful match in lexing. Can be interpreted with little knowledge of Marpa internals. May be set at any point in the parse, but will be useless if set after input is complete.

trace_lex_tries

Traces every attempted match in lexing. Can be interpreted with little knowledge of Marpa internals. Usually not useful without trace_lex_matches. trace_lex turns on both. May be set at any point in the parse, but will be useless if set after input is complete.

trace_priorities

Traces the priority setting of each SDFA state. Requires knowledge of Marpa internals. These priorities are set when the recognizer is created. A trace message warns the user if he sets trace_priorities after that point.

trace_rules

Traces rules as they are added to the grammar. Useful, but you may prefer the show_rules() method. Doesn't require knowledge of Marpa internals.

A trace message warns the user if he sets this option when rules have already been added. If a user adds rules using the source named argument, and uses the trace_rules named argument in the same call, it will take effect after the processing of the source option, which is probably not what he intended. To be effective trace_rules must be set in a method call prior to the one with the source option.

trace_values

As each node value is set, prints a trace of the rule and the value. Very helpful and does not require knowledge of Marpa internals. May usefully be set at any point.

METHODS

Static Method

show_location

my $recce = new Parse::Marpa::Recognizer( { grammar => $grammar } );
 if ( ( my $fail_location = $recce->text( \$text_to_parse ) ) >= 0 ) {
     print STDERR Parse::Marpa::show_location( "Parse failed",
         \$text_to_parse, $fail_location );
     exit 1;             
 }                       

A utility routine helpful in formatting messages about problems parsing user supplied text. Takes three arguments, all required. The first, message argument, must be a string. The second, text argument, must be a reference to a string. The third, offset argument, must be an integer, and will be interpreted as a character offset within that string.

show_location returns a multi-line string with a header line containing the message, followed by the line from text which contains offset, followed by a line using the ASCII "caret" symbol to point to the exact offset.

Grammar Methods

inaccessible_symbols

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
for my $symbol (@{$grammar->inaccessible_symbols()}) {
   say "Inaccessible symbol: ", $symbol;
}

Returns the plumbing names of the inaccessible symbols. Not useful before the grammar is precomputed. Used for test scripts. For debugging and tracing, the warnings option is usually the most convenient way to obtain the same information.

show_NFA

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
print $grammar->show_NFA();

Returns a multi-line string listing the states of the NFA with the LR(0) items and transitions for each. Not useful before the grammar is precomputed. Not really helpful for debugging grammars and requires very deep knowledge of Marpa internals.

show_SDFA

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
print $grammar->show_SDFA();

Returns a multi-line string listing the states of the SDFA with the LR(0) items, NFA states, and transitions for each. Not useful before the grammar is precomputed. Very useful in debugging, but requires knowledge of Marpa internals.

show_accessible_symbols

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Accessible symbols: ", $grammar->show_accessible_symbols();

Returns a one-line string with the plumbing names of the accessible symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.

show_nullable_symbols

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Nullable symbols: ", $grammar->show_nullable_symbols();

Returns a one-line string with the plumbing names of the nullable symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.

show_nulling_symbols

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Nulling symbols: ", $grammar->show_nulling_symbols();

Returns a one-line string with the plumbing names of the nulling symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.

show_problems

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
print $grammar->show_problems();

Returns a string describing the problems a grammar had in the precomputation phase. For many precomputation problems, Marpa does not immediately throw an exception. This is because there are often several problems with a grammar. Throwing an exception on the first problem would force the user to fix them one at a time -- very tedious. If there were no problems, returns a string saying so.

This method is not useful before precomputation. An exception is thrown when the user attempts to compile, or to create a parse from, a grammar with problems. The string returned by show_problems will be part of the exception's error message.

show_productive_symbols

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Productive symbols: ", $grammar->show_productive_symbols();

Returns a one-line string with the plumbing names of the productive symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.

show_rules

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
print $grammar->show_rules();

Returns a string listing the rules, each commented as to whether it was nullable, nulling, unproductive, inaccessible, empty or not useful. If a rule had a non-zero priority, that is also shown. Often useful and much of the information requires no knowledge of the Marpa internals to interpret.

show_rules shows a rule as not useful ("!useful") if it decides not to use it for any reason. Rules marked "!useful" include not just the ones called useless in standard parsing terminology (inaccessible and unproductive rules) but also any rule which is replaced by one of Marpa's grammar rewrites.

show_symbols

$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
print $grammar->show_symbols();

Returns a string listing the symbols, along with whether they were nulling, nullable, unproductive or inaccessible. Also shown is a list of rules with that symbol on the left hand side, and a list of rules which have that symbol anywhere on the right hand side. Often useful and much of the information requires no knowledge of the Marpa internals to interpret.

unproductive_symbols

    $grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
    $grammar->precompute();
    if ($unproductive_symbols) {
       for my $symbol (@{$grammar->unproductive_symbols()}) {
	   say "Unproductive symbol: ", $symbol;
       }
    }

Given a precomputed grammar, returns the plumbing names of the unproductive symbols. Not useful before the grammar is precomputed. Used in test scripts. For debugging and tracing, the warnings option is usually the most convenient way to obtain the same information.

Recognizer Method

show_status

my $recce = new Parse::Marpa::Recognizer( { grammar => $grammar } );
my $fail_location = $recce->text(\$text_to_parse);
print $recce->show_status();
if ($fail_location >= 0) {
    print STDERR Parse::Marpa::show_location("Parse failed", \$text_to_parse, $fail_location);
    exit 1;
}

This is the central tool for debugging a parse using Marpa internals. Returns a multi-line string listing every Earley item in every Earley set. For each Earley item, any current successor, predecessor, effect, cause, pointer or value is shown. Also shown are lists of all the links and rules in each Earley item, indicating which link or rule is the current choice.

Evaluator Method

show

my $evaler = new Parse::Marpa::Evaluator($recce);
croak("Input not recognized by grammar") unless $evaler;
print $evaler->show();

If the evaluator has a value the show method returns the parse derivation used to produce that value. Evaluators have values after successful calls of their next method. If the evaluator is unvalued, an exception is thrown.

Very useful. Basic use requires no Marpa internals. For those who are delving into the internals, the corresponding Earley item and SDFA state are reported with each line of the derivation.

SUPPORT

See the support section in the main module.

AUTHOR

Jeffrey Kegler

COPYRIGHT

Copyright 2007 - 2008 Jeffrey Kegler

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.