NAME
Parse::Marpa::Doc::Diagnostics - Diagnostics
DESCRIPTION
This document describes the techniques for debugging Marpa parses and grammars and describes the Marpa methods and options whose main use is tracing and debugging.
Basic Debugging Techniques
First look at the place where the parse was exhausted. That, along with inspection of the input and the grammar, is often enough to spot the problem. Typically you'll have tried that already before consulting this document. You probably also already have Marpa's warnings
option turned on (it's on by default), but if not, you probably should.
When a problem is not obvious, the first thing I do is turn on the trace_lex
option. This tells you which tokens the lexer is looking for and which ones it thinks it found. If the problem is in lexing, trace_lex
tells you the whole story. Even if the problem is in the grammar, which tokens the lexer is looking for is a clue to what the recognizer is doing. That is because Marpa uses predictive lexing and only looks for tokens that will result in a successful parse according to the grammar.
It sometimes helps to look carefully at the output of show_rules
and show_symbols
, to check if anything there is clearly not right or not what you expected.
If you're getting far enough to be able to return parse values from Parse::Marpa::Evaluator::next
method, but something is still wrong, it can help to run the Parse::Marpa::Evaluator::show
method after the call to next
. The show
method returns the parse derivation.
Advanced Techniques
Next, depending on where in the process you're having problems, you might want to turn on some of the more helpful traces. trace_actions
will show you the actions as they are being finalized. In an ambiguous parse, trace_evaluation_choices
shows the choices Marpa is making. trace_iteration_changes
and trace_rules
traces the initialization of, and changes in, node values.
The above should be enough to enable you to spot any problem in writing a grammar. But if you are interested in doing a complete investigation of a parse, do the following:
Run
show_symbols
on the precomputed grammar.Run
show_rules
on the precomputed grammar.Run
show_SDFA
on the precomputed grammar.Turn on
trace_lex
before input.Run
show_status
on the recognizer.Run
show
on the evaluator.
Note that when the input text to the grammar is of any length, the output from show_status
and trace_lex
can be large. You'll want to work with short inputs if at all possible. The internals document has example outputs from the show_SDFA
and show_status
methods, and explains how to read them.
OPTIONS
These are Marpa options. Unless otherwise stated, these Marpa options are valid for all methods which accept Marpa options as named arguments ( Parse::Marpa::mdl
, Parse::Marpa::Grammar::new
, Parse::Marpa::Grammar::set
, and Parse::Marpa::Recognizer::new
), and are useful at any point in the parse. Trace output goes to the trace file handle.
- academic
-
The academic option turns off all grammar rewriting. The resulting grammar is useless for recognition and parsing. The purpose of the
academic
argument is allow testing that Marpa's precomputations can accurately duplicate examples from textbooks. This is handy for testing the internals. An exception is thrown if the user attempts to create a recognizer from a grammar marked academic. Theacademic
option cannot be set in the recognizer or after the grammar is precomputed. - trace_actions
-
Traces actions as they are compiled. Little or no knowledge of Marpa internals required. This option is useless once the recognizer has been created. Setting it after that point will result in a warning.
- trace_evaluation_choices
-
This option traces the non-trivial choices Marpa makes among rules, among links, or among tokens. A choice is non-trivial when there is more than one alternative. Non-trivial choices only occur if the grammar is ambiguous. Knowledge of Marpa internals probably needed. May usefully be set at any point in the parse.
- trace_iteration_changes
-
Traces setting of, and changes in, node values during parse evaluation. Knowledge of Marpa internals very useful. May usefully be set at any point in the parse.
- trace_iteration_searches
-
Traces Marpa's search through the Earley items during parse evaluation. Requires knowledge of Marpa internals. May usefully be set at any point in the parse, but probably only useful in combination with
trace_iteration_changes
.trace_iterations
turns on both. - trace_iterations
-
A short hand for setting both
trace_iteration_changes
andtrace_iteration_searches
. May usefully be set at any point in the parse. - trace_lex
-
A short hand for setting both
trace_lex_matches
andtrace_lex_tries
. Very useful, and can be interpreted with limited knowledge of Marpa internals. Because Marpa uses predictive lexing, this can give you an idea not only of how lexing is working, but also of what the recognizer is looking for. May be set at any point in the parse, but will be useless if set after input is complete. - trace_lex_matches
-
Traces every successful match in lexing. Can be interpreted with little knowledge of Marpa internals. May be set at any point in the parse, but will be useless if set after input is complete.
- trace_lex_tries
-
Traces every attempted match in lexing. Can be interpreted with little knowledge of Marpa internals. Usually not useful without
trace_lex_matches
.trace_lex
turns on both. May be set at any point in the parse, but will be useless if set after input is complete. - trace_priorities
-
Traces the priority setting of each SDFA state. Requires knowledge of Marpa internals. These priorities are set when the recognizer is created. A trace message warns the user if he sets
trace_priorities
after that point. - trace_rules
-
Traces rules as they are added to the grammar. Useful, but you may prefer the
show_rules()
method. Doesn't require knowledge of Marpa internals.A trace message warns the user if he sets this option when rules have already been added. If a user adds rules using the
source
named argument, and uses thetrace_rules
named argument in the same call, it will take effect after the processing of thesource
option, which is probably not what he intended. To be effectivetrace_rules
must be set in a method call prior to the one with thesource
option. - trace_values
-
As each node value is set, prints a trace of the rule and the value. Very helpful and does not require knowledge of Marpa internals. May usefully be set at any point.
METHODS
Static Method
show_location
my $recce = new Parse::Marpa::Recognizer( { grammar => $grammar } );
if ( ( my $fail_location = $recce->text( \$text_to_parse ) ) >= 0 ) {
print STDERR Parse::Marpa::show_location( "Parse failed",
\$text_to_parse, $fail_location );
exit 1;
}
A utility routine helpful in formatting messages about problems parsing user supplied text. Takes three arguments, all required. The first, message argument, must be a string. The second, text argument, must be a reference to a string. The third, offset argument, must be an integer, and will be interpreted as a character offset within that string.
show_location
returns a multi-line string with a header line containing the message, followed by the line from text which contains offset, followed by a line using the ASCII "caret" symbol to point to the exact offset.
Grammar Methods
inaccessible_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
for my $symbol (@{$grammar->inaccessible_symbols()}) {
say "Inaccessible symbol: ", $symbol;
}
Returns the plumbing names of the inaccessible symbols. Not useful before the grammar is precomputed. Used for test scripts. For debugging and tracing, the warnings option is usually the most convenient way to obtain the same information.
show_NFA
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
print $grammar->show_NFA();
Returns a multi-line string listing the states of the NFA with the LR(0) items and transitions for each. Not useful before the grammar is precomputed. Not really helpful for debugging grammars and requires very deep knowledge of Marpa internals.
show_SDFA
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
print $grammar->show_SDFA();
Returns a multi-line string listing the states of the SDFA with the LR(0) items, NFA states, and transitions for each. Not useful before the grammar is precomputed. Very useful in debugging, but requires knowledge of Marpa internals.
show_accessible_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Accessible symbols: ", $grammar->show_accessible_symbols();
Returns a one-line string with the plumbing names of the accessible symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.
show_nullable_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Nullable symbols: ", $grammar->show_nullable_symbols();
Returns a one-line string with the plumbing names of the nullable symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.
show_nulling_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Nulling symbols: ", $grammar->show_nulling_symbols();
Returns a one-line string with the plumbing names of the nulling symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.
show_problems
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
print $grammar->show_problems();
Returns a string describing the problems a grammar had in the precomputation phase. For many precomputation problems, Marpa does not immediately throw an exception. This is because there are often several problems with a grammar. Throwing an exception on the first problem would force the user to fix them one at a time -- very tedious. If there were no problems, returns a string saying so.
This method is not useful before precomputation. An exception is thrown when the user attempts to compile, or to create a parse from, a grammar with problems. The string returned by show_problems
will be part of the exception's error message.
show_productive_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
say "Productive symbols: ", $grammar->show_productive_symbols();
Returns a one-line string with the plumbing names of the productive symbols of the grammar, space-separated. Useful in test scripts. Not useful before the grammar is precomputed. Not very useful for debugging.
show_rules
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
print $grammar->show_rules();
Returns a string listing the rules, each commented as to whether it was nullable, nulling, unproductive, inaccessible, empty or not useful. If a rule had a non-zero priority, that is also shown. Often useful and much of the information requires no knowledge of the Marpa internals to interpret.
show_rules
shows a rule as not useful ("!useful
") if it decides not to use it for any reason. Rules marked "!useful
" include not just the ones called useless in standard parsing terminology (inaccessible and unproductive rules) but also any rule which is replaced by one of Marpa's grammar rewrites.
show_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
print $grammar->show_symbols();
Returns a string listing the symbols, along with whether they were nulling, nullable, unproductive or inaccessible. Also shown is a list of rules with that symbol on the left hand side, and a list of rules which have that symbol anywhere on the right hand side. Often useful and much of the information requires no knowledge of the Marpa internals to interpret.
unproductive_symbols
$grammar = new Parse::Marpa::Grammar( { mdl_source => \$grammar_source } );
$grammar->precompute();
if ($unproductive_symbols) {
for my $symbol (@{$grammar->unproductive_symbols()}) {
say "Unproductive symbol: ", $symbol;
}
}
Given a precomputed grammar, returns the plumbing names of the unproductive symbols. Not useful before the grammar is precomputed. Used in test scripts. For debugging and tracing, the warnings option is usually the most convenient way to obtain the same information.
Recognizer Method
show_status
my $recce = new Parse::Marpa::Recognizer( { grammar => $grammar } );
my $fail_location = $recce->text(\$text_to_parse);
print $recce->show_status();
if ($fail_location >= 0) {
print STDERR Parse::Marpa::show_location("Parse failed", \$text_to_parse, $fail_location);
exit 1;
}
This is the central tool for debugging a parse using Marpa internals. Returns a multi-line string listing every Earley item in every Earley set. For each Earley item, any current successor, predecessor, effect, cause, pointer or value is shown. Also shown are lists of all the links and rules in each Earley item, indicating which link or rule is the current choice.
Evaluator Method
show
my $evaler = new Parse::Marpa::Evaluator($recce);
croak("Input not recognized by grammar") unless $evaler;
print $evaler->show();
If the evaluator has a value the show
method returns the parse derivation used to produce that value. Evaluators have values after successful calls of their next
method. If the evaluator is unvalued, an exception is thrown.
Very useful. Basic use requires no Marpa internals. For those who are delving into the internals, the corresponding Earley item and SDFA state are reported with each line of the derivation.
SUPPORT
See the support section in the main module.
AUTHOR
Jeffrey Kegler
COPYRIGHT
Copyright 2007 - 2008 Jeffrey Kegler
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.