SYNOPSIS

Quoted from Bison manual:

This grammar introduces a problem that arises in the declaration of enumerated and subrange types in Pascal:

type subrange = lo .. hi;
type enum = (a, b, c);

The original language standard allows only numeric literals and constant identifiers for the subrange bounds (`lo' and `hi'), but Extended Pascal and many other Pascal implementations allow arbitrary expressions there. This gives rise to the following situation, containing a superfluous pair of parentheses:

type subrange = (a) .. b;

Compare this to the following declaration of an enumerated type with only one value:

type enum = (a);

These two declarations look identical until the .. token. With normal LALR(1) one-token look-ahead it is not possible to decide between the two forms when the identifier a is parsed. It is, however, desirable for a parser to decide this, since in the latter case a must become a new identifier to represent the enumeration value, while in the former case a must be evaluated with its current meaning, which may be a constant or even a function call.

You could parse (a) as an 'unspecified identifier in parentheses', to be resolved later, but this typically requires substantial contortions in both semantic actions and large parts of the grammar, where the parentheses are nested in the recursive rules for expressions.

You might think of using the lexer to distinguish between the two forms by returning different tokens for currently defined and undefined identifiers. But if these declarations occur in a local scope, and ‘a’ is defined in an outer scope, then both forms are possible—either locally redefining ‘a’, or using the value of ‘a’ from the outer scope. So this approach cannot work.

SEE ALSO

1 POD Error

The following errors were encountered while parsing the POD:

Around line 40:

Non-ASCII character seen before =encoding in 'C<‘a’>'. Assuming UTF-8