NAME

Parse::ABNF - Parse IETF Augmented BNF (ABNF) grammars.

SYNOPSIS

use Parse::ABNF;
my $parser = Parse::ABNF->new;
my $rules = $parser->parse($grammar);

DESCRIPTION

This module parses IETF ABNF (STD 68, RFC 5234, 4234, 2234) grammars into a list of rules. Artifacts are mapped into hash references as follows:

A  = B ~ { class => 'Rule',       value => B, name => A               }
A /= B ~ { class => 'Rule',       value => B, ... combine => 'choice' }
A / B  ~ { class => 'Choice',     value => [A, B]                     }
A B    ~ { class => 'Group',      value => [A, B]                     }
A      ~ { class => 'Reference',  name  => A                          }
n*mA   ~ { class => 'Repetition', value => A, min  => n, max => m     }
[ A ]  ~ { class => 'Repetition', value => A, min  => 0, max => 1     }
*A     ~ { class => 'Repetition', value => A, min  => 0, max => undef }
"A"    ~ { class => 'Literal',    value => A                          }
<A>    ~ { class => 'ProseValue', value => A                          }
%xA.B  ~ { class => 'String',     value => [A, B], type => 'hex'      }
%bA.B  ~ { class => 'String',     value => [A, B], type => 'binary'   }
%dA.B  ~ { class => 'String',     value => [A, B], type => 'decimal'  }
%xA-B  ~ { class => 'Range',      type  => 'hex', min => A, max => B  }

Forms not listed here are mapped in an analogous manner.

As an example, the ABNF grammar

A = (B C) / *D

is parsed into

[ {
  'value' => {
    'value' => [
      {
        'value' => [
          {
            'name' => 'B',
            'class' => 'Reference'
          },
          {
            'name' => 'C',
            'class' => 'Reference'
          }
        ],
        'class' => 'Group'
      },
      {
        'min' => 0,
        'value' => {
          'name' => 'D',
          'class' => 'Reference'
        },
        'max' => undef,
        'class' => 'Repetition'
      }
    ],
    'class' => 'Choice'
  },
  'name' => 'A',
  'class' => 'Rule'
} ]

Until this module matures, this format is subject to change. Contact the author if you would like to depend on this module.

CORE RULES

The ABNF specification defines some Core Rules that are used without defining them locally in many ABNF grammars. You can access these rules as parsed by this module via $Parser::ABNF::CoreRules.

CAVEATS

Instead of CRLF line endings this module expects "\n" as line terminator. If necessary, convert the line endings e.g. using

$grammar =~ s/\x0d\x0a/\n/g;

The ABNF specification disallows white space preceding the left hand side, and so does this module. Remove it prior to passing the grammar e.g. using

$grammar =~ s/^\s+(?=\w+\s*=)//mg;

This module does not do that for you in order to preserve line and column numbers. Patches adapting the grammar to allow leading white space welcome.

The ABNF specification allows non-terminals to be enclosed inside <...>. That is the same syntax as used for prose values, and this module makes no attempt to differentiate the two.

Comments are not currently made available, this may change in future versions.

There is currently not much error handling. Patches welcome.

BUG REPORTS

Please report bugs in this module via http://rt.cpan.org/NoAuth/Bugs.html?Dist=Parser-ABNF

SEE ALSO

* http://www.ietf.org/rfc/rfc5234.txt
* Parse::RecDescent

AUTHOR / COPYRIGHT / LICENSE

Copyright (c) 2008-2009 Bjoern Hoehrmann <bjoern@hoehrmann.de>.
This module is licensed under the same terms as Perl itself.