NAME

Keyword::Declare - Declare new Perl keywords...via a keyword

VERSION

This document describes Keyword::Declare version 0.000004

STATUS

This module is an alpha release. Aspects of its behaviour will probably change in future releases.

In particular, do not rely on keyword parameters having been parsed into PPI objects. This is an interim solution that is likely to go away in future releases.

SYNOPSIS

use Keyword::Declare;

# Declare something matchable within a keyword's syntax...
keytype UntilOrWhile { /until|while/ }

# Declare a keyword and its syntax...
keyword repeat (UntilOrWhile $type, List $condition, Block $code) {
    # Return new source code as a string (which replaces any parsed syntax)
    return qq{
        while (1) {
            $code;
            redo $type $condition;
            last;
        }
    };
}

# Implement method declarator...
keyword method (Ident $name, List $params?, /:\w+/ @attrs?, Block $body) {
    return build_method_source_code($name, $params//'()', \@attrs, $body);
}

# Keywords can have two or more definitions (distinguished by syntax)...
keyword test (String $desc, Comma, Expr $test) {
    return "use Test::More; ok $test => $desc"
}

keyword test (Expr $test) {
    my $desc = "q{$test at line }.__LINE__";
    return "use Test::More; ok $test => $desc"
}

keyword test (String $desc, Block $subtests) {
    return "use Test::More; subtest $desc => sub $subtests;"
}

# Keywords declared in an import() are automatically exported...
sub import {

    keyword debug (Expr $expr) {
        return "" if !$ENV{DEBUG};
        return "use Data::Dump 'ddx'; ddx $expr";
    }

}

DESCRIPTION

This module implements a new Perl keyword: keyword, which you can use to specify other new keywords.

Normally, to define new keywords in Perl, you either have to write them in XS (shiver!) or use a module like Keyword::Simple or Keyword::API. Using any of these approaches requires you to grab all the source code after the keyword, manually parse out the components of the keyword's syntax, construct the replacement source code, and then substitute it for the original source code you just parsed.

Using Keyword::Declare, you define a new keyword by specifying its name and a parameter list corresponding to the syntactic components that must follow the keyword. You then use those parameters to construct and return the replacement source code. The module takes care of setting up the keyword, and of the associated syntax parsing, and of inserting the replacement source code in the correct place.

For example, to create a new keyword (say: loop) that takes an optional count and a block, you could write:

use Keyword::Declare;

keyword loop (Int $count?, Block $block) {
    if (defined $count) {
        return "for (1..$count) $block";
    }
    else {
        return "while (1) $block";
    }
}

At compile time, when the parser subsequently encounters source code such as:

loop 10 {
    $cmd = readline;
    last if valid_cmd($cmd);
}

then the keyword's $count parameter would be assigned the value "10" and its $code parameter would be assigned the value "{\n$cmd = readline;\nlast if valid_cmd($cmd);\n}". Then the "body" of the keyword definition would be executed and its return value used as the replacement source code:

for (1..10) {
    $cmd = readline;
    last if valid_cmd($cmd);
}

INTERFACE

Declaring a new lexical keyword

The general syntax for declaring new keywords is:

keyword NAME (PARAM, PARAM, PARAM...) [:desc] { REPLACEMENT }

The name of the new keyword can be any identifier, including the name of an existing Perl keyword. However, using the name of an existing keyword usually creates an infinite loop of keyword expansion, so it rarely does what you actually wanted.

Specifying keyword parameters

The parameters of the keyword tell it how to parse the source code that follows it. The general syntax for each parameter is:

                       [...] TYPE [$@] VARNAME [?] [= DEFAULT]

                        \_/  \__/  VV  \_____/ \_/ \_________/
Everything up to [opt]...:     :   ::     :     :       :
Component type.................:   ::     :     :       :
Appears once.......................::     :     :       :
Appears once or more................:     :     :       :
Capture variable..........................:     :       :
Component is optional [opt].....................:       :
Default source (if missing) [opt].......................:

Named keyword parameter types

The type of each keyword parameter specifies how to parse the corresponding item in the source code after the keyword. Most of the available types are drawn from the PPI class hierarchy, and are named with the final component of the PPI class name.

The standard named types that are available are:

Typename             Matches                    PPI equivalent
========             =======                    ==============
Statement            a full Perl statement      PPI::Statement

Block                a block of Perl code       PPI::Structure::Block

List                 a parenthesized list       PPI::Structure::List
                                                   or PPI::Structure::Condition

Expression or Expr   a Perl expression          PPI::Statement::Expression
                                                   or PPI::Statement

Number               any Perl number            PPI::Token::Number
Integer or Int       any Perl integer                  <none>
Binary               0b111                      PPI::Token::Number::Binary
Octal                07777                      PPI::Token::Number::Octal
Hex                  0xFFF                      PPI::Token::Number::Hex
Float                -1.234                     PPI::Token::Number::Float
Exp                  -1.234e-56                 PPI::Token::Number::Exp
Version              v1.2.3                     PPI::Token::Number::Version

Quote                a string literal           PPI::Token::Quote
Single               'single quoted'            PPI::Token::Quote::Single
Double               "double quoted"            PPI::Token::Quote::Double
Literal              q{uninterpolated}          PPI::Token::Quote::Literal
Interpolate          qq{interpolated}           PPI::Token::Quote::Interpolate
HereDoc              <<HERE_DOC                 PPI::Token::HereDoc
String or Str        a string literal           PPI::Token::HereDoc
                                                   or PPI::Token::Quote

Regex or Regexp      /.../                      PPI::Token::Regexp
Match                m/.../                     PPI::Token::Regexp::Match
Substitute           s/.../.../                 PPI::Token::Regexp::Substitute
Transliterate        tr/.../.../                PPI::Token::Regexp::Transliterate
Pattern or Pat       qr/.../ or m/.../          PPI::Token::QuoteLike::Regexp
                                                   or PPI::Token::Regexp::Match

QuoteLike            any Perl quotelike         PPI::Token::QuoteLike
Backtick             `cmd in backticks`         PPI::Token::QuoteLike::Backtick
Command              qx{cmd in quotelike}       PPI::Token::QuoteLike::Command
Words                qw{ words here }           PPI::Token::QuoteLike::Words
Readline             <FILE>                     PPI::Token::QuoteLike::Readline

Operator or Op       any Perl operator          PPI::Token::Operator
Comma                , or =>

Label                LABEL:                     PPI::Token::Label
Whitespace           Empty space                PPI::Token::Whitespace
Comment              # A comment                PPI::Token::Comment
Pod                  =pod ... =cut              PPI::Token::Pod

Identifier or Ident  simple identifier                 <none>
QualIdent            identifier containing ::          <none>

Var                  a scalar, array, or hash          <none>
ScalarVar            a scalar                          <none>
ArrayVar             an array                          <none>
HashVar              a hash                            <none>
ArrayIndex           $#arrayname                PPI::Token::ArrayIndex
Constructor          [arrayref] or {hashref}    PPI::Structure::Constructor
AnonHash             {...}                      PPI::Structure::Constructor
                                                   or  PPI::Structure::Block
Subscript            ...[$index] or ...{$key}   PPI::Structure::Subscript

Regex and literal parameter types

In addition to the standard named types listed in the previous section, a keyword parameter can have its type specified as either a regex or a string, in which case the corresponding component in the trailing source code is expected to match the pattern or literal.

For example:

keyword fail ('all' $all?, /hard|soft/ $fail_mode, Block $code) {...}

would accept:

fail hard {...}
fail all soft {...}
# etc.

If a literal or pattern is only parsing a static part of the syntax, there may not be a need to give it an actual parameter variable. For example:

keyword list (/keys|values|pairs/ $what, 'in', HashVar $hash) {

    my $EXTRACTOR = $what eq 'values' ? 'values' : 'keys';
    my $REPORTER  = $what eq 'pairs' ? $hash.'{$data}' : '$data';

    return qq{for my \$data ($EXTRACTOR $hash) { say join ': ',$REPORTER }
}

Here the 'in' parameter just parses a fixed syntactic component of the keyword, so there's no need to capture it in a parameter.

Naming literal and regex types

Literal and regex parameter types are useful for matching syntax that PPI cannot recognize. However, they tend to muddy a keyword definition with large amounts of line noise (especially the regexes).

So the module allows you to declare a named type that matches whatever a given literal or regex would have matched in the same place...via the keytype keyword.

For example, instead of explicit regexes and string literals:

keyword fail ('all' $all?, /hard|soft/ $fail_mode, Block $code) {...}

keyword list (/keys|values|pairs/ $what, 'in', HashVar $hash) {

...you could predeclare named types that work the same:

keytype All       { 'all' }
keytype FailMode  { /hard|soft/ }

keytype ListMode  { /keys|values|pairs/ }
keytype In        { 'In' }

and then declare the keywords like so:

keyword fail (All $all?, FailMode $fail_mode, Block $code) {...}

keyword list (ListMode $what, In, HashVar $hash) {

A keytype can also be used to rename an existing named type more meaningfully. For example:

keytype Name      { Ident  }
keytype ParamList { List   }
keytype Attr      { /:\w+/ }
keytype Body      { Block  }

keyword method (Name $name, ParamList $params?, Attr @attrs?, Body $body)
{...}

Finally, if the block of the keytype is not a simple regex, string literal, or standard type name, it is treated as the code of a subroutine that is passed the value of the parameter and should return true if the type matches. In that case, the keytype may be given a parameter, so you don't need to unpack @_ manually.

For example, if a keyword could take either a block or an expression after it:

keytype BlockOrExpr ($block_expr) {
    return $block_expr->isa('PPI::Structure::Block')
        or $block_expr->isa('PPI::Statement::Expression');
}

# and later...

keyword demo (BlockOrExpr $demo_what) {...}

Note: Do not rely on source code components having been parsed via PPI in the long-term. The module implementation is likely to change to a lighter-weight parsing solution once one can be created.

"Up-to" types

Normally, a parameter's type tells the module how to parse it out of the source code. But you can also use any type to specify when to stop parsing the source code...that is: what to parse up to in the source code when matching the parameter.

If you place an ellipsis (...) before the type specifier, the module matches everything in the source code until it has also matched the type. The parameter variable will contain all of the source up to and including whatever the type specifier matched.

For example:

keyword omniloop (Ident $type, ...Block $config_plus_block) {...}
# After the type, grab everything up to the block

keyword test (Expr $condition, ...';' $description) {...}
# After the condition expression, grab everything to the next semicolon

Scalar vs array keyword parameters

Declaring a keyword's parameter as a scalar (the usual approach) causes the source code parser to match the corresponding type of component exactly once in the trailing source. For example:

# try takes exactly one trailing block
keyword try (Block $block) {...}

Declaring a keyword's parameter as an array causes the source code parser to match the corresponding type of component as many times as it appears (but at least once) in the trailing source.

# tryall takes one or more trailing blocks
keyword tryall (Block @blocks) {...}

Optional keyword parameters (with or without defaults)

Any parameter can be marked as optional, in which case failing to find a corresponding component in the trailing source is no longer a fatal error. For example:

# The forpair keyword takes an optional iterator variable
keyword forpair ( Var $itervar?, '(', HashVar $hash, ')', Block $block) {...}

# The checkpoint keyword can be followed by zero or more trailing strings
keyword checkpoint (Str @identifier?) {...}

Instead of a ?, you can specify an optional parameter with an = followed by a compile-time expression. The parameter is still optional, but if th e corresponding syntactic component is mising, the parameter variable will be assigned the result of the compile-time expression, rather than undef.

For example:

# The forpair keyword takes an optional iterator variable (or defaults to $_)
keyword forpair ( Var $itervar = '$_', '(', HashVar $hash, ')', Block $block) {...}

Specifying a keyword description

Normally the error messages the module generates refer to the keyword by name. For example, an error detected in parsing a repeat keyword with:

keyword repeat (/while/ $while, List $condition, Block $code)
{...}

might produce the error message:

Syntax error in repeat...
Expected while after repeat but found: with

which is a good message, but would be slightly better if it was:

Syntax error in repeat-while loop...
Expected while after repeat but found: with

You can request that a particular keyword be referred to in error messages using a specific description, by adding the :desc modifier to the keyword definition. For example:

keyword repeat (/while/ $while, List $condition, Block $code)
:desc(repeat-while loop)
{...}

Simplifying keyword generation with an interpolator

Frequently, the code block that generates the replacement syntax for the keyword will consist of something like:

{
    my $code_interpolation = some_expr_involving_a($param);
    return qq{ REPLACEMENT $code_interpolation HERE };
}

in which the block does some maniulation of one or more parameters, then interpolates the results into a single string, which it returns.

So the module provides a shortcut for that structure: the "triple curly" block. If a keyword's block is delimited by three adjacent curly brackets, the entire block is taken to be a single uninterpolated string that specifies the replacement source code. Within that single string anything in <{...}> delimiters is a piece of code to be executed and its result is interpolated at that point in the replacement code.

In other words, a triple-curly block is a literal code template, with special <{...}> interpolators.

For example, instead of:

keyword forall (List $list, '->', Params @params, Block $code_block)
{
    $list =~ s{\)\Z}{,\\\$__acc__)};
    substr $code_block, 1, -1, q{};
    return qq[
        {
            state \$__acc__ = [];
            foreach my \$__nary__ $list {
                if (!ref(\$__nary__) || \$__nary__ != \\\$__acc__) {
                    push \@{\$__acc__}, \$__nary__;
                    next if \@{\$__acc__} <= $#parameters;
                }
                next if !\@{\$__acc__};
                my ( @parameters ) = \@{\$__acc__};
                \@{\$__acc__} = ();

                $code_block
            }
        }
    ]
}

...you could write:

keyword forall (List $list, '->', Params @params, Block $code_block)
{{{
    {
        state $__acc__ = [];
        foreach my $__nary__  <{ $list =~ s{\)\Z}{,\\\$__acc__)}r }>
        {
            if (!ref($__nary__) || $__nary__ != \$__acc__) {
                push @{$__acc__}, $__nary__;
                next if @{$__acc__} <= <{ $#parameters }>;
            }
            next if !@{$__acc__};
            my ( <{"@parameters"}> ) = @{$__acc__};
            @{$__acc__} = ();

            <{substr $code_block, 1, -1}>
        }
    }
}}}

...with a significant reduction in the number of sigils that have to be escaped (and hence a significant decrease in the likelihood of errors creeping in).

Declaring multiple variants of a single keyword

You can declare two (or more) keywords with the same name, provided they all have distinct parameter lists. In other words, keyword definitions are treated as multimethods, with each variant parsing the following source code and then the variant which matches best being selected to provide the replacement code.

For example, you might specify three syntaxes for a repeat loop:

keyword repeat ('while', List $condition, Block $block) {{{
    while (1) { do <{$block}>; last if !(<{$condition}>); }
}}}

keyword repeat ('until', List $condition, Block $block) {{{
    while (1) { do <{$block}>; last if <{$condition}>; }
}}}

keyword repeat (Num $count, Block $block) {{{
    for (1..<{$count}>) <{$block}>
}}}

When it encounters a keyword, the module now attempts to (re)parse the trailing code with each of the definitions of that keyword in the current lexical scope, collecting those definitions that successfuly parse the source.

If more than one definition was successful, the module selects the definition(s) with the most parameters. If more than one definition had the maximal number of parameters, the module selects the one whose parameters matched most specifically. If two or more definitions matched equally specifically, the module looks for one that is marked with a :prefer attribute. If there is no :prefer indicated (or more than one), the module gives up and reports a syntax ambiguity.

The :prefer attribute works like this:

The order of specificity for a paremeter match is determined by the relationships between the various components of a Perl program, as follows (where the further left a type is, the more specific it is):

ArrayIndex
Comment
Label
Pod
Subscript
Whitespace
Operator
    Comma
Statement
    Block
    Expr
        Identifier
        QualIdent
        Var
            ScalarVar
            ArrayVar
            HashVar
        Number
            Integer
            Binary
            Octal
            Hex
            Float
            Exp
            Version
        Quote/String
            Single
            Double
            Literal
            Interpolate
            HereDoc
        QuoteLike
            Backtick
            Command
            Words
            Readline
        Regexp/Pattern
            Match
            Substitute
            Transliterate
        AnonHash
        Constructor
        Condition
        List

Exporting keywords

Normally a keyword definition takes effect from the statement after the keyword declaration, to the end of the enclosing lexical block.

However, if you declare a keyword inside a subroutine named import (i.e. inside the import method of a class or module), then the keyword is also exported to the caller of that import method.

In other words, simply placing a keyword definition in a module's import exports that keyword to the lexical scope in which the module is used.

Debugging keywords

If you load the module with the 'debug' option:

use Keyword::Declare {debug=>1};

then keywords declared in that lexical scope will report how they transform the source following them. For example:

use Keyword::Declare {debug=>1};

keyword list (/keys|values|pairs/ $what, 'in', HashVar $hash) {
    my $EXTRACTOR = $what eq 'values' ? 'values' : 'keys';
    my $REPORTER  = $what eq 'pairs' ? $hash.'{$data}' : '$data';

    return qq{for my \$data ($EXTRACTOR $hash) { say join ': ', $REPORTER }};
}

# And later...

list pairs in %foo;

...would print to STDERR:

#####################################################
### Keyword macro defined at demo.pl line 3:
###
###    list <what> in <hash>
###
### Converted code at demo.pl line 12:
###
###    list pairs in %foo;
###
### Into:
###
###    for my $data (keys %foo) { say join ': ', %foo{$data} }
###
#####################################################

DIAGNOSTICS

Keyword %s not in scope

The module detected that you used a user-defined keyword, but not in a lexical scope in which that keyword was declared or imported.

You need to move the keyword declaration (or the import) into scope, or else move the use of the keyword to a scope where the keyword is valid.

Syntax error in %s... Expected %s

You used a keyword, but with the wrong syntax after it. The error message lists what the valid possibilities were.

Ambiguous use of %s

You used a keyword, but the syntax after it was ambiguous (i.e. it matched two or more variants of the keyword).

You either need to change the syntax you used (so that it matches only one variant of the keyword syntax) or else change the definition of one or more of the keywords (to ensure their syntaxes are no longer ambiguous).

Expected parameter type specification, but found %s instead
Unexpected %s in parameter list of %s

You put something in the parameter list of a keyword definition that the mechanism didn't recognize. Perhaps you misspelled something?

Unknown type (%s) in parameter list of keyword

You used a type for a keyword parameter that the module did not recognize. See earlier in this document for a list of the types that the module knows. Alternatively, did you declare a keytype but then use it in the wrong lexical scope?

Expected comma or closing paren, but found %s instead

There was something unexpected after the end of a keyword parameter. Possibly a misspelling, or a missing closing parenthesis.

Invalid option for: use Keyword::Declare

Currently the module takes only a simple argument when loaded: a hash of configuration options. You passed something else to use Keyword::Declare;

A common mistake is to load the module with:

use Keyword::Declare  debug=>1;

instead of:

use Keyword::Declare {debug=>1};
Expected %s after %s but found: %s

You used a user-defined keyword, but with the wrong syntax. The error message indicates the point at which an unexpected component was encountered during compilation, and what should have been there instead.

Not a valid regex: %s in keytype specification"

A keytype expects a valid regex to specify the new keyword-parameter type. The regex you supplied wasn't valid (for the reason listed).

Missing }}} on string-style block of keyword %s

You created a keyword definition with a {{{...}}} interpolator for its body, but the module couldn't find the closing }}}. Did you use }} or } instead?

Missing }> on interpolation <{%s...

You created a keyword definition with a {{{...}}} interpolator, within which there was an interpolation that extended to the end of the interpolator without supplying a closing }>. Did you accidentally use just a > or a } instead?

CONFIGURATION AND ENVIRONMENT

Keyword::Declare requires no configuration files or environment variables.

DEPENDENCIES

The module is an interface to Perl's pluggable keyword mechanism, which was introduced in Perl 5.12. Hence it will never work under earlier versions of Perl. The implementation also uses contructs introduced in Perl 5.14, so that is the minimal practical version.

Currently requires both the Key::Declare module and the PPI module.

INCOMPATIBILITIES

None reported.

But Keyword::Declare probably won't get along well with source filters or Devel::Declare.

BUGS AND LIMITATIONS

The module currently relies on Keyword::Simple, so it is subject to all the limitations of that module. Most significantly, it can only create keywords that appear at the beginning of a statement.

Even with the remarkable PPI module, parsing Perl code is tricky, and parsing Perl code to build Perl code that parses other Perl code is even more so. Hence, there are likely to be cases where this module gets it spectacularly wrong. In particular, attempting to mix PPI-based parsing with regex-based parsing--as this module does--is madness, and almost certain to lead to tears for someone (apart from the author, obviously).

Moreover, because of the extensive (and sometimes iterated) use of PPI, the module currently imposes a noticeable compile-time delay, both on the code that declares keywords, and also on any code that subsequently uses them.

Plans are in train to address most or all of these limitations....eventually.

Please report any bugs or feature requests to bug-keyword-declare.cpan.org, or through the web interface at http://rt.cpan.org.

AUTHOR

Damian Conway <DCONWAY@CPAN.org>

LICENCE AND COPYRIGHT

Copyright (c) 2015, Damian Conway <DCONWAY@CPAN.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.