NAME

Perl6::Overview::Rule - Grammar Rules

DESCRIPTION

Match object

$/  -- can be accessed as an array of sub-matches, e.g. $/[0], $/[1] ...
    -- can be accessed as a hash of named subrules, e.g. $/<foo>

$0, $1, $2, ... -- aliases for $/[0], $/[1], $/[2], ...

Context     Behaviour
------------------------------------------------------------------------
String      Stringifies to entire match
Number      The numeric value of the matched string (i.e. given 
            "foo123bar"~~/\d+/, then $/ in numeric context will be 123)
Boolean     True if success, false otherwise

Backtracking

:       Prevents backtracking over previous atom
::      Fails entire group if previous atom is backtracked over
:::     Fails entire rule if previous atom is backtracked over

Also <commit> and <cut> (see below).

Modifiers

Used as adverbial modifiers:

    "foO bar" ~~ m:s:ignorecase:g/[o | r] ./

and can appear within rules

    sub rule { <foo> [:i <bar>] }   # only ignore <bar>'s case

:b :basechar	Match base char ignoring accents and such
:bytes              Match individual bytes
:c, :continue       Start scanning from string's current .pos
:codes              Match individual codepoints
:ex, :exhaustive    Match every possible way (including overlapping)
:g, :global         Find all non-overlapping matches
:graphs             Match individual graphemes
:i, :ignorecase     Ignore letter case
:keepall            Force rule and all subrules to remember everything
:chars              Match maximally abstract characters allowed by pragma
:nth(N)             Find Nth occurrence. N is an integer; you can
                    also use 1st, 2nd, 3rd, 4th, as well as 1th, 2th.
                    (junctions allowed, e.g. :nth(1|2|3|5).)
:once               Only match first time
:p, :pos            Only try to match at string's current .pos
:perl5              Use Perl 5 syntax for regex
:ov, :overlap       Match at all possible character positions, including
                    overlapping; return all matches in list context,
                    disjunction in item context
:rw                 Claim string for modification, instead of copy-on-write
:s, :sigspace       Replaces every sequence of literal whitespace in
                    pattern with \s+ or \s* according to <?ws> rule
:x(N)               Repetition -- find N times (N is an integer)

[:ex vs :ov could use clarification.  see
 http://www.nntp.perl.org/group/perl.perl6.language/20985 ]
[re :s, except where there is already an explicit space rule]

Built-in assertions

'...'               Matches ... as a literal string
"..."               Matches ... as a literal string (after interpolation)
<sp>                Matches a literal space
<ws>                Matches any sequence of whitespace, like the :s modifier
<wb>                Matches any word boundary (Perl 5's \b semantics)
<dot>               Matches a literal . (same as '.')
<lt>                Matches a literal < (same as '<')
<gt>                Matches a literal > (same as '>')
<prior>             Match whatever the most recent successful match did
<after pattern>     Matches only after pattern (zero-width)
<before pattern>    Matches only before pattern (zero-width)
<commit>            Fails the entire match if backtracked over
<cut>               Fails the entire match if backtracked over,
                    and removes the portion of the string matched until then
<fail>              Fails the entire match if reached
<null>              Matches null string
<ident>             Matches an "identifier", same as
                    ([ [<alpha> | _] \w* | " [<alpha>|_] \w* " ])
<self>              Matches the same pattern as the current rule
                    (useful for recursion)
<!XXX>              Zero width assertion that XXX *doesn't* match at 
                    this location
<alnum>             Alphanumeric character
<alpha>             Alphabetic character
<ascii>             ASCII character
<blank>             Horizontal whitespace ([ \t])
<cntrl>             Control character
<digit>             Numeric character
<graph>             An alphanumeric character or punctuation
<lower>             Lowercase character
<print>             Printable character -- alphanumeric, punctuation or 
                    whitespace
<space>             Whitespace character ([\s\ck])
<upper>             Uppercase character
<word>              Word character (alphanumeric + _)
<xdigit>            Hexadecimal digit

Named rules are stored in the match object unless the rule name is
prefixed with a ?

    /<?item> <quantity> <price>/

would store values in $/{'quantity'} and $/{'price'} but not $/{'item'}

Character classes

<[abcd]>            Matches one of the characters a,b,c, or d.  Ranges
                    may be used as <[a..z]>.  Can be combined with + 
                    and - like so: <[a..z]-[m..p]> (which is the same
                    as <[a..l]+[q..z]>)
<-XXX>              Matches XXX as a negated character class.  For
                    instance, <-alpha> would match one non-alpha
                    character. May also be combined as above. 
                    (<-alpha+[qrst]> will match any non-alpha character
                    and the characters q,r,s, and t)
<+XXX>              Matches XXX as a character class. <+alpha> matches
                    one alpha character.

[This does not seem to match S05.  Which one is wrong?]

Hypothetical variables

Assign value to variable only if entire pattern succeeds.  
Syntax: let $foo = value

    my $x;
    / (\S*) { let $x = .pos } \s* foo /

Aliasing

Syntax:

let $foo = $1

Shorthand:

$foo := $1

my $x;
/ $x := (\S*) \s* foo /
    

Can use arrays:

/ @x := [ (\S+) \s* ]* /

# returns anonymous list in item context for *, +, **{n,m}:
/ $x := [ (\S+) \s* ]* /

# ? does not pluralize -- result or undef

Hashes:

/ %x := [ (\S+)\: \s* (.*) ]* / # key/value pairs
# $0 = list of keys
# $1 = list of values

/ %x := [ (\S+) \s* ]* /        # capture only keys, values = undef

Can capture return values of closures:

/ $x := { "Item context" } /
/ @x := { "List", "context" } /
/ %x := { "Hash" => "context" } /
# note: no parens around closure

Reorder paren groups:

/ $1 := (.*?), \h* $0 := (.*) /
# renumbering occurs
/ $2 := (.*?), \h* (.*) /       # $3 = (.*)

Relative to current location: $-1, $-2, $-3...

Named subrules:

/ <key>\: <value> { let $hash{$<key>} = $<value> } /