NAME

Regexp::Parser::Objects - objects for Perl 5 regexes

DESCRIPTION

This module contains the object definitions for Regexp::Parser.

Inheritance

All built-in objects inherit from Regexp::Parser::__object__. There are three abstract classes, anchor, assertion, and branch which can also be inherited from. anchor is inherited by bol, bound, gpos, and eol. assertion is inherited by ifmatch, unlessm, ifthen, suspend, eval, and logical. branch is inherited by or.

Here is the @ISA tree for the class Regexp::Parser::or:

Regexp::Parser::or
  Regexp::Parser::branch
    Regexp::Parser::__object__

Here is the @ISA tree for the class MyRx::or:

MyRx::or
  MyRx::branch
    MyRx::__object__
Regexp::Parser::or
  Regexp::Parser::branch
    Regexp::Parser::__object__

The __object__ Base Class

All nodes inherit from Regexp::Parser::__object__ the following methods:

my $d = $obj->data()

The object's data. This might be an array reference (for a 'branch' node), another object (for a 'quant' node), or it might not exist at all (for an 'anchor' node).

my $e = $obj->ender()

The arguments to object() to create the ending node for this object. This is used by the walk() method. Typically, a capturing group's ender is a close node, any other assertion's ender is a tail node, and a character class's ender is an anyof_close node.

my $c = $obj->family()

The general family of this object. These are any of: alnum, anchor, anyof, anyof_char, anyof_class, anyof_range, assertion, branch, close, clump, digit, exact, flags, group, groupp, minmod, prop, open, quant, ref, reg_any.

my $f = $obj->flags()

The flag value for this object. This value is a number created by OR'ing together the flags that are enabled at the time.

$obj->insert()

Inserts this object into the tree. It returns a value that says whether or not it ended up being merged with the previous object in the tree.

my $m = $obj->merge()

Merges this node with the previous one, if they are of the same type. If it is called after $obj has been added to the tree, $obj will be removed from the tree. Most node types don't merge. Returns true if the node was merged with the previous one.

my $o = $obj->omit()
my $o = $obj->omit(VALUE)

Whether this node is omitted from the parse tree. Certain objects do not need to appear in the tree, but are needed when inspecting the parsing, or walking the tree.

You can also set this attribute by passing a value.

my $q = $obj->qr()

The regex representation of this object. It includes the regex representation of any children of the object.

my $r = $obj->raw()

The raw representation of this object. It does not look at the children of the object, just itself. This is used primarily when inspecting the parsing of the regex.

my $t = $obj->type()

The specific type of this object. See the object's documentation for possible values for its type.

my $v = $obj->visual()

The visual representation of this object. It includes the visual representation of any children of the object.

$obj->walk()

"Walks" the object. This is used to dive into the node's children when using a walker (see "Walking the Tree" in Regexp::Parser).

Objects may override these methods (as objects often do).

Using NEXT:: instead of SUPER::

You can't use $obj->SUPER::method() inside the __object__ class, because __object__ doesn't inherit from anywhere. You want to go along the object's inheritance tree. Use Damian Conway's NEXT module instead. This module is standard with Perl 5.8.

Object Attributes

All objects share the following attributes (accessible via $obj->{...}):

rx

The parser object with which it was created.

flags

The flags for the object.

The following attributes may also be set:

branch

Whether this object has branches (like |).

chars

The characters contained in this class (for anyof, alnum, prop, etc.).

data

The data or children of this object.

dir

The direction of this object (for look-ahead/behind assertions). If less than 0, it is behind; otherwise, it is ahead.

down

Whether this object creates a deeper scope (like an OPEN).

family

The general family of this object.

ifthen

Whether this object has a true/false branch (like the (?(...)T|F) assertion).

max

The maximum repetition count of this object (for quantifiers).

min

The minimum repetition count of this object (for quantifiers).

neg

Whether this object is negated (like a look-ahead or a character class).

nparen

The capture group related to this object (like for OPEN and back references).

off

The flags specifically turned off for this object (for flag assertions and (?:...)).

omit

Whether this object is omitted from the actual tree (like a CLOSE).

on

The flags specifically turned on for this object (for flag assertions and (?:...)).

raw

The raw representation of this object.

type

The specific type of this object.

up

Whether this object goes into a shallower scope (like a CLOSE).

vis

The visual representation of this object.

zerolen

Whether this object does is zero-width (like an anchor).

If there is a method with the name of one of these attributes, it is imperative you use the method to access the attribute when outside the class, and it's a good idea to do so inside the class as well.

OBJECTS

All objects are prefixed with Regexp::Parser::, but that is omitted here for brevity. The headings are object classes. The field "family" represents the general category into which that object falls.

This is very sparse. Future versions will have more complete documentation. For now, read the source (!).

bol

Family: anchor

Types: bol (^), sbol (^ with /s on, \A), mbol (^ with /m on)

bound

Family: anchor

Types: bound (\b), nbound (\B)

Neg: 1 if negated

gpos

Family: anchor

Types: gpos (\G)

eol

Family: anchor

Types: eol ($), seol ($ with /s on, \Z), meol ($ with /m on), eos (\z)

reg_any

Family: reg_any

Types: reg_any (.), sany (. with /s on), cany (\C)

alnum

Family: alnum

Types: alnum (\w), nalnum (\W)

Neg: 1 if negated

space

Family: space

Types: space (\s), nspace (\S)

Neg: 1 if negated

digit

Family: digit

Types: digit (\d), ndigit (\D)

Neg: 1 if negated

anyof

Family: anyof

Types: anyof ([)

Data: array reference of anyof_char, anyof_range, anyof_class

Neg: 1 if negated

Ender: anyof_close

anyof_char

Family: anyof_char

Types: anyof_char (X)

Data: actual character

anyof_range

Family: anyof_range

Types: anyof_range (X-Y)

Data: array reference of lower and upper bounds, both anyof_char

anyof_class

Family: anyof_class

Types: via [:NAME:], [:^NAME:], \p{NAME}, \P{NAME}: alnum (\w, \W), alpha, ascii, cntrl, digit (\d, \D), graph, lower, print, punct, space (\s, \S), upper, word, xdigit; others are possible (Unicode properties)

Data: 'POSIX' if [:NAME:], [^:NAME:] (or other POSIX notations, like [=NAME=] and [.NAME.]); otherwise, reference to alnum, digit, space, or prop object

Neg: 1 if negated

anyof_close

Family: close

Types: anyof_close (] when in [...)

Omitted

prop

Family: prop

Types: name of property (\p{NAME}, \P{NAME}); any Unicode property defined by Perl or elsewhere

Neg: 1 if negated

clump

Family: clump

Types: clump (\X)

or

Family: branch

Types: or (|)

Data: array reference of array references, each representing one alternation, holding any number of objects

Branched

exact

Family: exact

Types: exact (abc), exactf (abc with /i on)

Data: array reference of actual characters

quant

Family: quant

Types: star (*), plus (+), curly (?, {n}, {n,}, {n,m})

Data: one object

group

Family: group

Types: group ((?:, (?i-s:)

Data: array reference of any number of objects

Ender: tail

open

Family: open

Types: open1, open2 ... openN (()

Data: array reference of any number of objects

Ender: close

close

Family: close

Types: close1, close2 ... closeN () when in (...)

Omitted

tail

Family: close

Types: tail () when not in (...)

Omitted

ref

Family: ref

Types: ref1, ref2 .. refN (\1, \2, etc.); reff1, reff2 .. reffN (\1, \2, etc. with /i on)

ifmatch

Family: assertion

Types: ifmatch ((?=), (?<=)

Data: array reference of any number of objects

Dir: -1 if look-behind, 1 if look-ahead

Ender: tail

unlessm

Family: assertion

Types: unlessm ((?!, (?<!)

Data: array reference of any number of objects

Dir: -1 if look-behind, 1 if look-ahead

Ender: tail

suspend

Family: assertion

Types: suspend ((?>)

Data: array reference of any number of objects

Ender: tail

ifthen

Family: assertion

Types: ifthen ((?()

Data: array reference of two objects; first: ifmatch, unlessm, eval, groupp; second: branch

Ender: tail

groupp

Family: groupp

Types: groupp1, groupp2 .. grouppN (1, 2, etc. when in (?()

eval

Family: assertion

Types: eval ((?{)

Data: string with contents of assertion

logical

Family: assertion

Types: logical ((??{)

Data: string with contents of assertion

flags

Family: flags

Types: flags ((?i-s))

minmod

Family: minmod

Types: minmod (? after quant)

Data: an object in the quant family

SEE ALSO

Regexp::Parser, Regexp::Parser::Handlers, Regexp::Parser::Hierarchy.

AUTHOR

Jeff japhy Pinyan, japhy@perlmonk.org

COPYRIGHT

Copyright (c) 2004 Jeff Pinyan japhy@perlmonk.org. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.