NAME
Regexp::Parser::Objects - objects for Perl 5 regexes
DESCRIPTION
This module contains the object definitions for Regexp::Parser.
Inheritance
All built-in objects inherit from Regexp::Parser::__object__. There are three abstract classes, anchor, assertion, and branch which can also be inherited from. anchor is inherited by bol, bound, gpos, and eol. assertion is inherited by ifmatch, unlessm, ifthen, suspend, eval, and logical. branch is inherited by or.
Here is the @ISA tree for the class Regexp::Parser::or:
Regexp::Parser::or
Regexp::Parser::branch
Regexp::Parser::__object__
Here is the @ISA tree for the class MyRx::or:
MyRx::or
MyRx::branch
MyRx::__object__
Regexp::Parser::or
Regexp::Parser::branch
Regexp::Parser::__object__
The __object__ Base Class
All nodes inherit from Regexp::Parser::__object__ the following methods:
- my $d = $obj->data()
-
The object's data. This might be an array reference (for a 'branch' node), another object (for a 'quant' node), or it might not exist at all (for an 'anchor' node).
- my $e = $obj->ender()
-
The arguments to object() to create the ending node for this object. This is used by the walk() method. Typically, a capturing group's ender is a
close
node, any other assertion's ender is atail
node, and a character class's ender is ananyof_close
node. - my $c = $obj->family()
-
The general family of this object. These are any of: alnum, anchor, anyof, anyof_char, anyof_class, anyof_range, assertion, branch, close, clump, digit, exact, flags, group, groupp, minmod, prop, open, quant, ref, reg_any.
- my $f = $obj->flags()
-
The flag value for this object. This value is a number created by OR'ing together the flags that are enabled at the time.
- $obj->insert()
-
Inserts this object into the tree. It returns a value that says whether or not it ended up being merged with the previous object in the tree.
- my $m = $obj->merge()
-
Merges this node with the previous one, if they are of the same type. If it is called after $obj has been added to the tree, $obj will be removed from the tree. Most node types don't merge. Returns true if the node was merged with the previous one.
- my $o = $obj->omit()
- my $o = $obj->omit(VALUE)
-
Whether this node is omitted from the parse tree. Certain objects do not need to appear in the tree, but are needed when inspecting the parsing, or walking the tree.
You can also set this attribute by passing a value.
- my $q = $obj->qr()
-
The regex representation of this object. It includes the regex representation of any children of the object.
- my $r = $obj->raw()
-
The raw representation of this object. It does not look at the children of the object, just itself. This is used primarily when inspecting the parsing of the regex.
- my $t = $obj->type()
-
The specific type of this object. See the object's documentation for possible values for its type.
- my $v = $obj->visual()
-
The visual representation of this object. It includes the visual representation of any children of the object.
- $obj->walk()
-
"Walks" the object. This is used to dive into the node's children when using a walker (see "Walking the Tree" in Regexp::Parser).
Objects may override these methods (as objects often do).
Using NEXT:: instead of SUPER::
You can't use $obj->SUPER::method()
inside the __object__ class, because __object__ doesn't inherit from anywhere. You want to go along the object's inheritance tree. Use Damian Conway's NEXT module instead. This module is standard with Perl 5.8.
Object Attributes
All objects share the following attributes (accessible via $obj->{...}
):
- rx
-
The parser object with which it was created.
- flags
-
The flags for the object.
The following attributes may also be set:
- branch
-
Whether this object has branches (like
|
). - chars
-
The characters contained in this class (for anyof, alnum, prop, etc.).
- data
-
The data or children of this object.
- dir
-
The direction of this object (for look-ahead/behind assertions). If less than 0, it is behind; otherwise, it is ahead.
- down
-
Whether this object creates a deeper scope (like an OPEN).
- family
-
The general family of this object.
- ifthen
-
Whether this object has a true/false branch (like the
(?(...)T|F)
assertion). - max
-
The maximum repetition count of this object (for quantifiers).
- min
-
The minimum repetition count of this object (for quantifiers).
- neg
-
Whether this object is negated (like a look-ahead or a character class).
- nparen
-
The capture group related to this object (like for OPEN and back references).
- off
-
The flags specifically turned off for this object (for flag assertions and
(?:...)
). - omit
-
Whether this object is omitted from the actual tree (like a CLOSE).
- on
-
The flags specifically turned on for this object (for flag assertions and
(?:...)
). - raw
-
The raw representation of this object.
- type
-
The specific type of this object.
- up
-
Whether this object goes into a shallower scope (like a CLOSE).
- vis
-
The visual representation of this object.
- zerolen
-
Whether this object does is zero-width (like an anchor).
If there is a method with the name of one of these attributes, it is imperative you use the method to access the attribute when outside the class, and it's a good idea to do so inside the class as well.
OBJECTS
All objects are prefixed with Regexp::Parser::, but that is omitted here for brevity. The headings are object classes. The field "family" represents the general category into which that object falls.
This is very sparse. Future versions will have more complete documentation. For now, read the source (!).
bol
Family: anchor
Types: bol (^
), sbol (^
with /s
on, \A
), mbol (^
with /m
on)
bound
Family: anchor
Types: bound (\b
), nbound (\B
)
Neg: 1 if negated
gpos
Family: anchor
Types: gpos (\G
)
eol
Family: anchor
Types: eol ($
), seol ($
with /s
on, \Z
), meol ($
with /m
on), eos (\z
)
reg_any
Family: reg_any
Types: reg_any (.
), sany (.
with /s
on), cany (\C
)
alnum
Family: alnum
Types: alnum (\w
), nalnum (\W
)
Neg: 1 if negated
space
Family: space
Types: space (\s
), nspace (\S
)
Neg: 1 if negated
digit
Family: digit
Types: digit (\d
), ndigit (\D
)
Neg: 1 if negated
anyof
Family: anyof
Types: anyof ([
)
Data: array reference of anyof_char, anyof_range, anyof_class
Neg: 1 if negated
Ender: anyof_close
anyof_char
Family: anyof_char
Types: anyof_char (X
)
Data: actual character
anyof_range
Family: anyof_range
Types: anyof_range (X-Y
)
Data: array reference of lower and upper bounds, both anyof_char
anyof_class
Family: anyof_class
Types: via [:NAME:]
, [:^NAME:]
, \p{NAME}
, \P{NAME}
: alnum (\w
, \W
), alpha, ascii, cntrl, digit (\d
, \D
), graph, lower, print, punct, space (\s
, \S
), upper, word, xdigit; others are possible (Unicode properties)
Data: 'POSIX' if [:NAME:]
, [^:NAME:]
(or other POSIX notations, like [=NAME=]
and [.NAME.]
); otherwise, reference to alnum, digit, space, or prop object
Neg: 1 if negated
anyof_close
Family: close
Types: anyof_close (]
when in [...
)
Omitted
prop
Family: prop
Types: name of property (\p{NAME}
, \P{NAME}
); any Unicode property defined by Perl or elsewhere
Neg: 1 if negated
clump
Family: clump
Types: clump (\X
)
or
Family: branch
Types: or (|
)
Data: array reference of array references, each representing one alternation, holding any number of objects
Branched
exact
Family: exact
Types: exact (abc
), exactf (abc
with /i
on)
Data: array reference of actual characters
quant
Family: quant
Types: star (*
), plus (+
), curly (?
, {n}
, {n,}
, {n,m}
)
Data: one object
group
Family: group
Types: group ((?:
, (?i-s:
)
Data: array reference of any number of objects
Ender: tail
open
Family: open
Types: open1, open2 ... openN ((
)
Data: array reference of any number of objects
Ender: close
close
Family: close
Types: close1, close2 ... closeN ()
when in (...
)
Omitted
tail
Family: close
Types: tail ()
when not in (...
)
Omitted
ref
Family: ref
Types: ref1, ref2 .. refN (\1
, \2
, etc.); reff1, reff2 .. reffN (\1
, \2
, etc. with /i
on)
ifmatch
Family: assertion
Types: ifmatch ((?=)
, (?<=
)
Data: array reference of any number of objects
Dir: -1 if look-behind, 1 if look-ahead
Ender: tail
unlessm
Family: assertion
Types: unlessm ((?!
, (?<!
)
Data: array reference of any number of objects
Dir: -1 if look-behind, 1 if look-ahead
Ender: tail
suspend
Family: assertion
Types: suspend ((?>
)
Data: array reference of any number of objects
Ender: tail
ifthen
Family: assertion
Types: ifthen ((?(
)
Data: array reference of two objects; first: ifmatch, unlessm, eval, groupp; second: branch
Ender: tail
groupp
Family: groupp
Types: groupp1, groupp2 .. grouppN (1
, 2
, etc. when in (?(
)
eval
Family: assertion
Types: eval ((?{
)
Data: string with contents of assertion
logical
Family: assertion
Types: logical ((??{
)
Data: string with contents of assertion
flags
Family: flags
Types: flags ((?i-s)
)
minmod
Family: minmod
Types: minmod (?
after quant)
Data: an object in the quant family
SEE ALSO
Regexp::Parser, Regexp::Parser::Handlers, Regexp::Parser::Hierarchy.
AUTHOR
Jeff japhy
Pinyan, japhy@perlmonk.org
COPYRIGHT
Copyright (c) 2004 Jeff Pinyan japhy@perlmonk.org. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.