NAME

PPIx::Regexp::Element - Base of the PPIx::Regexp hierarchy.

SYNOPSIS

No user-serviceable parts inside.

INHERITANCE

PPIx::Regexp::Element is not descended from any other class.

PPIx::Regexp::Element is the parent of PPIx::Regexp::Node and PPIx::Regexp::Token.

DESCRIPTION

This class is the base of the PPIx::Regexp object hierarchy. It provides the same kind of navigational functionality that is provided by PPI::Element.

METHODS

This class provides the following public methods. Methods not documented here are private, and unsupported in the sense that the author reserves the right to change or remove them without notice.

accepts_perl

$token->accepts_perl( '5.020' )
    and say 'This works under Perl 5.20';

This method returns a true value if the token is acceptable under the specified version of Perl, and a false value otherwise. Unless the token (or its contents) have been equivocated on, the result is simply what you would expect based on testing the results of perl_version_introduced() and perl_version_removed() versus the given Perl version number.

This method was added in version 0.051_01.

ancestor_of

This method returns true if the object is an ancestor of the argument, and false otherwise. By the definition of this method, $self is its own ancestor.

can_be_quantified

$token->can_be_quantified()
    and print "This element can be quantified.\n";

This method returns true if the element can be quantified.

class

This method returns the class name of the element. It is the same as ref $self.

column_number

This method returns the column number of the first character in the element, or undef if that can not be determined.

comment

This method returns true if the element is a comment and false otherwise.

content

This method returns the content of the element.

descendant_of

This method returns true if the object is a descendant of the argument, and false otherwise. By the definition of this method, $self is its own descendant.

explain

This method returns a brief explanation of what the element does. The return will be either a string or undef in scalar context, but may be multiple values or an empty array in list context.

This method should be considered experimental. What it returns may change without notice as my understanding of what all the pieces/parts of a Perl regular expression evolves. The worst case is that it will prove entirely infeasible to implement satisfactorily, in which case it will be put through a deprecation cycle and retracted.

error

say $token->error();

If an element is one of the classes that represents a parse error, this method may return a brief message saying why. Otherwise it will return undef.

is_matcher

This method reports on whether the element potentially matches something. Possible returns are a true value if it does, a false (but defined) value if it does not, or undef if this can not be determined.

The idea is to classify elements based on whether they potentially match something in the target string.

This method is overridden to return undef in PPIx::Regexp::Token::Code, PPIx::Regexp::Token::Interpolation, and PPIx::Regexp::Token::Unknown.

This method is overridden to return a true value in PPIx::Regexp::Token::Assertion, PPIx::Regexp::Token::CharClass, PPIx::Regexp::Token::Literal, and PPIx::Regexp::Token::Reference.

For PPIx::Regexp::Node, this method is overridden to return a value computed from the node's children.

For anything else this method returns a false (but defined) value.

in_regex_set

This method returns a true value if the invocant is contained in an extended bracketed character class (also known as a regex set), and a false value otherwise. This method returns true if the invocant is a PPIx::Regexp::Structure::RegexSet.

is_quantifier

$token->is_quantifier()
    and print "This element is a quantifier.\n";

This method returns true if the element is a quantifier. You can not tell this from the element's class, because a right curly bracket may represent a quantifier for the purposes of figuring out whether a greediness token is possible.

line_number

This method returns the line number of the first character in the element, or undef if that can not be determined.

location

This method returns a reference to an array describing the position of the element in the regular expression, or undef if locations were not indexed.

The array is compatible with the corresponding PPI::Element method.

logical_filename

This method returns the logical file name (taking #line directives into account) of the file containing first character in the element, or undef if that can not be determined.

logical_line_number

This method returns the logical line number (taking #line directives into account) of the first character in the element, or undef if that can not be determined.

main_structure

This method returns the PPIx::Regexp::Structure::Main that contains the element. In practice this will be a PPIx::Regexp::Structure::Regexp or a PPIx::Regexp::Structure::Replacement,

If the element is not contained in any such structure, undef is returned. This will happen if the element is a PPIx::Regexp or one of its immediate children.

modifier_asserted

$token->modifier_asserted( 'i' )
    and print "Matched without regard to case.\n";

This method returns true if the given modifier is in effect for the element, and false otherwise.

What it does is to walk backwards from the element until it finds a modifier object that specifies the modifier, whether asserted or negated. and returns the specified value. If nobody specifies the modifier, it returns undef.

This method will not work reliably if called on tokenizer output.

next_element

This method returns the next element, or nothing if there is none.

Unlike next_sibling(), this will cross from the content of a structure into the elements that define the structure, or vice versa.

next_sibling

This method returns the element's next sibling, or nothing if there is none.

next_token

This method returns the next token, or nothing if there is none.

Unlike next_element(), this will walk the parse tree.

parent

This method returns the parent of the element, or undef if there is none.

perl_version_introduced

This method returns the version of Perl in which the element was introduced. This will be at least 5.000. Before 5.006 I am relying on the perldelta, perlre, and perlop documentation, since I have been unable to build earlier Perls. Since I have found no documentation before 5.003, I assume that anything found in 5.003 is also in 5.000.

Since this all depends on my ability to read and understand masses of documentation, the results of this method should be viewed with caution, if not downright skepticism.

There are also cases which are ambiguous in various ways. For those see the PPIx::Regexp documentation, particularly Changes in Syntax.

Very occasionally, a construct will be removed and then added back. If this happens, this method will return the lowest version in which the construct appeared. For the known instances of this, see the PPIx::Regexp documentation, particularly Equivocation.

perl_version_removed

This method returns the version of Perl in which the element was removed. If the element is still valid the return is undef.

All the caveats to perl_version_introduced() apply here also, though perhaps less severely since although many features have been introduced since 5.0, few have been removed.

Very occasionally, a construct will be removed and then added back. If this happens, this method will return the undef if the construct is present in the highest-numbered version of Perl (whether production or development), or the version after the highest-numbered version in which it appeared otherwise. For the known instances of this, see the PPIx::Regexp documentation, particularly Equivocation.

previous_element

This method returns the previous element, or nothing if there is none.

Unlike previous_sibling(), this will cross from the content of a structure into the elements that define the structure, or vice versa.

previous_sibling

This method returns the element's previous sibling, or nothing if there is none.

This method is analogous to the same-named PPI::Element method, in that it will not cross from the content of a structure into the elements that define the structure.

previous_token

This method returns the previous token, or nothing if there is none.

Unlike previous_element(), this will walk the parse tree.

raw_width

my ( $raw_min, $raw_max ) = $self->raw_width();

This public method returns the minimum and maximum width matched by the element before taking into account such details as what the element actually is and how it is quantified. Either or both elements can be undef if the width can not be determined, and the maximum can be Inf.

This method was added in version 0.085_01.

remove_insignificant

This method returns a new object manufactured from the invocant, but containing only elements for which $elem->significant() returns a true value.

If you call this method on a PPIx::Regexp::Node you will get back a deep clone, but without the insignificant elements.

If you call this method on any other PPIx::Regexp class you will get back either the invocant or nothing. This may change to a clone of the invocant or nothing if unforeseen problems arise with returning the invocant, or if objects become mutable (unlikely, but not impossible.)

requirements_for_perl

say $token->requirements_for_perl();

This method returns a string representing the Perl requirements for a given module. This should only be used for informational purposes, as the format of the string may be subject to change.

At the moment, the returns may be:

version <= $]
version <= $] < version
two or more of the above joined by '||'
! $]

The last means that, although all the components of the regular expression can be compiled by some version of Perl, there is no version that will compile all of them.

I reiterate: the returned string may be subject to change, maybe without warning.

This method was added in version 0.051_01.

scontent

This method returns the significant content of the element. That is, if called on the parse of '/ f u b a r /x', it returns '/fubar/x'. If the invocant contains no insignificant elements, it is the same as content(). If called on an insignificant element, it returns nothing -- that is, undef in scalar context, and an empty list in list context.

This method was inspired by jb's question on Perl Monks about stripping comments and white space from a regular expression: https://www.perlmonks.org/?node_id=1207556

This method was added in version 0.053_01

significant

This method returns true if the element is significant and false otherwise.

snext_element

This method returns the next significant element, or nothing if there is none.

Unlike snext_sibling(), this will cross from the content of a structure into the elements that define the structure, or vice versa.

snext_sibling

This method returns the element's next significant sibling, or nothing if there is none.

This method is analogous to the same-named PPI::Element method, in that it will not cross from the content of a structure into the elements that define the structure.

sprevious_element

This method returns the previous significant element, or nothing if there is none.

Unlike sprevious_sibling(), this will cross from the content of a structure into the elements that define the structure, or vice versa.

sprevious_sibling

This method returns the element's previous significant sibling, or nothing if there is none.

This method is analogous to the same-named PPI::Element method, in that it will not cross from the content of a structure into the elements that define the structure.

statement

This method returns the PPI::Statement that contains this element, or nothing if the statement can not be determined.

In general this method will return something only under the following conditions:

tokens

This method returns all tokens contained in the element.

top

This method returns the top of the hierarchy.

unescaped_content

This method returns the content of the element, unescaped.

visual_column_number

This method returns the visual column number (taking tabs into account) of the first character in the element, or undef if that can not be determined.

whitespace

This method returns true if the element is whitespace and false otherwise.

width

my ( $min, $max ) = $self->width();

This method returns the minimum and maximum number of characters this element can match.

Either element can be undef if it cannot be determined. For example, for /$foo/ both elements will be undef. Recursions will return undef because they can not be analyzed statically -- or at least I am not smart enough to do so. Back references may return undef if the referred-to group can not be uniquely determined.

It is possible for $max to be Inf. For example, for /x*/ $max will be Inf.

Elements that do not actually match anything will return zeroes.

Note: This method was added because I wanted better detection of variable-length look-behinds. Both it and raw_width() (above) should be considered somewhat experimental.

This method was added in version 0.085_01.

This method returns navigation information from the top of the hierarchy to this node. The return is a list of names of methods and references to their argument lists. The idea is that given $elem which is somewhere under $top,

my @nav = $elem->nav();
my $obj = $top;
while ( @nav ) {
    my $method = shift @nav;
    my $args = shift @nav;
    $obj = $obj->$method( @{ $args } ) or die;
}
# At this point, $obj should contain the same object
# as $elem.

SUPPORT

Support is by the author. Please file bug reports at https://rt.cpan.org/Public/Dist/Display.html?Name=PPIx-Regexp, https://github.com/trwyant/perl-PPIx-Regexp/issues, or in electronic mail to the author.

AUTHOR

Thomas R. Wyant, III wyant at cpan dot org

COPYRIGHT AND LICENSE

Copyright (C) 2009-2023 by Thomas R. Wyant, III

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0. For more details, see the full text of the licenses in the directory LICENSES.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.