NAME

Persist::Filter - Function for parsing filters

SYNOPSIS

use Persist::Filter;
$tree = parse_filter($filter);

$tree->remap(sub { ... });
$tree->remap_on('Persist::Filter::Comparison', sub { ... });

$stringified = $tree->unparse;

DESCRIPTION

This package provides help for dealing with Persist filters. Filters are query strings similar in format to the expressions used in a SQL WHERE clause. This type of query language was chosen as it is familiar to those used to SQL programming (such as myself), makes translation into SQL extremely straightforward--SQL being the query language used for the first important Persist drivers, and it is a simple and powerful language.

This package provides a few methods for parsing filter strings, walking the parse trees for modification or examination, and for turning parse trees back to AST. First, let's examine the format of Persist filters.

FILTERS

Filters are simple boolean expressions similar to the SQL WHERE clause. Each filter is made up of one or more comparison expressions that are separated by either of the boolean operators AND or OR. Each comparison expression may be preceded by the boolean operator NOT to invert the result as well. Parentheses may be used to group tests together.

The comparison operators available are:

* Equivalence (=)
* Non-Equivalence (<>)
* Less-Than (<)
* Less-Than-Or-Equal (<=)
* Greater-Than (>)
* Greater-Than-Or-Equal (>=)
* Case-Sensitive Similarity (LIKE)
* Case-Insensitive Similarity (ILIKE)

In the case of LIKE and ILIKE, the literal value in the comparison may contain the meta characters ``%'' or ``_''. The ``%'' matches zero or more of any character and the ``_'' matches exactly one of any character. All operators but LIKE and ILIKE are commutative. LIKE and ILIKE treat the right side of the expression as a match expression and the left as a literal string. Not all drivers may implement LIKE and ILIKE with the capability of using a column name as a matching expression.

As an alternative to literal values a question mark (?) may be used as a place holder for a literal value. This may only be done when the method the filter is passed to also accepts a reference to an array of bindings. Each element of the binding will be used to replace the question mark place holder in order of appearance.

For example, given the filter:

foo = ? AND bar = ?

and the array:

('hello', 'world')

we effectively have:

foo = 'hello' AND bar = 'world'

This should be obvious to anyone familiar with most SQL database APIs like DBI.

HELPER METHODS

Here is the list of methods provided by this package for working with filter strings.

$ast = parse_filter($filter)

This function accepts a single argument: a string containing a filter. This function either returns undef when the filter is invalid or an abstract syntax tree (AST) representing the parsed filter.

The AST basically has this form:

[ operand, operator, operand ]

Each operand is either a scalar reference (for identifiers or values) or another AST. The operator is always a string representation of an operator.

As an example, this filter:

o.age > 40 and (a.color = 'blue' and not a.color = 'green')

would result in almost this AST:

[ [ 'o.age', '>', '40' ], 'and', [ [ 'a.color', '=', "'blue'" ], 'and',
                                 [ 'not', [ 'a.color', '=', "'green'" ] ] ] ]

"Almost" because each operand shown as a scalar here would actually be a reference to a variable containing that string. Furthermore, every reference is then blessed into a class to identify the objects type. Therefore, a more correct representation of the tree would be this (as output from Data::Dumper):

bless( [
         bless( [
                  bless( do{\(my $o = 'o.age')}, 'Persist::Filter::Identifier' ),
                  '>',
                  bless( do{\(my $o = '40')}, 'Persist::Filter::Number' )
                ], 'Persist::Filter::Comparison' ),
         'and',
         bless( [
                  bless( [
                           bless( do{\(my $o = 'a.color')}, 'Persist::Filter::Identifier' ),
                           '=',
                           bless( do{\(my $o = '\'blue\'')}, 'Persist::Filter::String' )
                         ], 'Persist::Filter::Comparison' ),
                  'and',
                  bless( [
                           'not',
                           bless( [
                                    bless( do{\(my $o = 'a.color')}, 'Persist::Filter::Identifier' ),
                                    '=',
                                    bless( do{\(my $o = '\'green\'')}, 'Persist::Filter::String' )
                                  ], 'Persist::Filter::Comparison' )
                         ], 'Persist::Filter::Not' )
                ], 'Persist::Filter::Junction' )
       ], 'Persist::Filter::Junction' )

Obviously, this isn't quite as simple to digest as the previous example, but it really is easy to process with Perl.

It should be noted that operator names and numeric literals are always converted to lowercase. The classes that each of these are blessed into are part of a rich inheritance tree to allow for easy tree walking--as we shall see later on.

$filter = unparse_filter($ast)

This method performs the exact opposite of parse_filter. This will construct a filter string from an AST. The given AST and resulting filter do not need to strictly adhere to the normal filter syntax--and can actually deviate quite far from it, if necessary. The tree is basically pretty printed according to some simple rules that do not depend on the tree being in filter format. This is so that complicated transformations of the tree into driver dependent query languages have an easier time of stringifying the AST.

The rules are as follows:

  1. References are dereferenced and ignored. The classes they are blessed into, if any, are also ignored. References to references have no defined semantics during pretty printing.

  2. Arrays (or references to them) have their contents printed by adding space between internal elements and surrounding contents with parenthesis.

  3. Scalars (or references to them) are printed as is.

  4. Data of any other type have no defined semantics during pretty printing--that is, we haven't really defined what will happen, but you can bet it probably won't be good.

Here are some samples.

This AST:

[ [ 'o.age', '>', '40' ], 'and', [ [ 'a.color', '=', "'blue'" ], 'and',
                                 [ 'not', [ 'a.color', '=', "'green'" ] ] ] ]

would become:

((o.age > 40) and ((a.color = 'blue') and (not (a.color = 'green'))))

and this AST:

[ 'o->name', '=~', '/Bob.*/' ]

would become:

(o->name =~ /Bob.*/)

and this AST:

[ '&',[ 'name','=',"'Bob'" ],[ 'age','>','40' ],[ 'color','=',"'red'" ] ]

would become:

(& (name = 'Bob') (age > 40) (color = 'red'))

Obviously, this is very flexible. It allows us to build strings that are parseable in most query languages.

AST CLASS HEIRARCHY

Each AST object returned from parse_filter is blessed as one of these classes:

Persist::Filter::AST

Every AST object is a Persist::Filter::AST object as all inherit from it. This class isn't used directly. It provides three methods for every AST object.

$ast->remap(\&subroutine)

This is the most fundamental of the tree-walking functions. It calls the given subroutine on every node in the tree, including the current node. Each time it is called the subroutine is passed a reference to the node as the argument.

This method performs the tree-walking operation using a non-recursive algorithm so it should be relatively efficient.

$ast->remap_on($type, \&subroutine)

This method does essentially the same thing as remap but only calls the given subroutine on code that is equal to or a decendent of the given $type. That is, UNIVERSAL::isa is called on each AST object and $type to determine when &subroutine should be called.

This is performed with a non-recursive algorithm for efficiency.

$filter = $ast->unparse

This is just a shorthand for:

$filter = unparse_filter($ast);
Persist::Filter::Logical

All logic operations are subclassed from Persist::Filter::Logical. This includes Persist::Filter::Junction and Persist::Filter::Not. All subclasses of this class are blessed array references.

Persist::Filter::Junction

Binary logical operations are blessed into this class. This includes conjunction (AND) and disjunction (OR). These are blessed array references.

Persist::Filter::Not

Unary logical negation (NOT) operations are blessed into this class. These are blessed array references.

Persist::Filter::Not

All of the binary comparison operations (i.e., =, <>, <, >, <=, >=, LIKE, ILIKE, NOT LIKE, NOT ILIKE) are blessed into this class. These are blessed array references.

Persist::Filter::Operand

All operands are blessed into subclasses of this class. All subclasses of this type are blessed scalars.

Persist::Filter::Identifier

Identifiers are blessed into this class.

Persist::Filter::Literal

All literals are subclassed from this class.

Persist::Filter::String

Literal strings are blessed into this class.

Persist::Filter::Number

Literal numbers are blessed into this class.

Persist::Filter::Placeholder

Literal placeholders (?) are blessed into this class.

GRAMMAR

Persist::Filter uses Parse::RecDescent to parse the filters. For details on the grammar itself, please examine the source for this package. You can use:

perldoc -m Persist::Filter

to examine the source code after installation.

EXPORT

The functions parse_filter and unparse_filter are always exported.

SEE ALSO

Parse::RecDescent

AUTHOR

Andrew Sterling Hanenkamp, <hanenkamp@users.sourceforge.net>

COPYRIGHT AND LICENSE

Copyright (c) 2003, Andrew Sterling Hanenkamp
All rights reserved.

Redistribution and use in source and binary forms, with or without 
modification, are permitted provided that the following conditions 
are met:

  * Redistributions of source code must retain the above copyright 
    notice, this list of conditions and the following disclaimer.
  * Redistributions in binary form must reproduce the above copyright 
    notice, this list of conditions and the following disclaimer in 
    the documentation and/or other materials provided with the 
    distribution.
  * Neither the name of the Contentment nor the names of its 
    contributors may be used to endorse or promote products derived 
    from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN 
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 
POSSIBILITY OF SUCH DAMAGE.