NAME
Persist::Filter - Function for parsing filters
SYNOPSIS
use Persist::Filter;
$tree = parse_filter($filter);
$tree->remap(sub { ... });
$tree->remap_on('Persist::Filter::Comparison', sub { ... });
$stringified = $tree->unparse;
DESCRIPTION
This package provides help for dealing with Persist filters. Filters are query strings similar in format to the expressions used in a SQL WHERE
clause. This type of query language was chosen as it is familiar to those used to SQL programming (such as myself), makes translation into SQL extremely straightforward--SQL being the query language used for the first important Persist drivers, and it is a simple and powerful language.
This package provides a few methods for parsing filter strings, walking the parse trees for modification or examination, and for turning parse trees back to AST. First, let's examine the format of Persist filters.
FILTERS
Filters are simple boolean expressions similar to the SQL WHERE
clause. Each filter is made up of one or more comparison expressions that are separated by either of the boolean operators AND
or OR
. Each comparison expression may be preceded by the boolean operator NOT
to invert the result as well. Parentheses may be used to group tests together.
COMPARISON OPERATORS
The comparison operators available are:
* Equivalence (=)
* Non-Equivalence (<>)
* Less-Than (<)
* Less-Than-Or-Equal (<=)
* Greater-Than (>)
* Greater-Than-Or-Equal (>=)
* Case-Sensitive Similarity (LIKE)
* Case-Insensitive Similarity (ILIKE)
In the case of LIKE
and ILIKE
, the literal value in the comparison may contain the meta characters ``%'' or ``_''. The ``%'' matches zero or more of any character and the ``_'' matches exactly one of any character. All operators but LIKE
and ILIKE
are commutative. LIKE
and ILIKE
treat the right side of the expression as a match expression and the left as a literal string. Not all drivers may implement LIKE
and ILIKE
with the capability of using a column name as a matching expression.
PLACEHOLDERS
As an alternative to literal values a question mark (?) may be used as a place holder for a literal value. This may only be done when the method the filter is passed to also accepts a reference to an array of bindings. Each element of the binding will be used to replace the question mark place holder in order of appearance.
For example, given the filter:
foo = ? AND bar = ?
and the array:
('hello', 'world')
we effectively have:
foo = 'hello' AND bar = 'world'
This should be obvious to anyone familiar with most SQL database APIs like DBI.
IDENTIFIERS
Identifiers may be in either name
or name.name
format. Usually, the former is preferred, but the latter may be required where name
is ambiguous. (This happens when filtering on a table join where two of the joined tables have columns with the same name.) When the name.name
format is used, the first name identifies the table and the second identifies the column.
Table identifiers are either the name of the table (as given to the appropriate method), a number for the index the name (as given), or the name of the table and number of that table's occurance. That is, if [ 'A', 'B', 'A' ]
were passed as the tables to a method using a filter, then a table name of "1" identifies the first occurance of "A", "2" identifies the "B" table, and "3" identifies the third "A". Or, "A" is an ambiguous table name, so it cannot be used and "B" identifies the "B" table. Or, "A1" identifies the first "A", "B1" identifies the "B" table, and "A2" identifies the second "A" table. Each of these nomenclatures can be mixed as needed or desired.
HELPER METHODS
Here is the list of methods provided by this package for working with filter strings.
- $ast = parse_filter($filter)
-
This function accepts a single argument: a string containing a filter. This function either returns
undef
when the filter is invalid or an abstract syntax tree (AST) representing the parsed filter.The AST basically has this form:
[ operand, operator, operand ]
Each
operand
is either a scalar reference (for identifiers or values) or another AST. Theoperator
is always a string representation of an operator.As an example, this filter:
o.age > 40 and (a.color = 'blue' and not a.color = 'green')
would result in almost this AST:
[ [ 'o.age', '>', '40' ], 'and', [ [ 'a.color', '=', "'blue'" ], 'and', [ 'not', [ 'a.color', '=', "'green'" ] ] ] ]
"Almost" because each operand shown as a scalar here would actually be a reference to a variable containing that string. Furthermore, every reference is then blessed into a class to identify the objects type. Therefore, a more correct representation of the tree would be this (as output from Data::Dumper):
bless( [ bless( [ bless( do{\(my $o = 'o.age')}, 'Persist::Filter::Identifier' ), '>', bless( do{\(my $o = '40')}, 'Persist::Filter::Number' ) ], 'Persist::Filter::Comparison' ), 'and', bless( [ bless( [ bless( do{\(my $o = 'a.color')}, 'Persist::Filter::Identifier' ), '=', bless( do{\(my $o = '\'blue\'')}, 'Persist::Filter::String' ) ], 'Persist::Filter::Comparison' ), 'and', bless( [ 'not', bless( [ bless( do{\(my $o = 'a.color')}, 'Persist::Filter::Identifier' ), '=', bless( do{\(my $o = '\'green\'')}, 'Persist::Filter::String' ) ], 'Persist::Filter::Comparison' ) ], 'Persist::Filter::Not' ) ], 'Persist::Filter::Junction' ) ], 'Persist::Filter::Junction' )
Obviously, this isn't quite as simple to digest as the previous example, but it really is easy to process with Perl.
It should be noted that operator names and numeric literals are always converted to lowercase. The classes that each of these are blessed into are part of a rich inheritance tree to allow for easy tree walking--as we shall see later on.
- $filter = unparse_filter($ast)
-
This method performs the exact opposite of
parse_filter
. This will construct a filter string from an AST. The given AST and resulting filter do not need to strictly adhere to the normal filter syntax--and can actually deviate quite far from it, if necessary. The tree is basically pretty printed according to some simple rules that do not depend on the tree being in filter format. This is so that complicated transformations of the tree into driver dependent query languages have an easier time of stringifying the AST.The rules are as follows:
References are dereferenced and ignored. The classes they are blessed into, if any, are also ignored. References to references have no defined semantics during pretty printing.
Arrays (or references to them) have their contents printed by adding space between internal elements and surrounding contents with parenthesis.
Scalars (or references to them) are printed as is.
Data of any other type have no defined semantics during pretty printing--that is, we haven't really defined what will happen, but you can bet it probably won't be good.
Here are some samples.
This AST:
[ [ 'o.age', '>', '40' ], 'and', [ [ 'a.color', '=', "'blue'" ], 'and', [ 'not', [ 'a.color', '=', "'green'" ] ] ] ]
would become:
((o.age > 40) and ((a.color = 'blue') and (not (a.color = 'green'))))
and this AST:
[ 'o->name', '=~', '/Bob.*/' ]
would become:
(o->name =~ /Bob.*/)
and this AST:
[ '&',[ 'name','=',"'Bob'" ],[ 'age','>','40' ],[ 'color','=',"'red'" ] ]
would become:
(& (name = 'Bob') (age > 40) (color = 'red'))
Obviously, this is very flexible. It allows us to build strings that are parseable in most query languages.
AST CLASS HEIRARCHY
Each AST object returned from parse_filter
is blessed as one of these classes:
- Persist::Filter::AST
-
Every AST object is a Persist::Filter::AST object as all inherit from it. This class isn't used directly. It provides three methods for every AST object.
- $ast->remap(\&code)
-
This is the most fundamental of the tree-walking functions. It calls the given subroutine on every node in the tree, including the current node. Each time it is called the subroutine
&code
is passed a reference to the node as the argument.This method performs the tree-walking operation using a non-recursive algorithm so it should be relatively efficient.
- $ast->remap_on($type, \&code)
-
This method does essentially the same thing as
remap
but only calls the given subroutine on code that is equal to or a decendent of the given$type
. That is,UNIVERSAL::isa
is called on each AST object and$type
to determine when&code
should be called.This is performed with a non-recursive algorithm for efficiency.
- $filter = $ast->unparse(@args)
-
This is just a shorthand for:
$filter = unparse_filter(@args);
See
unparse_filter
for details are arguments passed.
- Persist::Filter::Logical
-
All logic operations are subclassed from
Persist::Filter::Logical
. This includesPersist::Filter::Junction
andPersist::Filter::Not
. All subclasses of this class are blessed array references. - Persist::Filter::Junction
-
Binary logical operations are blessed into this class. This includes conjunction (AND) and disjunction (OR). These are blessed array references.
- Persist::Filter::Not
-
Unary logical negation (NOT) operations are blessed into this class. These are blessed array references.
- Persist::Filter::Not
-
All of the binary comparison operations (i.e., =, <>, <, >, <=, >=, LIKE, ILIKE, NOT LIKE, NOT ILIKE) are blessed into this class. These are blessed array references.
- Persist::Filter::Operand
-
All operands are blessed into subclasses of this class. All subclasses of this type are blessed scalars.
- Persist::Filter::Identifier
-
Identifiers are blessed into this class.
- Persist::Filter::Literal
-
All literals are subclassed from this class.
- Persist::Filter::String
-
Literal strings are blessed into this class.
- Persist::Filter::Number
-
Literal numbers are blessed into this class.
- Persist::Filter::Placeholder
-
Literal placeholders (?) are blessed into this class.
GRAMMAR
Persist::Filter uses Parse::RecDescent to parse the filters. For details on the grammar itself, please examine the source for this package. You can use:
perldoc -m Persist::Filter
to examine the source code after installation.
EXPORT
The functions parse_filter
and unparse_filter
are always exported.
SEE ALSO
AUTHOR
Andrew Sterling Hanenkamp, <hanenkamp@users.sourceforge.net>
COPYRIGHT AND LICENSE
Copyright (c) 2003, Andrew Sterling Hanenkamp
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the
distribution.
* Neither the name of the Contentment nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.