NAME
Criteria::Compile - Describe wanted objects/data using grammar
SYNOPSIS
#users can use common grammar to specify their criteria
search_things(title_is => 'EXAMPLE TITLE', author_like => qr/Anthony.*/);
#write the subroutine by using this module
sub search_things {
#build the criteria object
my $criteria = Criteria::Compile->new(@_);
#once we're ready, export it as an anonymous subroutine
$criteria = $criteria->export_sub;
#filter objects using the exported sub (calls ->title and ->author)
return grep $criteria->($_), @things;
}
DESCRIPTION
This module provides an easy framework to compile "wanted" subroutines by describing simple data structures and objects using custom grammar. Users can supply criteria using a set of basic grammar rules. Functionality can also be extended by defining custom grammar-handlers to construct the necessary logic. A number of useful grammars are provided out-of-the-box.
Reading the notes section of this document is advised!
- Objects and data-types
-
This package by default works with object methods. You can handle different object and data access methods by switching access modes, or by defining new ones. See "access_mode" and "define_access_mode" for more information.
KEY CONCEPTS
What is grammar?
A grammar is a type of operation that can be used to describe an object. The simplest example of this is the 'is' grammar, which equates to an eq
comparison.
Criteria::Compile->new(title_is => 'TITLE-HERE');
The criteria above matches when an object's title
method eq
's TITLE-HERE
. In a similar fashion, the 'like' grammar compares an object attribute against a Regexp
.
- Dynamic Grammar (3)*
-
A dynamic grammar is, as with
is
above, grammar which dynamically operates on object attributes by acting as a pre or post-fix. All of the default grammar built-in to this module is defined as dynamic grammar. - Static Grammar (1)*
-
A static grammar is an operation with pre-defined operands. These are useful for simplifying complex operations, or operating on information that is derived from different or multiple sources. These will mainly become useful when subclassing this module for use on specific object types (e.g. an EmployeeCriteria class with custom logic).
- Chained Grammar (2)*
-
Chained grammar is the same as dynamic grammar with the special-case of being processed before dynamic grammar. Typically, this behaviour is used to do something clever before redispatching to the real dynamic handler using "resolve_dispatch".
* = precedence order in the case of multiple-matching definitions
PUBLIC CONSTANTS
The following constants are provided by this package:
Grammar Types
Tokens used to indicate grammar types to the various subroutines in this module, such as "define_grammar".
TYPE_STATIC
TYPE_DYNAMIC
TYPE_CHAINED
Access Types
Tokens used to indicate an access mode, for use with "access_mode" etc.
ACC_OBJECT
ACC_HASH
METHODS
new
$inst = Criteria::Compile->new();
$inst = Criteria::Compile->new(%custom_criteria);
This is the constructor for new Criteria::Compile
objects. Optionally takes a key-value list of criteria, which will be passed to "add_criteria".
add_criteria
our $exe_file_check = Criteria::Compile->new();
#check the value returned by the filename method of objects ends in '.exe'
$exe_file_check->add_criteria(filename_like => qr/\.exe$/);
Adds criteria to be applied. This is inherently an AND operation in the context of adding multiple criteria. This method takes a criteria pattern and the value accepted by the grammar. In this case, the 'like' grammar expects a Regexp
.
exec
#match exe files
$exe_file_check->exec($fh)
? print 'Found EXE file!'
: print 'Does not meet the criteria for an EXE file!';
Executes the criteria on the supplied object ad-hoc. Takes a single argument which is the object to be evaluated. Returns a boolean indicating whether the criteria were met.
export_sub
my $sub = $inst->export_sub;
This method returns an anonymous subroutine which will internally execute the exec
method on a value supplied to it. This method takes no arguments.
compile
#alter already in-use criteria
$inst->add_criteria(name_like => qr/^Ben.*/);
#force a re-compile
$inst->compile();
This methods forces a re-compile of the instance's criteria. This can be used to alter or recycle instances. This method takes an optional single argument, which is a HASHREF
containing additional criteria to be added with the special behaviour that they will not be stored, and will be forgotten on subsequent calls to compile
access_mode
$inst->access_mode(Criteria::Compile::ACC_HASH)
$inst->access_mode(Criteria::Compile::ACC_OBJECT)
Switches the current access mode used to examine values. Defaults to ACC_OBJECT
. Access modes accepted by default are:
ACC_HASH
-
This mode accesses fields / attributes as keys using perl hash access. This token is available as a constant.
ACC_OBJECT
-
This mode accesses fields / attributes as object methods. This token is available as a constant.
define_access_mode
$inst->define_access_mode('MODE_TOKEN', $getter)
Defines a new object / data access mode. Takes 2 arguments, the mode token, which is the string which will be passed to "access_mode" to activate this mode, and a SUBREF
to the "getter" implementation. The getter implementation is a subroutine which has the signature:
$sub->($object, $attribute)
The subroutine must return a value corresponding to the object's attribute when called, or undef
.
define_grammar
package MySubclass;
use parent qw( Criteria::Compile );
sub _init {
my $self = shift;
return 0 unless $self::SUPER->_init(@_);
#add a new grammar
$self->define_grammar($match, $handler);
#add a new chained grammar
$self->define_grammar($match, $handler, $self->TYPE_CHAINED);
return 1;
}
Defines a new grammar for use in criteria. Takes the match pattern, the name of the handler sub as a string or CODEREF
, and a final optional type flag which indicates the grammar type (Defaults to TYPE_DYNAMIC
).
See "List of Default Grammar" for a list of already-available grammars. See "Defining Custom Grammar" for more information on grammar types.
resolve_dispatch
#this statement says "tell me how to handle title_is"
my ($handler, @handler_arguments) = $self->resolve_dispatch('title_is');
#let's compile it with a value!
my $sub = $handler->($value, @handler_arguments);
#use it as a single standalone criteria!
$sub->($object);
Takes a criteria key, and returns the handler CODEREF
and the arguments required to compile the criteria using it, minus the criteria's value which needs to be prepended to the list before use. This method is intended for internal usage within this module and subclasses.
GRAMMAR
List of Default Grammar
(.*)_is
-
name_is => 'Anthony'
Evaluates the value of the field corresponding to match group 1 against the value supplied, using
eq
(.*)_like
-
address_like => qr/.*London.*/
Evaluates the value of the field corresponding to match group 1 against a
Regexp
value supplied, using=~
(.*)_greater_than
-
score_greater_than => 20
Evaluates the value of the field corresponding to match group 1 is greater than the supplied
number
, using>
(.*)_less_than
-
score_less_than => 20
Evaluates the value of the field corresponding to match group 1 is less than the supplied
number
, using<
(.*)_in
-
age_in => [16..25]
Evaluates whether the value of the field corresponding to match group 1 exists in the supplied
ARRAYREF
, usingeq
(.*)_matches
-
user_matches => \%allowed_users
Evaluates the value of the field corresponding to match group 1 against the value supplied, using
~~
(smart match). See perlsyn for more detail on smart matching.
Defining Custom Grammar
A good practise when starting to use this module (or pattern, more precicely) will be to automatically create a subclass of this module for every new object type you develop. I do this because it allows anyone working with my new object / data type to easily deal with them in a simple and consistent way. Another side-effect you may notice is that this provides a great deal of encapsulation, allowing you to keep the details of exactly how to extract complex data from your objects within a package you control, should that be necessary.
As every object or data type has very diferrent uses, there will be a need to define custom grammar specific to your objects or data structures. This may mean a 'near' grammar which calculates distance, or a 'related_to' grammar that checks a person's family records.
There are 3 parts to custom grammar definitions, the match pattern, the grammar handler, and the grammar type. The match pattern is a Regexp
which determines how users can access your new grammar, in which case the handler will be called and must return an anonymous subroutine which can execute the check. The grammar handler is only called once, during "compilation". The grammar type is one of the constants provided.
A simple example of this would be the definition of the 'is' grammar.
#DISCLAIMER: Although functional, this is not the actual implementation of 'is'.
#GRAMMAR MAPPING
$inst->define_grammar(qw/^(.*)_is$/, 'is_handler',
Criteria::Compile::TYPE_DYNAMIC);
#Note: Grammar in this module is defined by decorating an instance.
#It is recommended to override the init method should you wish to subclass or extend this package.
#HANDLER
sub is_handler {
my ($context, $value, $attribute) = @_;
return unless ($attribute);
#for example, in: filename_is => 'something.txt'
#attribute will be match group 1 from the match pattern (filename)
#value is the value supplied to add_criteria ('something.txt')
#return a subroutine which will 'eq' against the $attribute method of objects passed in
return sub {
my $object = $_[0];
if (ref($object) {
my $obj_value = $object->$attribute;
return ($obj_value eq $value);
}
return 0;
};
}
Defining Chained Grammar
There is no special magic to chained grammar, only the expectation that the grammar's handler will internally re-dispatch to another grammar handler after making whatever changes are needed to the operands. See "resolve_dispatch" for more information on re-dispatching.
A simple example of a chained grammar would be a 'data' grammar which accesses a sub-object like below. It would allow a user to prefix criteria like 'content_is' as 'data_content_is' to include sub-objects as part of your criteria.
#GRAMMAR MAPPING
$inst->define_grammar(qw/^data_(.*)$/, 'data_chandler',
Criteria::Compile::TYPE_CHAINED);
#Note: Grammar in this module is defined by decorating an instance.
#It is recommended to override the init method should you wish to subclass or extend this package.
#HANDLER
sub data_chandler {
my ($self, $value, $real_crit) = @_;
#lookup the real handler, with our prefix removed
my ($real_sub, @args) = $self->resolve_dispatch($real_crit);
#get the criteria subroutine from the real handler
return unless ($real_sub);
return unless ($real_sub = $self->$real_sub($value, @args));
#return a subroutine which will extract the data from the object to operate on
#and then call the real criteria subroutine
return sub {
my $object = $_[0];
my $data = $object->data();
#pass the object's data on to the real subroutine
return $real_sub->($data);
};
}
Note: If speed matters, chained grammar is not for you. In such cases, use static or dynamic grammar, as you can flatten (hard-code) logic down to a single subroutine call for improved speed.
MORE EXAMPLES
See this distribution's examples
directory.
EXPORT
None by default.
NOTES
This looks slow!
This module makes extensive use of anonymous subroutines and closures, and some developers have expressed that this looks slow to them. These developers have typically heard that "Perl subroutines are slow", possibly even neglecting the readability and maintainability of their applications based on inaccurate assumptions.
The built-in grammar in this module add around 3 calls per criterion. That's a 15 call overhead for running 5 criteria on 3 objects. To put this into perspective, calling ->now
from the DateTime package causes 25 calls. There's almost zero chance that the cost of using this module has any real impact on your application's performance.
There are very few instances where the cost of performing any such optimisation to flatten the call-stack during the compile method does not almost completely outweigh any slight performance gains, if not turning out slower. Even if your handlers were tiny, using simple Perl operators, you would need to run the same criteria without compiling a new one 100,000 times in a tight loop to see a 100 millisecond gain.
In the end, don't believe me, I could be wrong about your application and/or hardware environment. Benchmark!
This is "slow"!
This module is "slow"-er than it could be, as each criterion causes at least 2 subroutine calls per-check. In the future there may be a cross-compatible B or XS-based variation of this module if I find the time (or anyone reading this, feel free to write it!). As yet, the speed has not been an issue for me, but there may well be a time in the future I will to want to use this for heavier or faster processing (or both!).
SEE ALSO
Nothing to see here!
AUTHOR
A. J. Lucas, <kaoyoriketsu@ansoni.com>
COPYRIGHT AND LICENSE
Copyright (C) 2011 - Anthony J. Lucas
This is free software; you can redistribute it and/or modify it under the same terms as perl itself.