NAME
Data::RuledValidator - data validator with rule
DESCRIPTION
Data::RuledValidator is validator of data. This needs rule which is readable by not programmer ... so it is like specification.
WHAT FOR ?
One programmer said;
specification is in code, so documentation is not needed.
Another programmer said;
code is specification, so if I write specification, it is against DRY.
It is excuse of them. They may dislike to write documents, they may be not good at writing documents, and/or they may think validation check is trivial task. But, if specification is used by programming and we needn't write program, they will start to write specification. And, at last, we need specification.
SYNOPSIS
You can use this without rule file.
BEGIN{
$ENV{REQUEST_METHOD} = "GET";
$ENV{QUERY_STRING} = "page=index&i=9&k=aaaaa&v=bbbb";
}
use Data::RuledValidator;
use CGI;
my $v = Data::RuledValidator->new(obj => CGI->new, method => "param");
print $v->by_sentence("age is num", "name is word", "nickname is word", "required = age,name,nickname"); # return 1 if valid
This means that parameter of CGI object, age is number, name is word, nickname is also word and require age, name and nickname.
Next example is using following rule in file "validator.rule";
;;GLOBAL
ID_KEY page
# $cgi->param('age') is num
age is num
# $cgi->param('name') is word
name is word
# $cgi->param('nickname') is word
nickname is word
# following rule is applyed when $cgi->param('page') is 'index'
;;index
# requied $cgi->param('age'), $cgi->param('name') and $cgi->param('nickname')
required = age, name, nickname
And code is(environmental values are as same as first example):
my $v = Data::RuledValidator->new(obj => CGI->new, method => "param", rule => "validator.rule");
print $v->by_rule; # return 1 if valid
This is as nearly same as first example. left value of ID_KEY, "page" is parameter name to specify rule name to use.
my $q = CGI->new;
$id = $q->param("page");
Now, $id is "index" (see above environmental values in BEGIN block), use rule in "index". The specified module and method in new is used. "index" rule is following:
;;index
required = age, name, nickname
Global rule is applied as well.
age is num
name is word
nickname is word
So it is as same as first example. This means that parameter of CGI object, age is number, name is word, nickname is also word and require age, name and nickname.
RuledValidator GENERAL IDEA
Object
Object has data which you want to check and Object has Method which returns Value(s) from Object's data.
Key
Basically, Key is the key which is passed to Object Method.
Value(s)
Value(s) are the returned of the Object Method passed Key.
Operator
Operator is operator to check Value(s).
Condition
Condition is the condition for Operator to judge whether Value(s) is/are valid or not.
USING OPTION
When using Data::RuledValidator, you can use option.
- import_error
-
This defines behavior when plugin is not imported correctly.
use Data::RuledValidator import_error => 0;
If value is 0, do nothing. It is default.
use Data::RuledValidator import_error => 1;
If value is 1, warn.
use Data::RuledValidator import_error => 2;
If value is 2, die.
- plugin
-
You can specify which plugins you want to load.
use Data::RuledValdiator plugin => [qw/Email/];
If you don't specify any plugins, all plugins will be loaded.
- filter
-
You can specify which filter plugins you want to load.
use Data::RuledValdiator filter => [qw/XXX/];
If you don't specify any filter plugins, all filter plugins will be loaded.
CONSTRUCTOR
- new
-
my $v = Data::RuledValidator->new( obj => $obj, method => $method, rule => $rule_file_location, );
$obj is Object which has values which you want to check. $method is Method of $obj which returns Value(s) which you want to check. $rule_file_location is file location of rule file.
my $v = Data::RuledValidator->new(obj => $obj, method => $method);
If you use "by_sentence" and/or you use "by_rule" with argument, no need to specify rule here.
You can use array ref for method. for example, $c is object, and $c->res->param is the way to get values. pass [qw/res param/] to method.
If you need another object and/or method for identify to group name.
my $v = Data::RuledValidator->new(obj => $obj, method => $method, id_obj => $id_obj, id_method => $id_method);
for validation, $obj->$method is used. for identifying to group name, $id_obj->$id_method is used (when you omit id_method, method is used).
CONSTRUCTOR OPTION
- rule
-
rule => rule_file_location
explained above.
- filter_replace
-
Data::RuledValidator has filter feature. You can decide replace object method value with filtered value or not.
This option can take 3 kind of value.
filter_replace => 0
This will not use filtered value.
filter_replace => 1 filter_replace => []
Use filtered value. Using 1 or [] is depends on the way to set value with object method.
1 ... $q->param(key, @value); [] ... $q->param(key, [ @value ]);
- rule_path
-
rule_path => '/path/to/rule_dir/'
You can specify the path of the directory including rule files.
- auto_reset
-
By default, reset method is automatically called when by_rule or by_sentence is called.
If you want to change this behavior, set it.
auto_reset => 0
You can change the value by method auto_reset.
- key_method
-
key_method => 'param'
key_method is the method of
obj
which returns keys like as param of CGI module. If you don't specify this value, the value you specified asmethod
is used. if you want to disable this, set 0 or empty value as following.key_method => 0 key_method => ''
This is for "filter * with ..." sentence in "FILTERS" and when
filter_replace
is true, this filter sentence apply filter all values of keys which are returned bykey_method
. When you disable this(you set key_method => 0), the values applyed filter are only keys which are in rule.
METHOD for VALIDATION
- by_sentence
-
$v->by_sentence("i is number", "k is word", ...);
The arguments is rule. You can write multiple sentence. It returns $v object.
- by_rule
-
$v->by_rule(); $v->by_rule($rule_file); $v->by_rule($rule_file, $group_name);
If $rule is omitted, using the file which is specified in new. It returns $v object.
- result
-
$v->result;
The result of validation check. This returned the following structure.
{ 'i_is' => 0, 'v_is' => 0, 'z_match' => 1, }
This means
key 'i' is invalid. key 'v' is invalid. key 'z' is valid.
You can get this result as following:
%result = @$v;
- valid
-
$v->valid;
The result of total validation check. The returned value is 1 or 0.
You can get this result as following, too:
$result = $v;
- failure
-
$v->failure;
Given values to validation check. Some/All of them are wrong value. This returned, for example, the following structure.
{ 'i_is' => ['x', 'y', 'z'], 'v_is' => ['x@x.jp'], 'z_match' => [0123, 1234], }
If you want wrong value only, use wrong method.
- missing
-
The values included in rule is not given from object. You can get such keys/aliases as following
my $missing_arrayref = $v->missing;
$missing_arrayref likes as following;
['key', 'alias']
- wrong
-
This is not implemented.
$v->wrong;
It returns only wrong value.
{ 'i_is' => ['x', 'y', 'z'], 'v_is' => ['x@x.jp'], 'z_match' => [0123, 1234], }
All of them are wrong values.
- reset
-
$v->reset();
The result of validation check is reseted. This is internally called when by_sentence or by_rule is called.
OTHER METHOD
- required_alias_name
-
$v->required_alias_name
It is special alias name to specify required keys.
- list_plugins
-
$v->list_plugins;
list all plugins.
- filter_replace
-
$v->filter_replace;
This get/set new's option filter_replace. get/set value is 0, 1 or [].
See "CONSTRUCTOR OPTION".
- rule_path
-
$v->rule_path
This get/set new's option rule_path.
See "CONSTRUCTOR OPTION".
- auto_reset
-
$v->auto_reset;
This get/set new's option auto_reset. get/set value is 0, 1.
See "CONSTRUCTOR OPTION".
RULE SYNTAX
Rule Syntax is very simple.
- ID_KEY Key
-
The right value is key which is passed to Object->Method. The returned value of Object->Method(Key) is used to identify GROUP_NAME
ID_KEY page
- ID_METHOD method, method ...
-
Note that: It is used, only when you need another method to identify to GROUP_NAME.
The right value is method which is used when Object->Method. The returned value of Object->Method(Key)/Object->Method (Key is omitted) is used to identify GROUP_NAME.
ID_METHOD request action
This can be defined in constructor, new.
- ;GROUP_NAME
-
start from ; is start of group and the end of this group is the line before next ';'. If the value of Object->Method(ID_KEY) is equal GROUP_NAME, this group validation rule is used.
;index
You can write as following.
;;;;index
You can repeat ';' any times.
- ;r;^GROUP_NAME$
-
This is start of group, too. If the value of Object->Method(ID_KEY) is match regexp ^GROUP_NAME$, this group validation rule is used.
;r;^.*_confirm$
You can write as following.
;;r;;^.*_confirm$
You can repeat ';' any times.
- ;path;/path/to/where
-
It is as same as ;r;^/path/to/where/?$.
Note that: this is needed that ID_KEY is 'ENV_PATH_INFO'.
You can write as following.
;;path;;/path/to/where
You can repeat ';' any times.
- ;GLOBAL
-
This is start of group, too. but 'GLOBAL' is special name. The rule in this group is inherited by all group.
;GLOBAL i is number w is word
If you write global rule on the top of rule. no need specify ;GLOBAL, they are parsed as GLOBAL.
# The top of file i is number w is word
They will be regarded as global rule.
- #
-
start from # is comment.
# This is comment
- sentence
-
i is number
sentence has 3 parts, at least.
Key Operator Condition
In example, 'i' is Key, 'is' is Operator and 'number' is Condition.
This means:
return $obj->$method('i') =~/^\d+$/ + 0;
In some case, Operator can take multiple Condition. It is depends on Operator implementation.
For example, Operator 'match' can multiple Condition.
i match ^[a-z]+$,^[0-9]+$
When i is match former or later, it is valid.
Note that:
You CANNOT use same key with same operator.
i is number i is word
- alias = sentence
-
sentence is as same as above. 'alias =' effects result data structure.
First example is normal version.
Rule:
i is number p is word z match ^\d{3}$
Result Data Structure:
{ 'i_is' => 0, 'p_is' => 0, 'z_match' => 1, }
Next example is version using alias.
id = i is number password = p is word zip = z match ^\d{3}$
Result Data Structure:
{ 'id_is' => 0, 'password_is' => 0, 'zip_match' => 1, }
- Special alias name for required values
-
required = name, id, password
This alias name "required" is special name and syntax after the name, is special a bit.
This sentence means these keys/aliases, name, id and password are required.
You can change the name "required" by required_alias_name method.
Note that: You cannot write key name if you use alias and don't use the key name elsewhere.
for example;
foo is alpha alias = var is 'value' # It doesn't work correctly because alias is used instead of key name 'var' required = foo, var
You should write as following;
foo is alpha alias = var is 'value' # It works correctly because alias is used required = foo, alias
But the following works correctly;
foo is alpha alias = foo eq 'value' # It works correctly because key name 'foo' is used elsewhere. required = foo
- Override Global Rule
-
You can override global rule.
;GLOBAL ID_KEY page i is number w is word ;index i is word w is number
If you want delete some rules in GLOBAL in 'index' group.
;index w is n/a w match ^[e-z]+$
If you want delete all GLOBAL rule in 'index' group.
;index GLOBAL is n/a
FILTERS
Data::RuledValidator has filtering feature. There are two ways how to filter values.
- filter Key, ... with FilterName, ...
-
filter tel_number with no_dash tel_number is num tel_number length 9
This declaration is no relation with location. So, following is as same mean as above.
tel_number is num tel_number length 9 filter tel_number with no_dash
Filter is also inherited from GLOBAL. If you want to ignore GLOBAL filter, do as following;
filter tel_number with n/a
If you want to ignore GLOBAL filter on all keys, do as following; (not yet implemented)
filter * with n/a
- Keys Operator Condition with FilterName, ...
-
This is temporary filter.
tel1 = tel_number is num with no_dash tel2 = tel_number is num
tel1's tel_number is filtered tel_number, but tel2's tel_number is not filtered.
But in following case, tel2 is filtered, too.
filter tel_number with no_dash tel1 = tel_number is num with no_dash tel2 = tel_number is num
If you want ignore "filter tel_number with no_dash", use no_filter in temporary filter.
filter tel_number with no_dash tel1 = tel_number is num with no_filter tel2 = tel_number is num
If temporary filter is defined, it is prior to "filter ... with ...".
See also Data::RuledValidator::Filter
OPERATORS
- is
-
key is mail key is word key is num
'is' is something special operator. It can be to be unavailable GLOBAL at all or about some key.
;;GLOBAL i is num k is value ;;index v is word
in this rule, 'index' inherits GLOBAL. If you want not to use GLOBAL.
;;index GLOBAL is n/a v is word
if you want not to use key 'k' in index.
;;index k is n/a v is word
This inherits 'i', but doesn't inherit 'k'.
- isnt
-
It is the opposite of 'is'. but, no use to use 'n/a' in condition.
- of
-
This is some different from others. Left word is not key. number or 'all' and this needs alias.
all = all of x,y,z
This is needed all of keys x, y and z. It is no need for these value of keys to be valid. If this key exists, it is OK.
If you need only 2 of these keys. you can write;
2ofxyz = 2 of x,y,z
This is needed 2 of keys x, y or z.
If you want valid values, use of-valid instead of valid.
- of-valid
-
This likes 'of'.
all = all of-valid x,y,z
This is needed all of keys x, y and z. It is needed for these value of keys to be valid.
If you need only 2 of these keys. you can write;
2ofvalidxyz = 2 of-valid x,y,z
This is needed 2 of keys x, y or z.
If you want valid values, use of-valid instead of 'of'.
- in
-
If value is in the words, it is OK.
key in Perl, Python, Ruby, PHP ...
This is "or" condition. If value is equal to one of them, it is OK.
- match
-
This is regular expression.
key match ^[a-z]{2}\d{5}$
If you want multiple regular expression.
key match ^[a-z]{2}\d{5}$, ^\d{5}[a-z]{1}\d{5}$, ...
This is "or" condition. If value is match one of them, it is OK.
- re
-
It is as same as 'match'.
- has
-
key has 3
This means key has 3 values.
If you want less than the number or grater than the number. You can write;
key has < 4 key has > 4
- eq (= equal)
-
key eq STRING
If key's value is as same as STRING, it is valid.
You can use special string like following.
key eq [key_name] key eq {data_key_name}
[key_name] is result of $obj->$method(key_name). For the case which user have to input password twice, you can write following rule.
password eq [password2]
This rule means, for example;
$cgi->param('password') eq $cgi->param('password2');
{data_key_name} is result of $data->{data_key_name}. For the case when you should check data from database.
my $db_data = ....; if($cgi->param('key') ne $db_data){ # wrong! }
In such a case, you can write as following.
rule;
key eq {db_data}
code;
my $db_data = ...; $v->by_rule({db_data => $db_data});
- ne (= not_equal)
-
key ne STRING
If key's value is NOT as same as STRING, it is valid. You can use special string like "eq" in above explanation.
- length #,#
-
words length 0, 10
If the length of words is from 0 to 10, it is valid. The first number is min length, and the second number is max length.
You can write only one value.
words length 5
This means the length of words is lesser than 6.
Note that: use it instead of '>= ~ #', '<= ~ #' and 'between ~ #, #'.
- >, >=
-
key > 4
If key's value is greater than number 4, it is valid. You can use '>=', too.
If you want to check length of the value, put '~' before number as following.
key > ~ 4
Note that: use
length
, instead of '>= ~ #'. - <, <=
-
key < 5
If key's value is less than number 5, it is valid. You can use '<=', too.
If you want to check length of the value, put '~' before number as following.
key < ~ 4
Note that: use
length
, instead of '<= ~ #'. - between #,#
-
key between 3,5
If key's value is in the range, it is valid.
If you want to check length of the value, put '~' before number as following.
key between ~ 4,10
Note that: use
length
, instead of 'between ~ #, #'.
HOW TO ADD OPERATOR
This module has 2 kinds of operator.
- normal operator
-
This is used in sentence.
Key Operator Condition ~~~~~~~~ For example: is, are, match ...
"v is word" returns structure like a following:
{ v_is => 1, v_valid => 1, }
- condition operator
-
This is used in sentence only when Operator is 'is/are/isnt/arent'.
Key Operator Condition (is/are) ~~~~~~~~~ (isnt/arent)
This is operator which is used for checking Value(s). Operator should be 'is' or 'are'(these are same) or 'isnt or arent'(these are same).
For example: num, alpha, alphanum, word ...
You can add these operator with 2 class method.
- add_operator
-
Data::RuledValidator->add_operator(name => $code);
$code should return code to make closure. For example:
Data::RuledValidaotr->add_operator( 'is' => sub { my($key, $c) = @_; my $sub = Data::RuledValidaotr->_cond_op($c) || ''; unless($sub){ if($c eq 'n/a'){ return $c; }else{ Carp::croak("$c is not defined. you can use; " . join ", ", Data::RuledValidaotr->_cond_op); } } return sub {my($self, $v) = @_; $v = shift @$v; return($sub->($self, $v) + 0)}; }, )
$key and $c is Key and Condition. They are given to $code. $code receive them and use them as $code likes. In example, get code ref to use $c(Data::RuledValidaotr->_cond_op($c)).
return sub {my($self, $v) = @_; $v = shift @$v; return($sub->($self, $v) + 0)};
This is the code to return closure. To closure, 5 values are given.
$self, $values, $alias, $obj, $method $self = Data::RuledValidaotr object $values = Value(s). array ref $alias = alias of Key $obj = object given in new $method = method given in new
In example, first 2 values is used.
- add_condition
-
Data::RuledValidator->add_condition(name => $code);
$code should be code ref. For example:
__PACKAGE__->add_condition ( 'mail' => sub{my($self, $v) = @_; return Email::Valid->address($v) ? 1 : 0}, );
PLUGIN
Data::RuledValidator is made with plugins (since version 0.02).
How to create plugins
It's very easy. The name of the modules plugged in this is started from 'Data::RuledValidator::Plugin::'.
for example:
package Data::RuledValidator::Plugin::Email;
use Email::Valid;
use Email::Valid::Loose;
Data::RuledValidator->add_condition
(
'mail' =>
sub{
my($self, $v) = @_;
return Email::Valid->address($v) ? 1 : ()
},
'mail_loose' =>
sub{
my($self, $v) = @_;
return Email::Valid::Loose->address($v) ? 1 : ()
},
);
1;
That's all. If you want to add normal_operator, use add_operator Class method.
OVERLOADING
$valid = $validator_object; # it is as same as $validator_object->valid;
%valid = @$validator_object; # it is as same as %{$validator_object->result};
INTERNAL CLASS DATA
It is just a memo.
- %RULE
-
All rule for all object(which has different rule file).
structure:
rule_name => { _regex_group => [], # For group name, regexp can be used, for no need to find rule key is regexp or not, # This exists. id_key => [], # Rule has key which identify group name. this hash is {RULE_NAME => key_name} # why array ref? # for unique, we can set several key for id_key(it likes SQL unique) coded_rule => [], # it is assemble of closure time => $time # (stat 'rule_file')[9] }
- %COND_OP
-
The keys are condition operator names. The values is coderef(condition operator).
- %MK_CLOSURE
-
{ operator => sub{coderef which create closure} }
- %REQUIRED
-
{ required_key => undef, required_key2 => undef }
NOTE
Now, once rule is parsed, rule is change to code (assemble of closure) and it is stored as class data.
If you use this for CGI, performance is not good. If you use this on mod_perl, it is good idea.
I have some solution;
store code to storable file. store code to shared memory.
TODO
- can take 2 keys for id_key
- More test
-
I have to do more test.
- More documents
-
I have to write more documents.
- multiple rule files
AUTHOR
Ktat, <ktat@cpan.org>
COPYRIGHT
Copyright 2006-2007 by Ktat
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
See http://www.perl.com/perl/misc/Artistic.html