NAME
Data::FormValidator - Validates user input (usually from an HTML form) based on input profile.
SYNOPSIS
In an HTML::Empberl page:
use Data::FormValidator;
my $validator = new Data::FormValidator( "/home/user/input_profiles.pl" );
my ( $valid, $missing, $invalid, $unknown ) = $validator->validate( \%fdat, "customer_infos" );
DESCRIPTION
Data::FormValidator's main aim is to make the tedious coding of input validation expressible in a simple format and to let the programmer focus on more interesting tasks.
When you are coding a web application one of the most tedious though crucial tasks is to validate user's input (usually submitted by way of an HTML form). You have to check that each required fields is present and that some fields have valid data. (Does the phone input looks like a phone number? Is that a plausible email address? Is the YY state valid? etc.) For a simple form, this is not really a problem but as forms get more complex and you code more of them this task becames really boring and tedious.
Data::FormValidator lets you define profiles which declare the required fields and their format. When you are ready to validate the user's input, you tell Data::FormValidator the profile to apply to the user data and you get the valid fields, the name of the fields which are missing. An array is returned listing which fields are valid, missing, invalid and unknown in this profile.
You are then free to use this information to build a nice display to the user telling which fields that he forgot to fill.
INPUT PROFILE SPECIFICATION
To create a Data::FormValidator, use the following :
my $validator = new Data::FormValidator( $input_profile );
Where $input_profile may either be an hash reference to an input profiles specification or a file that will be evaluated at runtime to get a hash reference to an input profiles specification.
The input profiles specification is an hash reference where each key is the name of the input profile and each value is another hash reference which contains the actual profile elements. If the input profile is specified as a file name, the profiles will be reread each time that the disk copy is modified.
Here is an example of a valid input profiles specification :
{
customer_infos => {
optional =>
[ qw( company fax country password password_confirmation) ],
required =>
[ qw( fullname phone email address) ],
required_regexp => '/city|state|zipcode/',
optional_regexp => '/_province$/',
constraints =>
{
email => "email",
fax => "american_phone",
phone => "american_phone",
zipcode => '/^\s*\d{5}(?:[-]\d{4})?\s*$/',
state => "state",
},
constraint_regexp_map => {
'/_postcode$/' => 'postcode',
'/_province$/' => 'province,
},
dependency_groups => {
password_group => [qw/password password_confirmation/]
}
defaults => {
country => "USA",
},
},
customer_billing_infos => {
optional => [ "cc_no" ],
dependencies => {
"cc_no" => [ qw( cc_type cc_exp ) ],
},
constraints => {
cc_no => { constraint => "cc_number",
params => [ qw( cc_no cc_type ) ],
},
cc_type => "cc_type",
cc_exp => "cc_exp",
}
filters => [ "trim" ],
field_filters => { cc_no => "digit" },
},
}
Notice that a number of components take anonymous arrays as their values. In any of these places, you can simply use a string if you only need to specify one value. For example, instead of
filters => [ 'trim' ]
you can simply say
filters => 'trim'
The following are the valid fields for an input specification :
- required
-
This is an array reference which contains the name of the fields which are required. Any fields in this list which are not present in the user input will be reported as missing.
- optional
-
This is an array reference which contains the name of optional fields. These are fields which MAY be present and if they are, they will be check for valid input. Any fields not in optional or required list will be reported as unknown.
- required_regexp
-
This is a regular expression used to specify additional fieds which are required. For example, if you wanted all fields names that begin with user_ to be required, you could use the regular expression, /^user_/
- optional_regexp
-
This is a regular expression used to specify additional fieds which are optional. For example, if you wanted all fields names that begin with user_ to be optional, you could use the regular expression, /^user_/
- dependencies
-
This is an hash reference which contains dependencies information. This is for the case where one optional fields has other requirements. For example, if you enter your credit card number, the field cc_exp and cc_type should also be present. Any fields in the dependencies list that is missing when the target is present will be reported as missing.
- dependency_groups
-
This is a hash reference which contains information about groups of interdependent fields. The keys are arbitrary names that you create and the values are references to arrays of the field names in each group. For example, perhaps you want both the password and password_confirmation field to be required if either one of them is filled in.
- defaults
-
This is a hash reference which contains defaults which should be substituted if the user hasn't filled the fields. Key is field name and value is default value which will be returned in the list of valid fields.
- filters
-
This is a reference to an array of filters that will be applied to ALL optional or required fields. This can be the name of a built-in filter (trim,digit,etc) or an anonymous subroutine which should take one parameter, the field value and return the (possibly) modified value.
- field_filters
-
This is a reference to an hash which contains reference to array of filters which will be applied to specific input fields. The key of the hash is the name of the input field and the value is a reference to an array of filters, the same way the filters parameter works.
- constraints
-
This is a reference to an hash which contains the constraints that will be used to check whether or not the field contains valid data. Constraints can be either the name of a builtin constraint function (see below), a perl regexp or an anonymous subroutine which will check the input and return true or false depending on the input's validity.
The constraint function takes one parameter, the input to be validated and returns true or false. It is possible to specify the parameters that will be passed to the subroutine. For that, use an hash reference which contains in the constraint element, the anonymous subroutine or the name of the builtin and in the params element the name of the fields to pass a parameter to the function. (Don't forget to include the name of the field to check in that list!) For an example, look at the cc_no constraint example.
- constraint_regexp_map
-
This is a hash reference where the keys are the regular expressions to use and the values are the constraints to apply. Used to apply constraints to fields that match a regular expression. For example, you could check to see that all fields that end in "_postcode" are valid Canadian postal codes by using the key '_postcode$' and the value "postcode".
VALIDATING INPUT
my( $valids, $missings, $invalids, $unknowns ) =
$validator->validate( \%fdat, "customer_infos" );
To validate input you use the validate() method. This method takes two parameters :
- data
-
Contains an hash which should correspond to the form input as submitted by the user. This hash is not modified by the call to validate.
- profile
-
Can be either a name which will be used to lookup the corresponding profile in the input profiles specification, or it can be an hash reference to the input profile which should be used.
This method returns a 4 elements array.
- valids
-
This is an hash reference to the valid fields which were submitted in the data. The data may have been modified by the various filters specified.
- missings
-
This is a reference to an array which contains the name of the missing fields. Those are the fields that the user forget to fill or filled with space. These fields may comes from the required list or the dependencies list.
- invalids
-
This is a reference to an array which contains the name of the fields which failed their constraint check.
- unknowns
-
This is a list of fields which are unknown to the profile. Whether or not this indicates an error in the user input is application dependant.
INPUT FILTERS
These are the builtin filters which may be specified as a name in the filters and field_filters parameters of the input profile. You may also call these functions directly through the procedural interface by either importing them directly or importing the whole :filters group. For example, if you want to access the trim function directly, you could either do:
use Data::FormValidator (qw/filter_trim/);
or
use Data::FormValidator (:filters);
$string = filter_trim($string);
Notice that when you call filters directly, you'll need to prefix the filter name with "filter_".
- trim
-
Remove white space at the front and end of the fields.
- strip
-
Runs of white space are replaced by a single space.
- digit
-
Remove non digits characters from the input.
- alphanum
-
Remove non alphanumerical characters from the input.
- integer
-
Extract from its input a valid integer number.
- pos_integer
-
Extract from its input a valid positive integer number.
- neg_integer
-
Extract from its input a valid negative integer number.
- decimal
-
Extract from its input a valid decimal number.
- pos_decimal
-
Extract from its input a valid positive decimal number.
- neg_decimal
-
Extract from its input a valid negative decimal number.
- dollars
-
Extract from its input a valid number to express dollars like currency.
- phone
-
Filters out characters which aren't valid for an phone number. (Only accept digits [0-9], space, comma, minus, parenthesis, period and pound [#].)
- sql_wildcard
-
Transforms shell glob wildcard (*) to the SQL like wildcard (%).
- quotemeta
-
Calls the quotemeta (quote non alphanumeric character) builtin on its input.
- lc
-
Calls the lc (convert to lowercase) builtin on its input.
- uc
-
Calls the uc (convert to uppercase) builtin on its input.
- ucfirst
-
Calls the ucfirst (Uppercase first letter) builtin on its input.
BUILTIN VALIDATORS
Those are the builtin constraint that can be specified by name in the input profiles. You may also call these functions directly through the procedural interface by either importing them directly or importing the whole :validators group. For example, if you want to access the email validator directly, you could either do:
use Data::FormValidator (qw/valid_email/);
or
use Data::FormValidator (:validators);
if (valid_email($email)) {
# do something with the email address
}
Notice that when you call validators directly, you'll need to prefix the validator name with "valid_"
-
Checks if the email LOOKS LIKE an email address. This checks if the input contains one @, and a two level domain name. The address portion is checked quite liberally. For example, all those probably invalid address would pass the test :
nobody@top.domain %?&/$()@nowhere.net guessme@guess.m
- state_or_province
-
This one checks if the input correspond to an american state or a canadian province.
- state
-
This one checks if the input is a valid two letter abbreviation of an american state.
- province
-
This checks if the input is a two letter canadian province abbreviation.
- zip_or_postcode
-
This constraints checks if the input is an american zipcode or a canadian postal code.
- postcode
-
This constraints checks if the input is a valid Canadian postal code.
- zip
-
This input validator checks if the input is a valid american zipcode : 5 digits followed by an optional mailbox number.
- phone
-
This one checks if the input looks like a phone number, (if it contains at least 6 digits.)
- american_phone
-
This constraints checks if the number is a possible North American style of phone number : (XXX) XXX-XXXX. It has to contains more than 7 digits.
- cc_number
-
This is takes two parameters, the credit card number and the credit cart type. You should take the hash reference option for using that constraint.
The number is checked only for plausibility, it checks if the number could be valid for a type of card by checking the checksum and looking at the number of digits and the number of digits of the number.
This functions is only good at weeding typos and such. IT DOESN'T CHECK IF THERE IS AN ACCOUNT ASSOCIATED WITH THE NUMBER.
- cc_exp
-
This one checks if the input is in the format MM/YY or MM/YYYY and if the MM part is a valid month (1-12) and if that date is not in the past.
- cc_type
-
This one checks if the input field starts by M(asterCard), V(isa), A(merican express) or D(iscovery).
CREDITS
Some of those input validation functions have been taken from MiniVend by Michael J. Heins <mike@heins.net>
The credit card checksum validation was taken from contribution by Bruce Albrecht <bruce.albrecht@seag.fingerhut.com> to the MiniVend program.
Mark Stosberg contributed a number of enhancements including required_regexp, optional_regexp and constraint_regexp_map
AUTHOR
Copyright (c) 1999 Francis J. Lacoste and iNsu Innovations Inc. All rights reserved.
Parts Copyright 1996-1999 by Michael J. Heins <mike@heins.net> Parts Copyright 1996-1999 by Bruce Albrecht <bruce.albrecht@seag.fingerhut.com> Parts Copyright 2001 by Mark Stosberg <mark@summersault.com>
This program is free software; you can redistribute it and/or modify it under the terms as perl itself.