NAME

Data::FormValidator - Validates user input (usually from an HTML form) based on input profile.

SYNOPSIS

In an HTML::Empberl page:

use Data::FormValidator;

my $validator = new Data::FormValidator( "/home/user/input_profiles.pl" );
my ( $valid, $missing, $invalid, $unknown ) = $validator->validate(  \%fdat, "customer_infos" );

DESCRIPTION

Data::FormValidator's main aim is to make the tedious coding of input validation expressible in a simple format and to let the programmer focus on more interesting tasks.

When you are coding a web application one of the most tedious though crucial tasks is to validate user's input (usually submitted by way of an HTML form). You have to check that each required fields is present and that some fields have valid data. (Does the phone input looks like a phone number? Is that a plausible email address? Is the YY state valid? etc.) For a simple form, this is not really a problem but as forms get more complex and you code more of them this task becames really boring and tedious.

Data::FormValidator lets you define profiles which declare the required fields and their format. When you are ready to validate the user's input, you tell Data::FormValidator the profile to apply to the user data and you get the valid fields, the name of the fields which are missing. An array is returned listing which fields are valid, missing, invalid and unknown in this profile.

You are then free to use this information to build a nice display to the user telling which fields that he forgot to fill.

INPUT PROFILE SPECIFICATION

To create a Data::FormValidator, use the following :

my $validator = new Data::FormValidator( $input_profile );

Where $input_profile may either be an hash reference to an input profiles specification or a file that will be evaluated at runtime to get a hash reference to an input profiles specification.

The input profiles specification is an hash reference where each key is the name of the input profile and each value is another hash reference which contains the actual profile elements. If the input profile is specified as a file name, the profiles will be reread each time that the disk copy is modified.

Here is an example of a valid input profiles specification :

    {
	customer_infos => {
	    optional     =>
		[ qw( company fax country password password_confirmation) ],
	    required     =>
		[ qw( fullname phone email address) ],
            required_regexp => '/city|state|zipcode/',
            optional_regexp => '/_province$/',
	    constraints  =>
		{
		    email	=> "email",
		    fax		=> "american_phone",
		    phone	=> "american_phone",
		    zipcode	=> '/^\s*\d{5}(?:[-]\d{4})?\s*$/',
		    state	=> "state",
		},
	    constraint_regexp_map => {
		'/_postcode$/'	=> 'postcode',
		'/_province$/'  => 'province,		      
	    },			      
            dependency_groups  => {
                password_group => [qw/password password_confirmation/]
            }
	    defaults => {
		country => "USA",
	    },
	},
	customer_billing_infos => {
	     optional	    => [ "cc_no" ],
	     dependencies   => {
            "cc_no" => [ qw( cc_type cc_exp ) ],
            "pay_type" => {
                check => [ qw( check_no ) ],
             }
	     },
	     constraints => {
		cc_no      => {  constraint  => "cc_number",
				 params	     => [ qw( cc_no cc_type ) ],
				},
		cc_type	=> "cc_type",
		cc_exp	=> "cc_exp",
	      }
	    filters       => [ "trim" ],
	    field_filters => { cc_no => "digit" },
	},
    }

Notice that a number of components take anonymous arrays as their values. In any of these places, you can simply use a string if you only need to specify one value. For example, instead of

filters => [ 'trim' ]

you can simply say

filters => 'trim'

The following are the valid fields for an input specification :

required

This is an array reference which contains the name of the fields which are required. Any fields in this list which are not present in the user input will be reported as missing.

optional

This is an array reference which contains the name of optional fields. These are fields which MAY be present and if they are, they will be check for valid input. Any fields not in optional or required list will be reported as unknown.

required_regexp

This is a regular expression used to specify additional fieds which are required. For example, if you wanted all fields names that begin with user_ to be required, you could use the regular expression, /^user_/

optional_regexp

This is a regular expression used to specify additional fieds which are optional. For example, if you wanted all fields names that begin with user_ to be optional, you could use the regular expression, /^user_/

dependencies

This is an hash reference which contains dependencies information. This is for the case where one optional fields has other requirements. The dependencies can be specified with an array reference. For example, if you enter your credit card number, the field cc_exp and cc_type should also be present. If the dependencies are specified with a hash reference then the additional constraint is added that the optional field must equal a key for the dependencies to be added. For example, if the pay_type field is equal to "check" then the check_no field is required. Any fields in the dependencies list that is missing when the target is present will be reported as missing.

dependency_groups

This is a hash reference which contains information about groups of interdependent fields. The keys are arbitrary names that you create and the values are references to arrays of the field names in each group. For example, perhaps you want both the password and password_confirmation field to be required if either one of them is filled in.

defaults

This is a hash reference which contains defaults which should be substituted if the user hasn't filled the fields. Key is field name and value is default value which will be returned in the list of valid fields.

filters

This is a reference to an array of filters that will be applied to ALL optional or required fields. This can be the name of a built-in filter (trim,digit,etc) or an anonymous subroutine which should take one parameter, the field value and return the (possibly) modified value.

field_filters

This is a reference to an hash which contains reference to array of filters which will be applied to specific input fields. The key of the hash is the name of the input field and the value is a reference to an array of filters, the same way the filters parameter works.

constraints

This is a reference to an hash which contains the constraints that will be used to check whether or not the field contains valid data. Constraints can be either the name of a builtin constraint function (see below), a perl regexp or an anonymous subroutine which will check the input and return true or false depending on the input's validity.

The constraint function takes one parameter, the input to be validated and returns true or false. It is possible to specify the parameters that will be passed to the subroutine. For that, use an hash reference which contains in the constraint element, the anonymous subroutine or the name of the builtin and in the params element the name of the fields to pass a parameter to the function. (Don't forget to include the name of the field to check in that list!) For an example, look at the cc_no constraint example.

constraint_regexp_map

This is a hash reference where the keys are the regular expressions to use and the values are the constraints to apply. Used to apply constraints to fields that match a regular expression. For example, you could check to see that all fields that end in "_postcode" are valid Canadian postal codes by using the key '_postcode$' and the value "postcode".

VALIDATING INPUT

    my( $valids, $missings, $invalids, $unknowns ) =
	$validator->validate( \%fdat, "customer_infos" );

To validate input you use the validate() method. This method takes two parameters :

data

Contains an hash which should correspond to the form input as submitted by the user. This hash is not modified by the call to validate.

profile

Can be either a name which will be used to lookup the corresponding profile in the input profiles specification, or it can be an hash reference to the input profile which should be used.

This method returns a 4 elements array.

valids

This is an hash reference to the valid fields which were submitted in the data. The data may have been modified by the various filters specified.

missings

This is a reference to an array which contains the name of the missing fields. Those are the fields that the user forget to fill or filled with space. These fields may comes from the required list or the dependencies list.

invalids

This is a reference to an array which contains the name of the fields which failed their constraint check.

unknowns

This is a list of fields which are unknown to the profile. Whether or not this indicates an error in the user input is application dependant.

INPUT FILTERS

These are the builtin filters which may be specified as a name in the filters and field_filters parameters of the input profile. You may also call these functions directly through the procedural interface by either importing them directly or importing the whole :filters group. For example, if you want to access the trim function directly, you could either do:

use Data::FormValidator (qw/filter_trim/);
or
use Data::FormValidator (:filters);

$string = filter_trim($string);

Notice that when you call filters directly, you'll need to prefix the filter name with "filter_".

trim

Remove white space at the front and end of the fields.

strip

Runs of white space are replaced by a single space.

digit

Remove non digits characters from the input.

alphanum

Remove non alphanumerical characters from the input.

integer

Extract from its input a valid integer number.

pos_integer

Extract from its input a valid positive integer number.

neg_integer

Extract from its input a valid negative integer number.

decimal

Extract from its input a valid decimal number.

pos_decimal

Extract from its input a valid positive decimal number.

neg_decimal

Extract from its input a valid negative decimal number.

dollars

Extract from its input a valid number to express dollars like currency.

phone

Filters out characters which aren't valid for an phone number. (Only accept digits [0-9], space, comma, minus, parenthesis, period and pound [#].)

sql_wildcard

Transforms shell glob wildcard (*) to the SQL like wildcard (%).

quotemeta

Calls the quotemeta (quote non alphanumeric character) builtin on its input.

lc

Calls the lc (convert to lowercase) builtin on its input.

uc

Calls the uc (convert to uppercase) builtin on its input.

ucfirst

Calls the ucfirst (Uppercase first letter) builtin on its input.

BUILTIN VALIDATORS

Those are the builtin constraint that can be specified by name in the input profiles. You may also call these functions directly through the procedural interface by either importing them directly or importing the whole :validators group. For example, if you want to access the email validator directly, you could either do:

use Data::FormValidator (qw/valid_email/);
or
use Data::FormValidator (:validators);

if (valid_email($email)) {
  # do something with the email address
}

Notice that when you call validators directly, you'll need to prefix the validator name with "valid_"

email

Checks if the email LOOKS LIKE an email address. This checks if the input contains one @, and a two level domain name. The address portion is checked quite liberally. For example, all those probably invalid address would pass the test :

nobody@top.domain
%?&/$()@nowhere.net
guessme@guess.m
state_or_province

This one checks if the input correspond to an american state or a canadian province.

state

This one checks if the input is a valid two letter abbreviation of an american state.

province

This checks if the input is a two letter canadian province abbreviation.

zip_or_postcode

This constraints checks if the input is an american zipcode or a canadian postal code.

postcode

This constraints checks if the input is a valid Canadian postal code.

zip

This input validator checks if the input is a valid american zipcode : 5 digits followed by an optional mailbox number.

phone

This one checks if the input looks like a phone number, (if it contains at least 6 digits.)

american_phone

This constraints checks if the number is a possible North American style of phone number : (XXX) XXX-XXXX. It has to contains more than 7 digits.

cc_number

This is takes two parameters, the credit card number and the credit cart type. You should take the hash reference option for using that constraint.

The number is checked only for plausibility, it checks if the number could be valid for a type of card by checking the checksum and looking at the number of digits and the number of digits of the number.

This functions is only good at weeding typos and such. IT DOESN'T CHECK IF THERE IS AN ACCOUNT ASSOCIATED WITH THE NUMBER.

cc_exp

This one checks if the input is in the format MM/YY or MM/YYYY and if the MM part is a valid month (1-12) and if that date is not in the past.

cc_type

This one checks if the input field starts by M(asterCard), V(isa), A(merican express) or D(iscovery).

CREDITS

Some of those input validation functions have been taken from MiniVend by Michael J. Heins <mike@heins.net>

The credit card checksum validation was taken from contribution by Bruce Albrecht <bruce.albrecht@seag.fingerhut.com> to the MiniVend program.

Mark Stosberg contributed a number of enhancements including required_regexp, optional_regexp and constraint_regexp_map

AUTHOR

Copyright (c) 1999 Francis J. Lacoste and iNsu Innovations Inc. All rights reserved.

Parts Copyright 1996-1999 by Michael J. Heins <mike@heins.net> Parts Copyright 1996-1999 by Bruce Albrecht <bruce.albrecht@seag.fingerhut.com> Parts Copyright 2001 by Mark Stosberg <mark@summersault.com>

This program is free software; you can redistribute it and/or modify it under the terms as perl itself.