NAME
Data::Validate::XSD - Validate complex structures by definition
SYNOPSIS
use Data::Validate::XSD;
my $validator = Data::Validate::XSD->new( \%definition );
$errors = $validator->validate( \%data );
warn Dumper($errors) if $errors;
DESCRIPTION
Based on xsd and xml validation, this is an attempt to provide those functions
without either xml or the hidous errors given out by modules like XPath.
The idea behind the error reporting is that the errors can reflect the structure
of the original structure replacing each variable with an error code and message.
It is possible to work out a one dimention error reporting scheme too which I may
work on next.
INVITATION
If you find an example where the W3C definitions and this module differ then
please email the author and a new version with fixes can be released.
If you find there is a certain type that your always using then let me know
I can consider adding the type to the default set and make the module more useful.
EXAMPLES
Definitions
A definition is a hash containing information like an xml node containing children.
An example definition for registering a user on a website:
$def = { root => [ { name => 'input', type => 'newuser' }, { name => 'foo', type => 'string' }, ],
simpleTypes => [
confirm => { base => 'id', match => '/input/password' },
rname => { base => 'name', minLength => 1 },
password => { base => 'id', minLength => 6 },
],
complexTypes => {
newuser => [
{ name => 'username', type => 'token' },
{ name => 'password', type => 'password' },
{ name => 'confirm', type => 'confirm' },
{ name => 'firstName', type => 'rname' },
{ name => 'familyName', type => 'name', minOccurs => 0 },
{ name => 'nickName', type => 'name', minOccurs => 0 },
{ name => 'emailAddress', type => 'email', minOccurs => 1, maxOccurs => 3 },
[
{ name => 'aim', type => 'index' },
{ name => 'msn', type => 'email' },
{ name => 'jabber', type => 'email' },
{ name => 'irc', type => 'string' },
]
],
},
};
Data
And this is an example of the data that would validate against it:
$data = { input => { username => 'abcdef', password => '1234567', confirm => '1234567', firstName => 'test', familyName => 'user', nickName => 'foobar', emailAddress => [ 'foo@bar.com', 'some@other.or', 'great@nice.con' ], msn => 'foo@msn.com', }, foo => 'extra content', };
We are asking for a username, a password typed twice, some real names, a nick name, between 1 and 3 email addresses and at least one instant message account, foo is an extra string of information to show that the level is arbitary. bellow the definition and all options are explained.
Results
The first result you get is a structure the second is a boolean, the boolean explains the total stuctures pass or fail status.
The structure that is returned is almost a mirror structure of the input:
$errors = { input => { username => 0, password => 0, confirm => 0, firstName => 0, familyName => 0, nickName => 0, emailAddress => 0, } },
DETAILED DEFINITION
Definition Root
root - The very first level of all structures, it should contain the first
level complex type (see below). The data by default is a hash since
all xml have at least one level of xml tags names.
import - A list of file names, local to perl that should be loaded to include
further and shared simple and complex types. Supported formats are
"perl code", xml and yml.
simpleTypes - A hash reference containing each simple definition which tests a
scalar type (see below for format of each definition)
complexTypes - A hash reference containing each complex definition which tests a
structure (see below for definition).
Simple Types
A simple type is a definition which will validate data directly, it will never validate
arrays, hashes or any future wacky structural types. In perl parlance it will only validate
SCALAR types. These options should match the w3c simple types definition:
base - The name of another simple type to first test the value against.
fixed - The value should match this exactly.
pattern - Should be a regular expresion reference which matchs the value i.e qr/\w/
minLength - The minimum length of a string value.
maxLength - The maximum length of a string value.
match - An XPath link to another data node it should match.
notMatch - An XPath link to another data node it should NOT match.
enumeration - An array reference of possible values of which value should be one.
custom - Should contain a CODE reference which will be called upon to validate the value.
minInclusive - The minimum value of a number value inclusive, i.e greater than or eq to (>=).
maxInclusive - The maximum value of a number value inclusive, i.e less than of eq to (<=).
minExclusive - The minimum value of a number value exlusive, i.e more than (>).
maxExclusive - The maximum value of a number value exlusive, i.e less than (<).
fractionDigits - The maximum number of digits on a fractional number.
Complex Types
A complex type is a definition which will validate a hash reference, the very first structure,
'root' is a complex definition and follows the same syntax as all complex types. each complex
type is a list of data which should all occur in the hash, when a list entry is a hash; it
equates to one named entry in the hash data and has the following options:
name - Required name of the entry in the hash data.
minOccurs - The minimum number of the named that this data should have in it.
maxOccurs - The maximum number of the named that this data should have in it.
type - The type definition which validates the contents of the data.
Where the list entry is an array, it will toggle the combine mode and allow further list entries
With in it; this allows for parts of the sturcture to be optional only if different parts of the
stucture exist.
INBUILT TYPES
By default these types are available to all definitions as base types.
string - /^.*$/
integer - /^[\-]{0,1}\d+$/
index - /^\d+$/
double - /^[0-9\-\.]*$/
token - /^\w+$/
boolean - /^1|0|true|false$/
email - /^.+@.+\..+$/
date - /^\d\d\d\d-\d\d-\d\d$/ + datetime
'time' - /^\d\d:\d\d$/ + datetime
datetime - /^(\d\d\d\d-\d\d-\d\d)?[T ]?(\d\d:\d\d)?$/ + valid_date method
percentage - minInclusive == 0 + maxInclusive == 100 + double
METHODS
$class->new( $definition )
Create a new validation object, debug will cause
All error codes to be replaced by error strings.
$class->newFromFile( $path, $filename, $debug )
Create a new definition from a dumped perl file.
$validator->validate( $data )
Validate a set of data against this validator.
Returns an $errors structure or 0 if there were no errors.
$validator->validateFile( $filename )
Validate a file against this validator.
$validator->setStrict( $bool )
Should missing data be considered an error.
$validator->setDefinition( $definition )
Set the validators definition, will load it (used internally too)
$validator->getErrorString( $error_code )
Return a human readable string for each error code.
INTERNAL METHODS
Only read on if you are interesting in knowing some extra stuff about
the internals of this module.
$validator->_load_definition( $definition )
Internal method for loading a definition into the validator
$validator->_load_definition_from_file( $filename )
Internal method for loading a definition from a file
$validator->_validate_elements( %p )
Internal method for validating a list of elements;
p: definition, data, mode
$validator->_validate_element( %p )
Internal method for validating a single element
p: data, definition, mode
$validator->_validate_type( %p )
Internal method for validating a single data type
$validator->_find_value( %p )
Internal method for finding a value match (basic xpath)
$validator->_push_hash( $dest, $source )
Internal method for copying a hash to another
$validator->_load_file( $file )
Internal method for loading a file, must be valid perl syntax.
Yep that's right, be bloody careful when loading from files.
$validate->_test_datetime( $typedef )
Test a date time range is a valid date.
KNOWN BUGS
* XML and YML suport not added yet.
* Fraction Didgets test doesn't work yet.
AUTHOR
Copyright, Martin Owens 2007-2008, Affero General Public License (AGPL)
http://www.fsf.org/licensing/licenses/agpl-3.0.html