NAME

Dios::Types - Type checking for the Dios framework (and everyone else too)

VERSION

This document describes Dios::Types version 0.000001

SYNOPSIS

use Dios::Types 'validate';

# Throw an exception if the VALUE doesn't conform to the specified TYPE
validate($TYPE, $VALUE);

# Same, but report errors using the specified MESSAGE
validate($TYPE, $VALUE, $MESSAGE);

# Same, but VALUE must satisfy every one of the CONSTRAINTS as well
validate($TYPE, $VALUE, $DESC, @CONSTRAINTS);

# If you don't want exceptions in response to type mismatches, use an eval
if (eval{ validate($TYPE, $VALUE) }) {
    warn "$VALUE not of type $TYPE. Proceeding anyway.";
}

DESCRIPTION

Standard types

This module implements type-checking for all of the following types...

Any

Accepts any Perl value.

Bool

Accepts any Perl value that can be used as a boolean. So effectively: any Perl value (just like Any).

This type exists mainly to allow you to be more specific about using a value as a boolean.

Undef

Accepts any value that is undefined. In other words, only the value undef.

Def

Accepts any value that is defined. That is, any value except undef.

Value

Accepts any value is defined...but not a reference. For example: 7 or 0x093FA3D7 or 'word'.

Num

Accepts any value that is defined and also something for which Scalar::Util::looks_like_number() returns true.

However, unlike looks_like_number(), this type does not accept the special value 'NaN'. (I mean, what part of "not a number" does that function not understand???)

Note that this type does accept other special values like "Inf"/"Infinity", as well as objects with numeric overloadings.

Int

Accepts any value for which Scalar::Util::looks_like_number() returns true and which also matches the regex:

/
   \A
   \s*                     # optional leading space
   [+-]?                   # optional sign
   (?:                     # either...
       \d++                #     digits
       (\.0*)?             #     plus optional decimal zeroes
   |                       # or...
       (?i) inf(?:inity)?  #     some "infinity" variant
   )                       #
   \s*                     # optional trailing space
   \Z
/x

Note that this type also accepts objects with numeric overloadings that produce integers.

Str

Accepts any value that is a string, or a non-reference that can be converted to a string (e.g. a number), or any objects with a stringification overloading.

Class

Accepts any value that's a string that is the name of a symbol-table entry containing at least one of: $VERSION, @ISA, or some CODE entry.

In other words, the value must be the name of a package that is plausibly also a class...either because it has a version number, or because it inherits from some other class, or because it has at least one method defined.

Ref and Ref[T]

Accepts any value that is a reference of some kind (including objects).

The parameterized form specifies what kind(s) of reference the value must be:

Ref[Str]       # accepts only a reference to a string
Ref[Int]       # accepts only a reference to an integer
Ref[Array]     # accepts only a reference to an array
Ref[Hash]      # accepts only a reference to a hash
Ref[Code]      # accepts only a reference to a subroutine
Ref[Str|Num]   # accepts only a reference to a string or number

This implies that an unparameterized Ref is just a shorthand for Ref[Any].

Scalar

Accepts any value that is a reference to a scalar. For example: \1, \2.34e56, \"foo", etc.

Regex

Accepts any value that is a reference to a Regexp object (i.e. the value created by a qr/.../).

Code

Accepts any value that is a reference to a subroutine. Either: \&named_sub or sub {...}.

Glob

Accepts any value that is a reference to a typeglob.

IO

Accepts any value that is a reference to an open filehandle of some kind (as tested by Scalar::Util::openhandle()).

Obj

Accepts any value that is a reference to an object (i.e. anything blessed).

Array and Array[T]

Accepts any value that is a reference to an array.

The parameterized form specifies what kind of values the array must contain:

Array[Str]         # reference to array containing only strings

Array[Hash]        # reference to array containing only hash refs

Array[Code|Array]  # reference to array containing
                   # subroutine refs and/or array refs

Hence an unparameterized Array is just a shorthand for Array[Any].

Tuple[T1, T2, T3, ...],

Accepts any value that is a reference to an array in which the sequence of array elements are of the specified types (in order). For example:

Array[Str, Int, Int, Hash]    # accepts: ["Foo", 1, 2,   {bar=>1}]
                              # but not: ["Foo", 1, 2.1, {bar=>1}]
                              # and not: [1, 2,  "Foo",  {bar=>1}]

If the final specified type is followed by ..., the remainder of the elements may be any number of values (including none) of that type. For example:

Array[Str, Hash, Str...]  # accepts:  ["Foo", {bar=>1}]
                          # and also: ["Foo", {bar=>1}, 'cat']
                          # and also: ["Foo", {bar=>1}, 'cat', 'dog']
                          # et cetera...

If the last component of a tuple's type list is just ... by itself, the remainder of the elements may be anything (or nothing)...

Array[Str, Hash, ...]     # accepts:  ["Foo", {bar=>1}]
                          # and also: ["Foo", {bar=>1}, 'etc']
                          # and also: ["Foo", {bar=>1}, 3, 4.5]
                          # et cetera...

That is, a trailing ... is just shorthand for a trailing Any...

Hash and Hash[T]

Accepts any value that is a reference to a hash.

The parameterized form specifies what kind of values the hash may contain:

Hash[Str]         # hash's values are only strings
Hash[Hash]        # hash's values are only hash refs
Hash[Code|Array]  # hash's values are subroutine or array refs

Hence an unparameterized Hash is just a shorthand for Hash[Any].

Dict[ k, k => T, k? => T, ...],

Accepts any value that is a reference to a hash containing specific keys (and optionally with those keys having values of specific types).

Keys may be required or optional, and the corresponding values may be typed or untyped (i.e. Any). The set of keys listed may specify the only permitted keys...or allow other keys as well. The following examples cover the various possibilities.

To specify a reference to a hash with only four permitted keys ('name', 'rank', 'ID', and 'notes'), all of which must be present in the hash:

Dict[ name, rank, ID, notes ]

To specify a reference to a hash with four permitted keys, only two of which are required to be present in the hash:

Dict[ name, rank?, ID, notes? ]   # may have 'rank' and 'notes' entries
                                  # but not required to

To specify a reference to a hash with two to four permitted keys, with values of specific types:

Dict[ name => Str, rank? => Rank, ID => Int, notes? => Array ]

To specify a reference to a hash with two to four permitted keys, only some of which have values of specific types:

Dict[ name, rank? => Rank, ID => Int, notes? ]  # 'name' and 'notes entries
                                                # can be of any type

To specify a reference to a hash with two to four specific keys, some with specific types, and with any number of other keys also allowed:

Dict[ name, rank? => Rank, ID => Int, notes?, ... ]

More complex relationships between keys and types can be specified using disjunctive types. For example, a reference to a hash with required 'ID' and 'name' entries and an optional 'rank' entry...but if the 'rank' entry is present, there must also be a 'notes' array:

Dict[name,ID]|Dict[name,ID,rank,notes=>Array]

Match[PATTERN]

Accept a value whose stringification matches the regex: m[PATTERN]x

The pattern is always assumed to have the /x modifier in effect. If you don't want that, you need to turn it off within the pattern:

Match[      a b c ]     # accepts "abc"
Match[(?-x) a b c ]     # accepts " a b c"

Note that this type does not accept objects unless those objects overload stringification...even if the pattern specified would match the default 'MyClass=HASH[0x1d15ed17]' stringification of objects.

Can[METHODNAME]

Accepts any value that is either an object or a classname (i.e. Obj|Class) and for which $VALUE->can('METHODNAME') returns true.

If you need to be more specific as to whether the value itself is an object or a class, use a conjunction:

  Obj&Can[dump]    # i.e. $object->can('dump') returns true

Class&Can[dump]    # i.e. MyClass->can('dump') returns true

T1&T2

Accepts any value that both type T1 and type T2 individually accept. For example:

Obj&Device                # blessed($VALUE) && $VALUE->isa('Device')

Class&Match[^Internal::]  # an actual class whose name begins: Internal::

Note that there cannot be space between the & and either typename.

The & is associative, so you can add as many types as needed. For example, to accept only a hash-based object from a class in the Storable hierarchy, which must also have a valid restore() method:

Obj&Hash&Storable&Can[restore]

The component type tests are performed left-to-right and short-circuit on any failure (like the normal Perl && operator), so it will often be an optimization to put the most expensive type tests at the end.

T1|T2

Accepts any value that either type T1 or type T2 individually accepts. For example:

Str|Obj       # accepts either a string or an object

Num|Undef     # accepts either a number or undef

Array|Hash    # accepts either an array or hash reference

Note that there cannot be space between the | and either typename.

The | is associative, so you can add as many type checks as needed. For example, to accept a number or a specific string or a hash of integers:

Num|Match[quit]|Hash[Int]

The component type tests are performed left-to-right and short-circuit on any success (like the normal Perl || operator), so it will often be an optimization to put the most expensive type tests at the end.

The | and & type compositors have the usual precedences, so you can combine them as expected. For example, to accept an object (of any kind), or else the name of a class in the Storable hierarchy:

Obj|Class&Storable

If you need to circumvent the usual precedence, then use an Is[...].

Is[T]

Accepts any value that type T itself would accept.

This construct may be used anywhere within a typename, but is mainly useful for "bracketing" types when composing them with | and &.

For example, to match an object of any class in the Storable or Disposable hierarchies, or any object that has a reset() method, using normal &/| precedence, you'd have to write:

Obj&Storeable|Obj&Disposable|Obj&Can[reset]

With Is[...], that's just:

Obj&Is[Storeable|Disposable|Can[reset]]

Not[T]

Accepts any value that type T itself would not accept.

For example:

Not[Num]             # Anything except a number

Not[Ref]             # Anything except a reference (i.e. a Value)

Not[Obj]             # Anything unblessed

Not[Match[error]]    # Anything that doesn't match /error/x

Not[Obj|Class]       # Anything you can't call methods on

Not[Obj&Storable]    # Anything that isn't an object of class Storable
                     # (could still be an object of some other hierarchy
                     #  or else a classname in the Storable hierarchy)

User-defined types

Any other type specification that is a valid Perl identifier or qualified identifier is treated as a classname.

If the corresponding class exists, such a "classname type" accepts an object or classname in the corresponding class hierarchy. For example:

Storable               # object or classname in the Storable hierarchy

Disk::DVD::Rewritable  # object or classname in D::D::R hierarchy

Such user-defined types can be composed with each other and with all the other type specifiers listed above:

Storable|Disk::DVD::Rewritable  # object or classname from either hierarchy

Storable&Can[restore]           # a Storable with a restore() method

Obj&Disk::DVD::Rewritable       # an object of the hierarchy

Type relationships

Most of the standard types and type compositors listed in the previous section form a single hierarchy, like so:

Any
  \__Bool
       |___Undef
       |
        \__Def
            |__Value
            |     |___Num
            |     |    \__Int
            |     |
            |      \__Str
            |           \_Class
            |
             \__Ref
                 |___Ref[<T>]
                 |___Scalar
                 |___Regex
                 |___Code
                 |___Glob
                 |___IO
                 |___Obj
                 |___Array
                 |      |___Array[<T>]
                 |       \__Tuple[<T>, <T>, <T>, ...],
                 |
                  \__Hash
                       |___Hash[<T>]
                        \__Dict[<k> => <T>, <k>? => <T>, ...],

That is, a value that is accepted by any specific type in this diagram will also be accepted by all of its ancestral types. So, for example, the type Tuple[Str,Int] accepts the value ['A',1], so that same value will also be accepted by all of the following types (amongst many others): Tuple[Value,IntLike], Tuple[Def,Num], Tuple[Any,Bool], Array, Ref, Def, Bool, or Any.

Howwver, the converse is not generally true: a value that is accepted by a "parent" type may not be accepted by all (or any) of its descendants. So while the type Array accepts the value ['A',{}], that same value will not be accepted by either of the "child" types: Array[Int] or Tuple[Int,Str].

INTERFACE

use Dios::Types 'validate';

The validate() subroutine is not exported by default, but must be explicitly requested.

use Dios::Types 'validate' => 'OTHER_NAME';

When importing validate(), you can request the module rename it, by passing the desired alternative name as a second argument. For example:

use Dios::Types 'validate' => 'typecheck';

# and later...

typecheck('Array', $data);

validate($type, $value, $value_desc, @constraint_subs)

This subroutine requires its first two arguments: a type specification and a scalar value. If the type accepts the value, the subroutine returns true. If the type doesn't accept the value, an exception is thrown.

For example:

# Die if number of matches isn't an integer...
validate('Int', $matches);

# Die if any element isn't an open filehandle...
validate('Array[IO]', \@filehandles);

# Validate subroutine args...
sub fill_text {
    validate('Str',                my $text  = shift);
    validate('Int',                my $width = shift);
    validate('Dict[fill?, just?]', my $opts  = shift);
    ...
}

If you don't want the exception on failure, use an eval to defuse it:

if (!eval{ validate('Int', $matches); 1}) {
    say "Warning: $@";
    redo;
}

Describing the value passed to validate()

You can also pass one or more extra strings to validate(), which are use to improve the error messages produced for unacceptable values. Any extra arguments passed to the subroutine (that are not references) are concatenated together and used as the description of the value in the exception message. For example:

my $input = 'seven';

validate(Int, $input);
# dies with: "Value ("seven") is not of type Int"

validate(Int, $input, 'Error count reported by ', get_user_name());
# dies with: "Error count reported by root is not of type Int"

If the description string contains a %s, it is used as a sprintf format, and the value itself interpolated for the %s. For example:

validate(Int, $input, 'Error count (%s) reported by ', get_user_name());
# dies with: "Error count (7.5) reported by root is not of type Int"

Constraining the value passed to validate()

Any other extra arguments must be subroutine references, and these are used as additional constraints on the type-checking.

That is, if the specified type accepts the value, that value is then passed to each constraint subroutine in turn. If any of those subroutines returns false or throws an exception, then the type is considered not to have matched the value.

For example:

# Is $data a non-empty array of ints?
validate('Array[Int]', $data, sub{ @{$_[0]} > 0 });

# Is $filename a string in 8.3 format?
validate('Str', $filename, sub{ shift =~ qr/^\w{1,8}\.\w{3}$/ };

# Is $config a valid and normalized hash?
validate('Hash', $config, \&is_valid, \&is_normalized);

When the constraint subroutines are called, the value being validated is also temporarily aliased to $_, which sometimes simplifies the constraint:

# Is $data a non-empty array of ints?
validate('Array[Int]', $data, sub{ @$_ > 0 });

# Is $filename a string in 8.3 format?
validate('Str', $filename, sub{ /^\w{1,8}\.\w{3}$/ });

# Is $ID an unused integer?
validate('Int', $ID, sub{ !$used_ID[$_] });

When a constraint test fails, validate() does its best to produce a meaningful error message. For example, when $data isn't long enough:

my $data = [];

validate('Array[Int]', $data, sub{ @$_ > 0 });

...then the exception thrown is:

Value ([]) did not satisfy the constraint: { @$_ > 0; }

which is accurate, but maybe not sufficiently enlightening for all users.

There are two ways of improving the message produced. If a constraint is specified as a named subroutine, as in the earlier example:

validate('Hash', $config, \&is_valid, \&is_normalized);

then validate() attempts to convert the subroutine name into a description of the constraint:

Value ({ a=>1, b=>2, c=>1 }) did not satisfy the constraint: is normalized

Alternatively, if a constraint subroutine throws an exception on failure, the text of the exception is used as the description of the constraint:

validate('Array[Int]', $data, sub{ @$_ > 0 or die 'must not be empty' });

Now the exception thrown is:

Value ([]) did not satisfy the constraint: must not be empty

Note that the two kinds of extra arguments to validate() (i.e. value description strings and constraint subroutines) can be passed in any order, or even intermixed, as there is no ambiguity in the meaning of sub references vs non-references.

Declaring subtypes

subtype NEW_TYPE of OLD_TYPE;

subtype NEW_TYPE of OLD_TYPE where CONSTRAINT;

The many parametric types and type compositors that the module supports can easily cause type specifications to become unwieldy:

validate('Array[Int|Hash[Int]]&Is[IntArray|Not[Obj]]', $data);

Of course, you could always factor this specification string out into a variable or constant:

use constant
    ConfigArray => 'Array[Int|Hash[Int]]&Is[Configurator|Not[Obj]]';

validate(ConfigArray, $data);

but that doesn't help if you're using the full Dios framework:

use Dios;

use constant
    ConfigArray => 'Array[Int|Hash[Int]]&Is[Configurator|Not[Obj]]';

func configurate(ConfigArray $config) {...}

because the compiler won't substitute the constant ConfigArray there, but will instead treat that typename as specifying that the $config value must belong to class ConfigArray.

So Dios::Types provides a compile-time mechanism for naming a complex type specification within a given lexical scope...without injecting a constant into that scope: the subtype keyword:

subtype ConfigArray of Array[Int|Hash[Int]]&Is[Configurator|Not[Obj]];

# and later in the same scope...

validate('ConfigArray', $data);

# or...

func configurate(ConfigArray $config) {...}

A subtype is equally useful for giving some builtin type a more meaningful name:

subtype HitCount   of Int;
subtype ClientName of Str;
subtype Search     of Dict[ for => Regex, filters? => Tuple[Code...] ];

The subtype mechanism also allows you to create a named type that binds a constraint to an existing type (built-in or user-defined). For example:

# Hitcounts must be in the range 0 to 99...
subtype HitCount of Int where { 0 <= $_ && $_ <= 99 };

# Known names must have been previously seen...
subtype KnownName of ClientName where { $client_name_seen{$_} };

So now:

validate('HitCount', $count);

is a shorter and more meaningful way of specifying:

validate('Int', $count, sub { 0 <= $_ && $_ <= 99 });

The optional where constraint must be specified as a block of code, which is converted to a subroutine that is automatically passed to validate() whenever the subtype is used.

Note: if you want to use the other features of this module, but don't want the compile-time overheads of this keyword, use the Dios::Types::Pure module instead.

DIAGNOSTICS

Can't export %s

The module exports only a single subroutine: validate(). You asked it to export something else, which confused it.

If you were trying to export validate() under a different name, then you need:

use Dios::Types validate => '<name>';
Two type specifications for key %s in Dict[%s]

The Dict[...] type allows you to specify that a value must be of type Hash, and must only contain specific keys.

You're supposed to list each such key just once inside the square brackets but you listed a key twice (or more). Delete all the repetitions.

If you repeated a key because you were trying to allow its value to have two or more alternative types, like so:

Dict[name => Str, name => Undef]

then you need to write that using a single junctive type instead:

Dict[name => Str|Undef]
Incomprehensible type name: %s

The type you specified wasn't one that the module understands. Review the syntax for standard types and user-defined types.

Invalid regex syntax in Match[%s]: %s

The contents of the square brackets must be a valid regex specification (i.e. something you could validly put in an m/.../ or a qr/.../).

The full error message should point to the bad regex syntax. If that message doesn't help, see perlre for details of the standard Perl regex syntax.

Missing specification for constraint: %s

You passed a constraint to validate, but it was not a subroutine reference. Every constraint must be specified as a reference to a subroutine that expects one argument (the value) and returns a boolean value indicating whether the value satisfied the constraint.

%s is not of type %s

This is the default message returned by validate() if the value passed as its second argument doesn't match the type passed as its first argument.

%s did not satisfy the constraint: %s

This is the default message returned by validate() if the value passed as its second argument failed to satisfy one of the constraint subroutines that were also passed to it.

CONFIGURATION AND ENVIRONMENT

Dios::Types requires no configuration files or environment variables.

DEPENDENCIES

Requires Perl 5.14 or later.

Requires the Data::Dump and Variable::Magic modules.

Also requires the Keyword::Declare module (for the subtype keyword).

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to bug-dios-types@rt.cpan.org, or through the web interface at http://rt.cpan.org.

AUTHOR

Damian Conway <DCONWAY@cpan.org>

LICENCE AND COPYRIGHT

Copyright (c) 2015, Damian Conway <DCONWAY@cpan.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.