The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Whelk::Schema - Whelk validation language

SYNOPSIS

        # build from scratch
        Whelk::Schema->build(
                name => {
                        type => 'string',
                }
        );

        # build by extending
        Whelk::Schema->build(
                new_name => [
                        \'name_to_extend',
                        %more_args
                ],
        );

DESCRIPTION

Whelk schema is an easy validation language for defining validations similar to JSON Schema. It's designed to be a bit more concise and crafted specifically for Whelk needs.

Whelk schema is used everywhere in Whelk: not only in Whelk::Schema->build calls but also in request, response and parameters keys in endpoints. Only "build" allows defining named schemas.

A named schema is global and should have an unique name. The module will not allow overriding a named schema. All named schemas will be put into the OpenAPI document, in compontents/schemas object, using their defined names.

Defining a schema

There are a couple of ways to define a schema, listed below. All of them can be used at every nesting level, so for example you can use a reference to a schema inside properties of an object schema created with hash.

New schema using hash reference

        { # new schema, level 0
                type => 'array',
                items => { # new schema, level 1
                        type => 'object',
                        properties => { # reused schema, level 2
                                some_field => \'named_schema'
                        },
                },
        }

By passing a HASH reference you are creating a completely new schema. type field is required and must be one of the available types, in lowercase.

Schema declared this way will be put into the OpenAPI document as-is, without referencing any other schema.

Reusing schemas with scalar reference

        # reusing a named schema
        \'name'

By passing a SCALAR reference you are reusing a named schema. The name must exist beforehand or else an exception will be raised.

Schema declared this way will be put into the OpenAPI document as a reference to a schema inside components/schemas object.

Extending schemas with array reference

        # extending a named schema
        [
                \'name',
                required => !!0,
        ]

By passing an ARRAY reference you are extending an named schema. The first argument must be a SCALAR reference with the name of the schema to extend. Rest of the arguments are configuration which should be replaced in the extended schema. type cannot be replaced.

Schema declared this way will be put into the OpenAPI document as-is, without referencing any other schema.

Available types

Each new schema must have a type defined. All types share these common configuration values:

  • required

    Boolean - whether the value is required to be present. true by default.

  • description

    String - an optional description used for the schema in the OpenAPI document.

  • rules

    An array reference of hashes. See "Extra rules".

null

A forced undef value.

No special configuration.

empty

This is a special type used to implement 204 No Content responses. It is only valid at the root of response and should not be used in any other context.

No special configuration.

string

A string type. The value must not be a reference and the output will be coerced to a string value. Unlike JSON schema, this also accepts numbers.

Extra configuration fields:

  • default

    A default value to be used when there is no value. Also assumes required => !!0.

  • example

    An optional example used for the schema in the OpenAPI document.

boolean

A boolean type. Will coerce the output value to JSON::PP::true and JSON::PP::false objects.

Same extra configuration as in "string".

number

A numeric type. Will coerce the output value to a number. Unlike JSON schema, this also accepts strings as long as they contain something which looks like a number.

Same extra configuration as in "string".

integer

Same as "number", but will not accept numbers with fractions.

array

This is an array type, which will only accept array references.

Extra configuration fields:

  • items

    An optional type to use for each of the array elements. This is a nested schema, and all ways to define a schema discussed in "Defining a schema" will work.

  • lax

    This is a special boolean flag used to accept array parameters of type query and header. If present and true, the type will also accept a non-array input and turn it into an array with one element. Should probably only use it within parameters structure of the endpoint.

object

This is a hash type, which will only accept hash references. Unlike JSON schema, it's required is not an array of required elements - instead the required elements will be taken from required flag of its properties.

Extra configuration fields:

  • properties

    An optional dictionary to use for the keys in the object. If it's not specified, the object can contain anything. This is a nested schema, and all ways to define a schema discussed in "Defining a schema" will work.

  • strict

    This is a special boolean flag used to make any schema which does contain extra keys as those specified in properties incorrect. By default, the hash can contain any number of extra keys and will be considered correct. Note that the schema will still only copy the keys which were defined, so this is usually not required.

Extra rules

Whelk does not define a full JSONSchema spec with all its rules. To allow configuration, you can specify extra rules when needed which will be used during validation and may optionally add some keys to the OpenAPI spec of that field. While all field types allow defining extra rules, it makes little sense to use them for types boolean, null and empty - rules will do nothing for them.

An example of adding some rules is showcased below:

        {
                type => 'integer',
                rules => [
                        {
                                openapi => {
                                        minimum => '5',
                                },
                                hint => '(>=5)',
                                code => sub {
                                        my $value = shift;

                                        return $value >= 5;
                                },
                        },
                ],
        }

As shown, a rules array reference may be defined, containing hash references. Each rule (represented by a hash reference) must contain hint (a very short error message notifying the end user what's wrong), code (a sub reference, which will be passed the value and must return true if the value is valid) and optionally openapi (a hash reference, containing keys which will be added to OpenAPI document).

There may be multiple rules in each field, and each rule can contain multiple openapi keys (but only a single code and hint). This system is very bare-bones and a bit verbose, but it makes it very easy to write your own library of validations, implementing the parts of JSONSchema you need (or even the full schema - please publish to CPAN if you do!). Just write a function which will return a given hash reference and it becomes quite powerful:

        sub greater_or_equal
        {
                my ($arg) = @_;

                return {
                        openapi => {
                                minimum => $arg,
                        },
                        hint => "(>=$arg)",
                        code => sub { shift() >= $arg },
                };
        }

        ... then
        {
                type => 'integer',
                rules => [
                        greater_or_equal(5),
                ],
        }

METHODS

This is a list of factory methods implemented by Whelk::Schema.

build

Builds a schema and returns Whelk::Schema::Definition.

build_if_defined

Same as "build", but will not throw an exception if an undef is passed. Instead, returns undef.

get_by_name

Gets a named schema by name and returns Whelk::Schema::Definition.

get_or_build

A mix of "build" and "get_by_name". Tries to get a schema by name, and builds it if it was not defined yet.

all_schemas

Returns all named schemas defined thus far.

SEE ALSO

Whelk::Manual