=head1 NAME

DateTime::Format::Builder - create DateTime parser objects.

=head1 SYNOPSIS

    use DateTime::Format::Builder;

    my $parser = DateTime::Format::Builder->parser(
        params => [ qw( year month day hour minute second ) ],
        regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
    );

    my $dt = $parser->parse_datetime( "197907161533" );

=head1 DESCRIPTION

C<DateTime::Format::Builder> creates L<DateTime> parser objects.
Many string formats of dates and times are simple and just require
a basic regular expression to extract the relevant information.

As such, they don't require a full blown module to be implemented.
Hence, this module was written. It allows you to create parser objects
and classes with a minimum of fuss.

=head1 FORMATTING vs PARSING

The name of this module is C<DateTime::Format::Builder>. This is, perhaps,
somewhat misleading. It should be noted that the word C<Format> is being
used as a noun, not a verb.

=head1 CONSTRUCTORS

=head2 new

Creates a new C<DateTime::Format::Builder> object. If called as
an object method, then it clones the object.

No arguments.

=head2 parser

If called as a class method, it creates a new
C<DateTime::Format::Builder> object with a specified parser. Parameters
are as for L<create_parser>.

If called as an object method, it creates a new parser for that object.
(Essentially a shortcut for C<create_parser> and C<set_parser>.)

   # Class
   my $new_parser = DateTime::Format::Builder->parser( ... );

   # Object
   $new_parser->parser( ... )

As a sidenote, when called as an object method (e.g.
C<< $new_parser->parser(...) >>) then the object iself
is returned (e.g. C<< $new_parser >>).

=head2 clone

For those who prefer to explicitly clone via a method called C<clone()>.
If called as a class method it will die.

   my $clone = $original->clone();

=head2 create_class

C<create_class> is different from the other constructors. It creates a
full class for the parser, not just an instance of
C<DateTime::Format::Builder>.

It takes two optional parameters and one required one.

=head3 OPTIONAL PARAMETERS

=over 4

=item *

C<class> is the name of the class to create. If not specified then
it is inferred to be the current package. Generally best left
unspecified.

=item *

C<version> is the version of the class. Generally best left unspecified
unless C<class> is also specified (that is, you're not just preparing
the current context). Why? Because CPAN won't pick up a version for a
module that isn't specified with a C<$VERSION> like how CPAN wants, it
won't behave properly. Ditto L<ExtUtils::MakeMaker>

=back

=head3 REQUIRED PARAMETER

C<parsers> is the important parameter. It takes a hashref as an
argument. This hashref is a list of method names and arrayrefs
of parser specifications.

For example (since the code is often clearer than my writing):

    package DateTime::Format::Brief;
    use DateTime::Format::Builder;
    DateTime::Format::Builder->create_class(
        parsers => {
            parse_datetime => [
            {
                regex => qr/^(\d{4})(\d\d)(d\d)(\d\d)(\d\d)(\d\d)$/,
                params => [qw( year month day hour minute second )],
            },
            {
                regex => qr/^(\d{4})(\d\d)(d\d)$/,
                params => [qw( year month day )],
            },
            ],
        }
    );

If you just have one specification, you can just have it without the
list:

    parse_datetime => {
        regex => qr/^(\d{4})(\d\d)(d\d)$/,
        params => [qw( year month day )],
    },


=head1 CLASS METHODS

These methods work on either our objects or as class methods.

=head2 create_parser

Creates a function to parse datetime strings and return L<DateTime> objects.

    # Parse a 15 character ICal string
    my $parser_fn = DateTime::Format::Builder->create_parser({
        regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)T(\d\d)(\d\d)(\d\d)$/,
        params => [qw( year month day hour minute second )]
        extra   => {},
    });

    # Parse an 8 character ICal string
    my $short_ical_parser = DateTime::Format::Builder->create_parser(
        {
            params => [ qw( year month day ) ],
            regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)$/,
        }
    );

I call the arguments seen above 'specifications', or C<spec>. A
reference to such a C<spec> is done in a hashref and I call this a
C<specref>. Pardon the introduction of terminology, but it does make
things simpler later on.

I specify the layout of a C<spec> L<below|/Specificiations>.

C<create_parser> (and most of the other routines because of this) can
create a few different sorts of parser. For each type I'll have a bit in
parens that indicates a the call style.

=head1 SPECIFICATIONS

A specification is typically a hashref (except for simple, single, parser
creations where they can be just a hash).

For example, here we have two specifications:

    my $inefficient_ical_parser = DateTime::Format::Builder->create_parser(
        {
            regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)T(\d\d)(\d\d)(\d\d)$/,
            params => [qw( year month day hour minute second )]
        },
        {
            params => [ qw( year month day ) ],
            regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)$/,
        },
    );

Right. And for further fun and games, any of these C<specrefs> can also
be a coderef.  The routine will be given C<$self> object (or it may just
be a class string) and a date string on input, and is expected to return
undef on failure, or a C<DateTime> object on success.


=over 4

=item *

C<regex> will be applied to the input of the created function.
This argument is required.

=item *

C<params> is an arrayref that maps the results of C<regex> to parameters
of C<< DateTime->new() >>. The first element is C<$1>, the second C<$2>, etc.
This argument is required.

=item *

C<extra> is a hashref that lists what any extra arguments should be set to.
You can use it to specify  parameters to C<< DateTime->new() >>,
such as C<time_zone>.

=item *

C<on_fail> is a reference to a subroutine (anonymous or otherwise) that will
be called in the event of a parse failing. It will be passed a hash looking like:

=over 4

=item *

C<input>, being the input on which the parser failed

=item *

C<label>, being the label of the parser, if there is one

=back

=item *

C<on_match> is just like C<on_fail>, only it's called in the event of success.

=item *

C<label> provides a name for the parser and is passed to C<on_fail> and
C<on_match>.  If you specified a set of parsers with some form of
C<< X => Y >> hash style, then by default, the label is the C<X>. That
will be overridden if you use this C<label> tag.

=item *

C<preprocess> is another callback. Its arguments are a hash consisting
of the keys C<input> (the datetime string given to the parser) and
C<parsed> (a hashref that is initially empty [unless your group of
parser specifications had a preprocessor that put something in it]).

You may put what you like in the hashref, and it will be kept.

This callback is called I<after> length determination.

=item *

C<postprocess> is yet another callback. Its arguments the same as for
C<preprocess>, except the C<parsed> hashref has been filled out with how
the parse went. If parsing failed, it is not called. It is free to
modify the hashref. Any changes will be reflected back. If the callback
returns false, then the parse is regarded as a failure. B<Note>: ensure
you return some true value if you don't want things to fail
mysteriously.

=back

If you have a series of specification and want a common preprocessor, it
can be specified like this:

    my $brief_parser = DateTime::Format::Builder->create_parser(
        [
            preprocess => sub { whatever },
        ],
        {
            regex => qr/^(\d{4})(\d\d)(d\d)(\d\d)(\d\d)(\d\d)$/,
            params => [qw( year month day hour minute second )],
        },
        {
            regex => qr/^(\d{4})(\d\d)(d\d)$/,
            params => [qw( year month day )],
        },
        ],
    }

B<Note> that this works with the arrays of specs in C<create_class> too.

B<Note also> that the arrayref B<must> be the first argument.

The C<preprocess> sub is given a hash on input of the date to be
parsed and a hashref in which to place any pre-calculated values.
The hash keys are C<input> and C<parsed> respectively. The return
value should be the date string that the parsers will then go on
to process.

A sample preprocessor (taken from L<DateTime::Format::ICal>) looks
like this:

    my $add_tz = sub {
        my %args = @_;
        my ($date, $p) = @args{qw( input parsed )};
        if ( $date =~ s/^TZID=([^:]+):// )
        {
            $p->{time_zone} = $1;
        }
        # Z at end means UTC
        elsif ( $date =~ s/Z$// )
        {
            $p->{time_zone} = 'UTC';
        }
        else
        {
            $p->{time_zone} = 'floating';
        }
        return $date;
    };

Any length calculations (for length parsers) are done after this
preprocessing.

=head1 OBJECT METHODS

If you actually create a C<DateTime::Format::Builder> object, then you
get the following methods on that object.

=head2 set_parser / get_parser

Set and get the object's parser function. Fairly straight forward
and of minimal use, except for sub classes.

=head2 parse_datetime

Given an Builder day number, return a C<DateTime> object representing that
date and time.

    # Having created our parser, somehow, we can:

    my $dt = $parser->parse_datetime( "1998-04-01 15:16:24" );

If you receive errors about things being undefined, then there was a
parse failure.

=head2 format_datetime

Ok. We don't actually implement this. It's just here to
make sure you know we don't. It's implemented like an
abstract method: it will die if invoked.

It will be available at some point.

=head1 THANKS

Dave Rolsky (DROLSKY) for kickstarting the DateTime project
and some much needed review.

Joshua Hoblitt (JHOBLITT) for the concept, some of the API,
and more much needed review.

Kellan Elliott-McCrea (KELLAN) for even more review!

Simon Cozens (SIMON) for saying it was cool.

=head1 SUPPORT

Support for this module is provided via the datetime@perl.org email
list. See http://lists.perl.org/ for more details.

Alternatively, log them via the CPAN RT system via the web or email:

    http://perl.dellah.org/rt/dtbuilder
    bug-datetime-format-builder@rt.cpan.org

This makes it much easier for me to track things and thus means
your problem is less likely to be neglected.

=head1 LICENSE AND COPYRIGHT

Copyright E<copy> Iain Truskett, 2003. All rights reserved.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

The full text of the licenses can be found in the F<Artistic> and
F<COPYING> files included with this module.

=head1 AUTHOR

Iain Truskett <spoon@cpan.org>

=head1 TODO

=over 4

=item *

More tests.

=item *

strptime compatible parsing

=item *

strftime compatible formatting

=back

=head1 SEE ALSO

C<datetime@perl.org> mailing list.

L<http://datetime.perl.org/>

L<perl>, L<DateTime>

=cut