NAME

Date::Reformat - Rearrange date strings

SYNOPSIS

use Date::Reformat;

my $reformat = Date::Reformat->new(
    parser => {
        regex  => qr/^(\d{4})-(\d\d)-(\d\d)T(\d\d):(\d\d):(\d\d)$/,
        params => [qw(year month day hour minute second)],
    },
    defaults => {
        time_zone => 'America/New_York',
    },
    transformations => [
        {
            from    => 'year',
            to      => 'century',
            coderef => sub { int($_[0] / 100) },
        },
    ],
    formatter => {
        sprintf => '%s-%02d-%02dT%02d:%02d:02d %s',
        params  => [qw(year month day hour minute second time_zone)],
    },
);

my $reformat = Date::Reformat->new(
    parser => {
        strptime => '%Y-%m-%dT%M:%H:%S',
        # or heuristic => 'ymd', # http://www.postgresql.org/docs/9.2/static/datetime-input-rules.html
    },
    defaults => {
        time_zone => 'America/New_York',
    },
    formatter => {
        strftime => '%Y-%m-%dT%M:%H:%S %Z',
        # or data_structure => 'hashref' || 'hash' || 'arrayref' || 'array'
        # or coderef => sub { my ($y, $m, $d) = @_; DateTime->new(year => $y, month => $m, day => $d) },
        # params => [qw(year month day)],
    },
);

my $reformatted_string = $reformat->reformat_date($date_string);

DESCRIPTION

This module aims to be a lightweight and flexible tool for rearranging components of a date string, then returning the components in the order and structure specified.

My motivation was a month of trying to compare data from spreadsheets from several sources, and every single one used a different date format, which made comparison difficult.

There are so many modules for doing date math, or parsing a specific date format. I needed something that could take in pretty much any format and turn it into a single format that I could then use for comparison.

METHODS

new()

Returns a new reformatter instance.

my $reformat = Date::Reformat->new(
    'parser'          => $parsing_instructions,
    'transformations' => $transformation_instructions,
    'defaults'        => $default_values,
    'formatter'       => $formatting_instructions,
    'debug'           => 0,
);

Parameters:

parser

A hashref of instructions used to initialize a parser.

See "prepare_parser()" for details.

transformations

An arrayref of hashrefs containing instructions on how to convert values of one token into values for another token (such as month_abbr to month).

See "prepare_transformations()" for details.

defaults

A hashref specifying values to use if the date string does not contain a specific token (such as a time_zone value).

See "prepare_defaults()" for details.

formatter

A hashref of instructions used to initialize a formatter.

See "prepare_formatter()" for details.

debug

Either a 1 or a 0, to turn debugging on or off, respectively.

prepare_parser()

Builds a parser based on the given instructions. To add it to the currently active parsers, see "add_parser()".

If several parsers are active, the first one to successfully parse the current date string returns the results of the parse, and subsequent parsers are not utilized. See "parse_date()" for more information.

The types of parsers that can be initialized via this method are:

regex

The regex must specify what parts should be captured, and a list of token names must be supplied to identify which token each captured value will be assigned to.

$reformat->prepare_parser(
    {
        regex  => qr/^(\d{4})-(\d\d)-(\d\d)T(\d\d):(\d\d):(\d\d)$/,
        params => [qw(year month day hour minute second)],
    },
);
regex with named capture

The regex must specify what parts should be captured, using named capture syntax.

$reformat->prepare_parser(
    {
        regex  => qr/^(?<year>\d{4})-(?<month>\d\d)-(?<day>\d\d) (?<hour>\d\d?):(?<minute>\d\d):(?<second>\d\d)$/,
    },
);
strptime

The format string must be in strptime() format.

$reformat->prepare_parser(
    {
        strptime => '%Y-%m-%dT%M:%H:%S',
    },
);
heuristic

A hint must be provided that will help the parser determine the meaning of numbers if the ordering is ambiguous.

Currently the heuristic parsing mimics the PostgreSQL date parser (though I have not copied over all the test cases from the PostgreSQL regression tests, so there are likely to be differences/flaws).

$reformat->prepare_parser(
    {
        heuristic => 'ymd',  # or 'mdy' or 'dmy'
    },
);

Currently when the heuristic parser parses a date string, it creates a named regex parser which it injects into the active parsers directly in front of itself, so that subsequent date strings that are in the same format will be parsed via the regex.

I plan to add a parameter that will control whether parsers are generated by the heuristic parser (I also plan to refactor that method quite a bit, because it kind of makes me cringe to look at it).

prepare_formatter()

Builds a formatter based on the given instructions. To add it to the currently active formatters, see "add_formatter".

If several formatters are active, they are each called in turn, receiving the output from the previous parser.

The types of parsers that can be initialized via this method are:

sprintf

The format string must be in sprintf() format, and a list of token names must be supplied to identify which token values to send to the formatter.

$reformat->prepare_formatter(
    {
        sprintf => '%s-%02d-%02dT%02d:%02d:02d %s',
        params  => [qw(year month day hour minute second time_zone)],
    },
);
strftime

The format string must be in strftime() format.

$reformat->prepare_formatter(
    {
        strftime => '%Y-%m-%dT%M:%H:%S %Z',
    },
);
data_structure

The type of the desired data structure must be specified, and a list of token names to identify which token values to include in the data structure.

Valid data structure types are:

hash
hashref
array
arrayref
$reformat->prepare_formatter(
    {
        data_structure => 'hashref',
        params         => [qw(year month day hour minute second time_zone)],
    },
);
coderef

The supplied coderef will be passed the token values specified. Whatever the coderef returns will be passed to the next active formatter, or will be returned, if this is the final formatter.

$reformat->prepare_formatter(
    {
        coderef => sub { my ($y, $m, $d) = @_; DateTime->new(year => $y, month => $m, day => $d) },
        params  => [qw(year month day)],
    },
);
prepare_transformations()

Accepts an arrayref of hashrefs that specify how to transform token values from one token type to another.

Returns the same arrayref. To add it to the currently active transformers, see "add_transformations".

add_transformations()

Accepts an arrayref of hashrefs that specify how to transform token values from one token type to another. Adds each transformation instruction to the list of active transformers. A transformation instruction with the same to and from values as a previous instruction will overwrite the previous version.

$reformat->add_transformations(
    [
        {
            'to'             => 'hour',
            'from'           => 'hour_12',
            'transformation' => sub {
                my ($date) = @_;
                # Use the value of $date->{'hour_12'} (and $date->{'am_or_pm'})
                # to calculate what the value of $date->{'hour'} should be.
                # ...
                return $hour;
            },
        },
    ],
);

The values in each hashref are:

to

The name of the token type that is desired (for instance 'hour', meaning the 24-hour format).

from

The name of the token type that is available in the date string (for instance 'hour_12', meaning the 12-hour format).

transformation

A coderef which accepts a hashref containing the information which has been parsed out of the date string. The coderef is expected to examine the date information, transform the token type specified via from into the correct value for the token type specified via to, and return that value.

Several transformations have been built into this module. Search for $DEFAULT_TRANSFORMATIONS in the source code.

Transformations added via this method will take precedence over built-in transformations.

prepare_defaults()

Accepts a hashref of default values to use when transforming or formatting a date which is missing tokens that are needed.

This method clears out any defaults which had been set previously.

Returns the same hashref it was given, but does not set them. To add defaults, see "add_defaults".

add_defaults()

Accepts a hashref of default values to use when transforming or formatting a date which is missing tokens that are needed.

Each key should be the name of a token, and the corresponding value is the default value that will be used when a date is missing that token.

$reformat->add_defaults(
    {
        'time_zone' => 'America/New_York',
    },
);
debug()

Turns debugging statements on or off, or returns the current debug setting.

Expects a true value to turn debugging on, and a false value to turn debugging off.

$reformat->debug(1);  # 1 or 0
prepare_parser_for_regex_with_params()

Internal method called by "prepare_parser()".

prepare_parser_for_regex_named_capture()

Internal method called by "prepare_parser()".

prepare_parser_for_strptime()

Internal method called by "prepare_parser()".

prepare_parser_heuristic()

Internal method called by "prepare_parser()".

prepare_formatter_for_arrayref()

Internal method called by "prepare_formatter()".

prepare_formatter_for_hashref()

Internal method called by "prepare_formatter()".

prepare_formatter_for_coderef()

Internal method called by "prepare_formatter()".

prepare_formatter_for_sprintf()

Internal method called by "prepare_formatter()".

prepare_formatter_for_strftime()

Internal method called by "prepare_formatter()".

strptime_token_to_regex()

Internal method called by "prepare_parser()".

strftime_token_to_internal

Internal method called by "prepare_formatter()".

transform_token_value()

Internal method called by "prepare_formatter()".

most_likely_token()

Internal method called by "prepare_parser()".

add_parser()

Adds a parser to the active parsers. When parsing a date string, the parser will be called if each preceeding parser has failed to parse the date.

See "prepare_parser()" for generating a parser in the correct format.

$reformat->add_parser(
    $reformat->prepare_parser( ... ),
);
add_formatter()

Adds a formatter to the active formatters. When formatting a date, the formatter will be called after each preceeding formatter, receiving as input the output from the previous formatter.

See "prepare_formatter()" for generating a formatter in the correct format.

$reformat->add_formatter(
    $reformat->prepare_formatter( ... ),
);
parse_date()

Given a date string, attempts to parse it via the active parsers. Returns a hashref containing the tokens that were extracted from the date string.

my $date_hashref = $reformat->parse_date($date_string);
format_date()

Given a hashref containing the tokens that were extracted from a date string, formats the date using each of the active parsers, passing the output from the previous formatter to the next formatter.

my $date_string = $reformat->format_date($date_hashref);
reformat_date()

Given a date string, attempts to parse it and format it using the active parsers and formaters.

my $date_string = $reformat->reformat_date($date_string);

SEE ALSO

Date::Transform
Date::Parse
Date::Format
DateTime::Format::Flexible
DateTime::Format::Builder

AUTHOR

Nathan Gray <kolibrie@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2015 by Nathan Gray

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.