NAME
Data::TableReader::Field - Field specification for Data::TableReader
VERSION
version 0.010
DESCRIPTION
This class describes aspects of one of the fields you want to find in your spreadsheet.
ATTRIBUTES
name
Required. Used for the hashref key if you pull records as hashes, and used in diagnostic messages.
header
A string or regex describing the column header you want to find in the spreadsheet. If you specify a regex, it is used directly. If you specify a string, it becomes the regex matching any string with the same words (\w+) and non-whitespace (\S+) characters in the same order, case insensitive, surrounded by any amount of non-alphanumeric garbage ([\W_]*
). When no header is specified, the "name" is used as a string after first breaking it into words on underscore or camel-case or numeric boundaries.
This deserves some examples:
Name Implied Default Header
"zipcode" "zipcode"
"ZipCode" "Zip Code"
"Zip_Code" "zip Code"
"zip5" "zip 5"
Header Regex Could Match...
"ZipCode" /^[\W_]*ZipCode[\W_]*$/i "zipcode:"
"zip_code" /^[\W_]*zip_code[\W_]*$/i "--ZIP_CODE--"
"zip code" /^[\W_]*zip[\W_]*code[\W_]*$/i "ZIP\nCODE "
"zip-code" /^[\W_]*zip[\W_]*-[\W_]*code[\W_]*$/i "ZIP-CODE:"
qr/Zip.*Code/ /Zip.*Code/ "Post(Zip)Code"
If this default matching doesn't meet your needs or paranoia level, then you should always specify your own header regexes.
(If your data actually doesn't have any header at all and you want to brazenly assume the columns match the fields, see reader attribute "header_row_at" in Data::TableReader)
required
Whether or not this field must be found in order to detect a table. Defaults is true. Note this does not require the field of a row to contain data in order to read a record from the table; it just requires a column to exist.
trim
Whether or not to remove prefix/suffix whitespace from each value of the field. Default is true.
blank
The value to extract when the spreadsheet cell is empty. (where "empty" depends on the value of trim
). Default is undef
. Another common value would be ""
.
type
A Type::Tiny type (or any object or class with a validate
method) or a coderef which returns a validation error message (undef if it is valid).
use Types::Standard;
...
type => Maybe[Int]
# or without Type::Tiny
type => sub { $_[0] =~ /^\w+/? undef : "word-characters only" },
This is an optional feature and there is no default. The behavior of a validation failure depends on the options to TableReader.
array
Boolean of whether this field can be found multiple times in one table. Default is false. If true, the value of the field will always be an arrayref (even if only one column matched).
follows
Name (or arrayref of names) of a field which this field must follow, in a first-to-last ordering of the columns. This field must occur immediately after the named field(s), or after another field which also has a follows
restriction and follows the named field(s).
The purpose of this attribute is to resolve ambiguous columns. Suppose you expect columns with the following headers:
Father | | | | Mother | | |
FirstName | LastName | Tel. | Email | FirstName | LastName | Tel. | Email
You can use qr/Father\nFirstName/
to identify the first column, but after FirstName the rest are ambiguous. But, TableReader can figure it out if you say:
{ name => 'father_first', header => qr/Father\nFirstName/ },
{ name => 'father_last', header => 'LastName', follows => 'father_first' },
{ name => 'father_tel', header => 'Tel.', follows => 'father_first' },
{ name => 'father_email', header => 'Email', follows => 'father_first' },
..
and so on. Note how 'father_first'
is used for each as the follows
name; this way if any non-required fields (like maybe Tel
) are completely removed from the file, TableReader will still be able to find LastName
and Email
.
You can also use this to accumulate an array of columns that lack headers:
Scores | | | | | | | OtherData
12% | 35% | 42% | 18% | 65% | 99% | 55% | xyz
{ name => 'scores', array => 1, trim => 1 },
{ name => 'scores', array => 1, trim => 1, header => '', follows => 'scores' },
The second field definition has an empty header, which would normally make it rather ambiguous and potentially capture blank-header columns that might not be part of the array. But, because it must follow a column named 'scores' there's no ambiguity; you get exactly any column starting from the header 'Scores'
until a column of any other header.
follows_list
Convenience accessor for @{ ->follows }
.
header_regex
"header", coerced to a regex if it wasn't already
AUTHOR
Michael Conrad <mike@nrdvana.net>
COPYRIGHT AND LICENSE
This software is copyright (c) 2019 by Michael Conrad.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.