NAME
Hash::Type - pseudo-hashes as arrays tied to a "type" (list of fields)
SYNOPSIS
use Hash::Type;
# create a Hash::Type
my $personType = new Hash::Type(qw(firstname lastname city));
# create and populate some hashes tied to $personType
tie %wolfgang, $personType, "wolfgang amadeus", "mozart", "salzburg";
$ludwig = new $personType ("ludwig", "van beethoven", "vienna");
$jsb = new $personType;
$jsb->{city} = "leipzig";
@{$jsb}{qw(firstname lastname)} = ("johann sebastian", "bach");
# add fields dynamically
$personType->add("birth", "death") or die "fields not added";
$wolfgang{birth} = 1750;
# More complete example : read a flat file with headers on first line
my ($headerline, @datalines) = map {chomp; $_} <F>;
my $ht = new Hash::Type(split /\t/, $headerline);
foreach my $line (@datalines) {
my $data = new $ht(split /\t/, $line);
work_with($data->{someField}, $data->{someOtherField});
}
# an alternative to Time::gmtime and Time::localtime
my $timeType = new Hash::Type qw(sec min hour mday mon year wday yday);
my $localtime = new $timeType (localtime);
my $gmtime = new $timeType (gmtime);
print $localtime->{hour} - $gmtime->{hour}, " hours difference to GMT";
# comparison functions
my $byAge = $personType->cmp("birth : -num, lastname, firstname");
my $byNameLength = $personType->cmp(lastname => {length($b) <=> length($a)},
lastname => 'alpha',
firstname => 'alpha');
showPerson($_) foreach (sort $byAge @people);
showPerson($_) foreach (sort $byNameLength @people);
# special comparisons : dates
my $US_DateCmp = $myHashType->cmp("someDateField : m/d/y");
my $FR_InverseDateCmp = $myHashType->cmp("someDateField : -d.m.y");
DESCRIPTION
A Hash::Type is a collection of field names. Internally, an index is associated with each name. Such collections are created dynamically and can be extended. They are used to build tied hashes, either through tie
or through object-oriented method calls; such tied hashes :
are 'restricted' (will only accept operations on names previously declared in their Hash::Type)
are implemented internally as arrays (so they use less memory)
can be sorted efficiently through comparison functions generated and compiled by the class
The 'pseudo-hashes' in core Perl were very similar, but they are deprecated starting from Perl 5.8.0. More on comparison with other packages in section "SEE ALSO"
METHODS
$myType = new Hash::Type(@names)
-
Creates a new object which holds a collection of names and associated indices (technically, this is a hash reference blessed in package Hash::Type). This object can then be used to generate tied hashes. The list of
@names
is optional ; names can be added later through methodadd
. $h = new $myType(@vals)
-
Creates a new tied hash associated to package Hash::Type and containing a reference to $myType (technically, this is an array reference, tied to package Hash::Type).
The other way to create a tied hash is through the
tie
syntax :tie %h, $myType, @vals;
Access to
$h{name}
is equivalent to writingtied(%h)->[$myType->{name}]
so this will generate an error if
name
was not declared in$myType
.$h{'Hash::Type'}
is a special, predefined name that gives back the object to which this hash is tied (you may need it for example to generate a comparison function, see below).The operation
delete $h{name}
is forbidden. To delete a value, you have to go to the underlying array :delete tied(%h)->[$myType->{name}];
$myType->add(@newNames)
-
Adds @newNames in $myType and gives them new indices. Does nothing for names that were already present. Returns the number of names actually added.
You can also dynamically remove names by writing
delete $myType->{name}
; however, this merely masks access to {name} for all hashes tied to $myType, so the values are still present in the underlying arrays and you will not gain any memory by doing this.After deleting
{name}
, you can again call$myType->add('name')
, but this will allocate a new index, and not recover the previous one allocated to that key. $myType->names
-
Returns the list of defined names, in index order (which might be different from (keys %$myType)).
$cmp = $myType->cmp("f1 : cmp1, f2 : cmp2 , ...")
-
Returns a reference to an anonymous sub which successively compares the given field names, applying the given operators, and returns a positive, negative or zero value. This sub can then be fed to
sort
. 'f1', 'f2', etc are field names, 'cmp1', 'cmp2' are comparison operators written as :[+|-] [alpha|num|cmp|<=>|d.m.y|d/m/y|y-m-d|...]
The sign is '+' for ascending order, '-' for descending; default is '+'. Operator 'alpha' is synonym to 'cmp' and 'num' is synonym to '<=>'; operators 'd.m.y', 'd/m/y', etc. are for dates in various formats; default is 'alpha'.
If all you want is alphabetic ascending order, just write the field names :
$cmp = $personType->cmp('lastname', 'firstname');
Note :
sort
will not accept something likesort $personType->cmp('lastname', 'firstname') @people;
so you have to store it in a variable first :
my $cmp = $personType->cmp('lastname', 'firstname'); sort $cmp @people;
For date comparisons, values are parsed into day/month/year, according to the shape specified (for example 'd.m.y') will take '.' as a separator. Day, month or year need not be several digits, so '1.1.1' will be interpreted as '01.01.2001'. Years of 2 or 1 digits are mapped to 2000 or 1900, with pivot at 33 (so 32 becomes 2032 and 33 becomes 1933).
$cmp = $myType->cmp(f1 => cmp1, f2 => cmp2, ...)
-
This second syntax, with pairs of field names and operators, is a bit more verbose but gives you more flexibility, as you can write your own comparison functions using
$a
and$b
:my $byNameLength = $personType->cmp(lastname => {length($b) <=> length($a)}, lastname => 'alpha', firstname => 'alpha');
Note : the resulting closure is bound to special variables
$a
and <$b>. Since those are different in each package, you cannot pass the comparison function to another package : the call tosort
has to be done here.
CAVEATS
The implementation of 'each', 'keys', 'values' on tied hashes calls corresponding operations on the Hash::Type object ; therefore, nested 'each' on several tied hashes won't work.
SEE ALSO
The 'pseudo-hashes' documented in perlref are very similar, but are deprecated starting from Perl 5.8.0. Each pseudo-hash holds its own copy of key names in position 0 of the underlying array, whereas hashes tied to Hash::Type
hold a reference to a shared collection of keys.
Typed references together with the use fields
pragma provide support for compile-time translation of key names to array indices; see fields. This will be faster, but will not help if field names are only known at runtime (like in the flat file parsing example of the synopsis).
For other ways to restrict the keys of a hash to a fixed set, see "lock_keys" in Hash::Util, Tie::Hash::FixedKeys, Tie::StrictHash.
The Sort::Fields module in CPAN uses similar techniques for dynamically building sorting criterias according to field positions; but it is intended for numbered fields, not for named fields, and has no support for caller-supplied comparison operators. The design is also a bit different : fieldsort
does everything at once (splitting, comparing and sorting), whereas Hash::Type::cmp
only compares, and leaves it to the caller to do the rest.
Hash::Type
was primarily designed as a core element for implementing rows of data in File::Tabular.
AUTHOR
Laurent Dami, <laurent.dami AT etat geneve ch>
COPYRIGHT AND LICENSE
Copyright 2005 by Laurent Dami.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.