NAME
Genealogy::Occupation - Normalise and translate genealogical occupation strings
VERSION
Version 0.02
SYNOPSIS
use Genealogy::Occupation;
my $normaliser = Genealogy::Occupation->new();
my @occupations = $normaliser->normalise(
occupation => 'Ag Lab',
sex => 'M',
);
# Returns ('Agricultural Labourer')
# Or pass an arrayref
my @more = $normaliser->normalise(
occupation => ['Ag Lab', 'Ag Lab', 'Retired'],
sex => 'M',
);
# Returns ('Agricultural Labourer') - deduplicated and filtered
DESCRIPTION
Normalises occupation strings found in genealogical records, handling common abbreviations, malformed entries, locale-specific spellings and translations into French and German.
Designed to handle poor-quality data from genealogy software imports where occupation strings may be abbreviated, inconsistent or use archaic terminology.
Processing steps applied in order:
- 1. Filter out non-occupations (Scholar, Retired, Domestic Duties etc)
- 2. Normalise abbreviations and malformed entries to canonical forms
- 3. Deduplicate consecutive identical or equivalent entries (compared on pre-translation normalised forms)
- 4. Apply locale-specific spellings via
Lingua::EN::ABC - 5. Translate to French or German if system locale requires it
METHODS
new
Purpose
Constructs a new normaliser object.
API Specification
Input
{
warn_on_error => {
type => 'boolean',
optional => 1,
default => 0,
},
}
Output
{ type => 'object', isa => 'Genealogy::Occupation' }
Arguments
warn_on_error- If true, unknown occupations that cannot be translated will emit a warning viacarprather than silently falling back to English. Optional, defaults to 0.
Returns
A blessed Genealogy::Occupation object.
Side Effects
None.
Notes
The system locale is detected once at construction time and cached for the lifetime of the object.
Example
my $normaliser = Genealogy::Occupation->new({
warn_on_error => 1,
});
normalise
Purpose
Normalises one or more occupation strings, applying filtering, deduplication, abbreviation expansion, locale spelling and translation in order.
API Specification
Input
{
occupation => {
type => ['string', 'arrayref'],
},
sex => {
type => 'string',
optional => 1,
memberof => ['M', 'F'],
},
}
Output
{
type => 'arrayref',
element_type => 'string',
}
Arguments
occupation- A single occupation string or an arrayref of occupation strings. Required.sex- The sex of the person,'M'or'F'. Optional but required for correct gendered translations in French and German. Defaults to'M'if not provided when a gendered translation is needed.
Returns
An arrayref of normalised occupation strings. May be empty if all occupations were filtered out.
Side Effects
If warn_on_error was set at construction and an occupation cannot be translated, emits a warning via carp.
Notes
Deduplication operates across the full list of occupations passed in. Processing a single occupation at a time will not deduplicate across multiple calls.
Deduplication compares the pre-translation normalised English forms, not the translated output. This means two consecutive identical English occupations correctly collapse to one entry even in French or German locales, where the translated results stored in the output array would otherwise never match the incoming English string.
Example
my $result = $normaliser->normalise(
occupation => ['Ag Lab', 'Ag Lab', 'Retired'],
sex => 'M',
);
# Returns ['Agricultural Labourer']
my $result = $normaliser->normalise(
occupation => 'Platelayer Railway',
);
# Returns ['Railway Platelayer']
AUTHOR
Nigel Horne <njh@bandsman.co.uk>
BUGS
Please report bugs via the GitHub issue tracker: https://github.com/nigelhorne/Genealogy-Occupation/issues
TODO
Expand French and German translation tables
Add support for additional languages
Add
normalise_place()equivalent for occupation place strings
SEE ALSO
LICENSE AND COPYRIGHT
Copyright 2026 Nigel Horne.
This program is released under the following licence: GPL2 If you use it, please let me know.