NAME

Genealogy::Occupation - Normalise and translate genealogical occupation strings

VERSION

Version 0.02

SYNOPSIS

use Genealogy::Occupation;

my $normaliser = Genealogy::Occupation->new();

my @occupations = $normaliser->normalise(
    occupation => 'Ag Lab',
    sex        => 'M',
);
# Returns ('Agricultural Labourer')

# Or pass an arrayref
my @more = $normaliser->normalise(
    occupation => ['Ag Lab', 'Ag Lab', 'Retired'],
    sex        => 'M',
);
# Returns ('Agricultural Labourer') - deduplicated and filtered

DESCRIPTION

Normalises occupation strings found in genealogical records, handling common abbreviations, malformed entries, locale-specific spellings and translations into French and German.

Designed to handle poor-quality data from genealogy software imports where occupation strings may be abbreviated, inconsistent or use archaic terminology.

Processing steps applied in order:

METHODS

new

Purpose

Constructs a new normaliser object.

API Specification

Input

{
    warn_on_error => {
        type     => 'boolean',
        optional => 1,
        default  => 0,
    },
}

Output

{ type => 'object', isa => 'Genealogy::Occupation' }

Arguments

Returns

A blessed Genealogy::Occupation object.

Side Effects

None.

Notes

The system locale is detected once at construction time and cached for the lifetime of the object.

Example

my $normaliser = Genealogy::Occupation->new({
    warn_on_error => 1,
});

normalise

Purpose

Normalises one or more occupation strings, applying filtering, deduplication, abbreviation expansion, locale spelling and translation in order.

API Specification

Input

{
    occupation => {
        type => ['string', 'arrayref'],
    },
    sex => {
        type     => 'string',
        optional => 1,
        memberof => ['M', 'F'],
    },
}

Output

{
    type         => 'arrayref',
    element_type => 'string',
}

Arguments

Returns

An arrayref of normalised occupation strings. May be empty if all occupations were filtered out.

Side Effects

If warn_on_error was set at construction and an occupation cannot be translated, emits a warning via carp.

Notes

Deduplication operates across the full list of occupations passed in. Processing a single occupation at a time will not deduplicate across multiple calls.

Deduplication compares the pre-translation normalised English forms, not the translated output. This means two consecutive identical English occupations correctly collapse to one entry even in French or German locales, where the translated results stored in the output array would otherwise never match the incoming English string.

Example

my $result = $normaliser->normalise(
    occupation => ['Ag Lab', 'Ag Lab', 'Retired'],
    sex        => 'M',
);
# Returns ['Agricultural Labourer']

my $result = $normaliser->normalise(
    occupation => 'Platelayer Railway',
);
# Returns ['Railway Platelayer']

AUTHOR

Nigel Horne <njh@bandsman.co.uk>

BUGS

Please report bugs via the GitHub issue tracker: https://github.com/nigelhorne/Genealogy-Occupation/issues

TODO

SEE ALSO

LICENSE AND COPYRIGHT

Copyright 2026 Nigel Horne.

This program is released under the following licence: GPL2 If you use it, please let me know.