NAME
Lingua::PT::Gender - Decides if a Portuguese proper name is male or female
SYNOPSIS
use Lingua::PT::Gender qw/ptbr_gender/;
$result = ptbr_gender("Marco Carnut");
# $result now holds 1 meaning 'male'
$result = ptbr_gender("Ana Paula");
# $result = now holds 0 meaning 'female'
DESCRIPTION
This module provides a routine to decide whether a Portuguese name is male or female. The algorithm examines a table of suffixes to determine this.
The table was computed using a recursive space subdivision algorithm operating on a database of about 60,000 proper names.
Typical accuracy is greater than 99%. This makes it useful to find enrollment errors in databases.
ptbr_gender
This is the only function in this module. It returns 0 for female or 1 for male. Comparisons are case insensitive. It expects non-accented letters; it is your responsibility to strip them beforehand. The routine considers only the first name (word) on the string; all others are ignored.
A simple filter that gets names from the standard input and prefixes them with M or F accordingly:
#!/usr/bin/perl
use Lingua::PT::Gender qw/ptbr_gender/;
while (<>)
{
print ptbr_gender($_) ? "M" : "F";
print " $_";
}
LICENSE
GPL2 - http://www.gnu.org/licenses/gpl.txt
AUTHOR
Marco "Kiko" Carnut <kiko at tempest.com.br>
http://www.postcogito.org/