Name
Lingua::EN::SimilarNames::Levenshtein - Compare people first and last names.
Synopsis
my $people = [
[ 'John', 'Wayne' ],
[ 'Sundance', 'Kid' ],
[ 'Jose', 'Wales' ],
[ 'John', 'Wall' ],
];
my @people_objects = map {
Person->new(
first_name => $_->[0],
last_name => $_->[1],
)
} @{$people};
# Build list of name pairs within 5 character edits of each other
my $similar_people = SimilarNames->new(
list_of_people => \@people_objects,
maximum_distance => 5
);
# Get the people name pairs as an ArrayRef[ArrayRef[ArrayRef[Str]]]
print Dumper $similar_people->list_of_similar_name_pairs;
# which results in:
[
[ [ "Jose", "Wales" ], [ "John", "Wall" ] ],
[ [ "Jose", "Wales" ], [ "John", "Wayne" ] ],
[ [ "John", "Wall" ], [ "John", "Wayne" ] ]
]
Description
Given a list of people objects, find the people whose names are within a specified edit distance.
Classes
Person
This class defines people objects with first and last name attributes.
CompareTwoNames
This class defines comparator objects. Given two Person objects, it computes the edit distance between their names.
SimilarNames
This class takes a list of Person objects and uses CompareTwoNames to generate a list of people with similar names based on an edit distance range.
One can get at the list of Person object pairs with similar name via the list_of_people_with_similar_names
attribute. Alternatively, one can get at list of the names pairs themselves (no Person object) via the list_of_similar_name_pairs
attribute.
Accessors
list_of_similar_name_pairs
This is called on a SimilarNames object to return a list of similar name pairs for the list of Person objects passed in. It uses the Levenshtein edit distance. This means the names are close to one another in spelling.
list_of_people_with_similar_names
This accessor is similar to the list_of_similar_name_pairs
but returns a list of Person object pairs instead of the names.
Authors
Mateu X. Hunter hunter@missoula.org
Copyright
Copyright 2010, Mateu X. Hunter
License
You may distribute this code under the same terms as Perl itself.
Code Repository
http://github.com/mateu/Lingua-EN-SimilarNames-Levenshtein