NAME

Text::JaroWinkler - An implementation of the Jaro-Winkler distance

SYNOPSIS

use Text::JaroWinkler qw( strcmp95 );

print strcmp95("it is a dog","i am a dog.",11);
# print "0.865619834710744"

DESCRIPTION

This module implements the Jaro-Winkler distance. The Jaro-Winkler distance is a measure of similarity between two strings. It is a variant of the Jaro distance metric and mainly used in the area of record linkage (duplicate detection). The higher the Jaro-Winkler distance for two strings is, the more similar the strings are. The Jaro-Winkler distance metric is designed and best suited for short strings such as person names. The score is normalized such that 0 equates to no similarity and 1 is an exact match. More information can be found on <http://en.wikipedia.org/wiki/Jaro-Winkler>

It is an XS wrapper of the original C implementation by the author of the algorithm: <http://www.census.gov/geo/msb/stand/strcmp.c>, with some minor modification to accept variance length input.

EXPORT

None by default.

AUTHOR

Shu-Chun Weng <scw@csie.org>

SEE ALSO

perl, Text::Levenshtein, Text::LevenshteinXS, Text::WagnerFischer