NAME
Text::Levenshtein::Damerau - Damerau Levenshtein edit distance.
SYNOPSIS
use
warnings;
use
strict;
my
@targets
= (
'fuor'
,
'xr'
,
'fourrrr'
,
'fo'
);
# Initialize Text::Levenshtein::Damerau object with text to compare against
my
$tld
= Text::Levenshtein::Damerau->new(
'four'
);
$tld
->dld(
$targets
[0]);
# prints 1
my
$tld
=
$tld
->dld({
list
=> \
@targets
});
$tld
->{
'fuor'
};
# prints 1
$tld
->dld_best_match({
list
=> \
@targets
});
# prints fuor
$tld
->dld_best_distance({
list
=> \
@targets
});
# prints 1
# or even more simply
use
warnings;
use
strict;
edistance(
'Neil'
,
'Niel'
);
# prints 1
DESCRIPTION
Returns the true Damerau Levenshtein edit distance of strings with adjacent transpositions. Useful for fuzzy matching, DNA variation metrics, and fraud detection.
Defaults to using Pure Perl Text::Levenshtein::Damerau::PP, but has an XS addon Text::Levenshtein::Damerau::XS for massive speed imrovements. Works correctly with utf8 if backend supports it; known to work with Text::Levenshtein::Damerau::PP
and Text::Levenshtein::Damerau::XS
.
CONSTRUCTOR
new
Creates and returns a Text::Levenshtein::Damerau
object. Takes a scalar with the text (source) you want to compare against.
my
$tld
= Text::Levenshtein::Damerau->new(
'Neil'
);
# Creates a new Text::Levenshtein::Damerau object $tld
METHODS
$tld->dld
Scalar Argument: Takes a string to compare with.
Returns: an integer representing the edit distance between the source and the passed argument.
Hashref Argument: Takes a hashref containing:
list => \@array (array ref of strings to compare with)
OPTIONAL max_distance => $int (only return results with $int distance or less).
OPTIONAL backend => 'Some::Module::its_function' Any module that will take 2 arguments and returns an int. If the module fails to load, the function doesn't exist, or the function doesn't return a number when passed 2 strings, then
backend
remains unchanged.# Override defaults and use Text::Levenshtein::Damerau::PP's pp_edistance()
$tld
->dld({
list
=> \
@list
,
backend
=>
'Text::Levenshtein::Damerau::PP::pp_edistance'
);
# Override defaults and use Text::Levenshtein::Damerau::XS's xs_edistance()
requires Text::Levenshtein::Damerau::XS;
...
$tld
->dld({
list
=> \
@list
,
backend
=>
'Text::Levenshtein::Damerau::XS::xs_edistance'
);
Returns: hashref with each word from the passed list as keys, and their edit distance (if less than max_distance, which is unlimited by default).
my
$tld
= Text::Levenshtein::Damerau->new(
'Neil'
);
$tld
->dld(
'Niel'
);
# prints 1
#or if you want to check the distance of various items in a list
my
@names_list
= (
'Niel'
,
'Jack'
);
my
$tld
= Text::Levenshtein::Damerau->new(
'Neil'
);
my
$d_ref
=
$tld
->dld({
list
=> \
@names_list
});
# pass a list, returns a hash ref
$d_ref
->{
'Niel'
};
#prints 1
$d_ref
->{
'Jack'
};
#prints 4
$tld->dld_best_match
Argument: an array reference of strings.
Returns: the string with the smallest edit distance between the source and the array of strings passed.
Takes distance of $tld source against every item in @targets, then returns the string of the best match.
my
$tld
= Text::Levenshtein::Damerau->new(
'Neil'
);
my
@name_spellings
= (
'Niel'
,
'Neell'
,
'KNiel'
);
$tld
->dld_best_match({
list
=> \
@name_spellings
});
# prints Niel
$tld->dld_best_distance
Arguments: an array reference of strings.
Returns: the smallest edit distance between the source and the array reference of strings passed.
Takes distance of $tld source against every item in the passed array, then returns the smallest edit distance.
my
$tld
= Text::Levenshtein::Damerau->new(
'Neil'
);
my
@name_spellings
= (
'Niel'
,
'Neell'
,
'KNiel'
);
$tld
->dld_best_distance({
list
=> \
@name_spellings
});
# prints 1
EXPORTABLE METHODS
edistance
Arguments: source string and target string.
OPTIONAL 3rd argument int (max distance; only return results with $int distance or less). 0 = unlimited. Default = 0.
Returns: int that represents the edit distance between the two argument. -1 if max distance is set and reached.
Wrapper function to take the edit distance between a source and target string. It will attempt to use, in order:
Text::Levenshtein::Damerau::XS xs_edistance
Text::Levenshtein::Damerau::PP pp_edistance
SEE ALSO
https://github.com/ugexe/Text--Levenshtein--Damerau Repository
http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance Damerau levenshtein explanation
Text::Fuzzy Regular levenshtein distance
BUGS
Please report bugs to:
https://rt.cpan.org/Public/Dist/Display.html?Name=Text-Levenshtein-Damerau
AUTHOR
Nick Logan <ug@skunkds.com>
LICENSE AND COPYRIGHT
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.