NAME
String::Compare - Compare two strings and return how much they are alike
SYNOPSIS
use String::Compare;
my $str1 = "J R Company";
my $str2 = "J. R. Company";
my $str3 = "J R Associates";
my $points12 = compare($str1,$str2);
my $points13 = compare($str1,$str3);
if ($points12 > $points13) {
print $str1." refers to ".$str2;
} else {
print $str1." refers to ".$str3;
}
DESCRIPTION
This module was created when I needed to merge the information between two databases, and I had to find who were who in each database, but the names weren't always equals, sometimes there were differences.
The problem was that I need to choose the right person, so I must see how much the different names are alike. I've tried testing char by char, but situations like the described in the synopsis showed me that wasn't enough. So I created a set of tests to give a more accurate pontuation of how much the names are alike.
The result is in percentage. If the strings are exactly equal, it would return 1, if they have nothing in common, it would return 0.
METHODS
- compare($str1,$str2,%tests)
-
This method receives the two strings and optionally the names and weights of each test. The default behavior is to use all the tests with the weigth 1. This method lowercases both strings, since case doesn't change the meaning of the content. But each test is case sensitive, so if you like you must lc the strings.
The current tests are (you can use the tests individually if you like:
P.S.: You can use custom tests, because the tests are executed using eval, so if you want a custom test, just use the full name of a method.
P.S.2: If you created a test, please share it, sending me by email and I will be glad to include it into the default set.
- word_by_word($str1, $str2)
-
Test char_by_char each word, giving points according to the size of the word.
COPYRIGHT
This module was created by "Daniel Ruoso" <daniel@ruoso.com>. It is licensed under both the GNU GPL and the Artistic License.