Name
Text::SenseClusters::LabelEvaluation::SimilarityScore - Module for getting the similarity score between the contents of the two files.
SYNOPSIS
# The following code snippet will show how to use SimilarityScore.
package Text::SenseClusters::LabelEvaluation::Test_SimilarityScore;
# Including the LabelEvaluation Module.
use Text::SenseClusters::LabelEvaluation::SimilarityScore;
my $firstString = "IBM::: vice president, million dollars, Wall Street, Deep Blue, ".
"International Business, Business Machines, International Machines, ".
"United States, Justice Department, personal computers";
my $secondString = "vice president, million dollars, Deep Blue, International Business, ".
"Business Machines, International Machines, United States, Justice Department";
my $similarityObject = Text::SenseClusters::LabelEvaluation::SimilarityScore->
new($firstString,$secondString, "../stoplist.txt");
#my $score = $similarityObject->computeOverlappingScores();
my ($score, %allScores) = $similarityObject->computeOverlappingScores();
print "Score:: $score \n";
print "Lesk Score :: $allScores{'lesk'} \n";
print "Raw Lesk Score :: $allScores{'raw_lesk'} \n";
print "precision Score :: $allScores{'precision'} \n";
print "recall Score :: $allScores{'recall'} \n";
print "F Score :: $allScores{'F'} \n";
print "dice Score :: $allScores{'dice'} \n";
print "E Score :: $allScores{'E'} \n";
print "cosine Score :: $allScores{'cosine'} \n";
print "\n\n";
DESCRIPTION
This module provide a function that will compare the two strings and return the overlapping scores. Please refer the following for details description how it will calculate the similarity score: http://search.cpan.org/~tpederse/Text-Similarity-0.09/
Constructor: new()
This is the constructor which will create object for this class. Reference : http://perldoc.perl.org/perlobj.html
This constructor takes these argument and intialize it for the class:
1. $clusterData : Datatype: String
This variable contains the labels generated by the SenseClusters.
2. $scoreObject : Datatype: String
This variable contains the Gold standard key's data.
3. $stopListFileLoc : Datatype: String
This variable contains the user defined location for the stop list file.
4. $verbose : Datatype: integer
This variable tells whether to display all type of similarity score or not.
Function: computeOverlappingScores
Function that will compare the labels file with the wiki files and will return the overlapping score.
@argument1 : Name of the cluster file. @argument2 : Name of the file containing the data from Wikipedia. @argument3 : Name of the file containing the stop word lists.
@return : Return the overlapping scores between these files.
@description : 1). Reading the file name from the command line argument. 2). Invoking the Text::Similarity::Overlaps module and passing the file names for similarity comparison. 3). Then overlapping scores obtained from this module is returned as the similarity value.
SEE ALSO
http://senseclusters.cvs.sourceforge.net/viewvc/senseclusters/LabelEvaluation/
Last modified by : $Id: SimilarityScore.pm,v 1.5 2013/03/07 23:14:13 jhaxx030 Exp $
AUTHORS
Anand Jha, University of Minnesota, Duluth
jhaxx030 at d.umn.edu
Ted Pedersen, University of Minnesota, Duluth
tpederse at d.umn.edu
COPYRIGHT AND LICENSE
Copyright (C) 2012 Ted Pedersen, Anand Jha
See http://dev.perl.org/licenses/ for more information.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to:
The Free Software Foundation, Inc., 59 Temple Place, Suite 330,
Boston, MA 02111-1307 USA