NAME
Word2vec::Lesk - Word2vec-Interface Utility Module.
SYNOPSIS
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
$string_a
=
"This is a test string"
;
my
$string_b
=
"This is another test string"
;
my
$lesk_score
=
$lesk
->CalculateLeskScore(
$string_a
,
$string_b
);
my
$cosine_score
=
$lesk
->CalculateCosineScore(
$string_a
,
$string_b
);
my
$f_score
=
$lesk
->CalcualteFScore(
$string_a
,
$string_b
);
(
"Lesk Score: $lesk_score\n"
);
(
"Cosine Score: $cosine_score\n"
);
(
"F Score: $f_score\n"
);
undef
(
$lesk
);
or
my
$lesk
= Word2vec::Lesk->new();
my
$string_a
=
"This is a test string"
;
my
$string_b
=
"This is another test string"
;
my
%results
= %{
$lesk
->CalculateAllScores(
$string_a
,
$string_b
) };
for
my
$key
(
sort
keys
%results
)
{
"$key: $results{ $key }\n"
;
}
undef
(
%results
);
undef
(
$lesk
);
DESCRIPTION
Word2vec::Lesk is a module of Lesk functions for the Word2vec::Interface package. Lesk, Raw Lesk, Cosine, F, Recall and Precision scores are all calculated and returned to the used based on phrase/feature overlap between two strings.
Main Functions
new
Description:
Returns a new
"Word2vec::Lesk"
module object.
Note: Specifying
no
parameters implies
default
options.
Default Parameters:
debugLog = 0
writeLog = 0
Input:
$debugLog
-> Instructs module to
debug statements to the console. (1 = True / 0 = False)
$writeLog
-> Instructs module to
debug statements to a
log
file. (1 = True / 0 = False)
Output:
Word2vec::Lesk object.
Example:
DESTROY
Description:
Removes Word2vec::Lesk object from memory.
Input:
None
Output:
None
Example:
See above example
for
"new"
function.
Note: Destroy function is also automatically called during global destruction
when
exiting the program.
GetMatchingFeatures
Description:
Given two strings, this returns a hash of all overlapping (matching) features between both strings and their frequency counts.
Input:
$string_a
-> First comparison string
$string_b
-> Second comparison string
Output:
$hash_ref
-> Returns a hash table reference
with
keys
being the unique matching feature between two input string parameters and the value as the frequency count of
each
unique feature.
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
%matching_features
= %{
$lesk
->GetMatchingFeatures(
"I like to eat cookies"
,
"Sometimes I like to eat cookies"
) };
for
my
$feature
(
sort
keys
%matching_features
)
{
"$feature : $matching_features{ $feature }\n"
;
}
undef
(
%matching_features
);
undef
(
$lesk
);
GetPhraseOverlap
Description:
Given two strings, this returns a hash of all overlapping (matching) phrases between both strings and their frequency counts. This prioritizes longer phrases as higher priority
when
matching.
Input:
$string_a
-> First comparison string
$string_b
-> Second comparison string
Output:
$hash_ref
-> Returns a hash table reference
with
keys
being the unique matching phrase between two input string parameters and the value as the frequency count of
each
unique phrase.
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
%phrase_overlaps
= %{
$lesk
->GetPhraseOverlap(
"I like to eat cookies"
,
"Sometimes I like to eat cookies"
) };
for
my
$phrase
(
sort
keys
%phrase_overlaps
)
{
"$phrase : $phrase_overlaps{ $phrase }\n"
;
}
undef
(
%phrase_overlaps
);
undef
(
$lesk
);
CalculateLeskScore
Description:
Given two strings, this returns a lesk score based on overlapping (matching) features between both strings.
Input:
$string_a
-> First comparison string
$string_b
-> Second comparison string
Output:
$score
-> Lesk Score (Float)
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
$lesk_score
=
$lesk
->CalculateLeskScore(
"I like to eat cookies"
,
"Sometimes I like to eat cookies"
);
"Lesk Score: $lesk_score\n"
;
undef
(
$lesk
);
CalculateCosineScore
Description:
Given two strings, this returns a cosine score based on overlapping (matching) features between both strings.
Input:
$string_a
-> First comparison string
$string_b
-> Second comparison string
Output:
$score
-> Cosine Score (Float)
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
$cosine_score
=
$lesk
->CalculateCosineScore(
"I like to eat cookies"
,
"Sometimes I like to eat cookies"
);
"Cosine Score: $cosine_score\n"
;
undef
(
$lesk
);
CalculateFScore
Description:
Given two strings, this returns a F score based on overlapping (matching) features between both strings.
Input:
$string_a
-> First comparison string
$string_b
-> Second comparison string
Output:
$score
-> F Score (Float)
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
$f_score
=
$lesk
->CalculateFScore(
"I like to eat cookies"
,
"Sometimes I like to eat cookies"
);
"F Score: $f_score\n"
;
undef
(
$lesk
);
CalculateAllScores
Description:
Given two strings, this returns a list of scores (F, Cosine, Lesk, Raw Lesk, Precision, Recall), frequency counts (features, phrases, string lengths).
Input:
$string_a
-> First comparison string
$string_b
-> Second comparison string
Output:
$result_hash
-> Hash reference containing: Lesk, Raw Lesk, F, Precision, Recall, Cosine, Matching Feature Frequency, Matching Phrase Frequency, String A Length and String B Length.
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
%scores
= %{
$lesk
->CalculateAllScores(
"I like to eat cookies"
,
"Sometimes I like to eat cookies"
) };
for
my
$score_name
(
sort
keys
%scores
)
{
"$score_name : $scores{ $score_name }\n"
;
}
undef
(
$lesk
);
Accessor Functions
GetDebugLog
Description:
Returns the _debugLog member variable set during Word2vec::Lesk object initialization of new function.
Input:
None
Output:
$value
->
'0'
= False,
'1'
= True
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new()
my
$debugLog
=
$lesk
->GetDebugLog();
(
"Debug Logging Enabled\n"
)
if
$debugLog
== 1;
(
"Debug Logging Disabled\n"
)
if
$debugLog
== 0;
undef
(
$lesk
);
GetWriteLog
Description:
Returns the _writeLog member variable set during Word2vec::Lesk object initialization of new function.
Input:
None
Output:
$value
->
'0'
= False,
'1'
= True
Example:
use
Word2vec::Lesk;
my
$lesk
= Word2vec::Lesk->new();
my
$writeLog
=
$lesk
->GetWriteLog();
(
"Write Logging Enabled\n"
)
if
$writeLog
== 1;
(
"Write Logging Disabled\n"
)
if
$writeLog
== 0;
undef
(
$lesk
);
Debug Functions
WriteLog
Description:
Prints passed string parameter to the console,
log
file or both depending on user options.
Note: printNewLine parameter prints a new line character following the string
if
the parameter
is undefined and does not
if
parameter is 0.
Input:
$string
-> String to
to the console/
log
file.
$value
-> 0 = Do not
newline character
after
string, all
else
prints new line character including
'undef'
.
Output:
None
Example:
use
Word2vec::Lesk:
my
$lesk
= Word2vec::Lesk->new();
$lesk
->WriteLog(
"Hello World"
);
undef
(
$lesk
);
Author
Clint Cuffy, Virginia Commonwealth University
COPYRIGHT
Copyright (c) 2016
Bridget T McInnes, Virginia Commonwealth University
btmcinnes at vcu dot edu
Clint Cuffy, Virginia Commonwealth University
cuffyca at vcu dot edu
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to:
The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.