NAME
WordNet::Similarity::LCSFinder - methods for finding Least Common Subsumers
SYNOPSIS
use WordNet::QueryData;
my $wn = WordNet::QueryData->new;
my $obj = WordNet::Similarity::LCSFinder->new ($wn);
my ($lcs, $depth) = $obj->getLCSbyDepth ("scientist#n#1", "poet#n#1", "n", "wps");
my ($lcs, $pathlen) = $obj->getLCSbyPath ("dog#n#1", "cat#n#1", "n", "wps");
my ($lcs, $ic) = $obj->getLCSbyIC ("dog#n#1", "cat#n#1", "n", "wps");
DESCRIPTION
The following methods are declared in this module:
- getLCSbyDepth($synset1, $synset2, $pos, $mode)
-
Given two input synsets, finds the least common subsumer (LCS) of them. If there are multiple candidates for the LCS (due to multiple inheritance in WordNet), the LCS with the greatest depth is chosen (i.e., the candidate whose shortest path to the root is the longest).
Parameters: a blessed reference, two synsets, a part of speech, and a mode. The mode must the either the string 'wps' or 'offset'. If the mode is wps, then the two input synsets must be in word#pos#sense format. If the mode is offset, then the input synsets must be WordNet offsets.
Returns: a list of the form ($lcs, $depth) where $lcs is the LCS (in wps format if mode is 'wps' or an offset if mode is 'offset'. $depth is the depth of the LCS in its taxonomy. Returns undef on error.
- getLCSbyPath($synset1, $synset2, $pos, $mode)
-
Given two input synsets, finds the least common subsumer (LCS) of them. If there are multiple candidates for the LCS (due to multiple inheritance), the LCS that results in the shortest path between in input concepts is chosen.
Parameters: two synsets, a part of speech, and a mode.
Returns: a list of references to arrays where each array has the from
($lcs, $pathlength)
. $pathlength is the length of the path between the two input concepts. There can be multiple LCSs returned if there are ties for the shortest path between the two synsets. Returns undef on error. - getLCSbyIC($synset1, $synset2, $pos, $mode)
-
Given two input synsets, finds the least common subsumer (LCS) of them. If there are multiple candidates for the LCS, the the candidate with the greatest information content.
Parameters: two synsets, a part of speech, and a mode.
Returns: a list of the form ($lcs, $ic) where $lcs is the LCS and $ic is the information content of the LCS.
AUTHORS
Jason Michelizzi, University of Minnesota Duluth
mich0212 at d.umn.edu
Siddharth Patwardhan, University of Utah, Salt Lake City
sidd at cs.utah.edu
Ted Pedersen, University of Minnesota Duluth
tpederse at d.umn.edu
BUGS
None.
Report bugs to tpederse at d.umn.edu or go to http://groups.yahoo.com/group/wn-similarity (preferred).
SEE ALSO
WordNet::Similarity(3) WordNet::Similarity::PathFinder(3) WordNet::Similarity::ICFinder(3) WordNet::Similarity::res(3) WordNet::Similarity::lin(3) WordNet::Similarity::jcn(3)
COPYRIGHT
Copyright (C) 2004, Jason Michelizzi, Siddharth Patwardhan, and Ted Pedersen
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to
The Free Software Foundation, Inc.,
59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.
Note: a copy of the GNU General Public License is available on the web at http://www.gnu.org/licenses/gpl.txt and is included in this distribution as GPL.txt.