NAME

install.pod - how to install WordNet::Similarity

SYNOPSIS

perl Makefile.PL

make

make test

make install

DESCRIPTION

Prerequisites

You need to have WordNet (version 1.7.1 or later, 2.0 preferred), WordNet::QueryData (version 1.30 or later), and Text::OverlapFinder installed.

WordNet is available at http://www.cogsci.princeton.edu/~wn/ and WordNet::QueryData is available from http://search.cpan.org/dist/WordNet-QueryData/. Text::OverlapFinder is distributed with the Text-Similarity package and is available from http://search.cpan.org/dist/Text-Similarity/.

You should set the WNHOME environment variable to the location where you have WordNet installed; see the WordNet::QueryData documentation for more information.

Installing

The usual way to install the package is to run the following commands:

perl Makefile.PL

make

make test

make install

If you can't set the WNHOME environment variable, you can use the WNHOME option when running perl Makefile.PL. For example,

perl Makefile.PL WNHOME=/usr/local/WordNet-2.0

You will often need root access/superuser priviledges to run make install. The module can also be installed locally. To do a local install, you need to specify a PREFIX option when you run 'perl Makefile.PL'. For example,

perl Makefile.PL PREFIX=/home/sid

or

perl Makefile.PL LIB=/home/sid/lib PREFIX=/home/sid

will install Similarity into /home/sid. The first method above will install the modules in /home/sid/lib/perl5/site_perl/5.8.3 (assuming you are using version 5.8.3 of Perl; otherwise, the directory will be slightly different). The second method will install the modules in /home/sid/lib. In either case the executable scripts will be installed in /home/sid/bin and the man pages will be installed in home/sid/share.

Warning: do not put a dash or hyphen in front of PREFIX or LIB.

In your perl programs that you may write using the modules, you may need to add a line like so

use lib '/home/sid/lib/perl5/site_perl/5.8.3';

if you used the first method or

use lib '/home/sid/lib';

if you used the second method. By doing this, the installed modules are found by your program. To run the similarity.pl program, you would need to do

perl -I/home/sid/lib/perl5/site_perl/5.8.3 similarity.pl

or

perl -I/home/sid/lib

Of course, you could also add the 'use lib' line to the top of the program yourself, but you might not want to do that. You will need to replace 5.8.3 with whatever version of Perl you are using. The preceeding instructions should be sufficient for standard and slightly non-standard installations. However, if you need to modify other makefile options you should look at the ExtUtils::MakeMaker documentation. Modifying other makefile options is not recommended unless you really, absolutely, and completely know what you're doing!

NOTE: The information-content based measures (res, lin, jcn) are invoked using the default information content file generated during installation of the modules. If, however, the version of WordNet being used on your system has changed since that time, or for some reason the modules are unable to locate the default information content files, then alternate information content files can be specified only by using a configuration file corresponding to each of the modules. Format and creation of configuration files has been discussed in the documentation. Utilities to generate information content files have been provided in the package.

NOTE: If one (or more) of the tests run by 'make test' fails, you will see a summary of the tests that failed, followed by a message of the form "make: *** [test_dynamic] Error Y" where Y is a number between 1 and 255 (inclusive). If the number is less than 255, then it indicates how many test failed (if more than 254 tests failed, then 254 will still be shown). If one or more tests died, then 255 will be shown. For more details, see:

http://search.cpan.org/dist/Test-Simple/lib/Test/Builder.pm#EXIT_CODES

System Requirements

  1. Perl version 5.6 or later. This package has been written in Perl, which is freely available from www.perl.org. This package assumes that perl is installed in the directory /usr/local/bin. If this is where perl is on your computer, then the support programs can be run directly at the command line as 'similarity.pl ...' or 'semCorFreq.pl...', etc. However, if Perl is not installed at this location, you would need to explicitly invoke them as 'perl similarity.pl ... ' or 'perl freqCount.pl...', etc.

  2. WordNet: All the measures are based on WordNet. WordNet must be installed on your system. WordNet is freely downloadable from http://www.cogsci.princeton.edu/~wn/. WordNet version 2.0 was used during the development and testing of the package; however, it should work with other versions of WordNet as well. The WordNet::QueryData Perl module is used to access WordNet. This module requires that an environment variable 'WNHOME', containing the path to the WordNet files, be set up. For further details, please see the WordNet::QueryData documentation.

  3. WordNet::QueryData: This is the Perl interface to WordNet written by Jason Rennie. QueryData should be accessible on the @INC path of Perl. (Can be freely downloaded from http://search.cpan.org/dist/WordNet-QueryData/). QueryData 1.31 was used during the development. Also we observed that that due to some major changes in QueryData from its previous versions, this software does not work with the earlier versions of QueryData. If you have an earlier version of QueryData (1.29 or earlier) you may need to upgrade QueryData.

  4. Text::OverlapFinder: This module is used by the WordNet::Similarity::lesk measure for finding runs of word overlaps in glosses. The Text::OverlapFinder module can be downloaded from http://search.cpan.org/dist/Text-Similarity/.

Optional Tests

Running 'make install' after make will run a short series of tests. These tests should not take more than a few minutes to run. There is another series of more rigorous tests that may also be run; however, these tests can take quite some time to run (over an hour on some machines). To run these tests, run 'make test_all'.

Example

The following procedure will work on most Linux systems.

cd /tmp
wget http://www.cogsci.princeton.edu/2.0/WordNet-2.0.tar.gz
wget http://search.cpan.org/CPAN/authors/id/J/JR/JRENNIE/WordNet-QueryData-1.35.tar.gz
wget http://search.cpan.org/CPAN/authors/id/J/JA/JASONM/Text-Similarity-0.02.tar.gz
wget http://search.cpan.org/CPAN/authors/id/T/TP/TPEDERSE/WordNet-Similarity-0.12.tar.gz

Then unpack each one:

tar -zxvf WordNet-2.0.tar.gz
tar -zxvf WordNet-QueryData-1.35.tar.gz
tar -zxvf Text-Similarity-0.02.tar.gz
tar -zxvf WordNet-Similarity-0.12.tar.gz

Install WordNet:

cd WordNet-2.0

Unfortunately, you have to manually edit the Makefile yourself, but you probably only need to make one change. Open the Makefile with your favorite text editor (emacs, for example):

emacs Makefile

Then find the section that looks like this:

PLATFORM = solaris
#PLATFORM = irix
#PLATFORM = linux

If you are using Linux, comment out the first line and uncomment the last line so that it looks like this:

#PLATFORM = solaris
#PLATFORM = irix
PLATFORM = linux

Exit the editor.

Then just type:

make BinWorld

You will need root privileges to install.

su
make install
exit

Installing QueryData and Similarity is much easier:

cd /tmp/WordNet-QueryData-1.35
perl Makefile.PL
make
make test
su
make install
exit

cd /tmp/Text-Similarity-0.02
perl Makefile.PL
make
make test
su
make install
exit

cd /tmp/WordNet-Similarity-0.11
perl Makefile.PL
make
make test
su
make install
exit

AUTHORS

Siddharth Patwardhan, University of Utah, Salt Lake City
sidd at cs.utah.edu

Jason Michelizzi, University of Minnesota Duluth
mich0212 at d.umn.edu

Ted Pedersen, University of Minnesota Duluth
tpederse at d.umn.edu

SEE ALSO

intro.pod modules.pod

COPYRIGHT AND LICENSE

Copyright (C) 2003-2004 Siddharth Patwardhan, Ted Pedersen, and Jason Michelizzi

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.

Note: a copy of the GNU Free Documentation License is available on the web at http://www.gnu.org/copyleft/fdl.html and is included in this distribution as FDL.txt.