Todo List for WordNet-SenseRelate-AllWords


Plans for future versions of AllWords, including, driver, and web interface.


For version 0.08

  1. Investigate more on tagged text with WordNet::Similarity::jcn and in general tagged text with all options. When run via web interface of, pos tagged text and jcn together produced output that was of the form word#pos, with no sense indicated. Problem lies in

  2. Add an option in web interface to upload context as a file. File should be one line per sentence, one sentence per line.

  3. Add an option to web interface for uploading config file for WordNet-Similarity relatedness measure. This will need to be passed through to WordNet-Similarity.

  4. The --compound option is not working with, should repair as a user might want to use a subset of the compounds in WordNet.

  5. Investigate on stoplist handling for tagged text. (what is the problem here?)

  6. Change from raw and parsed to a single plain format. The plain format will assume that the text has already been tokenized, and that each white space separated string is a token (word or punctuation mark). The format will further assume that sentence boundary detection has already been performed.

  7. Move sentence splitter from to util program. All input formats will assume that input is one sentence per line, one line per sentence.

  8. Expanding set of POS tags that can be used, either by modifying or allowing user to submit a config file of some kind defining the tag set. At present limited to Penn TreeBank set of 47 tags.

  9. Return codes from to indicate if no relatedness found, stopword, or not in wordnet.

  10. Return codes that identifies trace level (to enable color coding).

  11. Graceful shutdown and restart of web server. Allow for a stop or restart command from the command line, rather than having to kill the process.

  12. Adding proper logging for server. (What does this mean?)

  13. Make a design decision about whether web interface should communicate directly with via disambiguation method, or should use command.

  14. Develop methods for testing web interface, or at least directly testing disambiguate method as used by web interface. Should have test cases that can be run to demonstrate problems as listed here, and also make sure that once they are fixed they stay fixed.

  15. Expand the testing in /t for and Right now it's quite minimal, and has very limited coverage. We should have multiple .t files, organized in some way to indicate what kind of testing we are doing, maybe based on format and then options being used, as in tagged.t for testing pos tagged data, wntagged.t for wordnet tagged data, raw.t or plain.t for that format, and so there. There is no reason that the testing be confined to one file per module or program as it is now.


Varada Kolhatkar, University of Minnesota, Duluth
kolha002 at

Ted Pedersen, University of Minnesota, Duluth
tpederse at

This document last modified by : $Id: TODO.pod,v 1.1 2008/03/14 01:27:31 tpederse Exp $



Copyright (c) 2008, Varada Kolhatkar, Ted Pedersen, Jason Michelizzi

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.

Note: a copy of the GNU Free Documentation License is available on the web at and is included in this distribution as FDL.txt.