README for DTA::CAB

ABSTRACT

DTA::CAB - "Cascaded Analysis Broker" for error-tolerant linguistic analysis

REQUIREMENTS

Perl Modules

See Makefile.PL, META.json, and/or META.yml in the distribution directory. Perl dependencies should be available on CPAN.

Additional Perl modules may be required by particular DTA::CAB::Analyzer subclasses. If you see errors like

Can't locate foo.pm in @INC (you may need to install the foo module)

... then you should probably first try looking for the foo module on on CPAN.

External Web-Service

If you just want to use the client libraries to query an external DTA::CAB web-service, you'll need only the URL for that service and an active internet connection. See the DTA::CAB Web-Service HOWTO for an introduction.

Language Resources

If you want to do anything other than querying an external DTA::CAB web-service, you'll need a small menagerie of gfsm transducers and various assorted other language(-variant)-specific resources which are not included in this distribution, and for which (presumably) there exists no "one-size-fits-all" solution. Look at the documentation and code of the individual DTA::CAB::Analyzer subclasses you're interested in for more details.

DESCRIPTION

The DTA::CAB package provides an object-oriented compiler/interpreter for error-tolerant heuristic morphological analysis of tokenized text.

INSTALLATION

Issue the following commands to the shell:

bash$ cd DTA-CAB-0.01   # (or wherever you unpacked this distribution)
bash$ perl Makefile.PL  # check requirements, etc.
bash$ make              # build the module
bash$ make test         # (optional): test module before installing
bash$ make install      # install the module on your system

REFERENCES

If you use this service in an academic context, please include the following citation in any related publications:

  • Jurish, Bryan. Finite-state Canonicalization Techniques for Historical German. PhD thesis, Universität Potsdam, 2012 (defended 2011). URN urn:nbn:de:kobv:517-opus-55789, [online, PDF, BibTeX]

See here for a list of other CAB-related publications.

SEE ALSO

AUTHOR

Bryan Jurish <moocow@cpan.org>