README for DTA::CAB
ABSTRACT
DTA::CAB - "Cascaded Analysis Broker" for error-tolerant linguistic analysis
REQUIREMENTS
- Perl Modules
-
See
Makefile.PL,META.json, and/orMETA.ymlin the distribution directory. Perl dependencies should be available on CPAN.Additional Perl modules may be required by particular DTA::CAB::Analyzer subclasses. If you see errors like
Can't locate foo.pm in @INC (you may need to install the foo module)... then you should probably first try looking for the
foomodule on on CPAN. - External Web-Service
-
If you just want to use the client libraries to query an external
DTA::CABweb-service, you'll need only the URL for that service and an active internet connection. See the DTA::CAB Web-Service HOWTO for an introduction. - Language Resources
-
If you want to do anything other than querying an external
DTA::CABweb-service, you'll need a small menagerie ofgfsmtransducers and various assorted other language(-variant)-specific resources which are not included in this distribution, and for which (presumably) there exists no "one-size-fits-all" solution. Look at the documentation and code of the individual DTA::CAB::Analyzer subclasses you're interested in for more details.
DESCRIPTION
The DTA::CAB package provides an object-oriented compiler/interpreter for error-tolerant heuristic morphological analysis of tokenized text.
INSTALLATION
Issue the following commands to the shell:
bash$ cd DTA-CAB-0.01 # (or wherever you unpacked this distribution)
bash$ perl Makefile.PL # check requirements, etc.
bash$ make # build the module
bash$ make test # (optional): test module before installing
bash$ make install # install the module on your system
REFERENCES
If you use this service in an academic context, please include the following citation in any related publications:
Jurish, Bryan. Finite-state Canonicalization Techniques for Historical German. PhD thesis, Universität Potsdam, 2012 (defended 2011). URN urn:nbn:de:kobv:517-opus-55789, [online, PDF, BibTeX]
See here for a list of other CAB-related publications.
SEE ALSO
The CAB software page is the top-level repository for CAB documentation, news, etc.
The DTA::CAB manual page contains a basic introduction to the the CAB architecture.
The DTA::CAB::Format manual page describes the abstract CAB I/O Format API, and includes a list of supported format classes.
The DTA::CAB::HttpProtocol manual page describes the conventions used by the CAB web-service API.
The DTA 'Base Format' Guidelines (DTABf) describes the subset of the TEI encoding guidelines which can reasonably be expected to be handled gracefully by the CAB TEI and/or TEIws formatters.
AUTHOR
Bryan Jurish <moocow@cpan.org>