NAME
DTA::CAB::Analyzer::TokPP - type-level heuristic token preprocessor (for punctuation etc)
SYNOPSIS
##========================================================================
## PRELIMINARIES
use DTA::CAB::Analyzer::TokPP::Perl;
##========================================================================
## Methods
$obj = CLASS_OR_OBJ->new(%args);
$bool = $anl->ensureLoaded();
$doc = $tpp->analyzeTypes($doc,\%types,\%opts);
DESCRIPTION
DTA::CAB::Analyzer::TokPP::Perl provides pure-perl a DTA::CAB::Analyzer interface to some simple text-based type-wise word analysis heuristics, e.g. for detection of punctutation, numeric strings, etc.
Methods
- new
-
$obj = CLASS_OR_OBJ->new(%args);
%$obj, %args:
label => $label, ##-- analyzer label; default='tokpp'
- ensureLoaded
-
$bool = $anl->ensureLoaded();
Ensures analysis data is loaded. Always returns 1.
Methods: Analysis
- analyzeTypes
-
$doc = $tpp->analyzeTypes($doc,\%types,\%opts);
Perform type-wise analysis of all (text) types in values(%types). Override sets:
$tok->{$anl->{label}} = \@morphHiStrings
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2010-2019 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
SEE ALSO
dta-cab-analyze.perl(1), DTA::CAB::Analyzer(3pm), DTA::CAB::Chain(3pm), DTA::CAB(3pm), perl(1), ...