NAME

DTA::CAB::Chain::EN - DTA-like analysis chain class for contemporary english

SYNOPSIS

use DTA::CAB::Chain::EN;

##========================================================================
## Methods

$obj = CLASS_OR_OBJ->new(%args);
$ach = $ach->setupChains();
$bool = $ach->ensureLoaded();
$bool = $anl->doAnalyze(\%opts, $name);
$doc = $ach->analyzeClean($doc,\%opts);

DESCRIPTION

DTA::CAB::Chain::EN is a DTA::CAB::Analyzer subclass with a DTA::CAB::Chain::DTA-like naming scheme suitable for analyzing contemporary English input. This class inherits from DTA::CAB::Chain::Multi. See the "setupChains" method for a list of supported sub-chains and the corresponding analyers.

Methods

new
$obj = CLASS_OR_OBJ->new(%args);

%$obj, %args:

##-- paranoia
autoClean => 0,  ##-- always run 'clean' analyzer regardless of options; checked in both doAnalyze(), analyzeClean()
defaultChain => 'default',
##
##-- overrides
chains => undef, ##-- see setupChains() method
chain => undef, ##-- see setupChains() method

Additionally, the following sub-analyzers are defined as fields of %$obj:

tokpp

Token preprocessor, a DTA::CAB::Analyzer::TokPP object.

xlit

Transliterator, a DTA::CAB::Analyzer::Unicruft object.

morph

Morphological analyzer (Helsinki-style with TAGH emulation hacks), a DTA::CAB::Analyzer::Morph::Helsinki::EN object.

mlatin

Latin pseudo-morphology, a DTA::CAB::Analyzer::Morph::Latin object.

msafe

Morphological security heuristics, a DTA::CAB::Analyzer::MorphSafe object.

moot

HMM part-of-speech tagger, a DTA::CAB::Analyzer::Moot object.

mootsub

Post-processing for "moot" tagger, a DTA::CAB::Analyzer::MootSub object.

clean

Janitor (paranoid removal of internal temporary data), a DTA::CAB::Analyzer::DTAClean object.

setupChains
$ach = $ach->setupChains();

Setup default named sub-chains in $ach->{chains}. Currently defines a singleton chain sub.NAME for each analyzer key in keys(%$ach), as well as the following non-trivial chains:

'sub.sent'       =>[@$ach{qw(moot  mootsub)}],
'sub.sent1'      =>[@$ach{qw(moot1 mootsub)}],
##
'default.tokpp'  =>[@$ach{qw(tokpp)}],
'default.xlit'   =>[@$ach{qw(xlit)}],
'default.morph'  =>[@$ach{qw(tokpp xlit morph)}],
'default.mlatin' =>[@$ach{qw(tokpp xlit       mlatin)}],
'default.msafe'  =>[@$ach{qw(tokpp xlit morph mlatin msafe)}],
'default.langid' =>[@$ach{qw(tokpp xlit morph mlatin msafe langid)}],
'default.moot'   =>[@$ach{qw(tokpp xlit              morph mlatin msafe langid moot)}],
'default.moot1'  =>[@$ach{qw(tokpp xlit              morph mlatin msafe langid moot1)}],
'default.lemma'  =>[@$ach{qw(tokpp xlit morph mlatin msafe langid moot  mootsub)}],
'default.lemma1' =>[@$ach{qw(tokpp xlit morph mlatin msafe langid moot1 mootsub)}],
'default.base'   =>[@$ach{qw(tokpp xlit morph mlatin msafe langid)}],
'default.type'   =>[@$ach{qw(tokpp xlit morph mlatin msafe langid)}],
##
'norm'           =>[@$ach{qw(tokpp xlit morph mlatin msafe langid moot  mootsub)}],
'norm1'          =>[@$ach{qw(tokpp xlit morph mlatin msafe langid moot1 mootsub)}],
'all'            =>[@$ach{qw(tokpp xlit morph mlatin msafe langid moot  mootsub)}], ##-- old dta clients use 'all'!
'clean'          =>[@$ach{qw(clean)}],
'null'           =>[$ach->{null}],
ensureLoaded
$bool = $ach->ensureLoaded();

Ensures analysis data is loaded from default files. Inherited DTA::CAB::Chain::Multi override calls ensureChain() before inherited method. Hack copies chain sub-analyzers (rwsub, dmootsub) AFTER loading their own sub-analyzers, setting 'enabled' only then if appropriate.

doAnalyze
$bool = $anl->doAnalyze(\%opts, $name);

Alias for $anl->can("analyze${name}") && (!exists($opts{"doAnalyze${name}"}) || $opts{"doAnalyze${name}"}). Override checks $anl->{autoClean} flag.

analyzeClean
$doc = $ach->analyzeClean($doc,\%opts);

Cleanup any temporary data associated with $doc. Chain default calls $a->analyzeClean for each analyzer $a in the chain, then superclass Analyzer->analyzeClean. Local override checks $ach->{autoClean}.

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2016-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

dta-cab-analyze.perl(1), DTA::CAB::Chain::Multi(3pm), DTA::CAB::Chain(3pm), DTA::CAB::Analyzer(3pm), DTA::CAB(3pm), perl(1), ...

1 POD Error

The following errors were encountered while parsing the POD:

Around line 375:

'=item' outside of any '=over'