NAME
DTA::CAB::Analyzer::Morph::SMOR - morphological analysis via Gfsm automata, for SMOR-style transducers (e.g. Zmorge)
SYNOPSIS
$morph
= DTA::CAB::Analyzer::Morph::SMOR->new(
%args
);
$morph
->analyze(
$tok
);
DESCRIPTION
DTA::CAB::Analyzer::Morph::SMOR is a subclass of DTA::CAB::Analyzer::Morph::Helsinki::DE suitable for use with SMOR-style transducers, including zmorge transducers as produced by the SMORLemma grammar.
To produce a GFSM transducer (zmorge.gfst
) and vocabulary (zmorge.lab
) suitable for use with this module from one of the binary SFST-format transducers available from https://pub.cl.uzh.ch/users/sennrich/zmorge/, do something like the following (in debian at least):
sudo apt-get install sfst unzip wget sed gawk
unzip zmorge-20150315-smor_newlemma.a.zip
fst-
zmorge-20150315-smor_newlemma.a | sed
's/ /_/g;'
> zmorge.tfst
cat zmorge.tfst \
| awk -F$
'\t'
'{ if (NF >= 4) { print $3 "\n" $4 } }'
\
| sed
's/^<>$//;'
\
|
sort
-u \
| sed
's/^$/<>/;'
\
| awk
'{print $1 "\t" NR-1}'
\
> zmorge.lab
gfsmcompile -z0 -l zmorge.lab zmorge.tfst | gfsminvert -z0 | gfsmarcsort -l -F zmorge.gfst
You can then test the compiled transducer with this module by calling e.g.:
dta-cab-analyze.perl -ac=Morph::SMOR -ao=fstFile=zmorge.gfst -ao=labFile=zmorge.lab -fc=text -w Vermittlungsgespräche
which should produce something like the following output:
Vermittlungsgespräche
+[morph] Vermittlungsgespräch[_NN]=Vermittl[<~>]ungs[<
#>]gespräch[<+NN>][<Neut>][<Acc>][<Pl>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittl[<~>]ungs[<
#>]gespräch[<+NN>][<Neut>][<Dat>][<Sg>][<Old>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittl[<~>]ungs[<
#>]gespräch[<+NN>][<Neut>][<Gen>][<Pl>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittl[<~>]ungs[<
#>]gespräch[<+NN>][<Neut>][<Nom>][<Pl>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittlung[<->]s[<
#>]gespräch[<+NN>][<Neut>][<Acc>][<Pl>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittlung[<->]s[<
#>]gespräch[<+NN>][<Neut>][<Dat>][<Sg>][<Old>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittlung[<->]s[<
#>]gespräch[<+NN>][<Neut>][<Gen>][<Pl>] <0>
+[morph] Vermittlungsgespräch[_NN]=Vermittlung[<->]s[<
#>]gespräch[<+NN>][<Neut>][<Nom>][<Pl>] <0>
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2021 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 105:
Non-ASCII character seen before =encoding in 'Vermittlungsgespräche'. Assuming UTF-8