Changes for version 0.08055 - 2012-02-07

  • use new version of MSTperl parser and Treex::Core

Modules

abstract ancestor for parallel-corpora document readers
abstract ancestor for parallel-corpora document readers
segment text on new lines
language independent rule based tokenizer
Base tokenizer, splits on whitespaces, fills no_space_after
Rule based pseudo language-independent sentence segmenter
collection of blocks parametrized by language and language independent

Provides

in lib/Treex/Block/W2A/BaseChunkParser.pm