NAME
Treex::Block::Read::CoNLLX
VERSION
version 2.20151102
DESCRIPTION
Document reader for CoNLL format. Each token is on separated line in the following format: ord<tab>form<tab>lemma<tab>cpos<tab>pos<tab>features<tab>head<tab>deprel Sentences are separated with blank line. The sentences are stored into bundles in the document.
See http://ilk.uvt.nl/conll/#dataformat.
ATTRIBUTES
- from
 - 
space or comma separated list of filenames
 - lines_per_doc
 - 
number of sentences (!) per document
 - feat_is_iset
 - 
1if the features field is a serialization of Interset (e.g.pos=adj|prontype=dem|number=plu|case=dat|person=3) to read it directly into the Interset represenation for each node.0by default. - deprel_is_afun
 - 
1if the deprel field is an afun (e.g.Sb,Obj_M,Pnom) to read it directly into theafunfield for each node (also strips_Mand setsis_memberto1).0by default. 
METHODS
- next_document
 - 
Loads a document.
 
SEE
Treex::Block::Read::BaseTextReader Treex::Core::Document Treex::Core::Bundle
AUTHOR
David Mareček
COPYRIGHT AND LICENSE
Copyright © 2011-2013 by Institute of Formal and Applied Linguistics, Charles University in Prague
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 139:
 Non-ASCII character seen before =encoding in 'Mareček'. Assuming UTF-8