NAME

Treex::Block::Write::BaseWriter

VERSION

version 2.20160629

DESCRIPTION

This is the base class for document writer blocks in Treex.

It handles selecting and opening the output files, allowing for output of one-file per document. The output file name(s) may be set in several ways (standard output may also be used as a file with the name '-'); GZip file compression is supported.

Other features, such as writing all documents to one file or setting character encoding, are enabled in Treex::Block::Write::BaseTextWriter.

PARAMETERS

to

Space-or-comma-separated list of output file names.

file_stem, path

These override the respective attributes in documents (filled in by a DocumentReader), which are used for generating output file names.

stem_suffix

A string to append after file_stem.

compress

If set to 1, the output files are compressed using GZip (if to is used to set file names, the names must also contain the ".gz" suffix).

clobber

If set to 1, existing destination files will be overwritten.

DERIVED CLASSES

The derived classes should just use print { $self-_file_handle } "output text">, the base class will take care of opening the proper file.

All derived classes that override the process_document method directly must call the _prepare_file_handle method to gain access to the correct file handle.

The extension parameter should be overriden with the default file extension for the given file type.

TODO

  • Set compress if file name contains .gz or .bz2? Add .gz to extension to even for file names set with the to parameter if compress is set to true?

  • Possibly rearrange somehow so that the _prepare_file_handle method is not needed. The problem is that if this was a Moose role, it would have to be applied only after an override to process_document. The Moose inner and augment operators are a possibility, but would not remove a need for a somewhat non-standard behavior in derived classes (one could not just override process_document, but would have to augment it).

AUTHORS

Ondřej Dušek <odusek@ufal.mff.cuni.cz>

Martin Popel <popel@ufal.mff.cuni.cz>

Ondřej Bojar <bojar@ufal.mff.cuni.cz>

COPYRIGHT AND LICENSE

Copyright © 2011-2012 by Institute of Formal and Applied Linguistics, Charles University in Prague

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.