NAME
nat-makeCWB - Dumps a NATools corpus in a format suitable to be imported in CWB
SYNOPSIS
nat-makeCWB [-encode=<CWBName> -d=<CWBCrpDir> [-r=<CWBRegistry>]] <NatCrpDir>
DESCRIPTION
This small scripts exports a NATools corpus directory to a pair of files that can be easily imported in Corpus WorkBench (CWB).
By default nat-makeCWB processes a NATools corpora dir an creates a pair of files, source.cqp and target.cqp that can be later imported into CWB using cwb-align-import.
Flags:
- -encode
-
If this option is used then nat-makeCWB will try to use cwb tools to create the aligned corpus. This option should be follows by the corpora name. The corpora creates will nem named
name_source
andname_target
respectively.This option should be used in conjunction with option
-d
.The CWB registry directory will be guessed using
cwb-config
orCORPUS_REGISTRY
environment variable. To use other path, please specify it with -r. - -d
-
This option is required when using
-encode
. It specifies CWB corpus directory (without the corpus name). - -r
-
Use this option to force a registry path other than the system default.
- -debug
-
Use this option if you need to debug the temporary files. If this option is supplied they will not be deleted.
SEE ALSO
NATools, perl(1)
AUTHOR
Alberto Manuel Brandão Simões, <ambs@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2010 by Alberto Manuel Brandão Simões
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 262:
Non-ASCII character seen before =encoding in 'Brandão'. Assuming UTF-8