NAME

nat-makeCWB - Dumps a NATools corpus in a format suitable to be imported in CWB

SYNOPSIS

nat-makeCWB [-encode=<CWBName> -d=<CWBCrpDir> [-r=<CWBRegistry>]] <NatCrpDir>

DESCRIPTION

This small scripts exports a NATools corpus directory to a pair of files that can be easily imported in Corpus WorkBench (CWB).

By default nat-makeCWB processes a NATools corpora dir an creates a pair of files, source.cqp and target.cqp that can be later imported into CWB using cwb-align-import.

Flags:

-encode

If this option is used then nat-makeCWB will try to use cwb tools to create the aligned corpus. This option should be follows by the corpora name. The corpora creates will nem named name_source and name_target respectively.

This option should be used in conjunction with option -d.

The CWB registry directory will be guessed using cwb-config or CORPUS_REGISTRY environment variable. To use other path, please specify it with -r.

-d

This option is required when using -encode. It specifies CWB corpus directory (without the corpus name).

-r

Use this option to force a registry path other than the system default.

-debug

Use this option if you need to debug the temporary files. If this option is supplied they will not be deleted.

SEE ALSO

NATools, perl(1)

AUTHOR

Alberto Manuel Brandão Simões, <ambs@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2010 by Alberto Manuel Brandão Simões

1 POD Error

The following errors were encountered while parsing the POD:

Around line 262:

Non-ASCII character seen before =encoding in 'Brandão'. Assuming UTF-8