NAME

nat-dumpDicts - Command line tool to dump NATools PTDs

SYNOPSIS

nat-dumpDicts <natools-dir>

nat-dumpDicts -self <natools-dir>

DESCRIPTION

This command is used to dump NATools Probabilistic Translation Dictionaries in different formats. By default a Perl Data::Dumper format is used, but other formats are also available, like SQLite database.

Data::Dumper

To dump a PTD in Perl can be performed in three different ways:

  • Use it directly with a NATools corpus directory path, and it will create two files in the current directory with the dictionaries. They will be named source-target.dmp and target-source.dmp.

    Note: this process will overwrite any files with those names.

  • Use it with the -self flag and a NATools corpus directory path. The dictionaries will be created inside the NATools corpus directory and will be named source-target.dmp and target-source.dmp.

    Note: this process will overwrite any files with those names.

  • Used mainly for debug purposes, you can also supply four arguments to nat-dumpDicts (together with the -full flag). These arguments are the source lexicon file, the source-target binary dictionary file, the target lexicon file and finally the target-source binary dictionary file. If this all seems strange to you, just do not use it.

    nat-dumpDicts -full <src.lex> <src-tgt.bin> <tgt.lex> <tgt-src.bin>

SQLite database

When running this command you can supply a -sqlite=databasename option. In this case, instead of dumping in Perl Data::Dumper format, a sqlite database will be created. You can use this option with or without the -full flag, but there isn't a -self option as the output filename is supplied in the command line.

SEE ALSO

NATools documentation, perl(1)

AUTHOR

Alberto Manuel Brandão Simões, <ambs@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2006-2012 by Alberto Manuel Brandão Simões