NAME
jspell-dist - RFC for jspell dictionary packages
RATIONALE
The Lingua::Jspell binary format (also known as hash file) is architecture dependent (32 vs 64 bit architectures, little-endian vs big-endian architectures). This makes the release of binary formats for each dictionary unmanageable.
Also, given that some language dictionaries (namely, the Portuguese dictionary) require some developing tools (bison, flex and gcc), distributing the bootstrap files would be also very complicated.
Therefore, this RFC defines a middle-term structure, where just a full Lingua::Jspell installation is needed (together with Perl and some default Lingua::Jspell dependencies).
DESCRIPTION
The files that are usually installed for each dictionary are: affix file, hash file, irregular file (if exists) and meta (yaml) file. All these files are text documents, meaning they are architecture independent.
Regarding the hash file, it is language dependent but can be built with jbuild
, that is delivered with Lingua::Jspell. jbuild
requires the affix file (already mentioned above) and the dictionary file.
The dictionary file is, also, a textual document, and therefore, architecture independent. While jbuild
works with just one dictionary file, some dictionaries are split in different, smaller, files, making the management of the dictionary easier. Instead of requiring a single dictionary file, jspell dictionary packages can handle more than one dictionary file that are concatenated together during the build phase.
PACKAGE CONTENTS
The suggested structure for a jspell dictionary package is:
MANIFEST
-
A
MANIFEST
file, just like Perl modules manifest files. It will be used to check for package completeness. META-DATA FILE
-
The meta-data file is an yaml file. The name can be anything, given that the file extension is
.yml
or.yaml
. Note that there should be only one yaml file in the distribution package.This file should include the
META
section, with theIDS
list. The first element of the list will be the official dictionary name, used when renaming the package files. Nevertheless, the system will try to link the other language names during installation. AFFIX FILE
-
The affix file should have the
.aff
extension. Also, there should be only one affix file in the distribution package. IRREGULARS FILES
-
Some languages might include irregular verbs (or other). They normally result in one or more files with the
.irr
extension. They will be concatenated together in a single.irr
file, sorting filenames alphabetically. DICTIONARY FILES
-
All the files with
.dic
extension are supposed to be dictionary files. They will be concatenated together, using filenames alphabetical order. This means that if there are any kind of macros that should be declared earlier, be sure to include them before any other file.
INSTALLATION PROCESS
The package installation process will follow the subsequent steps:
The
MANIFEST
file is read and the package content files are tested. If any file is missing the installation process will fail.The meta-data file (yaml file) is searched. If there is more than one, the system will issue an warning, but follow the process with one of them (the first one on the file glob). That file is read, and the names of the language extracted. Also, the yaml file is renamed to match the first language name with the
.yaml
extension.If no yaml file is present, the system will issue an warning, but will continue trying to use as language the name of the affix file.
The affix file is searched, and the name used if no yaml file is present. If the yaml file is present, the affix file will be renamed to the first language name (followed by the
.aff
extension).All the dictionary files (with extension
.dic
) are sorted, and the files concatenated together. The result will be placed on a file with the language name, followed by the.dic
extension.The same process will be performed for all
.irr
file, sorting the files, concatenating them, and putting the result in a file with the language name and the.irr
extension.The hash file will be created using the
.aff
file and the.dic
file created on step 4.The dictionary, irregular, hash, affixes and yaml files are copied to the Lingua::Jspell library directory (usually, ${prefix}/lib/jspell).
The optional language names will be used to create symlinks for all the files.
SEE ALSO
Lingua::Jpell(3)
AUTHOR
Alberto Manuel Brandão Simões, <ambs@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2010 by Alberto Manuel Brandão Simões