NAME

Lingua::Zompist::Barakhinei - Inflect Barakhinei nouns, verbs, and adjectives

VERSION

This document refers to version 0.02 of Lingua::Zompist::Barakhinei, released on 2002-06-26.

SYNOPSIS

use Lingua::Zompist::Barakhinei;
$i_am = Lingua::Zompist::Barakhinei::demeric('eza')->[0];

or

use Lingua::Zompist::Barakhinei ':all';
$i_am = demeric('eza')->[0];

or

use Lingua::Zompist::Barakhinei qw( demeric scrifel );
$you_know = demeric("shkriv\xea", 1)->[1];
$they_had = crifel("ten\xea", 1)->[5];
# note "\xea" = e with circumflex


# nouns and pronouns
$word = noun('belu', 'masc', 'beluri');  # nouns
$word = noun("s\xfb");    # pronouns ("\xfb" is u with circumflex: su^)
$word = noun('mukh');
# in general
$word = noun( NOUN [, GENDER [, PLURAL ] ] );

# adjectives
$word = adj("kh\xf4t\xea");  # adjectives (ho^te^)

# verbs
# note: "ibr\xea" is ibre^
$word = demeric("ibr\xea", 1);   # present
$word = scrifel("ibr\xea", 1);   # past
$word = izhcrifel("ibr\xea", 1); # past anterior
$word = budemeric("ibr\xea", 1); # present subjunctive
$word = buscrifel("ibr\xea", 1); # past subjunctive
$word = befel("ibr\xea", 1);     # imperative
$word = part("ibr\xea", 1);      # participles
# in general
$word = FUNC( VERB [, CLASS ] );

# Setting inflection tables
# nouns
$Lingua::Zompist::Barakhinei::gendertab = \%mygendertab;
$Lingua::Zompist::Barakhinei::pluraltab = \%mypluraltab;
# verbs
$Lingua::Zompist::Barakhinei::classtab = \%myclasstab;

# ones that you will probably not need as often
$Lingua::Zompist::Barakhinei::rootconstab = \%myrootconstab;
$Lingua::Zompist::Barakhinei::subjtab = \%mysubjtab;
$Lingua::Zompist::Barakhinei::cadhctab = \%mycadhctab;
$Lingua::Zompist::Barakhinei::cadhgtab = \%mycadhgtab;
$Lingua::Zompist::Barakhinei::cadhutab = \%mycadhutab;

DESCRIPTION

Overview

Lingua::Zompist::Barakhinei is a module which allows you to inflect Barakhinei words. You can conjugate verbs and decline nouns, pronouns, and adjectives.

There is one function to inflect nouns and pronouns, and another to inflect adjectives. Verbs are covered by several functions: one for each tense or mood and another for the participles.

Exports

Lingua::Zompist::Barakhinei exports no functions by default, in order to avoid namespace pollution. This enables, for example, Lingua::Zompist::Barakhinei and Lingua::Zompist::Cadhinor to be used in the same program, since otherwise many of the function names would clash. However, all functions listed here can be imported explicitly by naming them, or they can be imported all together by using the tag ':all'.

A note on the character set used

This module expects input to be in iso-8859-1 (Latin-1) and will return output in that character set as well. For example, lelcê (meaning to see) should have a byte with the value 234 as the last character, and its accusative, lelcâ, will have a byte with the value 226 as its last character.

In the future, this module may expect and produce the charset used by the Maraille font. At that point, the module Lingua::Zompist::Convert is expected to be available, which should be able to convert between that charset and standard charsets such as iso-8859-1 and utf-8.

noun

This function allows you to inflect nouns and pronouns.

It takes three arguments. All but the first are optional (and the function will guess or use entries from "$gendertab" and/or "$pluraltab" if they are not provided).

  • The noun or pronoun to inflect.

  • (optional) The gender of the noun (one of 'masc', 'neut', or 'fem'), or undef for the function to guess. (This can remain undef for pronouns.)

  • (optional) The (nominative) plural of the noun, or undef for the function to guess.

    In Barakhinei, it is necessary to know the singular, the gender, and the plural of a noun in order to inflect a noun correctly. However, if you do know the plural form, you can pass undef to this function and the function will attempt to guess based on a built-in list of nouns.

noun returns an arrayref on success, or undef or the empty list on failure (for example, because it could not determine which declension or gender a noun belonged to).

In general, the arrayref will have seven elements, in the following order: nominative singular, accusative singular, dative singular, genitive singular, nominative plural, accusative/dative plural, genitive plural. In some cases, some of those elements may be undef; the most common case is when you ask for the declension of a plural personal pronoun such as ta or kêt.

Notes:

  • If you use a singular personal pronoun as input to this function, you will get back an arrayref with seven elements, corresponding to both singular and plural forms of the pronoun. Note that this will cause the accusative/dative distinction to be thrown in away in the plural forms, since nouns make no such distinction! So it is better to input the plural form separately to get the full form.

    (This behaviour may change in the future. I'm not sure whether dropping one form is the right thing to do... singular pronouns may end up returning only the first four elements filled.)

    If you use a plural personal pronoun as input to this function, only the first four elements will be filled (with the plural forms) and the last three elements will be undef. This appears to be more DWIMmish (at least, it is for me -- I've used ta, for example, as input and wondered why it was being treated as a noun rather than as a personal pronoun).

  • The genitive form of , , ta, mukh, and will be returned in parenthesis to show that it is a regular adjective and not an undeclined genitive form.

  • The reflexive pronouns are listed under the pseudo-nominative forms and za; in the return list, the nominative forms will be the empty string.

adj

This function inflects adjectives. It expects two arguments:

  • The adjective to be inflected

  • (optional) The root consonant in the oblique forms (for example, for na "north", which has the root nan- in the oblique forms, pass in 'na' and 'n'). If you pass in undef for this argument or simply leave it out, the function will attempt to guess whether the adjective has a different oblique stem (using "$rootconstab").

adj returns an arrayref on success and undef or the empty list on failure.

The arrayref will itself contain three arrayrefs, each with seven elements. The first arrayref will contain the masculine forms, the second arrayref the neuter forms, and the third arrayref the feminine forms. The forms are in the same order as in the arrayref returned by the noun function. Briefly, this order is nominative - accusative - dative - genitive in the singular and nominative - accusative/dative - genitive in the plural.

This function should determine the declension of an adjective automatically.

There is currently no function which returns the declension of an adjective (partly because the matter is so simple -- declension I adjectives end in -C or have an extra oblique stem consonant, declension II adjectives end in -ê, and declension III adjectives end in -i); however, if there is popular demand for such a function it could be quickly added.

demeric

This function declines a verb in the present tense. It takes two arguments:

  • The verb to be conjugated

  • (optional) The declension of the verb as an integer (only strictly necessary for verbs in , which can be first, third, or fifth declension, corresponding to Cadhinor verbs in -EC, -EN, and -ER)

demeric returns an arrayref on success and undef or the empty list on failure.

The arrayref will contain six elements, in the following order: first person singular ("I"), second person singular ("thou"), third person singular ("he/she/it"), first person plural ("we"), second person plural ("[all of] you"), third person plural ("they").

scrifel

This function declines a verb in the past tense. It is otherwise similar to the function demeric.

izhcrifel

This function declines a verb in the past anterior tense. It is otherwise similar to the function demeric.

budemeric

This function declines a verb in the present subjunctive. It is otherwise similar to the function demeric.

The name derives from Cadhinor grammar terms buprilise "remote" and demeric "present", since the Barakhinei subjunctive mood derived from the Cadhinor remote forms of a verb.

buscrifel

This function declines a verb in the past subjunctive. It is otherwise similar to the function demeric.

The name derives from Cadhinor grammar terms buprilise "remote" and scrifel "past", since the Barakhinei subjunctive mood derived from the Cadhinor remote forms of a verb.

befel

This function declines a verb in the imperative. It is otherwise similar to the function demeric.

Note

The first and fourth elements of the arrayref will be empty, since Barakhinei has no first person imperative, neither singular nor plural.

part

This function returns the two participles of a verb. It takes the verb and declension number (compare "demeric") as an argument and returns an arrayref (in scalar context) or a list (in list context) of two elements: the present participle and the past participle. On failure, this function returns undef or the empty list.

Specifically, the form returned for each participle is the masculine nominative singular form of the participle (which is the citation form). Since participles decline like regular adjectives (with an oblique stem consonant of 'l' in the case of participles in -u), the other forms of the participles may be obtained by calling the adj function, if desired.

Tables

Since inflection in Barakhinei usually cannot be determined by the ending alone, this module makes use of lookup tables to provide additional information. For example, nouns ending in a consonant can be masculine, feminine, or neuter; if the gender is not passed explicitly to the "noun" function, that function attempts first to lookup the gender in a table, and if that fails, it attempts to guess the gender from the ending. Similarly with verb inflections or with the plural of nouns.

This section describes the various lookup tables which the module uses to perform its inflection tasks. All the tables described here can be overridden from the outside; this is most useful for $gendertab, $pluraltab, and $classtab, which do not come pre-filled since they would be fairly large.

It is up to you how you fill those tables -- you can leave them empty, the way they come, and explicitly pass the necessary information to each function; you can fill the tables from a hash which you initialise statically in your code; you can read in the data from a file each time; or you could use a tied hash (say, a DBM file). The last can be useful if you only want to make a couple of requests and don't want to load the entire database into memory; simply tie the data to a hash in your program and assign a reference to that hash to the appropriate variable.

Sample tables, generated programmatically from baralex.htm as of 2002-05-29 and hand-massaged slightly afterwards, are included as tab-separated value files: class.tsv, gender.tsv, and plural.tsv. It will be trivial to convert those to any representation you desire. There may also be other tab-separated value files in the distribution; have a look. Their purpose should be obvious from the filename.

These are the lookup tables which are used by the program and which can be influenced from outside:

$gendertab

This is a hashref whose keys are nouns and whose values are one of 'masc', 'fem', or 'neut'. This is used to determine the gender of nouns. For example:

san => 'neut',

indicating that the noun san is neuter.

$pluraltab

This is a hashref whose keys are nouns and whose values are the plural form of the noun. For example:

ibor => 'ibro',

indicating that the (nominative) plural of the noun ibor is ibro.

$classtab

This is a hashref whose keys are verbs and whose values are the declension number. First declension verbs end in and derive from Cadhinor verbs in -EC; second declension verbs end in -a and derive from Cadhinor verbs in -AN; third declension verbs end in and derive from Cadhinor verbs in -EN; fourth declension verbs end in -i and derive from Cadhinor verbs in -IR; fifth declension verbs end in and derive from Cadhinor verbs in -ER.

Strictly speaking, entries in this hashref are necessary only for first and fifth declension verbs; second and fourth declension verbs can be identified by their endings alone, and verbs ending in are taken to be third declension if no other declension is specified.

An example entry is

"hab\xea" => 5,

indicating that the verb habê is a fifth declension verb. (In your source code, you'd probably write hab\xea as habê.)

$rootconstab

This is a hashref whose keys are adjectives and whose values are the extra consonant which is added to the end in the oblique forms, for first declension adjectives such as na, nan-. This would be listed as

na => 'n',

You may not need to add to this table, as there aren't that many of these adjectives, and the ones listed in baralex.htm as of 2002-05-29 should already be in the module.

$subjtab

This is a hashref whose keys are verbs and whose values are the subjunctive forms of those verbs. This is used for verbs which use a different subjunctive stem (derived from Cadhinor verbs with a separate remote stem), for example

laoda => 'loda',

which indicates that the subjunctive stem of laoda is lod-. As indicated in the example, the final letter of the subjunctive stem should be the same as that of the normal infinitive; effectively, it is as if the subjunctive of those verbs is the indicative of another verb.

You may not need to add to this table, as there aren't that many of these verbs, and the ones listed in baralex.htm as of 2002-05-29 should already be in the module.

$cadhctab

This is a hashref whose keys are verbs which derive from a Cadhinor verb with a -C- stem consonant. The value is not used (but it is a good idea to have the value be true; for example, you could use the Cadhinor infinitive). This is used because verbs deriving from Cadhinor verbs in -C- suffer consonant changes in some forms. Compare "$cadhgtab".

You will probably not need to add to or replace this table.

$cadhgtab

This is a hashref whose keys are verbs which derive from a Cadhinor verb with a -G- stem consonant. The value is not used (but it is a good idea to have the value be true; for example, you could use the Cadhinor infinitive). This is used because verbs deriving from Cadhinor verbs in -G- suffer consonant changes in some forms. Compare "$cadhctab".

You will probably not need to add to or replace this table.

$cadhutab

This is a hashref whose keys are verbs which derive from a Cadhinor verb with a -U- in the last syllable of the verb stem. The value is not used (but it is a good idea to have the value be true; for example, you could use the Cadhinor infinitive). This is used because verbs deriving from Cadhinor verbs with -U- suffer vowel changes in some forms. Compare "$cadhctab" and "$cadhgtab".

BUGS

This module should handle irregular words correctly. However, if there is a word that is inflected incorrectly, please send me email and notify me. (Since Barakhinei has all sorts of funky sound changes, I wouldn't be surprised if this module makes mistakes! However, I think it handles correctly all the examples on the web page as of 2002-05-29.)

However, please make sure that you have checked against a current version of http://www.zompist.com/bara.htm or that you asked Mark Rosenfelder himself; the grammar occasionally changes as small errors are found or words change.

TODO

  • Flesh out the dictionary from baralex.htm.

  • document masculines & feminines in -u (decline like adjectives)

  • test masculines & feminines in -u (e.g. rizundu = m/f, klâtandu = m, redêlu = f)

  • test adjectives in -â: mudrâ, shkrâ

  • test pû/pe-

  • test verbs with different subjunctive stems

SEE ALSO

Lingua::Zompist::Verdurian, Lingua::Zompist::Kebreni, Lingua::Zompist::Cadhinor, http://www.zompist.com/bara.htm

FEEDBACK

If you use this module, I'd appreciate it if you drop me a line at the email address in "AUTHOR", just so that I have an idea of how many people use this module at all. Also, if you have any comments, feel free to email me.

AUTHOR

Philip Newton, <pne@cpan.org>

COPYRIGHT AND LICENSE

(This is basically the BSD licence.)

Copyright (C) 2002 by Philip Newton. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.