NAME

Lingua::Phoneme - MySQL-based accent-lookups.

SYNOPSIS

First time, to install the dictionary, manually create an MySQL database whose name is as described in $Lingua::Phoneme::DATABASE - by defaul this is accents:

mysqladmin create accents

Then run these following lines of Perl:

use Lingua::Phoneme;
my $o = new Lingua::Phoneme(
	USERNAME => 'myusername',
	PASSWORD => 'mypassword',
);
$o->build;

You can supply a parameter to build that should be the directory in which this module is located.

Thereafter:

use Lingua::Phoneme;
my $o = new Lingua::Phoneme(
	USERNAME => 'myusername',
	PASSWORD => 'mypassword',
};
$_ = $o->phoneme("house");
@_ = $o->phoneme("house");
my ($ps,$p,$s) = $o->phoneme_accent("house");

__END__

PREREQUISITES

DBI.pm, DBD::mysql.pm,

DESCRIPTION

This module is intended to provide information on the phonemes and stress of English-language words.

Currently it uses the Moby Pronunciation Dictionary in a MySQL DB, but you can change the DB settings at construction time, and there is no reason why it can't be extended to other languages should dictionaries be made available.

NOTES ON THE DATABASE

From the Moby README file:

Each pronunciation vocabulary entry consists of a word or phrase
field followed by a field delimiter of space and the IPA-equivalent
field that is coded using the following ASCII symbols (case is
significant). Spaces between words in the word or phrase or
pronunciation field is denoted with underbar "_".

/&/     sounds like the "a" in "dab"
/(@)/   sounds like the "a" in "air"
/A/     sounds like the "a" in "far"
/eI/    sounds like the "a" in "day"
/@/     sounds like the "a" in "ado"
        or the glide "e" in "system" (dipthong schwa)
/-/     sounds like the "ir" glide in "tire"
        or the  "dl" glide in "handle"
        or the "den" glide in "sodden" (dipthong little schwa)
/b/     sounds like the "b" in "nab"
/tS/    sounds like the "ch" in "ouch"
/d/     sounds like the "d" in "pod"
/E/     sounds like the "e" in "red"
/i/     sounds like the "e" in "see"
/f/     sounds like the "f" in "elf"
/g/     sounds like the "g" in "fig"
/h/     sounds like the "h" in "had"
/hw/    sounds like the "w" in "white"
/I/     sounds like the "i" in "hid"
/aI/    sounds like the "i" in "ice"
/dZ/    sounds like the "g" in "vegetably"
/k/     sounds like the "c" in "act"
/l/     sounds like the "l" in "ail"
/m/     sounds like the "m" in "aim"
/N/     sounds like the "ng" in "bang"
/n/     sounds like the "n" in "and"
/Oi/    sounds like the "oi" in "oil"
/A/     sounds like the "o" in "bob"
/AU/    sounds like the "ow" in "how"
/O/     sounds like the "o" in "dog"
/oU/    sounds like the "o" in "boat"
/u/     sounds like the "oo" in "too"
/U/     sounds like the "oo" in "book"
/p/     sounds like the "p" in "imp"
/r/     sounds like the "r" in "ire"
/S/     sounds like the "sh" in "she"
/s/     sounds like the "s" in "sip"
/T/     sounds like the "th" in "bath"
/D/     sounds like the "th" in "the"
/t/     sounds like the "t" in "tap"
/@/     sounds like the "u" in "cup"
/@r/    sounds like the "u" in "burn"
/v/     sounds like the "v" in "average"
/w/     sounds like the "w" in "win"
/j/     sounds like the "y" in "you"
/Z/     sounds like the "s" in "vision"
/z/     sounds like the "z" in "zoo"

Moby Pronunciator contains many common names and phrases borrowed from
other languages; special sounds include (case is significant):

"A"  sounds like the "a" in "ami"
"N"  sounds like the "n" in "Francoise"
"R"  sounds like the "r" in "Der"
/x/  sounds like the "ch" in "Bach"
/y/  sounds like the "eu" in "cordon bleu"
"Y"  sounds like the "u" in "Dubois"

Words and Phrases adopted from languages other than English
have the unaccented  form of the roman spelling. For example,
"etude" has an initial accented "e" but is spelled without the
accent in the Moby Pronunciator II database.

INSTALLATION OF THE DATABASE

See build.

CONSTRUCTOR new

Accepts name/value pairs as a hash or hash-like structure:

CHAT

Real-time info about progress on STDERR.

DATABASE

The name of the rhyming dictionary database that will be created. Defaults to accents.

DRIVER

The DBI::* driver: defaults to mysql.

USER. PASSWORD

Used to access the DB - no default values.

HOSTNAME, PORT

The following variables must be set by the user to access the database. Defaults are localhost, 3306

METHOD &build ($optional_path_to_db)

Calling this method will fill the database, dropping and re-making all tables if they already exist.

Optionally, supply an arugment which is the full path to the Moby Pronounciation dictionary file - the default is to use MobyPron in the $perl/site/lib/Lingua/Phoneme/dict/EN directory.

METHOD raw

Accepts database handle and scalar of the word to lookup

Returns raw Moby phoneme scalar from DB, or undef on failure to find the word (not necessarily an error).

You are advised to use other methods to look up data in the db: if you do use this, note that the DB keys have _underscores_ instead of spaces. You can use the &prepare function to convert these.

METHOD phoneme ($word_to_lookup)

Accepts a word to look up.

Returns the phonemes of the word, as a scalar or array, depending on the calling context, or undef if the word isn't in the dictionary.

The phoneme pattern is defined in the Moby documentation: see PHONEMES.

METHOD phoneme_accent ($word_to_lookup)

Accepts a word to look up.

Returns a reference to an array of the phonemes of the word, plus the index in that array of the primary accent, and if there is a secondary accent, its index too. Returns undef if the word isn't in the dictionary.

The phoneme pattern is defined in the Moby documentation: see PHONEMES.

Note that the Moby documentation describes the primary punctuation mark thus:

 "'" (uncurled apostrophe) marks primary stress
"," (comma) marks secondary stress.

This is plainly in reverse, as the entry for house is house ,h/&//U/s.

SEE ALSO

DBI, DBD::mysql, Lingua::Rhyme.

KEYWORDS

Phoneme, phoneme, syllable.

ACKNOWLEDGMENTS

The Moby dictionary was found at described as Moby (tm) Pronunciator II...(22 June 93) with the contact address: 3449 Martha Ct., Arcata, CA 95521-4884, USA, +1 (707) 826-7715.

AUTHOR

Lee Goddard <lgoddard@cpan.org>

COPYRIGHT

THis module is Copyright (C) Lee Goddard, 10 June 2002.

This is free software, and can be used/modified under the same terms as Perl itself.

The Moby dictionary is Copyright (c) 1988-93, Grady Ward. All Rights Reserved.