NAME
Lingua::Phoneme - MySQL-based accent-lookups.
SYNOPSIS
First time, to install the dictionary, manually create an MySQL database whose name is as described in $Lingua::Phoneme::DATABASE - by defaul this is accents
:
mysqladmin create accents
Then run these following lines of Perl:
use Lingua::Phoneme;
my $o = new Lingua::Phoneme(
USERNAME => 'myusername',
PASSWORD => 'mypassword',
);
$o->build;
You can supply a parameter to build
that should be the directory in which this module is located.
Thereafter:
use Lingua::Phoneme;
my $o = new Lingua::Phoneme(
USERNAME => 'myusername',
PASSWORD => 'mypassword',
};
$_ = $o->phoneme("house");
@_ = $o->phoneme("house");
my ($ps,$p,$s) = $o->phoneme_accent("house");
__END__
PREREQUISITES
DESCRIPTION
This module is intended to provide information on the phonemes and stress of English-language words.
Currently it uses the Moby Pronunciation Dictionary in a MySQL DB, but you can change the DB settings at construction time, and there is no reason why it can't be extended to other languages should dictionaries be made available.
NOTES ON THE DATABASE
From the Moby README file:
Each pronunciation vocabulary entry consists of a word or phrase
field followed by a field delimiter of space and the IPA-equivalent
field that is coded using the following ASCII symbols (case is
significant). Spaces between words in the word or phrase or
pronunciation field is denoted with underbar "_".
/&/ sounds like the "a" in "dab"
/(@)/ sounds like the "a" in "air"
/A/ sounds like the "a" in "far"
/eI/ sounds like the "a" in "day"
/@/ sounds like the "a" in "ado"
or the glide "e" in "system" (dipthong schwa)
/-/ sounds like the "ir" glide in "tire"
or the "dl" glide in "handle"
or the "den" glide in "sodden" (dipthong little schwa)
/b/ sounds like the "b" in "nab"
/tS/ sounds like the "ch" in "ouch"
/d/ sounds like the "d" in "pod"
/E/ sounds like the "e" in "red"
/i/ sounds like the "e" in "see"
/f/ sounds like the "f" in "elf"
/g/ sounds like the "g" in "fig"
/h/ sounds like the "h" in "had"
/hw/ sounds like the "w" in "white"
/I/ sounds like the "i" in "hid"
/aI/ sounds like the "i" in "ice"
/dZ/ sounds like the "g" in "vegetably"
/k/ sounds like the "c" in "act"
/l/ sounds like the "l" in "ail"
/m/ sounds like the "m" in "aim"
/N/ sounds like the "ng" in "bang"
/n/ sounds like the "n" in "and"
/Oi/ sounds like the "oi" in "oil"
/A/ sounds like the "o" in "bob"
/AU/ sounds like the "ow" in "how"
/O/ sounds like the "o" in "dog"
/oU/ sounds like the "o" in "boat"
/u/ sounds like the "oo" in "too"
/U/ sounds like the "oo" in "book"
/p/ sounds like the "p" in "imp"
/r/ sounds like the "r" in "ire"
/S/ sounds like the "sh" in "she"
/s/ sounds like the "s" in "sip"
/T/ sounds like the "th" in "bath"
/D/ sounds like the "th" in "the"
/t/ sounds like the "t" in "tap"
/@/ sounds like the "u" in "cup"
/@r/ sounds like the "u" in "burn"
/v/ sounds like the "v" in "average"
/w/ sounds like the "w" in "win"
/j/ sounds like the "y" in "you"
/Z/ sounds like the "s" in "vision"
/z/ sounds like the "z" in "zoo"
Moby Pronunciator contains many common names and phrases borrowed from
other languages; special sounds include (case is significant):
"A" sounds like the "a" in "ami"
"N" sounds like the "n" in "Francoise"
"R" sounds like the "r" in "Der"
/x/ sounds like the "ch" in "Bach"
/y/ sounds like the "eu" in "cordon bleu"
"Y" sounds like the "u" in "Dubois"
Words and Phrases adopted from languages other than English
have the unaccented form of the roman spelling. For example,
"etude" has an initial accented "e" but is spelled without the
accent in the Moby Pronunciator II database.
INSTALLATION OF THE DATABASE
See build.
CONSTRUCTOR new
Accepts name/value pairs as a hash or hash-like structure:
- CHAT
-
Real-time info about progress on
STDERR
. - DATABASE
-
The name of the rhyming dictionary database that will be created. Defaults to
accents
. - DRIVER
-
The
DBI::*
driver: defaults tomysql
. - USER. PASSWORD
-
Used to access the DB - no default values.
- HOSTNAME, PORT
-
The following variables must be set by the user to access the database. Defaults are
localhost
,3306
METHOD &build ($optional_path_to_db)
Calling this method will fill the database, dropping and re-making all tables if they already exist.
Optionally, supply an arugment which is the full path to the Moby Pronounciation dictionary file - the default is to use MobyPron
in the $perl/site/lib/Lingua/Phoneme/dict/EN
directory.
METHOD raw
Accepts database handle and scalar of the word to lookup
Returns raw Moby phoneme scalar from DB, or undef
on failure to find the word (not necessarily an error).
You are advised to use other methods to look up data in the db: if you do use this, note that the DB keys have _underscores_ instead of spaces. You can use the &prepare
function to convert these.
METHOD phoneme ($word_to_lookup)
Accepts a word to look up.
Returns the phonemes of the word, as a scalar or array, depending on the calling context, or undef
if the word isn't in the dictionary.
The phoneme pattern is defined in the Moby documentation: see PHONEMES
.
METHOD phoneme_accent ($word_to_lookup)
Accepts a word to look up.
Returns a reference to an array of the phonemes of the word, plus the index in that array of the primary accent, and if there is a secondary accent, its index too. Returns undef
if the word isn't in the dictionary.
The phoneme pattern is defined in the Moby documentation: see PHONEMES
.
Note that the Moby documentation describes the primary punctuation mark thus:
"'" (uncurled apostrophe) marks primary stress
"," (comma) marks secondary stress.
This is plainly in reverse, as the entry for house
is house ,h/&//U/s
.
SEE ALSO
DBI, DBD::mysql, Lingua::Rhyme.
KEYWORDS
Phoneme, phoneme, syllable.
ACKNOWLEDGMENTS
The Moby dictionary was found at described as Moby (tm) Pronunciator II...(22 June 93) with the contact address: 3449 Martha Ct., Arcata, CA 95521-4884, USA, +1 (707) 826-7715.
AUTHOR
Lee Goddard <lgoddard@cpan.org>
COPYRIGHT
THis module is Copyright (C) Lee Goddard, 10 June 2002.
This is free software, and can be used/modified under the same terms as Perl itself.
The Moby dictionary is Copyright (c) 1988-93, Grady Ward. All Rights Reserved.