NAME

eGuideDog::Dict::Mandarin - an informal Pinyin dictionary.

SYNOPSIS

use utf8;
use eGuideDog::Dict::Mandarin;

binmode(stdout, 'utf8');
my $dict = eGuideDog::Dict::Mandarin->new();
my $symbol = $dict->get_pinyin("é•¿");
print "é•¿: $symbol\n";
$symbol = $dict->get_pinyin("长江");
print "长江的长: $symbol\n";
my @symbols = $dict->get_pinyin("拼音");
print "拼音: @symbols\n";
my @words = $dict->get_words("é•¿");
print "Some words begin with é•¿: @words\n";

DESCRIPTION This module is for looking up Pinyin of Mandarin characters or words. The dictionary is from Mandarin dictionary of espeak (http://espeak.sf.net), which is mainly from Unihan is CEDICT. It's a part of the eGuideDog project (http://e-guidedog.sf.net).

EXPORT

None by default.

METHODS

new()

Initialize dictionary.

get_pinyin($str)

Return a scalar of Pinyin phonetic symbol of the first character if it is in a scalar context.

Return an array of Pinyin phonetic symbols of all characters in $str if it is in an array context.

get_words($char)

Return an array of words which are begined with $char. This list of words contains multi-phonetic-symbol characters and the symbol used in the word is less frequent than the other.

is_multi_phon($char)

Return non-zero if $char is multi-phonetic-symbol character. The returned value plus 1 is the number of phonetic symbols the character has.

Return 0 if $char is single-phonetic-symbol character.

get_multi_phon($char)

Return an array of phonetic symbols of $char.

SEE ALSO

eGuideDog::Dict::Cantonese, http://e-guidedog.sf.net

AUTHOR

Cameron Wong, <hgn823-perl at yahoo.com.cn>

ACKNOWLEDGMENT

Thanks to Silas S. Brown (http://people.pwf.cam.ac.uk/ssb22/) for maintaining the Mandarin dictionary file.

COPYRIGHT AND LICENSE

of the module

Copyright 2008 by Cameron Wong

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

of the dictionary data

Unihan and CC-CEDICT are used in the dictionary data.

About Unihan: Copyright (c) 1996-2006 Unicode, Inc. All Rights reserved.

Name: Unihan database
Unicode version: 5.0.0
Table version: 1.1
Date: 7 July 2006

CC-CEDICT is licensed under a Creative Commons Attribution-Share Alike 3.0 License. http://www.mdbg.net/chindict/chindict.php?page=cedict

CC-CEDICT is a continuation of the CEDICT project started by Paul Denisowski in 1997 with the aim to provide a complete downloadable Chinese to English dictionary with pronunciation in pinyin for the Chinese characters.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 226:

Non-ASCII character seen before =encoding in '$dict->get_pinyin("é•¿");'. Assuming CP1252