NAME

Lingua::ZH::HanConvert - convert between Traditional and Simplified Chinese characters

SYNOPSIS

    #!perl -lw
    use Lingua::ZH::HanConvert qw(simple trad);
    use utf8;
    
    my $t = "åœ‹"; # Traditional symbol for "country", unicode 22283
	# or: my $t = v22283;

    print simple($t); # Simplified "country", å›½ (unicode 22269)
    
    $s = "é±¼"; # Simplified symbol for "fish", unicode 40060
	# or: $s = v40060;

    print trad($s); # Traditional "fish", éš (unicode 39970)

REQUIRES

Perl 5.6

DESCRIPTION

In the 1950's, the Chinese government simplified over 2000 Chinese characters, to help promote literacy. Taiwan and Hong Kong still use the traditional characters. The simplified characters are hard to read if you only know the traditional ones, and vice-versa.

This module attempts to convert Chinese text between the two forms, using character-by-character transliteration.

Note that this module only handles text in the Unicode UTF-8 character set. If you need to convert between the Big5 and GB character sets, then please look at Text::IConv.

simple takes a string, converts any traditional Chinese characters (such as 國, unicode U+570B, meaning "country") to the corresponding simplified characters (like 国, unicode U+56FD, also meaning "country"), and returns the result. Characters which are not traditional Chinese do not change.

trad does the reverse; it converts any simplified Chinese characters to the corresponding traditional characters. Characters which are not simplified Chinese do not change.

BUGS, LIMITATIONS

Transliteration is not perfect. At the moment, this module only performs character-by-character transliteration, using the (one-to-one) mappings from the Unicode consortium's Unihan database. Converted text is very imperfect, though it is generally good enough to be readable.

The transliteration mappings could be improved; if anyone knows of another source of mappings then please let me know. Ideally, I'd like to see the module performing word-by-word transliteration, if suitable data sources were available. See http://www.basistech.com/articles/C2C.html for a discussion of transliteration issues.

The module may take several seconds to initialise. Each subroutine is slow the first time it is run, but is faster when run subsequent times.

The characters in this documentation may not display correctly unless the program you are reading it with is unicode-aware.

ACKNOWLEDGEMENTS

The data used by this module is taken from the Unicode consortium's Unihan database, available from ftp://ftp.unicode.org. Thanks to them for compiling the data.

AUTHOR

David Chan <david@sheetmusic.org.uk>

COPYRIGHT

1 POD Error

The following errors were encountered while parsing the POD:

Around line 231:: Non-ASCII character seen before =encoding in '"åœ‹";'. Assuming CP1252

To install Lingua::ZH::HanConvert, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Lingua::ZH::HanConvert

CPAN shell

perl -MCPAN -e shell
install Lingua::ZH::HanConvert

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	Go to GitHub issues (only if GitHub is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)