NAME

Lingua::KO::Hangul::Util - utility functions for Hangul Syllables

SYNOPSIS

use Lingua::KO::Hangul::Util;

decomposeHangul(0xAC00);
  # (0x1100,0x1161) or "\x{1100}\x{1161}"

composeHangul("\x{1100}\x{1161}");
  # "\x{AC00}"

getHangulName(0xAC00);
  # "HANGUL SYLLABLE GA"

parseHangulName("HANGUL SYLLABLE GA");
  # 0xAC00

DESCRIPTION

A Hangul Syllable consists of Hangul Jamo.

Hangul Jamo are classified into three classes:

CHOSEONG  (the initial sound) as a leading consonant (L),
JUNGSEONG (the medial sound)  as a vowel (V),
JONGSEONG (the final sound)   as a trailing consonant (T).

Any Hangul Syllable is a composition of

 i) CHOSEONG + JUNGSEONG (L + V)

  or

ii) CHOSEONG + JUNGSEONG + JONGSEONG (L + V + T).

Names of Hangul Syllables have a format of "HANGUL SYLLABLE %s".

Composition and Decomposition

$string_decomposed = decomposeHangul($codepoint)
@codepoints = decomposeHangul($codepoint)

Accepts unicode codepoint integer.

If the specified codepoint is of a Hangul Syllable, returns a list of codepoints (in a list context) or a UTF-8 string (in a scalar context) of its decomposition.

decomposeHangul(0xAC00) # U+AC00 is HANGUL SYLLABLE GA.
   returns "\x{1100}\x{1161}" or (0x1100, 0x1161);

decomposeHangul(0xAE00) # U+AE00 is HANGUL SYLLABLE GEUL.
   returns "\x{1100}\x{1173}\x{11AF}" or (0x1100, 0x1173, 0x11AF);

Otherwise, returns false (empty string or empty list).

decomposeHangul(0x0041) # outside Hangul Syllables
   returns empty string or empty list.
$string_composed = composeHangul($src_string)
@codepoints_composed = composeHangul($src_string)

Any sequence of an initial Jamo L and a medial Jamo V is composed to a syllable LV; then any sequence of a syllable LV and a final Jamo T is composed to a syllable LVT.

Any characters other than Hangul Jamo and Hangul Syllables are unaffected.

composeHangul("Hangul \x{1100}\x{1161}\x{1100}\x{1173}\x{11AF}.")
 returns "Hangul \x{AC00}\x{AE00}." or
  (0x48,0x61,0x6E,0x67,0x75,0x6C,0x20,0xAC00,0xAE00,0x2E);
$uv_composite = getHangulComposite($uv_here, $uv_next)

Return the unsigned integer codepoint of the composite if both two codepoints, $uv_here and $uv_next, are in Hangul, and composable.

Otherwise, returns undef.

Hangul Syllable Name

$name = getHangulName($codepoint)

If the specified codepoint is of a Hangul Syllable, returns its name; otherwise returns undef.

getHangulName(0xAC00) returns "HANGUL SYLLABLE GA";
getHangulName(0x0041) returns undef.
$codepoint = parseHangulName($name)

If the specified name is of a Hangul Syllable, returns its codepoint; otherwise returns undef.

parseHangulName("HANGUL SYLLABLE GEUL") returns 0xAE00;

parseHangulName("LATIN SMALL LETTER A") returns undef;

parseHangulName("HANGUL SYLLABLE PERL") returns undef;
 # Regrettably, HANGUL SYLLABLE PERL does not exist :-)

EXPORT

By default,

decomposeHangul
composeHangul
getHangulName
parseHangulName
getHangulComposite

AUTHOR

SADAHIRO Tomoyuki

bqw10602@nifty.com
http://homepage1.nifty.com/nomenclator/perl/

Copyright(C) 2001, SADAHIRO Tomoyuki. Japan. All rights reserved.

This program is free software; you can redistribute it and/or 
modify it under the same terms as Perl itself.

SEE ALSO

http://www.unicode.org/unicode/reports/tr15

Annex 10: Hangul, in Unicode Normalization Forms (UAX #15).