NAME
Unicode::Transform - conversion among Unicode Transformation Formats (UTFs)
SYNOPSIS
use Unicode::Transform;
$unicode_string = utf16be_to_unicode($utf16be_string);
$utf16le_string = unicode_to_utf16le($unicode_string);
DESCRIPTION
This module provides some functions to convert a string among some Unicode Transformation Formats (UTFs).
conversion from UTF to Perl internal's Unicode format
STRING
is the source string.
If CODEREF
is omitted, any partial octets are deleted.
If CODEREF
is specified, the appearance of a partial octet calls it with an argument the value of which is an integer of its octet code point, and the return value of that is inserted.
(You can call die
or croak
in CODEREF
if you want to trap an ill-formed source.)
utf16le_to_unicode([CODEREF,] STRING)
-
Converts UTF-16LE to Unicode (Perl internal's Unicode format).
utf16be_to_unicode([CODEREF,] STRING)
-
Converts UTF-16BE to Unicode.
utf32le_to_unicode([CODEREF,] STRING)
-
Converts UTF-32LE to Unicode.
utf32be_to_unicode([CODEREF,] STRING)
-
Converts UTF-32BE to Unicode.
utf8_to_unicode([CODEREF,] STRING)
-
Converts UTF-8 to Unicode.
utf8mod_to_unicode([CODEREF,] STRING)
-
Converts UTF-8-Mod to Unicode.
utfcp1047_to_unicode([CODEREF,] STRING)
-
Converts UTF-EBCDIC (for CP1047) to Unicode.
conversion from Perl Internal's Unicode format to UTF
STRING
is the source string.
If CODEREF
is omitted, any UTF-illegal characters (high and low surrogate characters, and code points over 0x10FFFF
) are deleted.
If CODEREF
is specified, the appearance of a UTF-illegal character calls it with an argument the value of which is an integer of its Unicode code point, and the return value of that is inserted.
unicode_to_utf16le([CODEREF,] STRING)
-
Converts UTF-16LE to Unicode.
unicode_to_utf16be([CODEREF,] STRING)
-
Converts UTF-16BE to Unicode.
unicode_to_utf32le([CODEREF,] STRING)
-
Converts UTF-32LE to Unicode.
unicode_to_utf32be([CODEREF,] STRING)
-
Converts UTF-32BE to Unicode.
unicode_to_utf8([CODEREF,] STRING)
-
Converts UTF-8 to Unicode.
unicode_to_utf8mod([CODEREF,] STRING)
-
Converts UTF-8-Mod to Unicode.
unicode_to_utfcp1047([CODEREF,] STRING)
-
Converts UTF-EBCDIC (for CP1047) to Unicode.
AUTHOR
SADAHIRO Tomoyuki, <SADAHIRO@cpan.org>
http://homepage1.nifty.com/nomenclator/perl/
Copyright(C) 2002-2003, SADAHIRO Tomoyuki. Japan. All rights reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SEE ALSO
- perlunicode
- http://www.unicode.org/reports/tr16
-
UTF-EBCDIC and UTF-8-Mod