NAME
Jcode - Japanese Charset Handler
SYNOPSIS
use Jcode;
# traditional
Jcode::convert(\$str, $ocode, $icode, "z");
# or OOP!
print Jcode->new($str)->h2z->tr($from, $to)->utf8;
DESCRIPTION
Jcode.pm supports both object and traditional approach. With object approach, you can go like;
$iso_2022_jp = Jcode::new($str)->h2z->jis;
Which is more elegant than;
$iso_2022_jp = &jcode::convert(\$str,'jis',jcode::getcode(\str), "z");
For those unfamiliar with objects, Jcode.pm still supports getcode() and convert().
Methods
Methods mentioned here all return Jcode object unless otherwise mentioned.
- $j = Jcode->new($str [, $icode]);
-
Creates Jcode object $j from $str. Input code is automatically checked unless you explicitly set $icode (This is necessary if you want to convert from UTF8).
The object keeps the string in EUC format enternaly. When the object itself is evaluated, it returns the EUC-converted string so you can "print $j;" without calling access method if you are using EUC (thanks to function overload).
Just like most of perl objects, Jcode object is just a reference to hash so you can retrieve its guts via $j->{whatever}.
Instead of scalar value, You can use reference as
Jcode->new(\$str);
This saves time a little bit. In exchange of the value of $str being converted. (In a way, $str is now "tied" to jcode object).
- $j->set($str [, $icode]);
-
Sets $j's internal string to $str. Handy when you use Jcode object repeatedly (saves time and memory to create object).
# converts mailbox to SJIS format
my $jconv = new Jcode; while(<>){print $jconv->set(\$_)->mime_decode->sjis;}
- $j->append($str [, $icode]);
-
Appends $str to $j's internal string.
- $j = jcode($str [, $icode]);
-
shortcut for Jcode->new() so you can go like;
$sjis = jcode($str)->sjis;
- $euc = $j->euc;
- $jis = $j->jis;
- $sjis = $j->sjis;
-
What you code is what you get :)
- $iso_2022_jp = j$str->iso_2022_jp
-
Same as $j->z2h->jis. Hankaku Kanas are forcibly converted to Zenkaku.
Methods that use MIME::Base64
To use methods below, you need MIME::Base64. To install, simply
perl -MCPAN -e
'CPAN::Shell->install("MIME::Base64")'
- $mime_header = $j->mime_encode;
-
Converts $str to MIME-Header documented in RFC1522.
- $j->mime_decode;
-
Decodes MIME-Header in Jcode object.
You can retrieve the number of matches via $j->{nmatch};
Methods implemented by Jcode::H2Z
Methods here are actually implemented in Jcode::H2Z.
- $j->h2z([$keep_dakuten]);
-
Converts X201 kana (Hankaku) to X208 kana (Zenkaku). When $keep_dakuten is set, it leaves dakuten as is (That is, "ka + dakuten" is left as is instead of being converted to "ga")
You can retrieve the number of matches via $j->{nmatch};
- $j->z2h;
-
Converts X208 kana (Zenkaku) to X201 kana (Hankazu).
You can retrieve the number of matches via $j->{nmatch};
Methods implemented in Jcode::Tr
Methods here are actually implemented in Jcode::Tr.
- $j->tr($from, $to);
-
Applies tr on Jcode object. $from and $to can contain EUC Japanese.
You can retrieve the number of matches via $j->{nmatch};
Methods implemented in Jcode::Unicode
See Jcode::Unicode for details
Traditional Way
- ($code, [$nmatch]) = getcode($str);
-
Returns char code of $str. When array context is used instead of scaler, it also returns how many character codes are found. As mentioned above, $str can be \$str instead.
Warning: UTF8 is not automatically detected!
jcode.pl Users: This function is 100% upper-conpatible with jcode::getcode() -- well, almost;
* When its return value is an array, the order is the opposite; jcode::getcode() returns $nmatch first.
* jcode::getcode() returns 'undef' when the number of EUC characters is equal to that of SJIS. Jcode::getcode() returns EUC. for Jcode.pm is no in-betweens.
- Jcode::convert($str, [$ocode, $icode, $opt]);
-
Converts $str to char code specified by $ocode. When $icode is specified also, it assumes $icode for input string instead of the one checked by getcode(). As mentioned above, $str can be \$str instead.
jcode.pl Users: This function is 100% upper-conpatible with jcode::convert() !
BUGS
ACKNOWLEDGEMENTS
This package owes a lot in motivation, design, and code, to the jcode.pl for Perl4 by Kazumasa Utashiro <utashiro@iij.ad.jp>.
Hiroki Ohzaki <ohzaki@iod.ricoh.co.jp> has helped me polish regexp from the very first stage of development.
SEE ALSO
COPYRIGHT
Copyright 1999 Dan Kogai <dankogai@dan.co.jp>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Unicode conversion table in Jcode::Unicode::Constants is based on files at ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/JIS/, Copyright (c) 1991-1994 Unicode, Inc.
15 POD Errors
The following errors were encountered while parsing the POD:
- Around line 192:
You forgot a '=back' before '=head2'
- Around line 198:
'=item' outside of any '=over'
- Around line 238:
You forgot a '=back' before '=head2'
- Around line 242:
'=item' outside of any '=over'
- Around line 275:
You forgot a '=back' before '=head2'
- Around line 279:
'=item' outside of any '=over'
- Around line 294:
You forgot a '=back' before '=head2'
- Around line 298:
'=item' outside of any '=over'
- Around line 320:
You forgot a '=back' before '=head1'
- Around line 322:
'=item' outside of any '=over'
- Around line 537:
You forgot a '=back' before '=head1'
- Around line 539:
'=item' outside of any '=over'
- Around line 541:
You forgot a '=back' before '=head1'
- Around line 551:
'=item' outside of any '=over'
- Around line 559:
You forgot a '=back' before '=head1'