Changes for version 0.01_09 - 2011-11-03

  • Implemented support for error-handling mechanism in encode and decode

Changes for version 0.01_08 - 2011-10-18

  • Refactored internal function utf8_check() to increase performance

Changes for version 0.01_07 - 2011-10-11

  • Fixed detection of non-shortest form UTF-X on perl versions <= 5.8.6
  • Fixed utf8_length() invocation, don't pass the interpreter context
  • Shortened the Encode.pm comparison

Changes for version 0.01_06 - 2011-09-24

  • Report character position in encode_utf8() warning messages
  • Added a comparison with Encode.pm

Changes for version 0.01_05 - 2011-09-20

  • Correct maximal subpart implementation An initial subsequence of a ill-formed sequence is not maximal subpart.
    • <C0 80> -> <FFFD FFFD> <ED A0 80> -> <FFFD FFFD FFFD> <EF BF BF> -> <FFFD> <F4 80 80> -> <FFFD> <F4 90 80 80> -> <FFFD FFFD FFFD FFFD>
    • Unicode v6.0: D93b Maximal subpart of an ill-formed subsequence: The longest code unit subsequence starting at an unconvertible offset that is either: a. the initial subsequence of a well-formed code unit sequence, or b. a subsequence of length one.

Changes for version 0.01_04 - 2011-09-17

  • croak if Perl's internal representation of wide characters is ill-formed.
  • Fixed a bug in replacement handling.
  • Added a test for replacement handling.

Changes for version 0.01_03 - 2011-09-16

  • Removed the "Can't represent restricted code point" error, code points above U+10FFFF is reported as "Can't represent super code point".
  • Instead of just croaking use the 'utf8' warnings category and leave the choise of error reporting to the user.
  • Maximal subpart of an ill-formed subsequence is replaced with U+FFFD as recomended by Unicode.

Changes for version 0.01_02 - 2011-09-13

  • Changed wording in encoding exception messages from "Can't map \w+ code point" to "Can't represent \w+ code point",
  • Added a taint test.
  • Added a leaks test.

Changes for version 0.01_01 - 2011-09-12

  • Initial CPAN release.

Modules

Encoding and decoding of UTF-8 encoding form.