NAME
Pcore::Util::Data
SYNOPSIS
DESCRIPTION
JSON SERIALIZE
ascii(1):
- qq[\xA3] -> \u00A3, upgrded and encoded to UTF-8 character;
- qq[£] -> \u00A3, UTF-8 character;
- qq[ᾥ] -> \u1FA5, UTF-8 character;
latin1(1):
- qq[\xA3] -> qq[\xA3], encoded as bytes;
- qq[£] -> qq[\xA3], downgraded and encoded as bytes;
- qq[ᾥ] -> \u1FA5, downgrade impossible, encoded as UTF-8 character;
utf8 - used only when ascii(0) and latin1(0);
utf8(0) - upgrade scalar, UTF8 on, DO NOT USE, SERIALIZED DATA SHOULD ALWAYS BY WITHOUT UTF8 FLAG!!!!!!!!!!!!!!!!!!;
- qq[\xA3] -> "£" (UTF8, multi-byte, len = 1, bytes::len = 2);
- qq[£] -> "£" (UTF8, multi-byte, len = 1, bytes::len = 2);
- qq[ᾥ] -> "ᾥ" (UTF8, multi-byte, len = 1, bytes::len = 3);
utf8(1) - upgrade, encode scalar, UTF8 off;
- qq[\xA3] -> "\xC2\xA3" (latin1, bytes::len = 2);
- qq[£] -> "\xC2\xA3" (latin1, bytes::len = 2);
- qq[ᾥ] -> "\xE1\xBE\xA5" (latin1, bytes::len = 3);
So,
- don't use latin1(1);
- don't use utf8(0);
JSON DESERIALIZE
utf8(0):
- qq[\xA3] -> "£", upgrade;
- qq[£] -> "£", as is;
- qq[\xC2\xA3] -> "£", upgrade each byte, invalid;
- qq[ᾥ] -> error;
utf8(1):
- qq[\xA3] -> "£", error, can't decode utf8;
- qq[£] -> "£", error, can't decode utf8;
- qq[\xC2\xA3] -> "£", decode utf8;
- qq[ᾥ] -> error, can't decode utf8;
So,
- if data was encoded with utf8(0) - use utf8(0) to decode;
- if data was encoded with utf8(1) - use utf8(1) to decode;