Why not adopt me?
NAME
Unicode.pod - Working with unicode
VERSION
version 2.07
DESCRIPTION
Working with unicode.
For a practical example, see the Catalyst application in the examples/unicode
directory in this distribution.
ASSUMPTIONS
In this tutorial, we're assuming that all encodings are UTF-8. It's relatively simple to combine different encodings from different sources, but that's beyond the scope of this tutorial.
For simplicity, we're also going to assume that you're using Catalyst for your web-framework, DBIx::Class for your database ORM, TT for your templating system, and YAML format HTML::FormFu
configuration files, with YAML::XS installed. However, the principles we'll cover should translate to whatever technologies you chose to work with.
BASICS
To make it short and sweet: you must decode all data going into your program, and encode all data coming from your program.
Skip to "CHANGES REQUIRED" if you want to see what you need to do without any other explanation.
INPUT
Input parameters from the browser
If you're using Catalyst
, Catalyst::Plugin::Unicode will decode all input parameters sent from the browser to your application - see "Catalyst Configuration".
If you're using some other framework or, in any case, you need to decode the input parameters yourself, please take a look at HTML::FormFu::Filter::Encode.
Data from the database
If you're using DBIx::Class, DBIx::Class::UTF8Columns is likely the best options, as it will decode all input retrieved from the database - see "DBIx::Class Configuration".
In other cases (i.e. plain DBI), you still need to decode the string data coming from the database. This varies depending on the database server. For MySQL, for instance, you can use the mysql_enable_utf8
attribute: see DBD::mysql documentation for details.
Your template files
Set TT to decode all template files - see "TT Configuration".
HTML::FormFu's own template files
Set HTML::FormFu
to decode all template files - see "HTML::FormFu Template Configuration".
HTML::FormFu form configuration files
If you're using YAML
config files, your files will automatically be decoded by load_config_file|HTML::FormFu/load_config_file
and load_config_filestem|HTML::FormFu/load_config_filestem
.
If you have Config::General config files, your files will automatically be decoded by load_config_file|HTML::FormFu/load_config_file
and load_config_filestem|HTML::FormFu/load_config_filestem
, which automatically sets Config::General's -UTF8
setting.
Your perl source code
Any perl source files which contain Unicode characters must use the utf8 module.
OUTPUT
Data saved to the database
With DBIx::Class
, DBIx::Class::UTF8Columns will encode all data sent to the database - see "DBIx::Class Configuration".
HTML sent to the browser
With Catalyst
, Catalyst::Plugin::Unicode will encode all output sent from your application to the browser - see "Catalyst Configuration".
In other circumstances you need to be sure to output your Unicode (decoded) strings in UTF-8. To do this you can encode your output before it's sent to the browser with something like:
use utf8;
if ( $output && utf8::is_utf8($output) ){
utf8::encode( $output ); # Encodes in-place
}
Another option is to set the binmode
for STDOUT
:
bindmode STDOUT, ':utf8';
However, be sure to do this only when sending UTF-8 data: if you're serving images, PFD files, etc, binmode
should remain set to :raw
.
CHANGES REQUIRED
Catalyst Configuration
Add Catalyst::Plugin::Unicode to the list of Catalyst plugins:
use Catalyst qw( ConfigLoader Static::Simple Unicode );
DBIx::Class Configuration
Add DBIx::Class::UTF8Columns to the list of components loaded, for each table that has columns storing unicode:
__PACKAGE__->load_components( qw( UTF8Columns HTML::FormFu PK::Auto Core ) );
Pass each column name that will store unicode to utf8_columns()
:
__PACKAGE__->utf8_columns( qw( lastname firstname ) );
TT Configuration
Tell TT to decode all template files, by adding the following to your application config in MyApp.pm
package MyApp;
use parent 'Catalyst';
use Catalyst qw( ConfigLoader );
MyApp->config({
'View::TT' => {
ENCODING => 'UTF-8',
},
});
1;
HTML::FormFu Template Configuration
Make HTML::FormFu
tell TT to decode all template files, by adding the following to your myapp.yml
Catalyst configuration file:
package MyApp;
use parent 'Catalyst';
use Catalyst qw( ConfigLoader );
MyApp->config({
'Controller::HTML::FormFu' => {
constructor => {
tt_args => {
ENCODING => 'UTF-8',
},
},
},
});
1;
These above 2 examples should be combined, like so:
package MyApp;
use parent 'Catalyst';
use Catalyst qw( ConfigLoader );
MyApp->config({
'Controller::HTML::FormFu' => {
constructor => {
tt_args => {
ENCODING => 'UTF-8',
},
},
},
'View::TT' => {
ENCODING => 'UTF-8',
},
});
1;
AUTHORS
Carl Franks cfranks@cpan.org
Michele Beltrame arthas@cpan.org
(contributions)
COPYRIGHT
This document is free, you can redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
Carl Franks <cpan@fireartist.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2018 by Carl Franks.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.