NAME
Search::Tools::Transliterate - transliterations of UTF-8 chars
SYNOPSIS
my $tr = Search::Tools::Transliterate->new();
print $tr->convert( 'some string of utf8 chars' );
DESCRIPTION
Search::Tools::Transliterate transliterates UTF-8 characters to single-byte equivalents. It is based on the transmap project by Markus Kuhn http://www.cl.cam.ac.uk/~mgk25/.
METHODS
new
Create new instance.
convert( text )
Returns text converted with all single bytes, transliterated according to %Map.
VARIABLES
%Map
package variable holds all the character mappings. You can alter it to taste with:
use Search::Tools::Transliterate;
my $tr = Search::Tools::Transliterate->new;
$Search::Tools::Transliterate::Map{mychar} = 'my transliteration';
BUGS
You might consider the whole attempt as a bug. It's really an attempt to accomodate applications that don't support Unicode. Perhaps we shouldn't even try. But for things like curly quotes and other 'smart' punctuation, it's often helpful to render the UTF-8 character as something rather than just letting a character without a direct translation slip into the ether.
That said, if a character has no mapping (and there are plenty that do not) a single space will be used.
AUTHOR
Peter Karman perl@peknet.com
Thanks to Atomic Learning www.atomiclearning.com
for sponsoring the development of this module.
COPYRIGHT
Copyright 2006 by Peter Karman. This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SEE ALSO
Search::Tools, Unicode::Map, Encode