NAME

Unicode::Towctrans - Generate small casefolding tables

SYNOPSIS

gen_wctrans
gen_wctrans --safec
gen_wctrans --musl
gen_wctrans -v 15
gen_wctrans -v 15 --cf CaseFolding.txt.15 --out towctrans-15.h

DESCRIPTION

gen_wctrans generates a towctrans.h header file, which is used by musl and safeclib to generate small and efficient case folding tables, to build the libc towupper() and towlower() functions and its secure variants towupper_s() and towlower_s().

If the code may run on a system with the turkish or azeri locale, you need to define -DHAVE_LOCALE_TR to check for the special turkish i locale and mappings at run-time.

If you know that your iswalpha() works correctly (only with musl), then use --with_iswalpha to get a lightly faster function. E.g. for benchmarking.

Planned also for the multi-byte folding tables for wcsfc_s() for safeclib. As the single-byte towupper and towlower conversions are meaningless for many multi-byte unicode mappings, those with status F - folding. Use a proper string foldcasing function instead.

PERFORMANCE

Currently it is still a bit un-optimized, but small and fast enough compared to the other implementations. And esp. correct compared to glibc, which ignores characters from other locales.

make -C examples
./bench
      my:        160 [us]
musl-new:        352 [us]
musl-old:        286 [us]
   glibc:        197 [us]

 wc -c towctrans-*.o
   5072 towctrans-my.o
   7096 towctrans-musl-new.o
   3408 towctrans-musl-old.o
  97432 towctrans-glibc.o

INSTALLATION

Perl 5.12 or later is required.

This module does not need to be installed. running gen_wctrans is enough. However for full testing and global installation run this:

perl Makefile.PL
make
make test
make test-all
sudo make install

DEPENDENCIES

This module requires a CaseFolding.txt file from Unicode Character Database, which is automatically downloaded via wget if missing.

AUTHOR

Reini Urban <rurban@cpan.org>

Copyright(C) 2026 Reini Urban. All rights reserved

COPYRIGHT AND LICENSE

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The generated files are MIT licensed. See the generated files headers.

SEE ALSO

https://www.unicode.org/reports/tr44/#Casemapping
https://git.musl-libc.org/cgit/musl/tree/src/ctype/towctrans.c
https://git.musl-libc.org/cgit/musl/tree/src/ctype/towctrans.c?id=e8aba58ab19a18f83d7f78e80d5e4f51e7e4e8a9
https://github.com/rurban/safeclib/blob/master/src/extwchar/towctrans.c
https://sourceware.org/git/?p=glibc.git;a=tree;f=wctype;;hb=HEAD