NAME
Unicode::Properties - find out what properties a character has
SYNOPSIS
use Unicode::Properties 'uniprops';
my @prop_list = uniprops ('☺'); # Unicode smiley face
print "@prop_list\n";
prints
InMiscellaneousSymbols Any Assigned Common
You can then use, for example, \p{InMiscellaneousSymbols}
to match this character in a regular expression.
FUNCTIONS
uniprops
Given a character, returns a list of properties which the character has.
matchchars
my @matching = matchchars ($property);
Returns a list of all the characters which match a particular property. If $property
is not found in the list of possible Unicode properties, it treats it as a regular expression.
BUGS
- Data source
-
This module uses a list taken from the "perlunicode" documentation. It would be better to use Perl's internals to get the list, but I don't know how to do that.
- Perl & Unicode version
-
Depending on your Perl and Unicode version, you'll get different results. For example "Balinese" was added in Unicode version 5.0.0, so if you are using Perl 5.8.8 unpatched, your Unicode version is 4.1.0 so you won't get "Balinese" in the results list.
Also, I don't know the behaviour of Unicode versions other than 4.1.0 and 5.0.0, so this module only covers those two. I couldn't get Perl 5.8.5 to install on my computer, so I've set the minimum version to 5.8.8 for this module.
SEE ALSO
- The "uniprops" script in Unicode::Tussle
-
This script was written because the author (Tom Christiansen) was dissatisfied with Unicode::Properties. Unfortunately, it uses the same method as this module, of parsing the Perl documentation to get the information. It only works for Perl versions 5.12 or 5.14.
COPYRIGHT & LICENSE
Copyright © 2011 Ben Bullock, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
Ben Bullock, <bkb@cpan.org>