NAME
Unicode::ICU::Collator - wrapper around ICU collation services
SYNOPSIS
use Unicode::ICU::Collator;
my $coll = Unicode::ICU::Collator->new($locale);
# name of the locale actually selected
print $coll->getLocale;
# sort according to locale
my @sorted = $coll->sort(@unsorted);
# comparisons
my @sorted = sort {
$coll->cmp($a->name, $b->name)
} @unsorted;
# build sort keys
my @sorted = map $_->[1],
sort { $a->[0] cmp $b->[0] }
map [ $coll->getSortKey($_->name), $_ ], @unsorted;
# get the display name of a collation locale
print Unicode::ICU::Collator->getDisplayName("de__phonebook", "en");
# German (PHONEBOOK)
print Unicode::ICU::Collator->getDisplayName("de__phonebook", "de");
# Deutsch (PHONEBOOK)
DESCRIPTION
Unicode::ICU::Collator is a thin (and currently incomplete) wrapper around ICU's collation functions.
CLASS METHODS
- new($locale)
-
Create a new collation object for the specified locale.
my $coll = Unicode::ICU::Collator->new("en"); my $coll_de = Unicode::ICU::Collator->new("de_phonebook");
- available()
-
Return a list of the available collation locale names.
my @locales = Unicode::ICU::Collator->available;
- getDisplayName($locale, $display_locale)
-
Return a descriptive name of the locale
$locale
for display in locale$display_locale
.# probably "English" my $en_en = Unicode::ICU::Collator->getDisplayName("en", "en"); # "German" my $de_en = Unicode::ICU::Collator->getDisplayName("de", "en"); # "Deutsch" my $de_de = Unicode::ICU::Collator->getDisplayName("de", "de"); # "Deutsch (PHONEBOOK)" my $deph_de = Unicode::ICU::Collator->getDisplayName("de__phonebook", "de");
INSTANCE METHODS
- cmp($str1, $str2)
-
Compare two strings per the collation selected, returning -1, 0, or 1 as per perl's
cmp
.my $cmp = $coll->cmp($str1, $str2); my @sorted = sort { $coll->cmp($a, $b) } @unsorted;
- eq($str1, $str2)
- ne($str1, $str2)
- lt($str1, $str2)
- gt($str1, $str2)
- le($str1, $str2)
- ge($str1, $str2)
-
Compare the strings lexically within the collation, returning true or false.
- getSortKey($str)
-
Returns a binary string suitable for use with perl's built-in string comparison operators such as cmp, for comparing the source strings.
my @sorted = map $_->[1], sort { $a->[0] cmp $b->[0] } map [ $coll->getSortKey($_->name), $_ ], @unsorted;
- sort(@list)
-
Return the contents of
@list
(which can be any list, not just an array) sorted per the collation.Currently this is a simply perl code wrapper around
getSortKey()
but that may change.my @sorted = $coll->sort(@unsorted);
- getLocale()
- getLocale($type)
-
Return the locale used as the source of the collation, the most specific collation name known or the collation name supplied to new, depending on
$type
.$type
is one of the following constants, as exported by the:locale
export tag:ULOC_ACTUAL_LOCALE - the actual locale being used. eg. if you supply
"en_US"
to new, this will probably return"en"
. If$type
is not provided, this is the default.ULOC_VALID_LOCALE - the most specific locale supported by ICU.
my $name = $coll->getLocale(); use Unicode::ICU::Collator ':locale'; my $name = $coll->getLocale(ULOC_VALID_LOCALE());
Previously you could supply
ULOC_REQUESTED_LOCALE
to get the locale name supplied tonew()
, but this was deprecated in ICU and current versions of ICU return an error, so I've removed it. - setAttribute($attr, $value)
-
Set an attribute for the collation.
Constants for
$attr
and$value
are exported by the:attributes
tag.Please see the documentation of
UColAttribute
type in the ICU documentation for details.$coll->setAttribute(UCOL_NUMERIC_COLLATION(), UCOL_ON());
- getAttribute($attr)
-
Return the value of a collation attribute.
my $value = $coll->getAttribute(UCOL_NUMERIC_COLLATION());
- getRules()
- getRules($type)
-
Retrieve the collation rules used by this collator.
Note: this is typically a long string for
UCOL_FULL_RULES
, and probably isn't very useful.Values for
$type
are:UCOL_FULL_RULES - the full set of rules for the collation. This is the default.
UCOL_TAILORING_ONLY - only the rule tailoring.
- getVersion()
-
Return version information for the collator as a dotted decimal string.
- getUCAVersion()
-
Return the UCA version information for a collator.
LICENSE
Unicode::ICU::Collator is licensed under the same terms as Perl itself.
SEE ALSO
http://site.icu-project.org/
http://userguide.icu-project.org/collation
http://icu-project.org/apiref/icu4c/ucol_8h.html
AUTHOR
Tony Cook <tonyc@cpan.org>