NAME

Unicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate

SYNOPSIS

use Unicode::Collate::Locale;

$Collator = Unicode::Collate::Locale->
    new(locale => $locale_name, %tailoring);

@sorted = $Collator->sort(@not_sorted);

DESCRIPTION

This module provides linguistic tailoring for it taking advantage of Unicode::Collate.

Constructor

The new method returns a collator object.

A parameter list for the constructor is a hash, which can include a special key 'locale' and its value (case-insensitive) standing for a two-letter language code (ISO-639) like 'en' for English. For example, Unicode::Collate::Locale->new(locale => 'FR') returns a collator tailored for French.

$locale_name may be suffixed with a territory(country) code or a variant code, which are separated with '_'. E.g. en_US for English in USA, es_ES_traditional for Spanish in Spain (Traditional),

If $localename is not defined, fallback is selected in the following order:

1. language_territory_variant
2. language_territory
3. language__variant
4. language
5. default

Tailoring tags provided by Unicode::Collate are allowed as long as they are not used for 'locale' support. Esp. the table tag is always untailorable since it is reserved for DUCET.

E.g. a collator for French, which ignores diacritics and case difference (i.e. level 1), with reversed case ordering and no normalization.

Unicode::Collate::Locale->new(
    level => 1,
    locale => 'fr',
    upper_before_lower => 1,
    normalization => undef
)

Methods

Unicode::Collate::Locale is a subclass of Unicode::Collate and methods other than new are inherited from Unicode::Collate.

Here is a list of additional methods:

$Collator->getlocale

Returns a language code accepted and used actually on collation. If linguistic tailoring is not provided for a language code you passed (intensionally for some languages, or due to the incomplete implementation), this method returns a string 'default' meaning no special tailoring.

A list of tailorable locales

  locale name       description
----------------------------------------------------------
  ca                Catalan
  cs                Czech
  eo                Esperanto
  es                Spanish
  es__traditional   Spanish ('ch' and 'll' as a grapheme)
  et                Estonian
  fi                Finnish
  fr                French
  lv                Latvian
  nb                Norwegian Bokmal
  nn                Norwegian Nynorsk
  pl                Polish
  ro                Romanian
  sk                Slovak
  sl                Slovenian
  sv                Swedish

AUTHOR

The Unicode::Collate::Locale module for perl was written by SADAHIRO Tomoyuki, <SADAHIRO@cpan.org>. This module is Copyright(C) 2004-2010, SADAHIRO Tomoyuki. Japan. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Unicode Collation Algorithm - UTS #10

http://www.unicode.org/reports/tr10/

The Default Unicode Collation Element Table (DUCET)

http://www.unicode.org/Public/UCA/latest/allkeys.txt

CLDR - Unicode Common Locale Data Repository

http://cldr.unicode.org/

Unicode::Collate
Unicode::Normalize