NAME

I18N::LangTags - functions for dealing with RFC1766-style language tags

SYNOPSIS

use I18N::LangTags qw(is_language_tag same_language_tag
                      extract_language_tags super_languages
                      similarity_language_tag is_dialect_of);

...or whatever of those functions you want to import. Those are all the exportable functions -- you're free to import only some, or none at all. By default, none are imported.

If you don't import any of these functions, assume a &I18N::LangTags:: in front of all the function names in the following examples.

DESCRIPTION

Language tags are a formalism, described in RFC 1766, for declaring what language form (language and possibly dialect) a given chunk of information is in.

This library provides functions for common tasks involving language tags as they are needed in a variety of protocols and applications.

Please see the "See Also" references for a thorough explanation of how to correctly use language tags.

the function is_language_tag($lang1)

Returns true iff $lang1 is a formally valid language tag.

is_language_tag("fr")            is TRUE
is_language_tag("x-jicarilla")   is FALSE
    (Subtags can be 8 chars long at most -- 'jicarilla' is 9)

is_language_tag("i-Klikitat")    is TRUE
    (True without regard to the fact noone has actually
     registered Klikitat -- it's a formally valid tag)

is_language_tag("fr-patois")     is TRUE
    (Formally valid -- altho descriptively weak!)

is_language_tag("Spanish")       is FALSE
is_language_tag("french-patois") is FALSE
    (No good -- first subtag has to match
     /^([xXiI]|[a-zA-Z]{2})$/ -- see RFC1766)

the function extract_language_tags($whatever)

Returns a list of whatever looks like formally valid language tags in $whatever. Not very smart, so don't get too creative with what you want to feed it.

extract_language_tags("fr, fr-ca, i-mingo")
  returns:   ('fr', 'fr-ca', 'i-mingo')

extract_language_tags("It's like this: I'm in fr -- French!")
  returns:   ('It', 'in', 'fr')
(So don't just feed it any old thing.)

the function same_language_tag($lang1, $lang2)

Returns true iff $lang1 and $lang2 are acceptable variant tags representing the same language-form.

same_language_tag('x-kadara', 'i-kadara')  is TRUE
   (The x/i- alternation doesn't matter)
same_language_tag('X-KADARA', 'i-kadara')  is TRUE
   (...and neither does case)
same_language_tag('en',       'en-US')     is FALSE
   (all-English is not the SAME as US English)
same_language_tag('x-kadara', 'x-kadar')   is FALSE
   (these are totally unrelated tags)

the function similarity_language_tag($lang1, $lang2)

Returns an integer representing the degree of similarity between tags $lang1 and $lang2 (the order of which does not matter), where similarity is the number of common elements on the left, without regard to case and to x/i- alternation.

similarity_language_tag('fr', 'fr-ca')           is 1
   (one element in common)
similarity_language_tag('fr-ca', 'fr-FR')        is 1
   (one element in common)

similarity_language_tag('fr-CA-joual',
                        'fr-CA-PEI')             is 2
similarity_language_tag('fr-CA-joual', 'fr-CA')  is 2
   (two elements in common)

similarity_language_tag('x-kadara', 'i-kadara')  is 1
   (x/i- doesn't matter)

similarity_language_tag('en',       'x-kadar')   is 0
similarity_language_tag('x-kadara', 'x-kadar')   is 0
   (unrelated tags -- no similarity)

similarity_language_tag('i-cree-syllabic',
                        'i-cherokee-syllabic')   is 0
   (no B<leftmost> elements in common!)

the function is_dialect_of($lang1, $lang2)

Returns true iff language tag $lang1 represents a subdialect of language tag $lang2.

Get the order right! It doesn't work the other way around!

is_dialect_of('en-US', 'en')            is TRUE
  (American English IS a dialect of all-English)

is_dialect_of('en-US', 'en')            is TRUE
  (American English IS a dialect of all-English)

is_dialect_of('fr-CA-joual', 'fr-CA')   is TRUE
is_dialect_of('fr-CA-joual', 'fr')      is TRUE
  (Joual is a dialect of (a dialect of) French)

is_dialect_of('en', 'en-US')            is FALSE
  (all-English is a NOT dialect of American English)

is_dialect_of('fr', 'en-CA')            is FALSE

is_dialect_of('en', 'en'   )            is TRUE
  (B<Note:> a degenerate case)

is_dialect_of('i-mingo-tom', 'x-Mingo') is TRUE
  (the x/i thing doesn't matter, nor does case)

the function super_languages($lang1)

Returns a list of language tags that are superordinate tags to $lang1 -- it gets this by removing subtags from the end of $lang1 until nothing (or just "i" or "x") is left.
```
super_languages("fr-CA-joual")  is  ("fr-CA", "fr")

super_languages("en-AU")  is  ("en")

super_languages("en")  is  empty-list, ()

super_languages("i-cherokee")  is  empty-list, ()
 ...not ("i"), which would be illegal as well as pointless.
```
Returns empty-list if $lang1 is not a valid language tag.

A notable and rather unavoidable problem with this method: "x-mingo-tom" has an "x" because the whole tag isn't an IANA-registered tag -- but super_languages('x-mingo-tom') is ('x-mmingo') -- which isn't really right, since 'i-mingo' is registered. But this module has no way of knowing that. (But note that same_language_tag('x-mingo', 'i-mingo') is TRUE.)

More importantly, you assume at your peril that superordinates of $lang1 are mutually intelligible with $lang1. Think REAL hard about how you use this. YOU HAVE BEEN WARNED.

NOTE

This library may (probably will) need ammending if/when RFC1766 is superceded.

COPYRIGHT

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

Sean M. Burke <sburke@netadventure.net>

To install I18N::LangTags, copy and paste the appropriate command in to your terminal.

cpanm

cpanm I18N::LangTags

CPAN shell

perl -MCPAN -e shell
install I18N::LangTags

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)