NAME

Locale::Maketext::Utils::Phrase - Consolidated Phrase Introspection

VERSION

This document describes Locale::Maketext::Utils::Phrase version 0.1

SYNOPSIS

use Locale::Maketext::Utils::Phrase ();

my $struct = Locale::Maketext::Utils::Phrase::phrase2struct(
    "So long, and thanks for [output,strong,all] the fish."
);

for my $piece (@{$struct}) {
    if (!ref($piece)) {
        # this $piece is a non-bracket notation chunk
    }
    else {
        # this $piece is a hashref describing the bracket notation chunk
    }
}

DESCRIPTION

This module is meant to allow you to simplify an already complex task by doing all of the parsing and basic categorization of bracket notation (or lack of BN) for you.

That way you do not have to worry about parsing or matching the syntax/escaping/delimiters/etc correctly and then maintaining it in each place it is used.

INTERFACE

Object

Eventually the base functions below will be used in an object that can be used for even more complete and fine tuned introspection.

For now these functions allow us to do most of what we need with little trouble.

Functions

Terms:

Phrase

A string intended to be passed to maketext() that may or may not contain bracket notation.

Struct

An array ref that represents a parsed phrase.

Each item in that array is either a string (a chunk that is not bracket notation) or a hashref (a chunk that is bracket notation).

The hashref has the following keys:

orig

The value is the original bracket notation string in its entirety. e.g. '[output,strong,NOT]'

cont

The value is the content of the inside of original bracket notation string. e.g. 'output,strong,NOT'

list

The value is the original bracket notation in list form. e.g. an array reference containing 'output', 'strong', 'NOT'.

type

This is a string defining what general type of bracket notation we’re dealing with:

'var'

The content is a variable reference (i.e. not translatable).

e.g. [_1]

'meth'

The content is a method that shouldn’t have any translatable part.

e.g. [numf,_1]

'basic'

The content is a method that can have translatable parts and follows a basic pattern like the first part or two after the method can be a string and the rest can be an arbitrary name/value attribute list.

e.g. [output,strong,foo]

'basic_var'

The content is 'basic' except every possible translatable part is a variable reference (i.e. not translatable).

e.g. [output,strong,_1]

'complex'

The content is more complicated than 'basic'.

'_unknown'

The content type could not be determined. This is not necessarily an error. It could be a method specific to your object, it could be something this module misses (rt please!).

'_invalid'

The content type is invalid.

This could be something Locale::Maketext would see as a syntax error (e.g. [" ,foo"]) or something it might allow through (on purpose or by happenstance (e.g. [])) but is ambiguous for no gain.

These all take a phrase as their only argument.

phrase2struct()

Returns the struct for the given phrase.

If there is a problem it will croak either "Unbalanced bracket: “…”" or ""Consecutive tildes are ambiguous": “…”".

phrase_has_bracket_notation()

Returns a boolean.

True: the given phrase has bracket notation.

False: the given phrase does not have any bracket notation.

phrase_is_entirely_bracket_notation()

Returns a boolean.

True: the given phrase is entirely bracket notation.

False: the given phrase is not entirely bracket notation.

Consecutive tildes are ambiguous

In order to keep the parsing as simple/fast as possible we avoid trying to properly interpret multiple consecutive tildes.

In the rare case you really need a literal ~ to precede a comma, ~, [, or ] (really, anywhere in the string) just use the explicit placeholder string “_TILDE_”.

$lh->maketext('A tilde is this: _TILDE_, you like?');

$lh->maketext('A tilde [output,strong,is this: _TILDE_, you like]?');

These all take a struct as their only argument.

struct2phrase()

Returns the given struct as a stringified phrase.

struct_has_bracket_notation()

Returns a boolean.

True: the given struct has bracket notation.

False: the given struct does not have any bracket notation.

struct_is_entirely_bracket_notation()

Returns a boolean.

True: the given struct is entirely bracket notation.

False: the given struct is not entirely bracket notation.

Misc

get_bn_var_regexp()

Takes no arguments, returns a regular expression that matches bracket notation variable syntax.

my $bn_var_regexp = Locale::Maketext::Utils::Phrase::get_bn_var_regexp();
if ($string =~ m/\A$bn_var_regexp\z/) {
    # string is a BN variable
}
elsif ($string =~ m/$bn_var_regexp/) {
    # string contains a BN variable
}

my @bn_variables = $string =~ m/($bn_var_regexp)/g;

get_non_translatable_type_regexp()

Takes no arguments, returns a regular expression that matches types that should not have any translatable parts.

my $non_translatable_type_regexp  = Locale::Maketext::Utils::Phrase::get_non_translatable_type_regexp();
if ($piece->{'type'} =~ m/\A$non_translatable_type_regexp\z/) {
    # nothing to translate here, move along, move along
}

if ($xliff->{'ctype'} =~ m/\Ax-bn-$non_translatable_type_regexp\z/) {
    # handle the XLIFF syntax for non-translatable <ph> tags back into bracket notation 
}

string_has_opening_or_closing_bracket()

Takes one argument, a string. Returns true if it contains an opening or closing bracket.

if ( !Locale::Maketext::Utils::Phrase::string_has_opening_or_closing_bracket($string) ){
    # $string does not have any bracket notation.
}

Private functions

These are essentially meant to be used internally but if you find a use for them be sure to verify the values you pass to them or you will get odd results.

_split_bn_cont()

Takes the 'cont' of the bracket notation piece hashref and optionally the max number of item to split it into and returns the resulting array.

Used internally to build the hash’s 'list' value.

_get_attr_hash_from_list()

Takes the 'list' of the bracket notation piece hashref and the index of where the arbitrary attributes begin and returns a hash. Accounts for non-key/value variable array refs.

_get_bn_type_from_list()

Takes the 'list' of the bracket notation piece hashref and returns the type.

Used internally to build the hash’s 'type' value.

DIAGNOSTICS

Nothing besides what is documented in phrase2struct().

CONFIGURATION AND ENVIRONMENT

Locale::Maketext::Utils::Phrase requires no configuration files or environment variables.

DEPENDENCIES

Locale::Maketext::Utils

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to bug-locale-maketext-utils-mock@rt.cpan.org, or through the web interface at http://rt.cpan.org.

TODO

Add in the object layer to really make the introspection complete.

AUTHOR

Daniel Muey <http://drmuey.com/cpan_contact.pl>

LICENCE AND COPYRIGHT

Copyright (c) 2012, Daniel Muey <http://drmuey.com/cpan_contact.pl>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.