NAME

Data::Tubes::Util

DESCRIPTION

Helper functions for automatic management of argument lists and other.

FUNCTIONS

args_array_with_options
my ($aref, $args) = args_array_with_options(@list, \%defaults); # OR
my ($aref, $args) = args_array_with_options(@list, \%args, \%defaults);

helper function to ease parsing of input parameters. This is mostly useful when your function usually takes a list as input, but you want to be able to provide an optional hash of arguments.

The function returns an array reference with the list of parameters, and a hash reference of arguments for less common things.

When calling this function, you are always supposed to pass a hash reference of options, which will act as a default. If the element immediately before is a hash reference itself, it will be considered the input for overriding arguments. Their combination (a simple overriding at the highest hash level) is then returned as $<$args>.

The typical way to invoke this function is like this:

function foo {
   my ($list, $args) = args_array_with_options(@_, {bar => 'baz'});
   ...
}

so that the function foo can be called with an optional trailing hash reference containing the arguments, like this:

foo(qw< this and that >, {bar => 'galook!'});

In case your list might actually contain hash references, you will have to take this into consideration.

assert_all_different
my $bool = assert_all_different(@strings);

checks that all strings in @strings are different. Returns 1 if the check is successful, throws an exception otherwise. The exception is a hash reference with a key message set to the first string that is found repeated.

load_module
my $module = load_module($locator); # OR
my $module = load_module($locator, $prefix);

loads a module automatically. There are a lot of modules on CPAN that do this, probably much better, but this should do for these module's needs.

The $locator is resolved into a full module name through "resolve_module"; the resulting name is then required and the resolved name returned back.

Example:

my $module = load_module('Reader');

loads module Data::Tubes::Plugin::Reader and returns the string Data::Tubes::Plugin::Reader, while:

my $other_module = load_module('Foo::Bar');

loads module Foo::Bar and returns string Foo::Bar.

You can optionally pass a $prefix that will be passed to "resolve_module", see there for further information.

load_sub
my $sub = load_sub($locator); # OR
my $sub = load_sub($locator, $prefix);

loads a sub automatically. There are a lot of modules on CPAN that do this, probably much better, but this should do for these module's needs.

The $locator is split into a pair of module and subroutine name. The module is loaded through "load_module"; the subroutine referenc3 is then returned from that module.

Example:

my $sub = load_module('Reader::by_line');

loads subroutine Data::Tubes::Plugin::Reader::by_line and returns a reference to it, while:

my $other_sub = load_module('Foo::Bar::baz');

returns a reference to subroutine Foo::Bar::baz after loading module Foo::Bar.

You can optionally pass a $prefix that will be passed to "resolve_module", see there for further information.

metadata
my $href = metadata($input, %args); # OR
my $href = metadata($input, \%args);

parse input string $string according to rules exposed below, that can be controlled through %args.

The string is split on the base of two separators, a chunks separator and a key/value separator. The first one isolates what should be key/value pairs, the second allows separating the key from the value in each of these chunks. Whenever a chunk is not actually a key/value pair, it is considered a value and associated to a default key.

The following items can be set in %args:

chunks_separator

what allows separating chunks, it MUST be a single character;

default_key

a string used as the key when a chunk cannot be split into a pair;

key_value_separator

what allows separating the key from the value in a chunk, it MUST be a single character.

Examples:

# use defaults
my $input = 'foo=bar baz=galook booom!';
my $href = metadata($input);
# $href = {
#    foo => 'bar',
#    baz => 'galook',
#    ''  => 'booom!'
# }

# use defaults
my $input = 'foo=bar baz=galook booom!';
my $href = metadata($input, default_key => 'name');
# $href = {
#    foo  => 'bar',
#    baz  => 'galook',
#    name => 'booom!'
# }

# use alternative separators
my $input = 'foo:bar & bar|baz:galook booom!|whatever';
my $href = metadata($input,
   default_key => 'name',
   chunks_separator => '|',
   key_value_separator => ':'
);
# $href = {
#    foo  => 'bar & bar',
#    baz  => 'galook booom!',
#    name => 'whatever'
# }
normalize_args
my $args = normalize_args( %args, \%defaults); # OR
my $args = normalize_args(\%args, \%defaults);

helper function to handle input parameters, with some defaults. Allows accepting both a series of key/value pairs, or a hash reference with these pairs, while at the same time providing default values.

A typical usage is as follows:

sub foo {
   my $args = normalize_args(@_, {bar => 'baz'});
   ...
}
normalize_filename
my $name_or_handle = normalize_filename($name, $default_handle);

helper function to normalize a file name according to some rules. In particular, depending on $filename:

  • if it is a filehandle, it is returned directly;

  • if it is the string -, the $default_handle is returned. This allows you to use STDIN or STDOUT as input/output handles in case the filename is - (like many applications support);

  • if it starts with the string file:, this prefix is stripped away and the rest is used as a filename. This allows you to actually use - as a real file name, avoiding the automatic handle management described in the bullet above. If your filename may start with the string file:, then you should always put this prefix, e.g.:

    file:whatever   -- should be passed as -->  file:file:whatever
  • if it starts with the string handle:, this prefix is stripped and the rest is used to get one of the standard filehandles. The allowed remaining parts are (case-insensitive):

    in
    stdin
    out
    stdout
    err
    stderr

    Any other remaining part causes an exception to be thrown.

    Again, if you actually need to create a file whose name is e.g. handle:whatever, you have to prefix it with file::

    handle:whatever   -- should be passed as -->  file:handle:whatever
  • otherwise, the provided $filename will be returned as-is.

resolve_module
my $full_module_name = resolve_module($module_name); # OR
my $full_module_name = resolve_module($module_name, $prefix);

possibly expand a module's name according to a prefix. These are the rules:

  • if $module_name starts with an exclamation point !, this initial character will be stripped away and the rest will be used as the package name. $prefix will be ignored in this case;

  • otherwise, if $module_name starts with a plus sign +, this first character will be stripped away and the $prefix will be used (defaulting to Data::Tubes::Plugin);

  • otherwise, if $module_name does not contain sub-packages (i.e. the sequence ::), then the $prefix will be used as in the previous bullet;

  • otherwise, the provide name is used.

Examples (in the same order as the bullet above):

module_name('!SimplePack'); # SimplePack
module_name('+Some::Pack'); # Data::Tubes::Plugin::Some::Pack
module_name('SimplePack');  # Data::Tubes::Plugin::SimplePack
module_name('Some::Pack');  # Some::Pack
module_name('Pack', 'Some::Thing'); # Some::Thing::Pack
module_name('Some::Pack', 'Some::Thing'); # Some::Pack
shorter_sub_names
shorter_sub_names($package_name);

this helper is used in plugins to generate alternative versions of the implemented functions, with shorter names.

The basic rationale is that functions are usually named after the area they cover, e.g. the function in Data::Tubes::Plugin::Reader that reads a filehandle line-by-line is called read_by_line. In this way, when you use e.g. summon from Data::Tubes, you end up with a function read_by_line that is much clearer than simply by_line.

On the other hand, when you rely upon automatic running of factory functions like in tube or pipeline (again, in Data::Tubes), some parts are redundant. In the example, you would end up using Reader::read_by_line, where read_ is actually redundant as you already have the last part of the plugin package name to tell you what this by_line thing is about.

shorter_sub_names comes to the rescue to generate alternative names by analysing the current namespace for a package and generating new functions by removing a prefix. In the Data::Tubes::Plugin::Reader case, for example, it is called like this at the end of the module:

shorter_sub_names(__PACKAGE__);

and it generates, among the others, by_line and by_paragraph.

Consider using this if you generate new plugins.

sprintffy
my $string = sprintffy($template, \@substitutions);

expand a $template string a-la sprintf, based on a list of @substitutions.

The template targets are sprintf-like, i.e. sequences that start with a percent sign followed by... something.

Each substitution is supposed to be an array reference with two items inside: a regular expression and a value specifier. The regular expression is used to match what comes after the percent sign, while the value part can be either a straight value, or a subroutine reference that will be run to get the real value for the substitution.

There is always an implicit, high priority substitution that matches a single percent sign and expands to a percent sign, so that the string %% will be unescaped to % as you would expect in something that is sprintf-like.

test_all_equal
my $bool = test_all_equal(@list);

test whether all elements in @list are equal to one another or not, and return test output as a boolean value (i.e. something that Perl considers true or false).

traverse
my $item = traverse($data, @keys);

Assuming that $data is an array or hash reference, traverse it using items in @keys at each step in the descent.

unzip
my ($even, $odds) = unzip(@list); # OR
my ($even, $odds) = unzip(\@list);

separates even and odd items in the input @list and returns them as two references to arrays.

SEE ALSO

Data::Tubes is a valid entry point of all of this.

AUTHOR

Flavio Poletti <polettix@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2016 by Flavio Poletti <polettix@cpan.org>

This module is free software. You can redistribute it and/or modify it under the terms of the Artistic License 2.0.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.