NAME
Util - Frequently Hacked Functions
DESCRIPTION
A cookbook of tasty functions for the terminally lazy, impatient and hubristic
AUTHOR
chocolateboy: chocolate.boy@email.com
SEE ALSO
Scalar::Util, Clone
BUGS
clone() currently segfaults if it encounters a Regex object
PUBLIC METHODS
any
usage:
any ($arrayref)
# or
any ($hashref)
description:
returns a randomly chosen member of the referenced array, or a random key
from the referenced hashtable
isnum
usage:
isnum ($val)
description:
returns nonzero value (indicating the numeric type) if $val is a number
The numeric types are a conjunction of the following flags:
0x01 IS_NUMBER_IN_UV (number within UV range - maybe not int)
0x02 IS_NUMBER_GREATER_THAN_UV_MAX (the pointed-to UV is undefined)
0x04 IS_NUMBER_NOT_INT (saw . or E notation)
0x08 IS_NUMBER_NEG (leading minus sign)
0x10 IS_NUMBER_INFINITY (this is big)
0x20 IS_NUMBER_NAN (this is not)
Rather than obliging the user to twiddle with bits, the following flavours of isnum
(corresponding to the flags above) are also available:
isuv
isbig
isfloat
isneg
isinf
isnan
div
usage:
my ($quotient, $remainder) = div ($numerator, $denominator);
# e.g.
my ($q, $r) = div (13, 3);
# $q = 4, $r = 1: 13 ($numerator) = 4 ($quotient) x 3 ($denominator) + 1 ($remainder)
description:
integer division operator
in list context, returns the quotient and remainder when the first operand ($numerator) is divided by the second ($denominator)
i.e.
$numerator = $quotient * $denominator + $remainder
in scalar context, returns just the quotient. To return the remainder, use %
mtime
usage:
mtime ($file)
description:
returns the modification time of the specified file
respond
usage:
respond ($scalar)
respond (@list)
description:
respond() performs a context-sensitive return.
In void context the return value is printed
In scalar context it is returned as a scalar
In list context it is returned as a list
quote
usage:
quote($str)
description:
escapes all single quote ('), double-quote (") and vertical space
(\n\r\f) characters with a backslash, visually flattening the resulting text
useful for string export e.g. assigning a HTML page to a JavaScript var
prints the escaped text out directly if called in void context
capitalize
usage:
capitalize($str)
# or
capitalize($str, @do_not_capitalize_these_words)
description:
Initial-capitalizes $str i.e. any words (defined as consecutive characters between word-boundaries (\b))
in $str that aren't already all caps are lowercased and their initials are uppercased.
Note: apostrophes are treated as word characters, so "There's more than one way to do it" becomes
"There's More Than One Way To Do It" rather than "There'S More Than One Way To Do It"
Any arguments supplied after string are treated as exceptions and left as is
trim
usage:
trim ($str)
description:
removes whitespace from the beginning and end of $str and squashes
multiple spaces down to single spaces
ltrim
usage:
ltrim ($str)
description:
removes whitespace from beginning of $str
rtrim
usage:
rtrim ($str)
description:
removes whitespace from end of $str
uid
usage:
uid()
description:
returns a quick 'n' dirty Unique Identifier. Uses $$, so
not necessarily reliable under SpeedyCGI, mod_perl &c.
html
usage:
html ($text);
description:
returns text with HTML Content-type header prefixed
prints the prefixed page out directly if called in void context
jscript
usage:
jscript ($text);
description:
returns the JavaScript in $text with the 'application/x-javascript' Content-type header prefixed
prints the prefixed page out directly if called in void context
jshtml
usage:
jshtml ($text);
description:
returns the JavaScript in $text wrapped inside <script>, <head> and <html> tags.
The resulting HTML is returned with a 'text/html' header.
prints the result out directly if called in void context
plural
usage:
my $plural = plural($stem, $count);
# or
my $plural = plural($stem, $count, $plural);
description:
plural() takes a singular word or word-stem as an argument;
it evaluates $count to see if it is equal to 1;
if it is, $stem is returned unchanged; otherwise $stem is
pluralised by adding $plural, or 's' if $plural
hasn't been supplied.
thus:
my $plural = plural('error', $error);
will return:
'errors' if $error == 0
'error' if $error == 1
'errors' if $error > 1
This simple implementation does not support irregular plurals that modify the stem.
urlize
usage:
my $url = urlize('Foo: BAR baz'); # returns 'foo_bar_baz'
# or
my $url = urlize('Foo - BAR - baz', 'html'); # returns 'foo_bar_baz.html'
description:
makes its text argument URL-friendly
Returns the first argument lowercased with any consecutive non-alphanumeric characters replaced by an underscore.
If the optional second argument is provided, this is appended as an extension prefixed by '.'
xmlparse
usage:
xmlparse ($parser, $xml_path_or_data);
description:
convenience wrapper for XML::Parser (or XML::Parser::Expat - or indeed any parser that supports parse() and parsefile())
that is agnostic with regard to whether $xml is a file/filehandle or raw XML text.
The $parser should be prefabricated according to taste.
readfile
description:
Swiss-Army Slurp
usage:
# vanilla
my $file = readfile($path);
# logging
my $file = readfile($path, LOG => $log);
# lines
my @file = readfile($path);
# set Input Record Separator
my @file = readfile($path, IRS => '...');
# strip Input Record Separator from result
my @file = readfile($path, IRS => '...', CHOMP => 1);
# all together now...
my @file = readfile($path, LOG => $log, IRS => '...', CHOMP => 1);
# generator/continuation
my $source = '';
my $generator = readfile($path, IRS => '...', CHOMP => 1, GENERATOR => 1);
while ($generator->(\$source)) {
process($source);
}
# or
while ($generator->()) { # implied target: $_
process($_);
}
# synonyms for IRS => DELIM, SPLIT, DELIMITER
reader
This method implements a generator/continuation interface
to the fine art of file slurpage. Essentially, it provides a private
lexically scoped filehandle and an associated file reader
(in the form of a closure).
reader() is also accessible via readfile().
If readfile() is called with a GENERATOR => 1 argument pair,
then a closure is returned.
This closure should be called with a reference to the variable
one wishes to be assigned the next line from the file.
If no argument is supplied then $_ is assumed
to be the target.
The generator yields true while the file is slurping, and undef thereafter.
For more information on the relative performance of fileHandle v local (*FH)
(local is used in 'vanilla' readfile()) vide:
http://groups.google.com/groups?
hl=en&frame=right&th=f6035f6588fa7bfe&seekm=tbaf8dq702.fsf%40blue.sea.net#s
Unfortunately, local() won't work in reader()
(as a way of creating anonymous, temporary filehandles)
because close() is called automatically on such filehandles
at the end of the scope. The end of the scope
means the end of the call to reader() or the end of a (single)
call to $generator->() - neither of which
correspond to the end of days for the filehandle as far as the user is concerned.
What's needed is a solution similar to Alexander Alexandrescu's
ScopeGuard, and that's what is implemented here:
http://www.cuj.com/experts/1812/alexandr.htm?topic=experts
writefile
usage:
writefile($file, $data, %args);
description:
Write $data to filename $file. In theory, additional herbs
and spices are specified as a list of pairs,
in the same manner as readfile (and reader).
In practice, only APPEND => 1 for append (as opposed to truncate)
and/or 'MODE' => $mode (to roll your own file access mode) are currently defined.
appendfile
usage:
appendfile($file, $data, %args);
description:
This is a simple wrapper for writefile()'s APPEND option.
Appends $data to the file whose path is specified by $file.
See writefile() for %args options.
clone
usage:
use Util qw(clone);
$a = Foo->new();
$b = { foo => 'bar', move => 'zig' };
$c = clone($a);
$d = clone($b);
# or
my $node4 = {
name => 'node4',
attr => 'foo',
children => [ $node5, $node6 ],
parent => $node3
};
# clone $node4 but preserve the original $node3 (rather than cloning
# through it all the way to the root)
my $clone = clone($node4, [ $node3 ]);
# or, equivalently
my $clone = clone($node4, [ $node4->{parent} ]);
description:
clone() returns a recursive copy of its argument, which can be an
arbitrary (scalar) type including nested hash, array and reference types
(including weak references), tied variables and objects.
To duplicate non-scalar types (e.g. lists, arrays and hashes), pass them
in by reference. e.g.
my $copy = clone (\@array);
# or
my %copy = %{ clone (\%hash) };
clone() takes an optional second argument: a reference to an array
containing a list of exceptions i.e. values that should be 'passed-thru'
verbatim. This is useful for, amongst other things, cloning nodes in a
hierarchy without duplicating the structure all the way to the root.
For a slower, but more flexible solution see Storable's dclone().
PRIVATE METHODS
myclose
Used as an auxiliary filehandle destructor by reader()
Scope::Guard
DESCRIPTION
Confer lexical semantics on an arbitrary resource
METHODS
new
usage:
my $sg = Scope::Guard->new();
description:
Creates a new ScopeGuard object. ScopeGuard provides resource
management for a non-lexically-scoped variable
by wrapping that variable in a lexical whose destructor then
manages the bound resource.
Thus the lifetime of a non-lexical resource can be made
commensurate with that of a blessed lexical.
In other words, a resource that's messy, painful or
inconvenient to close/free/cleanup can be 'automagically' managed
as painlessly as any temporary. Forget about it, let it go out of
scope, or set it to undef and resource
management kicks in via the ScopeGuard destructor (DESTROY, of course)
which feeds its second member (handler) its first member (resource).
In addition to this resource management functionality,
the ScopeGuard pointer value (as an integer) is used to
create a unique filehandle *name* within Util::reader()
(called by Util::readfile). In practice, any lexical reference
could have been used to provide a safe filehandle name.
The ScopeGuard object just happened to be the most convenient
lexical at our disposal.
For more information on ScopeGuard, vide:
http://www.cuj.com/experts/1812/alexandr.htm?topic=experts
guard
usage:
$sg->guard($resource, $handler);
description:
Initialize a ScopeGuard object with the resource it should
manage and the handler that should be called to implement
that management when the ScopeGuard object's destructor is called.
DESTROY
usage:
$sg->DESTROY();
description:
Not called directly. The destructor is a thin wrapper around
the invocation of the handler on the resource