NAME

Util - Frequently Hacked Functions

DESCRIPTION

A cookbook of tasty functions for the terminally lazy, impatient and hubristic

AUTHOR

chocolateboy: chocolate.boy@email.com

SEE ALSO

Scalar::Util, Clone

BUGS

clone() currently segfaults if it encounters a Regex object

PUBLIC METHODS

any

usage:

    any ($arrayref)

	# or

    any ($hashref)

description:

returns a randomly chosen member of the referenced array, or a random key
from the referenced hashtable

isnum

usage:

isnum ($val)

description:

returns nonzero value (indicating the numeric type) if $val is a number

The numeric types are a conjunction of the following flags:

0x01    IS_NUMBER_IN_UV		    (number within UV range - maybe not int)
0x02    IS_NUMBER_GREATER_THAN_UV_MAX   (the pointed-to UV is undefined)
0x04    IS_NUMBER_NOT_INT		    (saw . or E notation)
0x08    IS_NUMBER_NEG		    (leading minus sign)
0x10    IS_NUMBER_INFINITY		    (this is big)
0x20    IS_NUMBER_NAN		    (this is not)

Rather than obliging the user to twiddle with bits, the following flavours of isnum
(corresponding to the flags above) are also available:

isuv
isbig
isfloat
isneg
isinf
isnan

div

usage:

my ($quotient, $remainder) = div ($numerator, $denominator);

# e.g.

my ($q, $r) = div (13, 3);

# $q = 4, $r = 1:  13 ($numerator) = 4 ($quotient) x 3 ($denominator) + 1 ($remainder) 

description:

    integer division operator
    
    in list context, returns the quotient and remainder when the first operand ($numerator) is divided by the second ($denominator)
    
    i.e. 
    
	$numerator = $quotient * $denominator + $remainder

    in scalar context, returns just the quotient. To return the remainder, use %

mtime

usage:

mtime ($file)

description:

returns the modification time of the specified file

respond

usage:

respond ($scalar)
respond (@list)

description:

respond() performs a context-sensitive return.
In void context the return value is printed
In scalar context it is returned as a scalar
In list context it is returned as a list

quote

usage:

quote($str)

description:

escapes all single quote ('), double-quote (") and vertical space
(\n\r\f) characters with a backslash, visually flattening the resulting text

useful for string export e.g. assigning a HTML page to a JavaScript var

prints the escaped text out directly if called in void context

capitalize

usage:

capitalize($str)

# or 

capitalize($str, @do_not_capitalize_these_words)

description:

Initial-capitalizes $str i.e. any words (defined as consecutive characters between word-boundaries (\b))
in $str that aren't already all caps are lowercased and their initials are uppercased.
Note: apostrophes are treated as word characters, so "There's more than one way to do it" becomes
"There's More Than One Way To Do It" rather than "There'S More Than One Way To Do It"

Any arguments supplied after string are treated as exceptions and left as is

trim

usage:

trim ($str)

description:

removes whitespace from the beginning and end of $str and squashes
multiple spaces down to single spaces

ltrim

usage:

ltrim ($str)

description:

removes whitespace from beginning of $str

rtrim

usage:

rtrim ($str)

description:

removes whitespace from end of $str

uid

usage:

uid()

description:

returns a quick 'n' dirty Unique Identifier. Uses $$, so
not necessarily reliable under SpeedyCGI, mod_perl &c.

html

usage:

html ($text);

description:

returns text with HTML Content-type header prefixed

prints the prefixed page out directly if called in void context

jscript

usage:

jscript ($text);

description:

returns the JavaScript in $text with the 'application/x-javascript' Content-type header prefixed

prints the prefixed page out directly if called in void context

jshtml

usage:

jshtml ($text);

description:

returns the JavaScript in $text wrapped inside <script>, <head> and <html> tags.
The resulting HTML is returned with a 'text/html' header.

prints the result out directly if called in void context

plural

usage:

    my $plural = plural($stem, $count);

	# or

    my $plural = plural($stem, $count, $plural);

description:

    plural() takes a singular word or word-stem as an argument;
    it evaluates $count to see if it is equal to 1;
    if it is, $stem is returned unchanged; otherwise $stem is
    pluralised by adding $plural, or 's' if $plural
    hasn't been supplied.

    thus:

	my $plural = plural('error', $error);

	will return:
	
	    'errors' if $error == 0
	    'error' if $error == 1
	    'errors' if $error > 1

    This simple implementation does not support irregular plurals that modify the stem.

urlize

usage:

my $url = urlize('Foo: BAR baz'); # returns 'foo_bar_baz'

# or

my $url = urlize('Foo - BAR - baz', 'html'); # returns 'foo_bar_baz.html'

description:

makes its text argument URL-friendly

Returns the first argument lowercased with any consecutive non-alphanumeric characters replaced by an underscore.
If the optional second argument is provided, this is appended as an extension prefixed by '.'

xmlparse

usage:

xmlparse ($parser, $xml_path_or_data);

description:

convenience wrapper for XML::Parser (or XML::Parser::Expat - or indeed any parser that supports parse() and parsefile())
that is agnostic with regard to whether $xml is a file/filehandle or raw XML text.

The $parser should be prefabricated according to taste.

readfile

description:

Swiss-Army Slurp

usage:

    # vanilla
    my $file = readfile($path);
	
    # logging
    my $file = readfile($path, LOG => $log);
	
    # lines
    my @file = readfile($path);
	
    # set Input Record Separator
    my @file = readfile($path, IRS => '...');
	
    # strip Input Record Separator from result
    my @file = readfile($path, IRS => '...', CHOMP => 1);
	
    # all together now...
    my @file = readfile($path, LOG => $log, IRS => '...', CHOMP => 1);
	
    # generator/continuation
    my $source = '';
    my $generator = readfile($path, IRS => '...', CHOMP => 1, GENERATOR => 1);

    while ($generator->(\$source)) {
	process($source);
    }

	# or

    while ($generator->()) { # implied target: $_
	process($_);
    }
	
    # synonyms for IRS => DELIM, SPLIT, DELIMITER
	

reader

    This method implements a generator/continuation interface
    to the fine art of file slurpage. Essentially, it provides a private
    lexically scoped filehandle and an associated file reader
    (in the form of a closure).

    reader() is also accessible via readfile().

    If readfile() is called with a GENERATOR => 1 argument pair,
    then a closure is returned.
    This closure should be called with a reference to the variable
    one wishes to be assigned the next line from the file.
    If no argument is supplied then $_ is assumed
    to be the target.

    The generator yields true while the file is slurping, and undef thereafter.

    For more information on the relative performance of fileHandle v local (*FH)
    (local is used in 'vanilla' readfile()) vide:

	http://groups.google.com/groups?
	    hl=en&frame=right&th=f6035f6588fa7bfe&seekm=tbaf8dq702.fsf%40blue.sea.net#s

    Unfortunately, local() won't work in reader()
    (as a way of creating anonymous, temporary filehandles)
    because close() is called automatically on such filehandles
    at the end of the scope. The end of the scope
    means the end of the call to reader() or the end of a (single)
    call to $generator->() - neither of which
    correspond to the end of days for the filehandle as far as the user is concerned.

    What's needed is a solution similar to Alexander Alexandrescu's
    ScopeGuard, and that's what is implemented here:

	http://www.cuj.com/experts/1812/alexandr.htm?topic=experts

writefile

usage:

writefile($file, $data, %args);

description:

Write $data to filename $file. In theory, additional herbs
and spices are specified as a list of pairs,
in the same manner as readfile (and reader).
In practice, only APPEND => 1 for append (as opposed to truncate)
and/or 'MODE' => $mode (to roll your own file access mode) are currently defined.

appendfile

usage:

appendfile($file, $data, %args);

description:

This is a simple wrapper for writefile()'s APPEND option.
Appends $data to the file whose path is specified by $file.

See writefile() for %args options.

clone

usage:

    use Util qw(clone);

    $a = Foo->new();
    $b = { foo => 'bar', move => 'zig' };

    $c = clone($a);
    $d = clone($b);

    # or

    my $node4 = {
	name	    => 'node4',
	attr	    => 'foo',
	children    => [ $node5, $node6 ],
	parent	    => $node3
    };

    # clone $node4 but preserve the original $node3 (rather than cloning
    # through it all the way to the root)

    my $clone = clone($node4, [ $node3 ]);

	# or, equivalently

    my $clone = clone($node4, [ $node4->{parent} ]);

description:

    clone() returns a recursive copy of its argument, which can be an
    arbitrary (scalar) type including nested hash, array and reference types
    (including weak references), tied variables and objects.

    To duplicate non-scalar types (e.g. lists, arrays and hashes), pass them
    in by reference. e.g.
	
	my $copy = clone (\@array);

	# or

	my %copy = %{ clone (\%hash) };

    clone() takes an optional second argument: a reference to an array
    containing a list of exceptions i.e. values that should be 'passed-thru'
    verbatim. This is useful for, amongst other things, cloning nodes in a
    hierarchy without duplicating the structure all the way to the root.

    For a slower, but more flexible solution see Storable's dclone().

PRIVATE METHODS

myclose

Used as an auxiliary filehandle destructor by reader()

Scope::Guard

DESCRIPTION

Confer lexical semantics on an arbitrary resource

METHODS

new

usage:

my $sg = Scope::Guard->new();

description:

    Creates a new ScopeGuard object. ScopeGuard provides resource
    management for a non-lexically-scoped variable
    by wrapping that variable in a lexical whose destructor then
    manages the bound resource.

    Thus the lifetime of a non-lexical resource can be made
    commensurate with that of a blessed lexical.

    In other words, a resource that's messy, painful or
    inconvenient to close/free/cleanup can be 'automagically' managed
    as painlessly as any temporary. Forget about it, let it go out of
    scope, or set it to undef and resource
    management kicks in via the ScopeGuard destructor (DESTROY, of course)
    which feeds its second member (handler) its first member (resource).

    In addition to this resource management functionality,
    the ScopeGuard pointer value (as an integer) is used to
    create a unique filehandle *name* within Util::reader()
    (called by Util::readfile). In practice, any lexical reference
    could have been used to provide a safe filehandle name.
    The ScopeGuard object just happened to be the most convenient
    lexical at our disposal.

    For more information on ScopeGuard, vide:

	http://www.cuj.com/experts/1812/alexandr.htm?topic=experts

guard

usage:

$sg->guard($resource, $handler);

description:

Initialize a ScopeGuard object with the resource it should
manage and the handler that should be called to implement
that management when the ScopeGuard object's destructor is called.

DESTROY

usage:

$sg->DESTROY();

description:

Not called directly. The destructor is a thin wrapper around
the invocation of the handler on the resource