NAME
Parse::PerlConfig - parse a configuration file written in Perl
SYNOPSIS
use Parse::PerlConfig;
my $parsed = Parse::PerlConfig::parse(
File => "/etc/perlapp/conf",
Handlers => [\%config, \&config],
);
DESCRIPTION
This module is useful for parsing a configuration file written in Perl and obtaining the values defined therein. This is achieved through the parse() function, which creates a namespace, reads in Perl code, evals it, and then examines the namespace's symbol table. Symbols are then processed into a hash and returned.
Export
The parse() function is exportable upon request.
Parsing
Parsing is not a simple do("filename"). Instead the filenames specified are opened, read, eval'd, and closed. The justification for this being twofold:
- no @INC search
-
I did not want surprises in what file was found; do("file") searches @INC.
- lexicals
-
I wanted to be able to insert lexicals for the code in the file to see. Being able to define variables without having them parsed back out (remember, the namespace is searched) is a nice feature.
Parsing (in this manner) requires a namespace. By default, the namespace is constructed by appending Namespace_Base to a unique identifier (currently, an encoded version of the filename, but don't rely on this). You can override this behaviour by specifying an explicit Namespace argument.
Prior to eval'ing the contents of a configuration file the lexical hash %parse_perl_config is initialized with several keys (documented below); if a Lexicals argument was given each of the lexicals specified are initialized.
There are a few caveats; lexicals specified in the Lexicals argument cannot override %parse_perl_config; keys specified in Lexicals cannot be code references, because code references cannot currently be reliably reconstructed; modifications to %parse_perl_config keys (other than Error, documented below) are discouraged, as the results are not defined.
The %parse_perl_config hash contains the following keys:
- Error
-
Making this key a true value will cause the error handler Error_eval to be called with the value.
- Namespace
-
The namespace the file is being evaluated in.
- Filename
-
The name of the file being parsed.
- Parse_Args
-
A hash of the arguments passed to parse().
Once the namespace has been setup, and the code eval'd, it is then parse()'s job to go through the namespace's symbol table and look for "things". What it looks for depends on the Thing_Order and Symbols arguments.
After that, handlers are updated, and a hash reference of what was parsed out is returned.
The parse subroutine
Arguments
parse() takes a list of key-value pairs as its arguments and adds them to an argument hash. If the first argument to parse() is a hash or array reference, it is dereferenced and used as if it were specified as a list. All elements following this argument are added to the arguments hash, and they override any settings specified by the reference.
This means the call:
parse(
{ Files => "/home/me/config.conf", Error_default => 'fwarn' },
Files => "/home/you/config.conf"
);
causes parse()'s argument hash to consist of the following (ignoring default settings):
Files => "/home/you/config.conf",
Error_default => "fwarn",
Simply replace the braces, {}, with brackets, [], and you get the same result. This makes it convenient to store commonly-used arguments to parse() in a hash or array, and efficiently pass these arguments to parse(), while still allowing a seperate Files argument for each call.
The below itemization of parse()'s arguments describes key-value pairs. Each item consists of a key name and a description of the expected value for that key.
The value description requires some explanation.
A single pipe, "|", indicates alternative values; only one of the values must be specified.
Values bracketed with "<" and ">" indicate that value is not literal, but figurative. So, in the case of <coderef>, you must specify a code reference (a closure or reference to a named subroutine), not the literal string "<coderef>". Values without such bracketing are literal.
Braces, {}, indicate a hash reference is required; brackets, [], indicate an array reference is required.
Below each key-value description is a description of the default setting, followed by a description of what the argument means.
- Files <filename> | [<filename> ...]
-
default: none, this argument is required
This is the file or files you wish to parse. If a file cannot be parsed for any reason the entire parse is not abandoned, the file is simply skipped (after calling an appropriate error handling function).
- File <filename>
-
Equivalent to Files <filename>.
- Handlers <hashref>|<coderef> | [<hashref>|<coderef> ...]
-
default: none
By default, parse() simply returns a hash reference of symbol names and their values. Given a Handlers argument, parse() will add key-value pairs to each hash reference specified, and call each code reference specified with a single argument, the hash reference it returns.
- Handler <hashref>|<coderef>
-
Equivalent to Handlers <hashref>|<coderef>.
- Lexicals <hashref>
-
default: none
The key-value pairs in the specified hashref are made into lexical variables in the configuration eval. See the section on Parsing for further information.
- Thing_Order <string>|<arrayref>
-
default: '$@%&i*'
Specifies the default thing order for symbols parsed from each configuration file. See the section Things for further information.
- Taint_Clean <boolean>
-
default: false
If set to any true value the filehandle opened on the configuration file is untainted before evaling the code contained therein. Because this involves loading IO::Handle, which involves quite a bit of code, the option is turned off by default. You will get taint exceptions if you don't specify this option while running in
-T
mode.Also, as the namespace is currently constructed, having a tainted filename will cause the namespace name to be tainted, so it is also untainted. In the case of an explicitly specified Namespace value, it will also be untainted.
No other values are untainted. This includes any key-value pairs specified by the Lexicals argument; you must untaint those yourself, since there is no reasonable way for parse() to determine how best to untaint them.
- Symbols <hashref>
-
default: empty hashref
This is an override for the Thing_Order argument above. The keys in the specified hashref are symbols you want parsed specially, the values the thing order (either a string or array reference). See the section Things for further information regarding thing order.
- Namespace <string>
-
default: generated from Namespace_Base and a unique identifier
This option explicitly specifies the namespace the files are parsed in. See the section Parsing for further information.
- Namespace_Base <string>
-
default: Parse::PerlConfig::ConfigFile
Unless the Namespace argument is specified, the namespace a file is parsed in is generated by appending a cleaned up version of the filename to this setting. See the section Parsing for further information.
- Error_*
-
See the section Error Handling for further information.
Things
"Things" (as taken from the Perl documentation, regarding the *foo{THING} syntax) are the Perl datatypes. These include scalars, arrays, hashes, subroutines, IO handles, and globs.
Anywhere a "things" argument is required you can specify one of two things; a string containing the special "thing" characters, or an array reference of each thing's actual name. The thing characters are as follows: $
for scalar, %
for a hash, @
for an array, &
for a subroutine, i
for an IO handle, and *
for a glob. The full name for each coincides with the full name for each datatype in their respective glob slots: SCALAR for a scalar, HASH for a hash, ARRAY for an array, CODE for a subroutine, IO for an IO handle, and GLOB for a glob.
Exception Handling
parse() takes various Error_* and Warn_* arguments that determine how it handles any problems it encounters. Each argument can take one of several values.
- default
-
The error handling specifed by Error_default in the case of an Error_* argument, or Warn_default in the case of a Warn_* argument, is used.
- noop
-
The error is ignored.
- warn
-
Results in a call to CORE::warn() with a trailing newline, but only if
$^W
is set to a true value. - fwarn
-
Like warn, but the warning is raised regardless of
$^W
's value. - die
-
Results in a call to CORE::die() with a trailing newline.
- <code reference>
-
The code reference will be called with a single argument, that of the error message. The error message is guaranteed to contain no trailing newlines (in case the code reference decides to die() or warn()).
There are various handler arguments. Unless otherwise specified, the default handler is used (Error_default's or Warn_default's value).
- Warn_default
-
default: noop
The default warning handler.
- Warn_preparse
-
Called just before a file is parsed to indicate parsing is about to begin.
- Warn_eval
-
Called with any warnings issued by the eval'd file.
- Error_default
-
default: warn
The default error handler.
- Error_argument
-
Called if there is a problem with one of the arguments specified.
- Error_file_is_dir
-
Called if a configuration file specified was discovered to be a directory.
- Error_failed_open
-
Called if the open attempt on a configuration file fails.
- Error_eval
-
Called if the variable $parse_perl_config{Error} is set in the configuration file, or if there was an eval error.
- Error_unknown_thing
-
Called if there is a problem with a thing character or thing name in a thing argument (thing thing thing).
- Error_unknown_handler
-
Called if an unknown reference is encountered in the Handlers argument.
- Error_invalid_lexical
-
Called if an invalid lexical name or a CODE reference value is encountered in the Lexicals argument.
- Error_invalid_namespace
-
Called if either the constructed namespace (using Namespace_Base) or a specified Namespace value is invalid. This may indicate an error in the construction of a namespace name (the generation of a unique identifier), but it's most likely you specified Namespace_Base or Namespace with invalid characters.
BUGS
Due to the fact that the scalar slot in a glob is always filled it is not possible to distinguish from a scalar that was never defined (e.g. @foo
was, but $foo
was never mentioned) from one that is simply undef. Because of this, for example, if you have a thing order of $@
and code along the lines of $foo = undef; @foo = ();
the 'foo' key of the hash will be an array reference, despite there being a scalar and $
coming first in the thing order.
TODO
t/parse/symbols.t, t/parse/multi-file.t, t/parse/namespace.t
AUTHOR
Michael Fowler <michael@shoebox.net>