NAME

CGI::Ex::Fill - Fast but compliant regex based form filler

SYNOPSIS

use CGI::Ex::Fill qw(form_fill fill);

my $text = my_own_template_from_somewhere();

my $form = CGI->new;
# OR
# my $form = {key => 'value'}
# OR
# my $form = [CGI->new, CGI->new, {key1 => 'val1'}, CGI->new];


form_fill(\$text, $form); # modifies $text

# OR
# my $copy = form_fill($text, $form); # copies $text

# OR
fill({
    text => \$text,
    form => $form,
});


# ALSO

my $formname = 'formname';     # form to parse (undef = anytable)
my $fp = 0;                    # fill_passwords ? default is true
my $ignore = ['key1', 'key2']; # OR {key1 => 1, key2 => 1};

form_fill(\$text, $form, $formname, $fp, $ignore);

# OR
fill({
    text          => \$text,
    form          => $form,
    target        => 'my_formname',
    fill_password => $fp,
    ignore_fields => $ignore,
});

# ALSO

### delay getting the value until we find an element that needs it
my $form = {key => sub {my $key = shift; # get and return value}};

DESCRIPTION

form_fill is directly comparable to HTML::FillInForm. It will pass the same suite of tests (actually - it is a little bit kinder on the parse as it won't change case, reorder your attributes, or alter miscellaneous spaces and it won't require the HTML to be well formed).

HTML::FillInForm is based upon HTML::Parser while CGI::Ex::Fill is purely regex driven. The performance of CGI::Ex::Fill will be better on HTML with many markup tags because HTML::Parser will parse each tag while CGI::Ex::Fill will search only for those tags it knows how to handle. And CGI::Ex::Fill generally won't break on malformed html.

On tiny forms (< 1 k) form_fill was ~ 13% slower than FillInForm. If the html document incorporated very many entities at all, the performance of FillInForm goes down (adding 360 <br> tags pushed form_fill to ~ 350% faster). However, if you are only filling in one form every so often, then it shouldn't matter which you use - but form_fill will be nicer on the tags and won't balk at ugly html and will decrease performance only at a slow rate as the size of the html increases. See the benchmarks in the t/samples/bench_cgix_hfif.pl file for more information (ALL BENCHMARKS SHOULD BE TAKEN WITH A GRAIN OF SALT).

There are two functions, fill and form_fill. The function fill takes a hashref of named arguments. The function form_fill takes a list of positional parameters.

ARGUMENTS TO form_fill

The following are the arguments to the main function fill.

text

A reference to an html string that includes one or more forms.

form

A form hash, CGI object, or an array of hashrefs and objects.

target

The name of the form to swap. Default is undef which means to swap all form entities in all forms.

fill_password

Default true. If set to false, fields of type password will not be refilled.

ignore_fields

Hashref of fields to be ignored from swapping.

remove_script

Defaults to the package global $REMOVE_SCRIPT which defaults to true. Removes anything in <script></script> tags which often cause problems for parsers.

remove_comment

Defaults to the package global $REMOVE_COMMENT which defaults to true. Removes anything in <!-- --> tags which can sometimes cause problems for parsers.

object_method

The method to call on objects passed to the form argument. Default value is the package global $OBJECT_METHOD which defaults to 'param'. If a CGI object is passed, it would call param on that object passing the desired keyname as an argument.

ARGUMENTS TO form_fill

The following are the arguments to the legacy function form_fill.

\$html

A reference to an html string that includes one or more forms or form entities.

\%FORM

A form hash, or CGI query object, or an arrayref of multiple hash refs and/or CGI query objects that will supply values for the form.

$form_name

The name of the form to fill in values for. The default is undef which indicates that all forms are to be filled in.

$swap_pass

Default true. Indicates that <ltinput type="password"<gt>> fields are to be swapped as well. Set to false to disable this behavior.

\%IGNORE_FIELDS OR \@IGNORE_FIELDS

A hash ref of key names or an array ref of key names that will be ignored during the fill in of the form.

BEHAVIOR

fill and form_fill will attempt to DWYM when filling in values. The following behaviors are used on the following types of form elements.

<input type="text">

The following rules are used when matching this type:

1) Get the value from the form that matches the input's "name".
2) If the value is defined - it adds or replaces the existing value.
3) If the value is not defined and the existing value is not defined,
   a value of "" is added.

For example:

my $form = {foo => "FOO", bar => "BAR", baz => "BAZ"};

my $html = '
    <input type=text name=foo>
    <input type=text name=foo>
    <input type=text name=bar value="">
    <input type=text name=baz value="Something else">
    <input type=text name=hem value="Another thing">
    <input type=text name=haw>
';

form_fill(\$html, $form);

$html eq   '
    <input type=text name=foo value="FOO">
    <input type=text name=foo value="FOO">
    <input type=text name=bar value="BAR">
    <input type=text name=baz value="BAZ">
    <input type=text name=hem value="Another thing">
    <input type=text name=haw value="">
';

If the value returned from the form is an array ref, the values of the array ref will be sequentially used for each input found by that name until the values run out. If the value is not an array ref - it will be used to fill in any values by that name. For example:

$form = {foo => ['aaaa', 'bbbb', 'cccc']};

$html = '
    <input type=text name=foo>
    <input type=text name=foo>
    <input type=text name=foo>
    <input type=text name=foo>
    <input type=text name=foo>
';

form_fill(\$html, $form);

$html eq  '
    <input type=text name=foo value="aaaa">
    <input type=text name=foo value="bbbb">
    <input type=text name=foo value="cccc">
    <input type=text name=foo value="">
    <input type=text name=foo value="">
';
<input type="hidden">

Same as <input type="text">.

<input type="password">

Same as <input type="text">.

<input type="file">

Same as <input type="text">. (Note - this is subject to browser support for pre-population)

<input type="checkbox">

As each checkbox is found the following rules are applied:

1) Get the values from the form (do nothing if no values found)
2) Remove any existing "checked=checked" or "checked" markup from the tag.
3) Compare the "value" field to the values and mark with checked="checked"
if there is a match.

If no "value" field is found in the html, a default value of "on" will be used (which is what most browsers will send as the default value for checked boxes without "value" fields).

$form = {foo => 'FOO', bar => ['aaaa', 'bbbb', 'cccc'], baz => 'on'};

$html = '
    <input type=checkbox name=foo value="123">
    <input type=checkbox name=foo value="FOO">
    <input type=checkbox name=bar value="aaaa">
    <input type=checkbox name=bar value="cccc">
    <input type=checkbox name=bar value="dddd" checked="checked">
    <input type=checkbox name=baz>
';

form_fill(\$html, $form);

$html eq  '
    <input type=checkbox name=foo value="123">
    <input type=checkbox name=foo value="FOO" checked="checked">
    <input type=checkbox name=bar value="aaaa" checked="checked">
    <input type=checkbox name=bar value="cccc" checked="checked">
    <input type=checkbox name=bar value="dddd">
    <input type=checkbox name=baz checked="checked">
';
<input type="radio">

Same as <input type="checkbox">.

<select>

As each select box is found the following rules are applied (these rules are applied regardless of if the box is a select-one or a select-multi - if multiple values are selected on a select-one it is up to the browser to choose which one to highlight):

 1) Get the values from the form (do nothing if no values found)
 2) Remove any existing "selected=selected" or "selected" markup from the tag.
 3) Compare the "value" field to the values and mark with selected="selected"
 if there is a match.
 4) If there is no "value" field - use the text in between the "option" tags.

 (Note: There does not need to be a closing "select" tag or closing "option" tag)


$form = {foo => 'FOO', bar => ['aaaa', 'bbbb', 'cccc']};

$html = '
    <select name=foo><option>FOO<option>123<br>

    <select name=bar>
      <option>aaaa</option>
      <option value="cccc">cccc</option>
      <option value="dddd" selected="selected">dddd</option>
    </select>
';

form_fill(\$html, $form);

ok(
$html eq  '
    <select name=foo><option selected="selected">FOO<option>123<br>

    <select name=bar>
      <option selected="selected">aaaa</option>
      <option value="cccc" selected="selected">cccc</option>
      <option value="dddd">dddd</option>
    </select>
', "Perldoc example 4 passed");
<textarea>

The rules for swapping textarea are as follows:

1) Get the value from the form that matches the textarea's "name".
2) If the value is defined - it adds or replaces the existing value.
3) If the value is not defined, the text area is left alone.

(Note - there does not need to be a closing textarea tag.  In the case of
 a missing close textarea tag, the contents of the text area will be
 assumed to be the start of the next textarea of the end of the document -
 which ever comes sooner)

If the form returned an array ref of values, then these values will be used sequentially each time a textarea by that name is found. If a single value (not array ref) is found, that value will be used for each textarea by that name.

For example.

$form = {foo => 'FOO', bar => ['aaaa', 'bbbb']};

$html = '
    <textarea name=foo></textarea>
    <textarea name=foo></textarea>

    <textarea name=bar>
    <textarea name=bar></textarea><br>
    <textarea name=bar>dddd</textarea><br>
    <textarea name=bar><br><br>
';

form_fill(\$html, $form);

$html eq  '
    <textarea name=foo>FOO</textarea>
    <textarea name=foo>FOO</textarea>

    <textarea name=bar>aaaa<textarea name=bar>bbbb</textarea><br>
    <textarea name=bar></textarea><br>
    <textarea name=bar>';
<input type="submit">

Does nothing. The value for submit should typically be set by the templating system or application system.

<input type="button">

Same as submit.

HTML COMMENT / JAVASCRIPT

Because there are too many problems that could occur with html comments and javascript, form_fill temporarily removes them during the fill. You may disable this behavior by setting $REMOVE_COMMENT and $REMOVE_SCRIPT to 0 before calling form_fill. The main reason for doing this would be if you wanted to have form elements inside the javascript and comments get filled. Disabling the removal only results in a speed increase of 5%. The function uses \0COMMENT\0 and \0SCRIPT\0 as placeholders so it would be good to avoid these in your text (Actually they may be reset to whatever you'd like via $MARKER_COMMENT and $MARKER_SCRIPT).

UTILITY FUNCTIONS

html_escape

Very minimal entity escaper for filled in values.

my $escaped = html_escape($unescaped);

html_escape(\$text_to_escape);
get_tagval_by_key

Get a named value for from an html tag (key="value" pairs).

my $val     = get_tagval_by_key(\$tag, $key);
my $valsref = get_tagval_by_key(\$tag, $key, 'all'); # get all values
swap_tagval_by_key

Swap out values in an html tag (key="value" pairs).

my $count  = swap_tagval_by_key(\$tag, $key, $val); # modify ref
my $newtag = swap_tagval_by_key($tag, $key, $val);  # copies tag

LICENSE

This module may distributed under the same terms as Perl itself.

AUTHOR

Paul Seamons <perl at seamons dot com>