Maypole Request Hacking Cookbook

Hacks; design patterns; recipes: call it what you like, this chapter is a developing collection of techniques which can be slotted in to Maypole applications to solve common problems or make the development process easier.

As Maypole developers, we don't necessarily know the "best practice" for developing Maypole applications ourselves, in the same way that Larry Wall didn't know all about the best Perl programming style as soon as he wrote Perl. These techniques are what we're using at the moment, but they may be refined, modularized, or rendered irrelevant over time. But they've certainly saved us a bunch of hours work.

Frontend hacks

These hacks deal with changing the way Maypole relates to the outside world; alternate front-ends to the Apache and CGI interfaces, or subclassing chunks of the front-end modules to alter Maypole's behaviour in particular ways.

Separate model class modules

You want to put all the BeerDB::Beer routines in a separate module, so you say:

package BeerDB::Beer;
BeerDB::Beer->has_a(brewery => "BeerDB::Brewery");
sub foo :Exported {}

And in BeerDB.pm, you put:

use BeerDB::Beer;

It doesn't work.

Solution: It doesn't work because of the timing of the module loading. use Beer::Beer will try to set up the has_a relationships at compile time, when the database tables haven't even been set up, since they're set up by

BeerDB->setup("...")

which does its stuff at runtime. There are two ways around this; you can either move the setup call to compile time, like so:

BEGIN { BeerDB->setup("...") }

or move the module loading to run-time (my preferred solution):

BeerDB->setup("...");
BeerDB::Beer->require;

Debugging with the command line

You're seeing bizarre problems with Maypole output, and you want to test it in some place outside of the whole Apache/mod_perl/HTTP/Internet/browser circus.

Solution: Use the Maypole::CLI module to go directly from a URL to standard output, bypassing Apache and the network altogether.

Maypole::CLI is not a standalone front-end, but to allow you to debug your applications without having to change the front-end they use, it temporarily "borgs" an application. If you run it from the command line, you're expected to use it like so:

perl -MMaypole::CLI=Application -e1 'http://your.server/path/table/action'

For example:

perl -MMaypole::CLI=BeerDB -e1 'http://localhost/beerdb/beer/view/1?o2=desc'

You can also use the Maypole::CLI module programatically to create test suites for your application. See the Maypole tests themselves or the documentation to Maypole::CLI for examples of this.

Changing how URLs are parsed

You don't like the way Maypole URLs look, and want something that either fits in with the rest of your site or hides the internal workings of the system.

Solution: So far we've been using the /table/action/id/args form of a URL as though it was "the Maypole way"; well, there is no Maypole way. Maypole is just a framework and absolutely everything about it is overridable.

If we want to provide our own URL handling, the method to override in the driver class is parse_path. This is responsible for taking $r->{path} and filling the table, action and args slots of the request object. Normally it does this just by splitting the path on /s, but you can do it any way you want, including getting the information from POST form parameters or session variables.

For instance, suppose we want our URLs to be of the form ProductDisplay.html?id=123, we could provide a parse_path method like so:

sub parse_path {
    my $r = shift;
    $r->{path} ||= "ProductList.html";
    ($r->{table}, $r->{action}) = 
        ($r->{path} =~ /^(.*?)([A-Z]\w+)\.html/);
    $r->{table}  = lc $r->{table};
    $r->{action} = lc $r->{action};
    my %query = $r->{ar}->args;
    $self->{args} = [ $query{id} ];
}

This takes the path, which already has the query parameters stripped off and parsed, and finds the table and action portions of the filename, lower-cases them, and then grabs the id from the query. Later methods will confirm whether or not these tables and actions exist.

See BuySpy.pod for another example of custom URL processing.

Maypole for mobile devices

You want Maypole to use different templates to display on particular browsers.

Solution: There are several ways to do this, but here's the neatest we've found. Maypole chooses where to get its templates either by looking at the template_root config parameter or, if this is not given, calling the get_template_root method to ask the front-end to try to work it out. We can give the front-end a little bit of help, by putting this method in our driver class:

sub get_template_root {
    my $r = shift;
    my $browser = $r->{ar}->headers_in->get('User-Agent');
    if ($browser =~ /mobile|palm|nokia/i) {
        "/home/myapp/templates/mobile";
    } else {
        "/home/myapp/templates/desktop";
    }
}

(Maybe there's a better way to detect a mobile browser, but you get the idea.)

Content display hacks

These hacks deal primarily with the presentation of data to the user, modifying the view template or changing the way that the results of particular actions are displayed.

Null Action

You need an "action" which doesn't really do anything, but just formats up a template.

Solution: There are two ways to do this, depending on what precisely you need. If you just need to display a template, Apache::Template style, with no Maypole objects in it, then you don't need to write any code; just create your template, and it will be available in the usual way.

If, on the other hand, you want to display some data, and what you're essentially doing is a variant of the view action, then you need to ensure that you have an exported action, as described in StandardTemplates.pod:

sub my_view :Exported { }

Template Switcheroo

An action doesn't have any data of its own to display, but needs to display something.

Solution: This is an extremely common hack. You've just issued an action like beer/do_edit, which updates the database. You don't want to display a page that says "Record updated" or similar. Lesser application servers would issue a redirect to have the browser request /beer/view/id instead, but we can actually modify the Maypole request on the fly and, after doing the update, pretend that we were going to /beer/view/id all along. We do this by setting the objects in the objects slot and changing the template to the one we wanted to go to.

In this example from Flox.pod, we've just performed an accept method on a Flox::Invitation object and we want to go back to viewing a user's page.

sub accept :Exported {
    my ($self, $r) = @_;
    my $invitation = $r->objects->[0];
    # [... do stuff to $invitation ...]
    $r->{objects} = [$r->{user}];
    $r->{model_class} = "Flox::User";
    $r->{template} = "view";
}

This hack is so common that it's expected that there'll be a neater way of doing this in the future.

XSLT

Here's a hack I've used a number of times. You want to store structured data in a database and to abstract out its display.

Solution: You have your data as XML, because handling big chunks of XML is a solved problem. Build your database schema as usual around the important elements that you want to be able to search and browse on. For instance, I have an XML format for songs which has a header section of the key, title and so on, plus another section for the lyrics and chords:

   <song>
       <header>
           <title>Layla</title>
           <artist>Derek and the Dominos</artist>
           <key>Dm</key>
       </header>
       <lyrics>
         <verse>...</verse>
         <chorus>
           <line> <sup>A</sup>Lay<sup>Dm</sup>la <sup>Bb</sup> </line> 
           <line> <sup>C</sup>Got me on my <sup>Dm</sup>knees </line> 
           ...

I store the title, artist and key in the database, as well as an "xml" field which contains the whole song as XML.

To load the songs into the database, I can use the driver class for my application, since that's a handy way of setting up the database classes we're going to need to use. Then the handy XML::TreeBuilder will handle the XML parsing for us:

use Songbook;
use XML::TreeBuilder;
my $t = XML::TreeBuilder->new;
$t->parse_file("songs.xml");

for my $song ($t->find("song")) {
    my ($key) = $song->find("key"); $key &&= $key->as_text;
    my ($title) = $song->find("title"); $title = $title->as_text;
    my ($artist) = $song->find("artist"); $artist = $artist->as_text;
    my ($first_line) = $song->find("line");
    $first_line = join "", grep { !ref } $first_line->content_list;
    $first_line =~ s/[,\.\?!]\s*$//;
    Songbook::Song->find_or_create({
        title => $title,
        first_line => $first_line,
        song_key => Songbook::SongKey->find_or_create({name => $key}),
        artist => Songbook::Artist->find_or_create({name => $artist}),
        xml => $song->as_XML
    });
}

Now we need to set up the custom display for each song; thankfully, with the Template::Plugin::XSLT module, this is as simple as putting the following into templates/song/view:

[%
    USE transform = XSLT("song.xsl");
    song.xml | $transform
%]

We essentially pipe the XML for the selected song through to an XSL transformation, and this will fill out all the HTML we need. Job done.

Displaying pictures

You want to serve a picture, a Word document, or something else which doesn't have a content type of text/html, out of your database.

Solution: Fill the content and content-type yourself.

Here's a subroutine which displays the photo for either a specified user or the currently logged in user. We set the output slot of the Maypole request object: if this is done then the view class is not called upon to process a template, since we already have some output to display. We also set the content_type using one from the database.

sub view_picture :Exported {
    my ($self, $r) = @_;
    my $user = $r->{objects}->[0];
    $r->{content_type} = $user->photo_type;
    $r->{output} = $user->photo;
}

Of course, the file doesn't necessarily need to be in the database itself; if your file is stored in the filesystem, but you have a file name or some other pointer in the database, you can still arrange for the data to be fetched and inserted into $r->{output}.

REST

You want to provide a programmatic interface to your Maypole site.

Solution: The best way to do this is with REST, which uses a descriptive URL to encode the request. For instance, in Flox.pod we describe a social networking system. One neat thing you can do with social networks is to use them for reputation tracking, and we can use that information for spam detection. So if a message arrives from person@someco.com, we want to know if they're in our network of friends or not and mark the message appropriately. We'll do this by having a web agent (say, WWW::Mechanize or LWP::UserAgent) request a URL of the form http://flox.simon-cozens.org/user/relationship_by_email/person%40someco.com. Naturally, they'll need to present the appropriate cookie just like a normal browser, but that's a solved problem. We're just interested in the REST request.

The request will return a single integer status code: 0 if they're not in the system at all, 1 if they're in the system, and 2 if they're our friend.

All we need to do to implement this is provide the relationship_by_email action, and use it to fill in the output in the same way as we did when displaying a picture. Since person%40someco.com is not the ID of a row in the user table, it will appear in the args array:

use URI::Escape;
sub relationship_by_email :Exported {
    my ($self, $r) = @_;
    my $email = uri_unescape($r->{args}[0]);
    $r->{content_type} = "text/plain";
    my $user;
    unless (($user) = Flox::User->search(email => $email)) {
        $r->{content} = "0\n"; return;
    }

    if ($r->{user}->is_friend($user)) { $r->{content} = "2\n"; return; };
    $r->{content} = "1\n"; return;
}

Component-based Pages

You're designing something like a portal site which has a number of components, all displaying different bits of information about different objects. You want to include the output of one Maypole request call while building up another.

Solution: Use Maypole::Component. By inheriting from this, you can call the component method on the Maypole request object to make a "sub-request". For instance, if you have a template

<DIV class="latestnews">
[% request.component("/news/latest_comp") %]
</DIV>

<DIV class="links">
[% request.component("/links/list_comp") %]
</DIV>

then the results of calling the /news/latest_comp action and template will be inserted in the latestnews DIV, and the results of calling /links/list_comp will be placed in the links DIV. Naturally, you're responsible for exporting actions and creating templates which return fragments of HTML suitable for inserting into the appropriate locations.

Alternatively, if you've already got all the objects you need, you can probably just [% PROCESS %] the templates directly.

Bailing out with an error

Maypole's error handling sucks. Something really bad has happened to the current request, and you want to stop processing now and tell the user about it.

Solution: Maypole's error handling sucks because you haven't written it yet. Maypole doesn't know what you want to do with an error, so it doesn't guess. One common thing to do is to display a template with an error message in it somewhere.

Put this in your driver class:

sub error { 
    my ($r, $message) = @_;
    $r->{template} = "error";
    $r->{template_args}{error} = $message;
    return OK;
}

And then have a custom/error template like so:

[% PROCESS header %]
<H2> There was some kind of error... </H2>
<P>
I'm sorry, something went so badly wrong, we couldn't recover. This
may help:
</P>
<DIV CLASS="messages"> [% error %] </DIV>

Now in your actions you can say things like this:

if (1 == 0) { return $r->error("Sky fell!") }

This essentially uses the template switcheroo hack to always display the error template, while populating the template with an error parameter. Since you return $r->error, this will terminate the processing of the current action.

The really, really neat thing about this hack is that since error returns OK, you can even use it in your authenticate routine:

sub authenticate {
    my ($self, $r) = @_;
    $r->get_user;
    return $r->error("You do not exist. Go away.")
        if $r->{user} and $r->{user}->status ne "real";
    ...
}

This will bail out processing the authentication, the model class, and everything, and just skip to displaying the error message.

Non-showstopper errors or other notifications are best handled by tacking a messages template variable onto the request:

if ((localtime)[6] == 1) {
    push @{$r->{template_args}{messages}}, "Warning: Today is Monday";
}

Now custom/messages can contain:

[% IF messages %]
<DIV class="messages">
<UL>
    [% FOR message = messages %]
       <LI> [% message %] </LI>
    [% END %]
</UL>
</DIV>
[% END %]

And you can display messages to your user by adding PROCESS messages at an appropriate point in your template; you may also want to use a template switcheroo to ensure that you're displaying a page that has the messages box in it.

Authentication hacks

The next series of hacks deals with providing the concept of a "user" for a site, and what you do with one when you've got one.

Logging In

You need the concept of a "current user".

Solution: Use something like Maypole::Authentication::UserSessionCookie to authenticate a user against a user class and store a current user object in the request object.

UserSessionCookie provides the get_user method which tries to get a user object, either based on the cookie for an already authenticated session, or by comparing username and password form parameters against a user table in the database. Its behaviour is highly customizable, so see the documentation, or the authentication paper at http://maypole.simon-cozens.org/docs/authentication.html for examples.

Pass-through login

You want to intercept a request from a non-logged-in user and have them log in before sending them on their way to wherever they were originally going.

Solution:

sub authenticate {
    my ($self, $r) = @_;
    $r->get_user;
    return OK if $r->{user};
    # Force them to the login page.
    $r->{template} = "login";
    return OK;
}

This will display the login template, which should look something like this:

[% INCLUDE header %]

  <h2> You need to log in </h2>

<DIV class="login">
[% IF login_error %]
   <FONT COLOR="#FF0000"> [% login_error %] </FONT>
[% END %]
  <FORM ACTION="/[% request.path%]" METHOD="post">
Username: 
    <INPUT TYPE="text" NAME="[% config.auth.user_field || "user" %]"> <BR>
Password: <INPUT TYPE="password" NAME="password"> <BR>
<INPUT TYPE="submit">
</FORM>
</DIV>

Notice that this request gets POSTed back to wherever it came from, using request.path. This is because if the user submits correct credentials, get_user will now return a valid user object, and the request will pass through unhindered to the original URL.

Logging Out

Now your users are logged in, you want a way of having them log out again and taking the authentication cookie away from them, sending them back to the front page as an unprivileged user.

Solution: This action, on the user class, is probably overkill, but it does the job:

sub logout :Exported {
    my ($class, $r) = @_;
    # Remove the user from the request object
    my $user = delete $r->{user};
    # Destroy the session
    tied(%{$r->{session}})->delete;
    # Send a new cookie which expires the previous one
    my $cookie = Apache::Cookie->new($r->{ar},
        -name => $r->config->{auth}{cookie_name},
        -value => undef,
        -path => "/"
        -expires => "-10m"
    );
    $cookie->bake();
    # Template switcheroo
    $r->template("frontpage");
}

Multi-level Authentication

You have both a global site access policy (for instance, requiring a user to be logged in except for certain pages) and a policy for particular tables. (Only allowing an admin to delete records in some tables, say, or not wanting people to get at the default set of methods provided by the model class.)

You don't know whether to override the global authenticate method or provide one for each class.

Solution: Do both. Have a global authenticate method which calls a sub_authenticate method based on the class:

sub authenticate {
    ...
    if ($r->{user}) {
        return $r->model_class->sub_authenticate($r)
            if $r->model_class->can("sub_authenticate");
        return OK;
    }
    ...
}

And now your sub_authenticate methods can specify the policy for each table:

sub sub_authenticate { # Ensure we can only create, reject or accept
    my ($self, $r) = @_;
    return OK if $r->{action} =~ /^(issue|accept|reject|do_edit)$/;
    return;
}

Creating and editing hacks

These hacks particularly deal with issues related to the do_edit built-in action.

Limiting data for display

You want the user to be able to type in some text that you're later going to display on the site, but you don't want them to stick images in it, launch cross-site scripting attacks or otherwise insert messy HTML.

Solution: Use the CGI::Untaint::html module to sanitize the HTML on input. CGI::Untaint::html uses HTML::Sanitizer to ensure that tags are properly closed and can restrict the use of certain tags and attributes to a pre-defined list.

Simply replace:

App::Table->untaint_columns(
    text      => [qw/name description/]
);

with:

App::Table->untaint_columns(
    html      => [qw/name description/]
);

And incoming HTML will be checked and cleaned before it is written to the database.

Getting data from external sources

You want to supplement the data received from a form with additional data from another source.

Solution: Munge the contents of $r->params before jumping to the original do_edit routine. For instance, in this method, we use a Net::Amazon object to fill in some fields of a database row based on an ISBN:

sub create_from_isbn :Exported {
   my ($self, $r) = @_;
   my $response = $ua->search(asin => $r->{params}{isbn});
   my ($prop) = $response->properties;
   # Rewrite the CGI parameters with the ones from Amazon
   @{$r->{params}{qw(title publisher author year)} =            
       ($prop->title,
       $prop->publisher,
       (join "/", $prop->authors()),
       $prop->year());
   # And jump to the usual edit/create routine
   $self->do_edit($r);
}

The request will carry on as though it were a normal do_edit POST, but with the additional fields we have provided.

Catching errors in a form

A user has submitted erroneous input to an edit/create form. You want to send him back to the form with errors displayed against the erroneous fields, but have the other fields maintain the values that the user submitted.

Solution: This is basically what the default edit template and do_edit method conspire to do, but it's worth highlighting again how they work.

If there are any errors, these are placed in a hash, with each error keyed to the erroneous field. The hash is put into the template as errors, and we process the same edit template again:

$r->{template_args}{errors} = \%errors;
$r->{template} = "edit";

This throws us back to the form, and so the form's template should take note of the errors, like so:

 FOR col = classmetadata.columns;
    NEXT IF col == "id";
    "<P>";
    "<B>"; classmetadata.colnames.$col; "</B>";
    ": ";
        item.to_field(col).as_HTML;
    "</P>";
    IF errors.$col;
        "<FONT COLOR=\"#ff0000\">"; errors.$col; "</FONT>";
    END;
END;

If we're designing our own templates, instead of using generic ones, we can make this process a lot simpler. For instance:

<TR><TD>
First name: <INPUT TYPE="text" NAME="forename">
</TD>
<TD>
Last name: <INPUT TYPE="text" NAME="surname">
</TD></TR>

[% IF errors.forename OR errors.surname %]
    <TR>
    <TD><SPAN class="error">[% errors.forename %]</SPAN> </TD>
    <TD><SPAN class="error">[% errors.surname %]</SPAN> </TD>
    </TR>
[% END %]

The next thing we want to do is to put the originally-submitted values back into the form. We can do this relatively easily because Maypole passes the Maypole request object to the form, and the POST parameters are going to be stored in a hash as request.params. Hence:

<TR><TD>
First name: <INPUT TYPE="text" NAME="forename"
VALUE="[%request.params.forename%]">
</TD>
<TD>
Last name: <INPUT TYPE="text" NAME="surname"
VALUE="[%request.params.surname%]"> 
</TD></TR>

Finally, we might want to only re-fill a field if it is not erroneous, so that we don't get the same bad input resubmitted. This is easy enough:

<TR><TD>
First name: <INPUT TYPE="text" NAME="forename"
VALUE="[%request.params.forename UNLESS errors.forename%]">
</TD>
<TD>
Last name: <INPUT TYPE="text" NAME="surname"
VALUE="[%request.params.surname UNLESS errors.surname%]"> 
</TD></TR>

Uploading files and other data

You want the user to be able to upload files to store in the database.

Solution: It's messy.

First, we set up an upload form, in an ordinary dummy action. Here's the action:

sub upload_picture : Exported {}

And here's the template:

<FORM action="/user/do_upload" enctype="multipart/form-data" method="POST">

<P> Please provide a picture in JPEG, PNG or GIF format:
</P>
<INPUT TYPE="file" NAME="picture">
<BR>
<INPUT TYPE="submit">
</FORM>

(Although you'll probably want a bit more HTML around it than that.)

Now we need to write the do_upload action. At this point we have to get a little friendly with the front-end system. If we're using Apache::Request, then the upload method of the Apache::Request object (which Apache::MVC helpfully stores in $r->{ar}) will work for us:

sub do_upload :Exported {
    my ($class, $r) = @_;
    my $user = $r->{user};
    my $upload = $r->{ar}->upload("picture");

This returns a Apache::Upload object, which we can query for its content type and a file handle from which we can read the data. It's also worth checking the image isn't going to be too massive before we try reading it and running out of memory, and that the content type is something we're prepared to deal with.

if ($upload) {
    my $ct = $upload->info("Content-type");
    return $r->error("Unknown image file type $ct")
        if $ct !~ m{image/(jpeg|gif|png)};
    return $r->error("File too big! Maximum size is ".MAX_IMAGE_SIZE)
        if $upload->size > MAX_IMAGE_SIZE;

    my $fh = $upload->fh;
    my $image = do { local $/; <$fh> };

Now we can store the content type and data into our database, store it into a file, or whatever:

    $r->{user}->photo_type($ct);
    $r->{user}->photo($image);
}

And finally, we use our familiar template switcheroo hack to get back to a useful page:

    $r->objects([ $user ]);
    $r->{template} = "view";
}

Now, as we've mentioned, this only works because we're getting familiar with Apache::Request and its Apache::Upload objects. If we're planning to use CGI::Maypole instead, or want to write our application in a generic way so that it'll work regardless of front-end, then we need to replace the upload call with an equivalent which uses the CGI module to get the upload data. This is convoluted and horrific and we're not going to show it here, but it's possible.

Combine with the "Displaying pictures" hack above for a happy time.