NAME

HTML::Mason::Devel - Mason Developer's Manual

DESCRIPTION

This manual is written for content developers who know HTML and at least a little Perl. The goal is to write, run, and debug Mason components.

If you are the webmaster (or otherwise responsible for the Mason installation), you should also read HTML::Mason::Admin. There you will find information about virtual site configuration, performance tuning, component caching, and so on.

If you are a developer just interested in knowing more about Mason's capabilities and implementation, then HTML::Mason::Admin is for you too.

We strongly suggest that you have a working Mason to play with as you work through these examples. Other component examples can be found in the samples/ directory.

While Mason can be used for tasks besides implementing a dynamic web site, that is what most people want to do with Mason, and is thus the focus of this manual.

If you are planning to use Mason outside of the web, this manual will still be useful, of course. Also make sure to read NON-WEB MASON in Admin in the Administrator's Guide.

HOW TO USE THIS MANUAL

If you are just learning Mason and want to get started quickly, we recommend the following sections:

o What Are Components?

o In-Line Perl Sections

o Calling Components

o Top-Level Components

o Passing Parameters

o Initialization and Cleanup (mainly <%init>)

o Web-Specific Features

o Common Traps

WHAT ARE COMPONENTS?

The component - a mix of Perl and HTML - is Mason's basic building block and computational unit. Under Mason, web pages are formed by combining the output from multiple components. An article page for a news publication, for example, might call separate components for the company masthead, ad banner, left table of contents, and article body. Consider this layout sketch:

+---------+------------------+
|Masthead | Banner Ad        |
+---------+------------------+
|         |                  |
|+-------+|Text of Article ..|
||       ||                  |
||Related||Text of Article ..|
||Stories||                  |
||       ||Text of Article ..|
|+-------+|                  |
|         +------------------+
|         | Footer           |
+---------+------------------+

The top level component decides the overall page layout, perhaps with HTML tables. Individual cells are then filled by the output of subordinate components, one for the Masthead, one for the Footer, etc. In practice pages are built up from as few as one, to as many as twenty or more components.

This component approach reaps many benefits in a web environment. The first benefit is consistency: by embedding standard design elements in components, you ensure a consistent look and make it possible to update the entire site with just a few edits. The second benefit is concurrency: in a multi-person environment, one person can edit the masthead while another edits the table of contents. A last benefit is reuseability: a component produced for one site might be useful on another. You can develop a library of generally useful components to employ on your sites and to share with others.

Most components emit chunks of HTML. "Top level" components, invoked from a URL, represent an entire web page. Other, subordinate components emit smaller bits of HTML destined for inclusion in top level components.

Components receive form and query data from HTTP requests. When called from another component, they can accept arbitrary parameter lists just like a subroutine, and optionally return values. This enables a type of component that does not print any HTML, but simply serves as a function, computing and returning a result.

Mason actually compiles components down to Perl subroutines, so you can debug and profile component-based web pages with standard Perl tools that understand the subroutine concept, e.g. you can use the Perl debugger to step through components, and Devel::DProf to profile their performance.

IN-LINE PERL SECTIONS

Here is a simple component example:

<%perl>
my $noun = 'World';
my @time = split /[\s:]/, localtime;
</%perl>
Hello <% $noun %>,
% if ( $time[3] < 12 ) {
good morning.
% } else {
good afternoon.
% }

After 12 pm, the output of this component is:

Hello world, good afternoon.

This short example demonstrates the three primary "in-line" Perl sections. In-line sections are generally embedded within HTML and execute in the order they appear. Other sections (<%init>, <%args>, etc.) are tied to component events like initialization, cleanup, and argument definition.

The parsing rules for these Perl sections are as follows:

  1. Blocks of the form <% xxx %> are replaced with the result of evaluating xxx as a single Perl expression. These are often used for variable replacement. such as 'Hello, <% $name %>!'.

  2. Lines beginning with a '%' character are treated as Perl.

  3. Multiline blocks of Perl code can be inserted with the <%perl> .. </%perl> tag. The enclosed text is executed as Perl and the return value, if any, is discarded.

    The <%perl> tag, like all block tags in Mason, is case-insensitive. It may appear anywhere in the text, and may span any number of lines. <%perl> blocks cannot be nested inside one another.

% lines

Most useful for conditional and loop structures - if, while, foreach, , etc. - as well as side-effect commands like assignments. Examples:

o Conditional code

% my $ua = $r->header_in('User-Agent');
% if ($ua =~ /msie/i) {
Welcome, Internet Explorer users
...
% } elsif ($ua =~ /mozilla/i) {
Welcome, Netscape users
...
% }

o HTML list formed from array

<ul>
% foreach $item (@list) {
<li><% $item %>
% }
</ul>

o HTML list formed from hash

<ul>
% while (my ($key,$value) = each(%ENV)) {
<li>
<b><% $key %></b>: <% $value %>
% }
</ul>

o HTML table formed from list of hashes

<table>
<tr>
% foreach my $h (@loh) {
<td><% $h->{foo} %></td>
<td bgcolor=#ee0000><% $h->{bar} %></td>
<td><% $h->{baz} %></td>
% }
</tr>
</table>

<% xxx %>

Most useful for printing out variables, as well as more complex expressions. Examples:

Dear <% $name %>: We will come to your house at <% $address %> in the
fair city of <% $city %> to deliver your $<% $amount %> dollar prize!

The answer is <% ($y+8) % 2 %>.

You are <% $age < 18 ? 'not' : '' %> permitted to enter this site.

<%perl> xxx </%perl>

Useful for Perl blocks of more than a few lines.

MASON OBJECTS

This section describes the various objects in the Mason universe. If you're just starting out, all you need to worry about initially are the request objects.

Request Objects

Two global per-request objects are available to all components: $r and $m.

$r, the mod_perl request object, provides a Perl API to the current Apache request. It is fully described in Apache.pod. Here is a sampling of methods useful to component developers:

$r->uri             # the HTTP request URI
$r->header_in(..)   # get the named HTTP header line
$r->content_type    # set or retrieve content-type
$r->header_out(..)  # set or retrieve an outgoing header

$r->content         # don't use this one! (see Tips and Traps)

$m, the Mason request object, provides an analogous API for Mason. Almost all Mason features not activated by syntactic tags are accessed via $m methods. You'll be introduced to these methods throughout this document as they are needed. For a description of all methods see HTML::Mason::Request.

System Objects

Five system objects share the work of serving requests in Mason: Lexer, Compiler, Interp, Resolver, and ApacheHandler. The administrator creates these objects and provides parameters that shape Mason's behavior. As a pure component developer you shouldn't need to worry about or access these objects, but occasionally we'll mention a relevant parameter.

Component Objects

Mason provides an object API for components, allowing you to query a component's various asociated files, arguments, etc. For a description of all methods see HTML::Mason::Component. Typically you get a handle on a component object from request methods like $m->current_comp and $m->fetch_comp.

Note that for many basic applications all you'll want to do with components is call them, for which no object method is needed. See next section.

CALLING COMPONENTS

Mason pages often are built not from a single component, but from multiple components that call each other in a hierarchical fashion.

Components that output HTML

To call one component from another, use the <& &> tag:

<& comp_path, [name=>value, ...] &>
comp_path:

The component path. With a leading '/', the path is relative to the component root (comp_root). Otherwise, it is relative to the location of the calling component.

name => value pairs:

Parameters are passed as one or more name => value pairs, e.g. player => 'M. Jordan'.

comp_path may be a literal string (quotes optional) or a Perl expression that evaluates to a string. To eliminate the need for quotes in most cases, Mason employs some magic parsing: If the first character is one of [A-Za-z0-9/_.], comp_path is assumed to be a literal string running up to the first comma or &>. Otherwise, comp_path is evaluated as an expression.

Here are some examples:

# relative component paths
<& topimage &>
<& tools/searchbox &>

# absolute component path
<& /shared/masthead, color=>'salmon' &>

# this component path MUST have quotes because it contains a comma
<& "sugar,eggs", mix=>1 &>

# variable component path
<& $comp &>

# variable component and arguments
<& $comp, %args &>

# you can use arbitrary expression for component path, but it cannot
# begin with a letter or number; delimit with () to remedy this
<& (int(rand(2)) ? 'thiscomp' : 'thatcomp'), id=>123 &>

Several request methods also exist for calling components. $m->comp performs the equivalent action to <& &>:

$m->comp('/shared/masthead', color=>'salmon');

$m->scomp is like the sprintf version of $m->comp: it returns the component output, allowing the caller to examine and modify it before printing:

my $masthead = $m->scomp('/shared/masthead', color=>'salmon');
$masthead =~ ...;
$m->print($masthead);

Component Calls with Content

Components can be used to filter part of the page's content using an extended component syntax.

<&| /path/to/comp &> this is the content </&>
<&| comp, arg1 => 'hi' &> filters can take arguments </&>
<&| comp &> content can include <% "tags" %> of all kinds </&>
<&| comp1 &> nesting is also <&| comp2 &> OK </&> </&>
<&| SELF:method1 &> subcomponents can be filters </&>

The filtering component can be called in all the same ways a normal component is called, with arguments and so forth. The only difference between a filtering component and a normal component is that a filtering component is expected to fetch the content by calling $m->content and do something with it.

Here is an example of a component used for localization. Its content is a series of strings in different languages, and it selects the correct one based on a global $lang variable, which could be setup in a site-level autohandler.

<&| /i18n/itext &>
   <en>Hello, <% $name %> This is a string in English</en>
   <de>Schoene Gruesse, <% $name %>, diese Worte sind auf Deutsch</de>
   <pig>ellohay <% substr($name,2).substr($name,1,1).'ay' %>,
   isthay isay igpay atinlay</pig>
</&>

Here is the /i18n/itext component:

<% $text %>

<%init>
# this assumes $lang is a global variable which has been set up earlier.
local $_ = $m->content;
my ($text) = m{<$lang>(.*?)</$lang>};
</%init>

If a filter component is called like a normal component (e.g. <& itext &>), $m->content will return undef. If a normal component which does not call $m->content is called with content, the content will not be output.

If you wrap a filtering component call around the entire component, the result will be functionally similar to a <%filter> section. See also Filtering.

Advanced Components Calls with Content

Internally $m->content is implemented with a closure containing the part of the component which is the content. In English, that means that any mason tags and perl code in the content are evaluated when $m->content is called, and $m->content returns the text which would have been output by mason. Because the contents are evaluated at the time that $m->content is called, one can write components which act as control structures or which output their contents multiple times with different values for the variables (can you say taglibs?).

The tricky part of using filter components as control structures is setting up variables which can be accessed from both the filter component and the content, which is in the component which calls the filter component. The content has access to all variables in the surrounding component, but the filtering component does not. There are two ways to do this: use global variables, or pass a reference to a lexical variable to the filter component.

Here is a simple example using the second method:

% my $var;
<ol>
<&| list_items , list => \@items, var => \$var &>
<li> <% $var %>
</&>
</ol>

list_items component:

<%args>
@list
$var
</%args>
% foreach (@list) {
% $$var = $_;  # $var is a reference
<% $m->content %>
% }

Using global variables can be somewhat simpler. Below is the same example, with $var defined as a global variable. The site administrator must make sure that $var is included in Mason's allow_globals parameter. Local-izing $var within the filter component will allow the list_items component to be nested.

<ol>
<&| list_items, list => \@items &>
<li> <% $var %>
</&>
</ol>

list_items component:

<%args>
@list
</%args>
% foreach (@list) {
% local $var = $_;
<% $m->content %>
% }

Besides remembering to include $var in allow_globals, the developers should take care not to use that variable is other places where it might conflict with usage by the filter component. Local-izing $var will also provide some protection against using it in other places.

An even simpler method is to use the $_ variable. It is already global, and is automatically local-ized by the foreach statement:

<ol>
<&| list_items, list => \@items &>
<li> <% $_ %>
</&>
</ol>

list_items component:

<%args>
@list
</%args>
% foreach (@list) {
<% $m->content %>
% }

Components that compute values

So far you have seen components used solely to output HTML. However, components may also be used to compute a value. For example, you might have a component is_netscape that analyzes the user agent to determine whether it is a Netscape browser:

<%perl>
my $ua = $r->header_in('User-Agent');
return ($ua =~ /Mozilla/i && $ua !~ /MSIE/i) ? 1 : 0;
</%perl>

Because components are implemented underneath with Perl subroutines, they can return values and even understand scalar/list context.

The <& &> notation only calls a component for its side-effect, and discards its return value, if any. To get at the return value of a component, use the $m->comp command:

% if ($m->comp('is_netscape')) {
Welcome, Netscape user!
% }

Mason adds a return undef to the bottom of each component to provide an empty default return value. To return your own value from a component, you must use an explicit return statement. You cannot rely on the usual Perl trick of letting return values "fall through".

Generally components are divided into two types: those that output HTML, and those that return a value. There is very little reason for a component to do both. For example, it would not be very friendly for is_netscape to output "hi Mom" while it was computing its value, thereby surprising the if statement! Conversely, any value returned by an HTML component would typically be discarded by the <& &> tag that invoked it.

Subrequests

You may sometimes want to have a component call go through all the steps that the initial component call goes through, such as checking for autohandlers and dhandlers. To do this, you need to execute a subrequest.

A subrequest is simply a Mason Request object and has all of the methods normally associated with one.

To create a subrequest you simply use the $m->make_subrequest method. This method can take any parameters normally given to the Request object's constructor, such as autoflush or out_method. Once you have a new request object you simply call its exec method to execute it, which takes exactly the same parameters as the comp method.

Since subrequests inherit their parent request's parameters, output from a component called via a subrequest goes to the same desintation as output from components caled during the parent request. Of course, you can change this.

Here are some examples:

<%perl>
 my $req = $m->make_subrequest( comp => '/some/comp', args => [ id => 172 ] );
 $req->exec;
</%perl>

If you want to capture the subrequest's output in a scalar, you can simply pass an out_method parameter to $m->make_subrequest:

<%perl>
 my $buffer;
 my $req =
     $m->make_subrequest
         ( comp => '/some/comp', args => [ id => 172 ], out_method => \$buffer );
 $req->exec;
</%perl>

Now $buffer contains all the output from that call to /some/comp.

For convenience, Mason also provides an $m->subexec method. This method takes the same arguments as $m->comp and internally calls $m->make_subrequest and then exec on the created request, all in one fell swoop. This is useful in cases where you have no need to override any of the parent request object's attributes.

By default, output from a subrequest appears inline in the calling component, at the point where it is executed. If you wish to do something else, you will need to explicitly override the subrequest's out_method parameter.

Mason Request objects are only designed to handle a single call to exec. If you wish to make multiple subrequests, you must create a new subrequest object for each one.

TOP-LEVEL COMPONENTS

The first component invoked for a page (the "top-level component") resides within the DocumentRoot and is chosen based on the URL. For example:

http://www.foo.com/mktg/prods.html?id=372

Mason converts this URL to a filename, e.g. /usr/local/www/htdocs/mktg/prods.html. Mason loads and executes that file as a component. In effect, Mason calls

$m->comp('/mktg/prods.html', id=>372)

This component might in turn call other components and execute some Perl code, or it might contain nothing more than static HTML.

dhandlers

What happens when a user requests a component that doesn't exist? In this case Mason scans backward through the URI, checking each directory for a component named dhandler ("default handler"). If found, the dhandler is invoked and is expected to use $m->dhandler_arg as the parameter to some access function, perhaps a database lookup or location in another filesystem. In a sense, dhandlers are similar in spirit to Perl's AUTOLOAD feature; they are the "component of last resort" when a URL points to a non-existent component.

Consider the following URL, in which newsfeeds/ exists but not the subdirectory LocalNews nor the component Story1:

http://myserver/newsfeeds/LocalNews/Story1

In this case Mason constructs the following search path:

/newsfeeds/LocalNews/Story1         => no such thing
/newsfeeds/LocalNews/dhandler       => no such thing
/newsfeeds/dhandler                 => found! (search ends)
/dhandler

The found dhandler would read "LocalNews/Story1" from $m->dhandler_arg and use it as a retrieval key into a database of stories.

Here's how a simple /newsfeeds/dhandler might look:

    <& header &>
    <b><% $headline %></b><p>
    <% $body %>
    <& footer &>

    <%init>
    my $arg = $m->dhandler_arg;                # get rest of path
    my ($section, $story) = split("/", $arg);  # split out pieces
    my $sth = $DBH->prepare
	(qq{SELECT headline,body FROM news
            WHERE section = ? AND story = ?);
    $sth->execute($section, $story);
    my ($headline, $body) = $sth->fetchrow_array;
    return 404 if !$headline;                  # return "not found" if no such story
    </%init>

By default dhandlers do not get a chance to handle requests to a directory itself (e.g. /newsfeeds). These are automatically deferred to Apache, which generates an index page or a FORBIDDEN error. Often this is desirable, but if necessary the administrator can let in directory requests as well; see Allowing directgory request in Admin.

A component or dhandler that does not want to handle a particular request may defer control to the next dhandler by calling $m->decline.

The administrator can customize the file name used for dhandlers with the Interp's dhandler_name parameter.

autohandlers

Autohandlers allow you to grab control and perform some action just before Mason calls the top-level component. This might mean adding a standard header and footer, applying an output filter, or setting up global variables.

Autohandlers are directory based. When Mason determines the top-level component, it checks that directory and all parent directories for a component called autohandler. If found, the autohandler is called first. After performing its actions, the autohandler typically calls $m->call_next to transfer control to the original intended component.

$m->call_next works just like $m->comp except that the component path and arguments are implicit. You can pass additional arguments to $m->call_next; these are merged with the original arguments, taking precedence in case of conflict. This allows you, for example, to override arguments passed in the URL.

Here is an autohandler that adds a common header and footer to each page underneath its directory:

<HTML>
<HEAD><TITLE>McHuffy Incorporated</TITLE></HEAD>
<BODY BGCOLOR="salmon">

% $m->call_next;

<HR>
Copyright 1999 McHuffy Inc.
</BODY>
</HTML>

Same idea, using components for the header/footer:

<& /shared/header &>
% $m->call_next;
<& /shared/footer &>

The next autohandler applies a filter to its pages, adding an absolute hostname to relative image URLs:

% $m->call_next;

<%filter>
s{(<img[^>]+src=\")/} {$1http://images.mysite.com/}ig;
</%filter>

Most of the time autohandler can simply call $m->call_next without needing to know what the next component is. However, should you need it, the component object is available from $m->fetch_next. This is useful for calling the component manually, e.g. if you want to suppress some original arguments or if you want to use $m->scomp to store and process the output.

What happens if more than one autohandler applies to a page? Prior to version 0.85, only the most specific autohandler would execute. In 0.85 and beyond each autohandler gets a chance to run. The top-most autohandler runs first; each $m->call_next transfers control to the next autohandler and finally to the originally called component. This allows you, for example, to combine general site-wide templates and more specific section-based templates.

Autohandlers can be made even more powerful in conjunction with Mason's object-oriented style features: methods, attributes, and inheritance. In the interest of space these are discussed in a separate section, Object-Oriented Techniques.

The administrator can customize the file name used for autohandlers with the Interp's autohandler_name parameter.

dhandlers vs. autohandlers

dhandlers and autohandlers both provide a way to exert control over a large set of URLs. However, each specializes in a very different application. The key difference is that dhandlers are invoked only when no appropriate component exists, while autohandlers are invoked only in conjunction with a matching component.

As a rule of thumb: use an autohandler when you have a set of components to handle your pages and you want to augment them with a template/filter. Use a dhandler when you want to create a set of "virtual URLs" that don't correspond to any actual components, or to provide default behavior for a directory.

dhandlers and autohandlers can even be used in the same directory. For example, you might have a mix of real URLs and virtual URLs to which you would like to apply a common template/filter.

PASSING PARAMETERS

This section describes Mason's facilities for passing parameters to components (either from HTTP requests or component calls) and for accessing parameter values inside components.

In Component Calls

Any Perl data type can be passed in a component call:

<& /sales/header, s => 'dog', l => [2, 3, 4], h => {a => 7, b => 8} &>

This command passes a scalar ($s), a list (@l), and a hash (%h). The list and hash must be passed as references, but they will be automatically dereferenced in the called component.

In HTTP requests

Consider a CGI-style URL with a query string:

http://www.foo.com/mktg/prods.html?str=dog&lst=2&lst=3&lst=4

or an HTTP request with some POST content. Mason automatically parses the GET/POST values and makes them available to the component as parameters.

Accessing Parameters

Component parameters, whether they come from GET/POST or another component, can be accessed in two ways.

1. Declared named arguments: Components can define an <%args> section listing argument names, types, and default values. For example:

<%args>
$a
@b       # a comment
%c

# another comment
$d => 5
$e => $d*2
@f => ('foo', 'baz')
%g => (joe => 1, bob => 2)
</%args>

Here, $a, @b, and %c are required arguments; the component generates an error if the caller leaves them unspecified. $d, $e, @f and %g are optional arguments; they are assigned the specified default values if unspecified. All the arguments are available as lexically scoped ("my") variables in the rest of the component.

Arguments are separated by one or more newlines. Comments may be used at the end of a line or on their own line.

Default expressions are evaluated in top-to-bottom order, and one expression may reference an earlier one (as $e references $d above).

Only valid Perl variable names may be used in <%args> sections. Parameters with non-valid variable names cannot be pre-declared and must be fetched manually out of the %ARGS hash (see below). One common example of undeclarable parameters are the "button.x/button.y" parameters sent for a form submit.

2. %ARGS hash: This variable, always available, contains all of the parameters passed to the component (whether or not they were declared). It is especially handy for dealing with large numbers of parameters, dynamically named parameters, or parameters with non-valid variable names. %ARGS can be used with or without an <%args> section, and its contents are unrelated to what you have declared in <%args>.

Here's how to pass all of a component's parameters to another component:

<& template, %ARGS &>

Parameter Passing Examples

The following examples illustrate the different ways to pass and receive parameters.

1. Passing a scalar id with value 5.

In a URL: /my/URL?id=5
In a component call: <& /my/comp, id => 5 &>
In the called component, if there is a declared argument named...
  $id, then $id will equal 5
  @id, then @id will equal (5)
  %id, then an error occurs
In addition, $ARGS{id} will equal 5.

2. Passing a list colors with values red, blue, and green.

In a URL: /my/URL?colors=red&colors=blue&colors=green
In an component call: <& /my/comp, colors => ['red', 'blue', 'green'] &>
In the called component, if there is a declared argument named...
  $colors, then $colors will equal ['red', 'blue', 'green']
  @colors, then @colors will equal ('red', 'blue', 'green')
  %colors, then an error occurs
In addition, $ARGS{colors} will equal ['red', 'blue', 'green'].

3. Passing a hash grades with pairs Alice => 92 and Bob => 87.

In a URL: /my/URL?grades=Alice&grades=92&grades=Bob&grades=87
In an component call: <& /my/comp, grades => {Alice => 92, Bob => 87} &>
In the called component, if there is a declared argument named...
  @grades, then @grades will equal ('Alice', 92, 'Bob', 87)
  %grades, then %grades will equal (Alice => 92, Bob => 87)
In addition, $grade and $ARGS{grades} will equal
  ['Alice',92,'Bob',87] in the URL case, or {Alice => 92, Bob => 87}
  in the component call case.  (The discrepancy exists because, in a
  query string, there is no detectable difference between a list or
  hash.)

Using @_ instead

If you don't like named parameters, you can pass a traditional list of ordered parameters:

<& /mktg/prods.html', 'dog', [2, 3, 4], {a => 7, b => 8} &>

and access them as usual through Perl's @_ array:

my ($scalar, $listref, $hashref) = @_;

In this case no <%args> section is necessary.

We generally recommend named parameters for the benefits of readability, syntax checking, and default value automation. However using @_ may be convenient for very small components, especially subcomponents created with <%def>.

INITIALIZATION AND CLEANUP

The following sections contain blocks of Perl to execute at specific times.

<%init>

This section contains initialization code that executes as soon as the component is called. For example: checking that a user is logged in; selecting rows from a database into a list; parsing the contents of a file into a data structure.

Technically an <%init> block is equivalent to a <%perl> block at the beginning of the component. However, there is an aesthetic advantage of placing this block at the end of the component rather than the beginning.

We've found that the most readable components (especially for non-programmers) contain HTML in one continuous block at the top, with simple substitutions for dynamic elements but no distracting blocks of Perl code. At the bottom an <%init> block sets up the substitution variables. This organization allows non-programmers to work with the HTML without getting distracted or discouraged by Perl code. For example:

<html>
<head><title><% $headline %></title></head>
<body>
<h2><% $headline %></h2>
By <% $author %>, <% $date %><p>

<% $body %>

</body></html>

<%init>
# Fetch article from database
my $dbh = DBI::connect ...;
my $sth = $dbh->prepare("select * from articles where id = ?");
$sth->execute($article_id);
my ($headline, $date, $author, $body) = $sth->fetchrow_array;
# Massage the fields
$headline = uc($headline);
my ($year, $month, $day) = split('-', $date);
$date = "$month/$day";
</%init>

<%args>
$article_id
</%args>

<%cleanup>

This section contains cleanup code that executes just before the component exits. For example: closing a database connection or closing a file handle.

Technically a << <%cleanup> >> block is equivalent to a <%perl> block at the end of the component. Since a component corresponds a subroutine block, and since Perl is so darned good at cleaning up stuff at the end of blocks, <%cleanup> sections are rarely needed.

<%once>

This code executes once when the component is loaded. Variables declared in this section can be seen in all of a component's code and persist for the lifetime of the component.

This section is useful for declaring persistent component-scoped lexical variables (especially objects that are expensive to create), declaring subroutines (both named and anonymous), and initializing state.

This code does not run inside a request context. You cannot call components or access $m or $r from this section. Also, do not attempt to return() from a <%once> section; the current compiler cannot properly handle it.

Normally this code will execute individually from every HTTP child that uses the component. However, if the component is preloaded, this code will only execute once in the parent. Unless you have total control over what components will be preloaded, it is safest to avoid initializing variables that can't survive a fork(), e.g. DBI handles. Use code like this to initialize such variables in the <%init> section:

<%once>
my $dbh;      # declare but don't assign
...
</%once>

<%init>
$dbh ||= DBI::connect ...
...
</%init>

In addition, using $m or <$r> in this section will not work in a preloaded component, because neither of those variable exist when a component is preloaded.

<%shared>

As with <%once>, variables declared in this section can be seen in all of a component's code: the main component, subcomponents, and methods. However, the code runs once per request (whenever the component is used) and its variables last only til the end of the request.

Useful for initializing variables needed in, say, the main body and one more subcomponents or methods. See Object-Oriented Techniques for an example of usage.

It's important to realize that you do not have access to the %ARGS hash or variables created via an <%args> block inside a shared section. However, you can access arguments via the $m->request_args method, documented in the Request docs.

Avoid using <%shared> for side-effect code that needs to run at a predictable time during page generation. You may assume only that <%shared> runs just before the first code that needs it and runs at most once per request. <%init> offers more a predictable execution time.

Any component with a <%shared> section incurs an extra performance penalty, because (as currently implemented) Mason must recreate its anonymous subroutines the first time each new request uses the component. The exact penalty varies between systems and for most applications will be unnoticeable. However, one should avoid using <%shared> when patently unnecessary, e.g. when an <%init> would work as well.

Do not attempt to return() from a <%shared> section; the current compiler cannot properly handle it.

EMBEDDED COMPONENTS

<%def name>

Each instance of this section creates a subcomponent embedded inside the current component. Inside you may place anything that a regular component contains, with the exception of <%def>, <%method>, <%once>, and <%shared> tags.

The name consists of characters in the set [A-Za-z0-9._-]. To call a subcomponent simply use its name in <& &> or $m->comp. A subcomponent can only be seen from the surrounding component.

If you define a subcomponent with the same name as a file-based component in the current directory, the subcomponent takes precedence. You would need to use an absolute path to call the file-based component. To avoid this situation and for general clarity, we recommend that you pick a unique way to name all of your subcomponents that is unlikely to interfere with file-based components. For example, you could start subcomponent names with ".".

While inside a subcomponent, you may use absolute or relative paths to call file-based components and also call any of your "sibling" subcomponents.

The lexical scope of a subcomponent is separate from the main component. However a subcomponent can declare its own <%args> section and have relevant values passed in. You can also use a <%shared> section to declare variables visible from both scopes.

In the following example, we create a ".link" subcomponent to produce a standardized hyperlink:

<%def .link>
<font size="4" face="Verdana,Arial,Helvetica">
<a href="http://www.<% $site %>.com"><% $label %></a>
</font><br>
<%args>
$site
$label=>ucfirst($site)
</%args>
</%def>

Visit these sites:
<ul>
<li><& .link, site=>'yahoo' &><br>
<li><& .link, site=>'cmp', label=>'CMP Media' &><br>
<li><& .link, site=>'excite' &>
</ul>

<%method name>

Each instance of this section creates a method embedded inside the current component. Methods resemble subcomponents in terms of naming, contents, and scope. However, while subcomponents can only be seen from the parent component, methods are meant to be called from other components.

There are two ways to call a method. First, via a path of the form "comp:method":

<& /foo/bar:method1 &>

$m->comp('/foo/bar:method1');

Second, via the call_method component method:

my $comp = $m->fetch_comp('/foo/bar');
...
$comp->call_method('method1');

Methods are commonly used in conjunction with autohandlers to make templates more flexible. See Object-Oriented Techniques for more information.

FLAGS AND ATTRIBUTES

The <%flags> and <%attr> sections consist of key/value pairs, one per line, joined by '=>'. The key and value in each pair must be valid Perl hash keys and values respectively. An optional comment may follow each line.

<%flags>

Use this section to set official Mason flags that affect the current component's behavior.

Currently there is only one flag, inherit, which specifies the component's parent in the form of a relative or absolute component path. A component inherits methods and attributes from its parent; see Object-Oriented Techniques for examples.

<%flags>
inherit=>'/site_handler'
</%flags>

<%attr>

Use this section to assign static key/value attributes that can be queried from other components.

<%attr>
color => 'blue'
fonts => [qw(arial geneva helvetica)]
</%attr>

To query an attribute of a component, use the attr method:

my $color = $comp->attr('color')

where $comp is a component object.

Mason evaluates attribute values once when loading the component. This makes them faster but less flexible than methods.

FILTERING

This section describes several ways to apply filtering functions over the results of the current component. By separating out and hiding a filter that, say, changes HTML in a complex way, we allow non-programmers to work in a cleaner HTML environment.

<%filter> section

The <%filter> section allows you to arbitrarily filter the output of the current component. Upon entry to this code, $_ contains the component output, and you are expected to modify it in place. The code has access to component arguments and can invoke subroutines, call other components, etc.

This simple filter converts the component output to UPPERCASE:

<%filter>
tr/a-z/A-Z/
</%filter>

The following navigation bar uses a filter to "unlink" and highlight the item corresponding to the current page:

<a href="/">Home</a> | <a href="/products/">Products</a> |
<a href="/bg.html">Background</a> | <a href="/finance/">Financials</a> |
<a href="/support/">Tech Support</a> | <a href="/contact.html">Contact Us</a>

<%filter>
my $uri = $r->uri;
s{<a href="$uri/?">(.*?)</a>} {<b>$1</b>}i;
</%filter>

This allows a designer to code such a navigation bar intuitively without if statements surrounding each link! Note that the regular expression need not be very robust as long as you have control over what will appear in the body.

You can use Component Calls with Content if you want to filter specific parts of a component rather than the entire component.

OTHER SYNTAX

<%doc>

Text in this section is treated as a comment and ignored. Most useful for a component's main documentation. One can easily write a program to sift through a set of components and pull out their <%doc> blocks to form a reference page.

Can also be used for in-line comments, though it is an admittedly cumbersome comment marker. Another option is '%#':

%# this is a comment

These comments differ from HTML comments in that they do not appear in the HTML.

<%text>

Text in this section is passed through unmodified by Mason. Any Mason syntax inside it is ignored. This is useful, for example, when documenting Mason itself from a component:

<%text>
% This is an example of a Perl line.
<% This is an example of an expression block. %>
</%text>

This works for almost everything, but doesn't let you output </%text> itself! When all else fails, use $m->print:

% $m->print('The tags are <%text> and </%text>.');

Escaping expressions

Mason has facilities for escaping the output from <% %> tags, on either a site-wide or a per-expression basis.

Any <% %> expression may be terminated by a '|' and one or more single-letter escape flags (plus arbitrary whitespace):

<% $file_data |h %>

The current valid flags are

h - escape for HTML ('<' => '&lt;', etc.)
u - escape for URL query string (':' => '%3A', etc.) - all but [a-zA-Z0-9_.-]
n - turn off default escape flags

The administrator may specify a set of default escape flags via the Compiler's default_escape_flags parameter. For example, if the administrator specifies

default_escape_flags => 'h'

then all <% %> expressions will automatically be HTML-escaped. In this case you would use the n flag to turn off HTML-escaping for a specific expression:

<% $html_block |n %>

Future Mason versions will allow user-defined and multi-letter escape flags.

Backslash at end of line

A backslash (\) at the end of a line suppresses the newline. In HTML components, this is mostly useful for fixed width areas like <pre> tags, since browsers ignore white space for the most part. An example:

<pre>
foo
% if (1) {
bar
% }
baz
</pre>

outputs

foo
bar
baz

because of the newlines on lines 2 and 4. (Lines 3 and 5 do not generate a newline because the entire line is taken by Perl.) To suppress the newlines:

<pre>
foo\
% if (1) {
bar\
% }
baz
</pre>

which prints

foobarbaz

DATA CACHING

Mason's data caching interface allows components to cache the results of computation for improved performance. Anything may be cached, from a block of HTML to a complex data structure.

Each component gets its own private, persistent data cache. Except under special circumstances, one component does not access another component's cache. Each cached value may be set to expire at a certain time.

Data caching is implemented with DeWitt Clinton's Cache::Cache module. To get the full benefit out of caching you should read the documentation for Cache::Cache as well as for relevant subclasses (e.g. Cache::FileCache). Our documentation here covers common usage but skips many options and features.

Basic Usage

The $m->cache method returns an object representing the cache for this component. Here's the typical usage of $m->cache:

my $result = $m->cache->get('key');
if (!defined($result)) {
    ... compute $result ...
    $m->cache->set('key', $result);
}

$m->cache->get attempts to retrieve this component's cache value. If the value is available it is placed in $result. If the value is not available, $result is computed and stored in the cache by $m->cache->set.

Multiple Keys/Values

A cache can store multiple key/value pairs. A value can be anything serializable by Storable, from a simple scalar to an arbitrary complex list or hash reference:

$m->cache->set(name => $string);
$m->cache->set(friends => \@list);
$m->cache->set(map => %hash);

You can fetch all the keys in a cache with

my @idents = $m->cache->get_keys;

It should be noted that Mason reserves all keys beginning with __mason for its own use.

Expiration

You may pass an optional third argument to $m->cache->set indicating when the data should expire:

$m->cache->set('name1', $string1, '5min');   # Expire in 5 minutes
$m->cache->set('name2', $string2, '3h');     # Expire in 3 hours

To change the expiration time for a piece of data, call set again with the new expiration. To expire an item immediately, use $m->cache->remove.

You can also expire a cache item from an external script; see Accessing a Cache Externally below.

Caching All Output

Occasionally you will need to cache the complete output of a component. For this purpose, Mason offers the $m->cache_self method. This method causes Mason to check to see if this component has already been run and its output cached. If this is the case, this output is simply sent as output. Otherwise, the component run normally and its output and return value cached.

It is typically used right at the top of an <%init> section:

<%init>
return if $m->cache_self(expire_in => '3 hours' [, key => 'fookey' ]);
 ... <rest of init> ...
</%init>

$m->cache_self is built on top of $m->cache, so it accepts all of the cache options described earlier. $m->cache_self can also cache a component's return value; see the HTML::Mason::Request documentation for details.

Cache Object Meta-data

$m->cache->get_object returns the Cache::Object associated with a particular key. You can use this to retrieve useful meta-data:

my $co = $m->cache->get_object('name1');
$co->get_created_at();    # when was object stored in cache
$co->get_accessed_at();   # when was object last accessed
$co->get_expired_at();    # when does object expire

Choosing a Cache Subclass

Cache::Cache is a purely virtual API implemented by a variety of subclasses. For example, Cache::FileCache implements the interface with a set of directories and files, while Cache::MemoryCache implements the interface in memory.

By default $m->cache uses Cache::FileCache, but you can override this with the cache_class keyword. The value must be the name of a Cache::Cache subclass; if it does not contain a "::", the prefix "Cache::" is automatically prepended. For example:

my $result = $m->cache(cache_class => 'MemoryCache')->get('key');
$m->cache(cache_class => 'MemoryCache')->set(key => $result);

You can even specify different subclasses for different keys in the same component. Just make sure the correct value is passed to all calls to $m->cache; Mason does not remember which subclass you have used for a given component or key.

Accessing a Cache Externally

To access a component's cache from outside the component (e.g. in an external Perl script), you'll need have the following information:

  • the namespace associated with the component. The function HTML::Mason::Utils::data_cache_namespace, given a component path, will return the namespace, although this does not work with multiple component roots.

  • the username associated with the cache; this is "mason" unless it has been changed by the administrator.

  • the cache_root, for file-based caches only. Defaults to the "cache" subdirectory under the Mason data directory.

Given this information you can get a handle on the component's cache. For example, the following code removes a cache item for component /foo/bar, assuming the data directory is /usr/local/www/mason and the cache subclass is Cache::FileCache:

    use HTML::Mason::Utils qw(data_cache_namespace);

    my $cache = new Cache::FileCache
        (namespace => data_cache_namespace("/foo/bar"),
	 cache_root => "/usr/local/www/mason/cache",
	 username => "mason");
    $cache->remove('key1');

WEB-SPECIFIC FEATURES

Sending HTTP Headers

Mason automatically sends HTTP headers via $r->send_http_header but it will not send headers if they've already been sent manually.

To determine the exact header behavior on your system, you need to know whether your server's default is to have autoflush on or off. Your administrator should have this information. If your administrator doesn't know then it is probably off, the default.

With autoflush off the header situation is extremely simple: Mason waits until the very end of the request to send headers. Any component can modify or augment the headers.

With autoflush on the header situation is more complex. Mason will send headers just before sending the first output. This means that if you want to affect the headers with autoflush on, you must do so before any component sends any output. Generally this takes place in an <%init> section.

For example, the following top-level component calls another component to see whether the user has a cookie; if not, it inserts a new cookie into the header.

    <%init>
    my $cookie = $m->comp('/shared/get_user_cookie');
    if (!$cookie) {
	$cookie = new CGI::Cookie (...);
	$r->header_out('Set-cookie' => $cookie);
    }
    ...
    </%init>

With autoflush off this code will always work. Turn autoflush on and this code will only work as long as /shared/get_user_cookie doesn't output anything (given its functional nature, it shouldn't).

The administrator can turn off automatic header sending via the Request's auto_send_headers parameter available when running Mason with the ApacheHandler module.

Returning HTTP Status

The value returned from the top-most component becomes the status code of the request. If no value is explicitly returned, it defaults to OK (0).

Simply returning an error status (such as 404) from the top-most component has two problems in practice. First, the decision to return an error status often resides further down in the component stack. Second, you may have generated some content by the time this decision is made. (Both of these are more likely to be true when using autohandlers.)

Thus the safer way to generate an error status is

$m->clear_buffer;
$m->abort($status);

$m->abort bypasses the component stack and ensure that $status is returned from the top-most component. It works by throwing an exception. If you wrapped this code (directly or indirectly) in an eval, you must take care to rethrow the exception, or the status will not make it out:

eval { $m->comp('...') };
if ($@) {
   if ($m->aborted) {
       die $@;
   } else {
       # deal with non-abort exceptions
   }
}

External Redirects

Because it is so commonly needed, Mason provides an external redirect method:

$m->redirect($url);    # Redirects with 302 status

This method uses the clear_buffer/abort technique mentioned above, so the same warnings apply regarding evals.

Internal Redirects

There are two ways to perform redirects that are invisible to the client.

First, you can use a Mason subrequest (see "Subrequests"). This only works if you are redirecting to another Mason page.

Second, you can use Apache's internal_redirect method, which works whether or not the new URL will be handled by Mason. Use it this way:

$r->internal_redirect($url);
$m->auto_send_headers(0);
$m->clear_buffer;
$m->abort;

The last three lines prevent the original request from accidentally generating extra headers or content.

USING THE PERL DEBUGGER

You can use the perl debugger in conjunction with a live mod_perl/Mason server with the help of Apache::DB, available from CPAN. Refer to the Apache::DB documentation for details.

The only tricky thing about debugging Mason pages is that components are implemented by anonymous subroutines, which are not easily breakpoint'able. To remedy this, Mason calls the dummy subroutine debug_hook at the beginning of each component. You can breakpoint this subroutine like so:

b HTML::Mason::Request::debug_hook

debug_hook is called with two parameters: the current Request object and the full component path. Thus you can breakpoint specific components using a conditional on $_[1]:

b HTML::Mason::Request::debug_hook $_[1] =~ /component name/

You can avoid all that typing by adding the following to your ~/.perldb file:

# Perl debugger aliases for Mason
$DB::alias{mb} = 's/^mb\b/b HTML::Mason::Request::debug_hook/';

which reduces the previous examples to just:

mb
mb $_[1] =~ /component name/

OBJECT-ORIENTED TECHNIQUES

Earlier you learned how to assign a common template to an entire hierarchy of pages using autohandlers. The basic template looks like:

header HTML
% $m->call_next;
footer HTML

However, sometimes you'll want a more flexible template that adjusts to the requested page. You might want to allow each page or subsection to specify a title, background color, or logo image while leaving the rest of the template intact. You might want some pages or subsections to use a different template, or to ignore templates entirely.

These issues can be addressed with the object-oriented style primitives introduced in Mason 0.85.

Note: we use the term object-oriented loosely. Mason borrows concepts like inheritance, methods, and attributes from object methodology but implements them in a shallow way to solve a particular set of problems. Future redesigns may incorporate a deeper object architecture if the current prototype proves successful.

Determining inheritance

Every component may have a single parent. The default parent is a component named autohandler in the closest parent directory. This rule applies to autohandlers too: an autohandler may not have itself as a parent but may have an autohandler further up the tree as its parent.

You can use the inherit flag to override a component's parent:

<%flags>
inherit => '/foo/bar'
</%flags>

If you specify undef as the parent, then the component inherits from no one. This is how to suppress templates.

Currently there is no way to specify a parent dynamically at run-time, or to specify multiple parents.

Content wrapping

At page execution time, Mason builds a chain of components from the called component, its parent, its parent's parent, and so on. Execution begins with the top-most component; calling $m->call_next passes control to the next component in the chain. This is the familiar autohandler "wrapping" behavior, generalized for any number of arbitrarily named templates.

Accessing methods and attributes

A template can access methods and/or attributes of the requested page. First, use $m->request_comp to get a handle on the appropriate component:

my $self = $m->request_comp;

$self now refers to the component corresponding to the requested page (the component at the end of the chain).

To access a method for the page, use call_method:

$self->call_method('header');

This looks for a method named 'header' in the page component. If no such method exists, the chain of parents is searched upwards, until ultimately a "method not found" error occurs. Use 'method_exists' to avoid this error for questionable method calls:

if ($self->method_exists('header')) { ...

The component returned by the $m->request_comp method never changes during request execution. In contrast, the component returned by $m->base_comp may change several times during request execution.

When execution starts, the base component is the same as the requested component. Whenever a component call is executed, the base component may become the component that was called. The base component will change for all component calls except in the following cases:

  • A component is called via its component object rather than its path, for example:

    <& $m->fetch_comp('/some/comp'), foo => 1 &>
  • A method is called via the use of SELF: or PARENT:. These are covered in more detail below.

In all other cases, the base component is the called component or the called component's owner component if that called component is a method.

As hinted at above, Mason provides a shortcut syntax for method calls.

If a component call path starts with SELF:, then Mason will start looking for the method (the portion of the call after SELF:), in the base component.

<& SELF:header &>
$m->comp('SELF:header')

If the call path starts with PARENT:, then Mason will start looking in the current component's parent for the named method.

<& PARENT:header &>
$m->comp('PARENT:header')

In the context of a component path, PARENT is shorthand for $m->current_comp->parent.

The rules for attributes are similar. To access an attribute for the page, use attr:

my $color = $self->attr('color')

This looks for an attribute named 'color' in the $self component. If no such attribute exists, the chain of parents is searched upwards, until ultimately an "attribute not found" error occurs. Use attr_exists or attr_if_exist to avoid this error for questionable attributes:

if ($self->attr_exists('color')) { ...

my $color = $self->attr_if_exists('color'); # if it doesn't exist $color is undef

Sharing data

A component's main body and its methods occupy separate lexical scopes. Variables declared, say, in the <%init> section of the main component cannot be seen from methods.

To share variables, declare them either in the <%once> or <%shared> section. Both sections have an all-inclusive scope. The <%once> section runs once when the component loads; its variables are persistent for the lifetime of the component. The <%shared> section runs once per request (when needed), just before any code in the component runs; its variables last only til the end of the request.

In the following example, various sections of code require information about the logged-in user. We use a <%shared> section to fetch these in a single request.

<%attr>
title=>sub { "Account for $full_name" }
</%attr>

<%method lefttoc>
<i><% $full_name %></i>
(<a href="logout.html">Log out</a>)<br>
...
</%method>

Welcome, <% $fname %>. Here are your options:

<%shared>
my $dbh = DBI::connect ...;
my $user = $r->connection->user;
my $sth = $dbh->prepare("select lname,fname, from users where user_id = ?");
$sth->execute($user);
my ($lname, $fname) = $sth->fetchrow_array;
my $full_name = "$first $last";
</%shared>

<%shared> presents a good alternative to <%init> when data is needed across multiple scopes. Outside these situations, <%init> is preferred for its slightly greater speed and predictable execution model.

Example

Let's say we have three components:

/autohandler
/products/autohandler
/products/index.html

and that a request comes in for /products/index.html.

/autohandler contains a general template for the site, referring to a number of standard methods and attributes for each page:

<head>
<title><& SELF:title &></title>
</head>
<body bgcolor="<% $self->attr('bgcolor') %>">
<& SELF:header &>
<table><tr><td>

% $m->call_next;

</td></tr></table>
<& SELF:footer &>
</body>

<%init>
my $self = $m->base_comp;
...
</%init>

<%attr>
bgcolor => 'white'
</%attr>

<%method title>
McGuffey Inc.
</%method>

<%method header>
<h2><& SELF:title &></h2><p>
</%method>

<%method footer>
</%method>

Notice how we provide defaults for each method and attribute, even if blank.

/products/autohandler overrides some attributes and methods for the /products section of the site.

<%attr>
bgcolor => 'beige'
</%attr>
<%method title>
McGuffey Inc.: Products
</%method>

% $m->call_next;

Note that this component, though it only defines attributes and methods, must call $m->call_next if it wants the rest of the chain to run.

/products/index.html might override a few attributes, but mainly provides a primary section for the body.

COMMON TRAPS

Do not call $r->content or "new CGI"

Mason calls $r->content itself to read request input, emptying the input buffer and leaving a trap for the unwary: subsequent calls to $r->content hang the server. This is a mod_perl "feature" that may be fixed in an upcoming release.

For the same reason you should not create a CGI object like

my $query = new CGI;

when handling a POST; the CGI module will try to reread request input and hang. Instead, create an empty object:

my $query = new CGI ("");

such an object can still be used for all of CGI's useful HTML output functions. Or, if you really want to use CGI's input functions, initialize the object from %ARGS:

my $query = new CGI (\%ARGS);

AUTHORS

Jonathan Swartz <swartz@pobox.com>, Dave Rolsky <autarch@urth.org>, Ken Williams <ken@mathforum.org>

SEE ALSO

HTML::Mason, HTML::Mason::Admin, HTML::Mason::Request