NAME

XML::Sablotron - a Perl interface to the Sablotron XSLT processor

SYNOPSIS

use XML::Sablotron qw (:all);
Process(.....);

If you prefer an object approach, you can use the object wrapper:

$sab = new XML::Sablotron();
$sab->RunProcessor($template_url, $data_url, $output_url, 
                \@params, \@arguments);
$result = $sab->GetResultArg($output_url);

DESCRIPTION

This package is very simple interface to the Sablotron API. OK, but what does Sablotron mean?

Sablotron is an XSLT processor implemented in C++ based on the Expat XML parser.

If want to run this package, you need download and install Sablotron from the http://www.gingerall.cz/charlie-bin/get/webGA/act/download.act page.

You do _not_ need to download Expat, or any other Perl packages to run the XML::Sablotron package.

USAGE

ProcessStrings

ProcessStrings($template, $data, $result);

where...

$template

contains an XSL stylesheet

$data

contains an XML data to be processed

$result

is filled with the desired output

This function returns the Sablotron error code.

Process

This function provides a more general interface to Sablotron. You may find its usage a little bit tricky but it offers a variety of ways how to modify the Sablotron behavior.

Process($template_uri, $data_uri, $result_uri,
        $params, $buffers, $result);

where...

$template_uri

is a URI of XSL stylesheet

$data_uri

is a URI of processed data

$result_uri

is a URI of destination buffer. Currently, the arg: scheme is supported only. Use the value arg:/result. (the name of the $result variable without "$" sign)

$params

is a reference to array of global stylesheet parameters

$buffers

is a reference to array of named buffers

$result

receives the result. It requires $result_uri to be set to arg:/result.

The following example should make it clear.

Process("arg:/template", "arg:/data", "arg:/result", 
        undef, 
        ["template", $template, "data", $data], 
        $result);>

does exactly the same as

ProcessStrings($template, $data, $result);>

Why is it so complicated? Please, see the Sablotron documentation for details.

This function returns the Sablotron error code.

RegMessageHandler

This function is deprecated and no longer supported. See the description of object interface later in this document.

UnregMessageHandler

This function is deprecated and no longer supported. See the description of object interface later in this document.

OBJECT INTERFACE

This is a short intro for people, who like it hot. Skip this preface, if you just want to use this package the "ordinary" way.

There are two classes defined to deal with the Sablotron processor object.

XML::Sablotron::Processor is a class implementing an interface to the Sablotron processor object. Currently, there is no way, how to create more then one instance of the processor object but the use of multiple object should be supported soon. Usually, you don't need to use this class directly (except using handlers but it is a painless case).

Implementation of this class contains a circular reference inside Perl structures, which has to be broken calling the _release method. If you aren't going to do some strange hacks, you can forget this explanation.

XML::Sablotron is often the only thing you need. It's a wrapper around the XML::Sablotron::Processor object. The only quest of this class is to keep track of life-cycle of the processor, so you don't have to deal with a reference counting inside the processor class. All calls to this class are redirected to an inner instance of the XML::Sablotron::Processor object.

XML::Sablotron

Constructor

The constructor of the XML::Sablotron object takes no arguments, so you can create new instance simply like this:

$sab = new XML::Sablotron();

RunProcessor

The RunProcessor method is analogous to the Process function.

$code = $sab->RunProcessor($template_uri, $data_uri, $result_uri,
                           $params, $buffers);

where...

$template_uri

is a URI of XSL stylesheet

$data_uri

is a URI of processed data

$result_uri

is a URI of destination buffer

$params

is a reference to array of global stylesheet parameters

$buffers

is a reference to array of named buffers

URIs passed to this function may be from schemes supported internally (file:, arg:) of from any scheme handled by registered handler (see Handlers section).

Note the difference between the RunProcessor method and the Process function. RunProcessor doesn't return the output buffer ($result parameter is missing).

To obtain the result buffer(s) you have to call the "GetResultArg" method.

Example of use:

RunProcessor("arg:/template", "arg:/data", "arg:/result", 
        undef, 
        ["template", $template, "data", $data] );

GetResultArg

Call this function to obtain the result buffer after processing. The goal of this approach is to enable multiple output buffers. This little inconvenience of use is not so painful hopefully.

$result = $sab->GetResultArg($output_url);

This method returns a desired output buffer specified by its url.

The recent example of the RunProcessor method should continue:

$return = $sab->GetResultArg("result");

FreeResultArgs

$sab->FreeResultArgs();

This call frees up all output buffers allocated by Sablotron. You do not have to call this function as these buffers are managed by the processor internally.

Use this function to release huge chunks of memory while an instance of processor stays idle for a longer time, for example.

RegHandler

Set certain type of an external handler. The processor can use the handler for miscellaneous tasks such log and error messaging ...

For more details on handlers see the "HANDLERS" section of this document.

There are two ways how to call the RegHandler method:

$sab->RegHandler($type, $handler);

where...

$type

is the handler type (see "HANDLERS")

$handler

is an object implementing the handler interface

The second way allows to create anonymous handlers defined as a set of function calls:

$sab->RegHandler($type, { handler_stub1 => \&my_proc1,
                        handlerstub2 => \&my_proc2.... });

However, this form is very simple. It disallows to unregister the handler later.

For the detailed description of handler interface see the Handlers section.

UnregHandler

$sab->UnregHandler($type, $handler);

This method unregisters a registered handler.

Remember, that anonymously registered handlers can't be unregistered. (Of course, they can be canceled but it's a little bit tricky).

Set/GetEncoding

$sab->SetEncoding($encoding);

Calling these methods has no effect. They are valuable for miscellaneous handler, which may store received values together with the processor instance.

Set/GetContentType

$sab->SetEncoding($encoding);

Calling these methods has no effect. They are valuable for miscellaneous handler, which may store received values together with the processor instance.

SetOutputEncoding

$sab->SetOutputEncoding($encoding);

This methos allows to override the encoding specified in the <xsl:output> instruction. It enables to produce differently encoded outputs using one template.

SetBase

$sab->SetBase($base_url);

Call this method to make processor to use the $base_url base URI while resolving any relative URI within a data or template.

SetBaseForScheme

$sab->SetBaseForScheme($scheme, $base);

Like SetBase, but given base URL is used only for specified scheme.

SetLog

$sab->SetLog($filename);

This methods sets the log file name.

ClearError

$sab->ClearError();

This methods clears the last internal error of processor.

HANDLERS

Currently, Sablotron supports three flavors of handlers.

  • messages handler (0)

  • scheme handler (1)

  • SAX-like output handler (2)

  • miscellaneous handler (3)

I have to say that in this moment the XML::Sablotron extension supports only the first two of them.

General interface format

Call-back functions implementing handlers are of different prototypes (not a prototypes in the Perl meaning) but the first two parameters are always the same:

$self

is a reference to registered object, so you can implement handlers the common object way. If you register a handler with a hash reference (see RegHandler, this parameter refers to a hidden object, which is useless for you.

$processor

is reference to the processor, which is actually calling your handler. It allows you to use one handler for more than one processor.

Messages handler - overview

The goal of this handler is to deal with all messages produced by a processor.

Each state reported by the processor is composed of the following data:

  • severity

    zero means: not so bad thing; 1 means: OOPS, bad thing

  • facility

    Helps to determine who is reporting in larger systems. Sablotron always sets this value to 2.

  • code

    An internal Sablotron code.

Each reported event falls into one of predefined categories, which define the event level. The valid levels include:

  • debug (0)

    all stuff

  • info (1)

    informations for curious people

  • warn (2)

    warnings on suspicious things

  • error (3)

    huh, something is wrong

  • critical (4)

    very, very bad day...

The numbers in the parentheses are the internal level codes.

Messages handler - interface

To define a messages handler, you have to define the following functions (or methods, depending on kind of registration, see "RegHandler").

MHMakeCode($self, $processor, $severity, $facility, $code)

This function is called whenever Sablotron needs display any message. It helps you to convert the internal codes into your own space of numbers. After this call Sablotron forgets its code and use the yours.

To understand parameters of this call see: "Messages handler - overview"

MHLog($self, $processor, $code, $level, @fields)

A Sablotron request to log some event.

$code

is the code previously returned by MHMakeCode

$level

is the event level (see "Messages handler - overview")

@fields

are text fields in format of "fldname: following text"

MHError($self, $processor, $code, $level, @fields)

is very similar to the MHLog function but it is called only when a bad thing happens (error and critical levels).

Messages handler - example

A very simple message handler could look like this:

sub myMHMakeCode {
    my ($self, $processor, $severity, $facility, $code);
    return $code; #i can deal with internal numbers
}

sub myMHLog {
    my ($self, $processor, $code, $level, @fields);
    print LOGHANDLE "[Sablot: $code]\n" . (join "\n", @fields, "");
}

sub myMHError {
    myMHlog(@_);
    die "Dying from Sablotron errors, see log\n";
}

$sab = new XML::Sablotron();
$sab->RegHandler(0, { MHMakeCode => \&myMHMakeCode,
                      MHLog => \&myMHLog,
                      MHError => \&myMHError });

That's all, folks.

Scheme handler - overview

One of great features of Sablotron is the possibility of Scheme handlers. This feature allows to reference data from any URL scheme. Every time the processor is asked for some URI (e.g. using the document() function), it looks for a handler, which can resolve the required document.

Sablotron asks the handler for all the document at once. If the handler refuses this request, Sablotron "opens" a connection to the handler and tries to read the data "per partes".

A handler can be used for the output buffers as well, so this mechanism also supports the "put" method.

Scheme handler - interface

SHGetAll($self, $processor, $scheme, $rest)

This function is called, when the processor is trying to resolve a document. It supposes, that the MHGetAll function returns the whole document.

If you're going to use the second way (giving chunks of the document), simply don't implement this function or return the undef value from it.

$scheme parameter holds the scheme extracted from a URI
$rest holds the rest of the URI
SHOpen($self, $processor, $scheme, $rest);

This function is called immediately after SHGet or SHPut is called. Use it to pass some "handle" (I mean a user data) to the processor. This data will be a part of each following request (SHGet, SHPut).

SHGet($self, $processor, $handle, $size)

This function returns the following chunk of data. The size of the data MUST NOT be greater then the $size parameter.

$handle is the value previously returned from the SHOpen function.

Return the undef value to say "No more data".

SHPut($self, processor, $handle, $data)

This function stores a chunk of data given in the $data parameter.

SHClose($self, $processor, $handle)

You can close you internal connections, files, etc. using this function.

Scheme handler - example

See the test script (test.pl) included in this distribution.

SAX handler

The SAX-like handler is not yet supported.

Miscellaneous handler

This handler was introduced in version 0.42 and could be subject of change in the near future. For the namespace collision with message handler misc. handler uses prefix 'XS' (like eXtended features).

XHDocumentInfo($self, $processor, $contentType, $encoding)

This function is called, when document attributes are specified via <xsl:output> instruction. $contentType holds value of "media-type" attribute, $encoding holds value of "ecoding attribute.

Return value of this callback is discarded.

Miscellaneous handler - example

Suppose template like this:

<?xml version='1.0'?>
...
<xsl:output media-type="text/html" encoding="iso-8859-2"/>
...

In this case XSDocumentInfo callback function is called with values of "text/html" and "iso-8859-2".

LICENSE

This package is subject to the MPL (or the GPL alternatively).

The same licensing applies for Sablotron.

AUTHOR

Pavel Hlavnicka; pavel@gingerall.cz

SEE ALSO

perl(1).