NAME

Parse::WBXML - event-driven support for the generation and parsing of WBXML documents

VERSION

version 0.005

SYNOPSIS

use Parse::WBXML;
my $wbxml = Parse::WBXML->new;
$wbxml->add_handler_for_event(
  start_element => sub {
    my ($self, $el) = @_;
    $self;
  },
  characters => sub {
    my ($self, $data) = @_;
    $self;
  },
  end_element => sub {
    my ($self, $el) = @_;
    $self;
  },
);
$wbxml->parse("wbxml data");

DESCRIPTION

WARNING: this is an early alpha release, if you want WBXML support then please try the other modules in "SEE ALSO" first. The current API may change before the 1.0 release.

Provides a pure-Perl implementation for the WBXML compressed XML format. Slower and less efficient than the libwbxml2-based alternatives ("SEE ALSO"), but supports streaming SAX-like parsing.

This may be of some use in low-bandwidth situations where you want data as soon as available from the stream, or in cases where the document is damaged and you want to recover as much data as possible, or if you just don't have libwbxml2 available.

METHODS

ACCESSOR METHODS

charset

Returns the current charset, such as 'UTF-8'.

publicid

Returns the current public ID, which is the XML DTD identifier for this document, e.g. "-//WAPFORUM//DTD SI 1.0//EN".

version

Returns current version as a string, e.g. "1.3".

attribute_codepage

Returns current attribute codepage. This is the table used for all document-specific attribute tag lookups, and defaults to 0.

tag_codepage

Returns current tag codepage. This is the table used for all document-specific element tag lookups, and defaults to 0.

METHODS

new

Constructor. Ignores everything you give it.

mb_to_int

Convert multi-byte sequence to an integer value.

decode_string

Decodes the given string using the current "charset".

encode_string

Encodes the given string using the current "charset".

parse

Parse as much as we can from the given buffer.

Takes a single scalar ref as parameter, this should point to the buffer to be processed.

queue_start

Queue the initial items for parsing.

process_queue

Process everything we can in the queue.

Takes a single scalar ref as parameter, this should point to the buffer to be processed.

next_queued

Return current item in the queue (without removing it).

mark_item_complete

Remove the current item from the queue.

push_queued

Queue some more items. More of a shift than a push.

parse_item

Parse the given item if we have a method for it.

parse_version

Deconstruct a version - single byte containing major in the high nybble, minor in the lower nybble.

parse_publicid

Look up the given public ID, which is either a token for a preset value or a reference to the string table.

parse_charset

parse_strtbl

parse_strtbl_length

parse_strtbl_data

parse_body

parse_pi

parse_attribute

parse_attrvalue

parse_element

tag_from_id

attrstart_from_id

attrvalue_from_id

dump_from_buffer

Given a buffer of data, will report as much information as we can extract.

next_tag_code_from_buffer

Extracts the next tag code from the buffer, returning an empty list if none found.

next_attribute_item_from_buffer

Get tag Process attibutes if any Process content if any

NOTES

Probably more suited to Marpa::XS-style parsing than a manual task stack/state machine, so it's likely that the internals will be rearranged in the next version(s).

SEE ALSO

AUTHOR

Tom Molesworth <cpan@entitymodel.com>

LICENSE

Copyright Tom Molesworth 2011-2012. Licensed under the same terms as Perl itself.