NAME
RTF::Tokenizer - Tokenize RTF
DESCRIPTION
Tokenizes RTF
SYNOPSIS
use RTF::Tokenizer;
sub entity_handler {
return "&#" . hex($_[0]);
}
my $object = RTF::Tokenizer->new($line);
#my $object = RTF::Tokenizer->new($line, \&entity_handler);
while (1) {
my ($type, $value, $extra) = $object->get_token;
print "$type, $value, $extra\n";
if ($type eq 'eof') { exit; }
}
$rtf->bookmark('save', '_font_table_original');
$rtf->jump_to_control_word('fonttbl');
my ($la, $la, $la) = $rtf->get_token; # 'control', 'fonttbl'
$rtf->bookmark('retr', '_font_table_original');
$rtf->jump_to_control_word('rtf');
my ($la, $la, $la) = $rtf->get_token; # 'control', 'rtf', 1
$rtf->bookmark('retr', '_font_table_original');
$rtf->bookmark('delete', '_font_table_original');
METHODS
new ( $data [, entity handling subroutine ] )
Creates an instance. Needs a string of RTF for the first argument and an optional subroutine for the second. This subroutine is what to do upon finding an entity. Default behaviour is to change it into the character represented, but you can make it spit out HTML entities if you want too (as per the example above). The argument passed to this routine will be a hex value for the entity.
get_token
Returns a list, containing: token type (one of: control, text, group or eof), token data, and then if it's a control word, the integer value associated with it (if there is one).
bookmark ( action, name )
Saves a copy of the current buffer to a hash in the object, with the key of 'name'. Possible actions are 'save', 'retr' and 'delete.' It's probably a good idea, if you have a large amount of text, to delete your bookmarks when done, because the hash contains a copy of the data, rather than a position in the buffer. Font.pm contains a good example.
jump_to_control_word ( list of control words )
Goes through the buffer until it finds one of the control words. The next token from get_token
, having done this, will be the control word. The buffer up to this point will be lost (unless you've saved it.)
AUTHOR
Peter Sergeant <pete@clueball.com>
COPYRIGHT
Copyright 2002 Peter Sergeant.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.