NAME
Text::Parser::Multiline - Adds multi-line support to the Text::Parser object.
VERSION
version 0.800
SYNOPSIS
RATIONALE
Some text formats allow users to split a single line into multiple lines, with a continuation character in the beginning or in the end, usually to improve human readability.
To handle these types of text formats with the native Text::Parser class, the derived class would need to have a save_record
method that would:
Detect if the line is continued, and if it is, save it in a temporary location
Keep appending (or joining) any continued lines to this temporary location
Once the line continuation stops, then create a record and save the record with
save_record
method
It should also look for error conditions:
If the end of file is reached, and a joined line is still waiting incomplete, throw an exception "unexpected EOF"
If the first line in a text input happens to be a continuation of a previous line, that is impossible, since it is the first line ; so throw an exception
This gets further complicated by the fact that whereas some multi-line text formats have a way to indicate that the line continues after the current line (like a back-slash character at the end of the line or something), and some other text formats indicate that the current line is a continuation of the previous line. For example, in bash, Tcl, etc., the continuation character is \
(back-slash) which, if added to the end of a line of code would imply "there is more on the next line". In contrast, SPICE has a continuation character (+
) on the next line, indicating that the text on that line should be joined with the previous line.
This extension allows users to use the familiar save_record
interface to save records, as if all the multi-line text inputs were joined.
OVERVIEW
To create a multi-line text parser you need to know:
Determine if your parser is a
'join_next'
type or a'join_last'
type. This depends on which line has the continuation character.Recognize if a line has a continuation pattern
How to strip the continuation character and join with last line
So here are the things you need to do if you have to write a multi-line text parser:
As usual inherit from Text::Parser, never this class (
use parent 'Text::Parser'
)Override the
new
constructor to addmultiline_type
option by default. Read about the option here.Override the
is_line_continued
method to detect if there is a continuation character on the line.Override the
join_last_line
to join the previous line and the current line after stripping any continuation characters.Implement your
save_record
as if you always get joined lines, and
REQUIRED METHODS
The following methods are required to compose this role into an object or a class. There are some default implementations for both these methods, but for most practical purposes you'd want to override those in your own parser class.
$self-
>is_line_continued($line)
Takes a string argument as input. Returns a boolean that indicates if the current line is continued from the previous line, or is continued on the next line (depending on the type of multi-line text format).
$self-
>join_last_line($last_line, $current_line)
Takes two string arguments. The first is the line previously read which is expected to be continued on this line. The function should return a string that has stripped any continuation characters, and joined the current line with the previous line.
BUGS
Please report any bugs or feature requests on the bugtracker website http://rt.cpan.org/Public/Dist/Display.html?Name=Text-Parser or by email to bug-text-parser at rt.cpan.org.
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
AUTHOR
Balaji Ramasubramanian <balajiram@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2018 by Balaji Ramasubramanian.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.