NAME
Text::Parser::Multiline - Adds multi-line support to the Text::Parser object.
VERSION
version 0.919
SYNOPSIS
use Text::Parser;
my $parser = Text::Parser->new(multiline_type => 'join_last');
$parser->read('filename.txt');
print $parser->get_records();
print scalar($parser->get_records()), " records were read although ",
$parser->lines_parsed(), " lines were parsed.\n";
RATIONALE
Some text formats allow line-wrapping with a continuation character, usually to improve human readability. To handle these types of text formats with the native Text::Parser class, the derived class would need to have a save_record
method that would:
Detect if the line is wrapped or is part of a wrapped line. To do this the developer has to implement a function named
is_line_continued
.Join any wrapped lines to form a single line. For this, the developer has to implement a function named
join_last_line
.
With these two things, the developer can implement their save_record
assuming that the line is already unwrapped.
OVERVIEW
This role may be composed into an object of the Text::Parser class. To use this role, just set the multiline_type
attribute. A derived class may set this in their constructor (or BUILDARGS
if you use Moose). If this option is set, the developer should re-define the is_line_continued
and join_last_line
methods.
ERRORS AND EXCEPTIONS
It should also look for the following error conditions (see Text::Parser::Errors):
If the end of file is reached, and the line is expected to be still continued, an exception of
Text::Parser::Errors::UnexpectedEof
is thrown.It is impossible for the first line in a text input to be wrapped from a previous line. So if this condition occurs, an exception of
Text::Parser::Errors::UnexpectedCont
is thrown.
METHODS TO BE IMPLEMENTED
These methods must be implemented by the developer in the derived class. There are default implementations provided in Text::Parser but they may not handle your target text format.
$parser->is_line_continued($line)
Takes a string argument containing the current line (also available through the this_line
method) as input. Your implementation should return a boolean that indicates if the current line is wrapped.
sub is_line_continued {
my ($self, $line) = @_;
chomp $line;
$line =~ /\\\s*$/;
}
The above example method checks if a line is being continued by using a back-slash character (\
).
$parser->join_last_line($last_line, $current_line)
Takes two string arguments. The first is the previously read line which is wrapped in the next line (the second argument). The second argument should be identical to the return value of this_line
. Neither argument will be undef
. Your implementation should join the two strings stripping any continuation character(s), and return the resultant string.
Here is an example implementation that joins the previous line terminated by a back-slash (\
) with the present line:
sub join_last_line {
my $self = shift;
my ($last, $line) = (shift, shift);
$last =~ s/\\\s*$//g;
return "$last $line";
}
SEE ALSO
BUGS
Please report any bugs or feature requests on the bugtracker website http://github.com/balajirama/Text-Parser/issues
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
AUTHOR
Balaji Ramasubramanian <balajiram@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2018-2019 by Balaji Ramasubramanian.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.