NAME

Text::Parser::Multiline - Adds multi-line support to the Text::Parser object.

VERSION

version 0.919

SYNOPSIS

use Text::Parser;

my $parser = Text::Parser->new(multiline_type => 'join_last');
$parser->read('filename.txt');
print $parser->get_records();
print scalar($parser->get_records()), " records were read although ",
    $parser->lines_parsed(), " lines were parsed.\n";

RATIONALE

Some text formats allow line-wrapping with a continuation character, usually to improve human readability. To handle these types of text formats with the native Text::Parser class, the derived class would need to have a save_record method that would:

  • Detect if the line is wrapped or is part of a wrapped line. To do this the developer has to implement a function named is_line_continued.

  • Join any wrapped lines to form a single line. For this, the developer has to implement a function named join_last_line.

With these two things, the developer can implement their save_record assuming that the line is already unwrapped.

OVERVIEW

This role may be composed into an object of the Text::Parser class. To use this role, just set the multiline_type attribute. A derived class may set this in their constructor (or BUILDARGS if you use Moose). If this option is set, the developer should re-define the is_line_continued and join_last_line methods.

ERRORS AND EXCEPTIONS

It should also look for the following error conditions (see Text::Parser::Errors):

METHODS TO BE IMPLEMENTED

These methods must be implemented by the developer in the derived class. There are default implementations provided in Text::Parser but they may not handle your target text format.

$parser->is_line_continued($line)

Takes a string argument containing the current line (also available through the this_line method) as input. Your implementation should return a boolean that indicates if the current line is wrapped.

sub is_line_continued {
    my ($self, $line) = @_;
    chomp $line;
    $line =~ /\\\s*$/;
}

The above example method checks if a line is being continued by using a back-slash character (\).

$parser->join_last_line($last_line, $current_line)

Takes two string arguments. The first is the previously read line which is wrapped in the next line (the second argument). The second argument should be identical to the return value of this_line. Neither argument will be undef. Your implementation should join the two strings stripping any continuation character(s), and return the resultant string.

Here is an example implementation that joins the previous line terminated by a back-slash (\) with the present line:

sub join_last_line {
    my $self = shift;
    my ($last, $line) = (shift, shift);
    $last =~ s/\\\s*$//g;
    return "$last $line";
}

SEE ALSO

BUGS

Please report any bugs or feature requests on the bugtracker website http://github.com/balajirama/Text-Parser/issues

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

AUTHOR

Balaji Ramasubramanian <balajiram@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2018-2019 by Balaji Ramasubramanian.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.