NAME
TextFileParser - an extensible Perl class to parse any text file by specifying grammar in derived classes.
VERSION
version 0.1821905
SYNOPSIS
use strict;
use warnings;
use TextFileParser;
my $parser = new TextFileParser;
$parser->read(shift @ARGV);
print $parser->get_records, "\n";
The above code reads a text file and prints the content to STDOUT
.
Here's another parser which is derived from TextFileParser
as the base class. See how simple it is to make your own parser.
use strict;
use warnings;
package CSVParser;
use parent 'TextFileParser';
sub save_record {
my ($self, $line) = @_;
chomp $line;
my (@fields) = split /,/, $line;
$self->SUPER::save_record(\@fields);
}
package main;
my $a_parser = new CSVParser;
$a_parser->read(shift @ARGV);
foreach my $rec ($a_parser->get_records) {
print $_, "\t" for (@{$rec});
print "\n";
}
METHODS
new
Takes no arguments. Returns a blessed reference of the object.
my $pars = new TextFileParser;
read
Takes zero or one string argument with the name of the file. Throws an exception if filename provided is either non-existent or cannot be read for any reason.
$pars->read($filename);
# The above is equivalent to the following
$pars->filename($anotherfile);
$pars->read();
Returns once all records have been read or if an exception is thrown for any parsing errors. This function will handle all open
and close
operations on all files even if any exception is thrown.
use Try::Tiny;
try {
$pars->read('myfile.txt');
} catch {
print STDERR $_, "\n";
}
You're better-off not overriding this subroutine. Override save_record
instead. If you want to intervene in the file open
step you can't do it for now. A new version will explain how you can do that.
filename
Takes zero or one string argument with the name of a file. Returns the name of the file that was last opened if any. Returns undef if no file has been opened. This is most useful in generating error messages.
lines_parsed
Takes no arguments. Returns the number of lines last parsed.
print $pars->lines_parsed, " lines were parsed\n";
This is also very useful for error message generation. See example under Synopsis.
save_record
Takes exactly one string argument. This method can be overridden in derived classes to extract the relevant information from each line and store records. In general once the relevant data has been collected, you would want to call SUPER::save_record
. By default, this method saves the input string as the record.
See Synopsis for how a derived class could write their own method to handle data.
package MyParser;
use parent 'TextFileParser';
sub save_record {
my ($self, $line) = @_;
my $data = __extract_some_info($line);
$self->SUPER::save_record($data);
}
Here's an example of a parser that reads multi-line records: if a line starts with a '+'
character then it is to be treated as a continuation of the previous line.
use strict;
use warnings;
package MultilineParser;
use parent 'TextFileParser';
sub save_record {
my ($self, $line) = @_;
chomp $last_rec;
return $self->SUPER::save_record($line) if $line !~ /^[+]\s+/;
$line =~ s/^[+]\s+//g;
$self->SUPER::save_record( $self->pop_record . $line );
}
sub __append_last_record {
my ($self, $line) = @_;
my $last_rec = $self->pop_record;
$self->SUPER::save_record($last_rec . $line);
}
get_records
Takes no arguments. Returns an array containing all the records that were read by the parser.
record_list_pointer
Takes no arguments and returns the reference to the array containing all the records. This may be useful if you want to re-order the records in some way.
last_record
Takes no arguments and returns the last saved record. Leaves the saved records untouched.
my $last_rec = $pars->last_record;
pop_record
Takes no arguments and returns the last saved record. In the process the last element from the saved records is lost. To ensure data is not lost, make sure you call SUPER::save_record
or simply save_record
after a call to pop_record
.
my $last_rec = $pars->pop_record;
$uc_last = uc $last_rec;
$pars->save_record($uc_last);
AUTHOR
Balaji Ramasubramanian <balajiram@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2018 by Balaji Ramasubramanian.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.