NAME

Tie::CSV_File - ties a csv-file to an array of arrays

SYNOPSIS

use Tie::CSV_File;

tie my @data, 'Tie::File', 'xyz.dat';
print "Data in 3rd line, 5th column: ", $data[2][4];
untie @data;

# or to read a tabular seperated file
tie my @data, 'Tie::File', 'xyz.dat', sep_char     => "\t",
                                      quote_char   => undef,
                                      eol          => undef, # default
                                      escape_char  => undef,
                                      always_quote => 0;     # default
                                      
# or to read a simple white space seperated file
tie my @data, 'Tie::File', 'xyz.dat', sep_re       => qr/\s+/,
                                      sep_char     => ' ',
                                      quote_char   => undef,
                                      eol          => undef, # default
                                      escape_char  => undef,
                                      always_quote => 0;     # default

$data[1][3] = 4;
$data[-1][-1] = "last column in last line";

[NOT YET IMPLEMENTED]
push @data, [qw/Jan Feb Mar/];
delete $data[3][2];

DESCRIPTION

Tie::File represents a regular csv file as a Perl array of arrays. The first dimension of the represents the line-nr in the original file, the second dimension represents the col-nr. Both indices are starting with 0. You can also access with the normal array value, e.g. $data[-1][-1] stands for the last field in the last line, or @{$data[1]} stands for the columns of the second line.

An empty field has the value '', while a not existing field has the value undef. E.g. about the file

"first field",,
"last field"

"the above line is empty"

we can say

$data[0][0] eq "first field"
$data[0][1] eq ""
!defined $data[0][2] 

$data[1][0] eq "last field"

@{$data[1]}  # is an empty list ()
!defined $data[1][0]

$data[2][0] eq "the above line is empty"

!defined $data[$x][$y] # for every $x > 3, $y any 

Note, that it is possible also, to change the data. At the moment it's only tested to work with direct access:

$data[0][0]   = "first line, first column";
$data[3][7]   = "anywhere in the world";
$data[-1][-1] = "last line, last column";

It's not yet implemented yet to have another access:

[NOT IMPLEMENTED YET]
$data[0] = ["Last name", "First name", "Address"];
push @data, ["Schleicher", "Janek", "Germany"];
my @header = @{ shift @data };

But it will be implemented soon.

There's only a small part of the whole file in memory, so this module will work also for large files. Please look the Tie::File module for any details, as I use it to read the lines of the file.

But it won't work with large fields, as all fields of one line are parsed, even if you only want to get one field.

CSV options for tieing

Similar to Text::CSV_XS, you can add the following options:

quote_char {default: "} =item eol {default: undef}, =item sep_char {default: ,} =item escape_char {default: "} =item always_quote {default: 0}

Please read the documentation of Text::CSV_XS for details.

Note, that the binary option isn't available.

In addition to have an easier working with files, that aren't seperated with different characters, e.g. sometimes one whitespace, sometimes more, I added the sep_re option (defaults to undef).

If it is specified, sep_char is ignored when reading, instead something similar to split at the sepater is done to find out the fields.

E.g., you can say

tie my @data, 'Tie::File', 'xyz.dat', sep_re       => qr/\s+/,
                                      quote_char   => undef,
                                      eol          => undef, # default
                                      escape_char  => undef,
                                      always_quote => 0;     # default
                                      

to read something like

   PID TTY          TIME CMD
1200 pts/0    00:00:00 bash
1221 pts/0    00:00:01 nedit
1224 pts/0    00:00:01 nedit
1228 pts/0    00:00:06 nedit
1318 pts/0    00:00:01 nedit
1605 pts/0    00:00:00 ps

Note, that the value of sep_re must be a regexp object, e.g. generated with qr/.../. A simple string produces an error.

Note also, that sep_char is used to write data.

EXPORT

None by default.

TODO

Implement the missing features for indirect writable access.

Possibility to give (memory) options at tieing, like mode, memory, dw_size similar to Tie::File.

Implement binary mode.

Option like filter = sub { s/\s+/ / }> that would specify a routine called before a line is processed. Perhaps even process is a sensfull name to this option.

Create constants for tabulator seperated, whitespace seperated, ... files.

Warn if sep_char isn't matched with a specified sep_re.

SEE ALSO

Tie::File Text::CSV Text::CSV_XS

AUTHOR

Janek Schleicher, <big@kamelfreund.de<gt>

COPYRIGHT AND LICENSE

Copyright 2002 by Janek Schleicher

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.