NAME

Spreadsheet::Edit::IO - convert between spreadsheet and csv files

SYNOPSIS

use Spreadsheet::Edit::IO
      qw/convert_spreadsheet OpenAsCsv
         cx2let let2cx @sane_CSV_read_options @sane_CSV_write_options/;

# Open a CSV file or result of converting a sheet from a spreadsheet 
# DEPRECATED? 
#
my $hash = OpenAsCsv("/path/to/spreadsheet.odt!Sheet1");  # single-arg form
my $hash = OpenAsCsv(inpath => "/path/to/spreadsheet.odt", 
                     sheetname -> "Sheet1"); 
print "Reading ",$hash->csvpath()," with encoding ",$hash->encoding(),"\n";
while (<$hash->{fh}>) { ... }

# Convert CSV to spreadsheet
$hash = convert_spreadsheet(inpath => "mycsv.csv", cvt_to => "xls");
print "Resulting spreadsheet path is $hash->{outpath}\n";

# Convert a single sheet from a spreadsheet to CSV
$hash = convert_spreadsheet(inpath => "mywork.xls", sheetname => "Sheet1", 
                            cvt_to => "csv");
open my $fh, (openhandle($hash->{outpath}) ? "<&" : "<"), $hash->{outpath};
binmode $fh, $hash->{iolayers};
...

# Convert all sheets in a spreadsheet to CSV files in a subdir
$hash = convert_spreadsheet(inpath => "mywork.xls", allsheets => 1,
                            cvt_to => "csv");
system "ls -l ".$hash->{outpath};  # show resulting .csv files

# Translate between 0-based column index and letter code (A, B, etc.)
print "The first column is column ", cx2let(0), " (duh!)\n";
print "The 100th column is column ", cx2let(99), "\n";
print "Column BF is index ", let2cx("BF"), "\n";

# Extract components of "filepath!SHEETNAME" specifiers
my $path      = filepath_from_spec("/path/to/spreasheet.xls!Sheet1")
my $sheetname = sheetname_from_spec("/path/to/spreasheet.xls!Sheet1")

# Parse a csv file with sane options
my $csv = Text::CSV->new({ @sane_CSV_read_options, eol => $hash->{eol} })
  or die "ERROR: ".Text::CSV->error_diag ();
my @rows
while (my $F = $csv->getline( $infh )) {
  push @rows, $F;
}
close $infh or die "Error reading ", $hash->csvpath(), ": $!";

# Write a csv file with sane options
my $ocsv = Text::CSV->new({ @sane_CSV_write_options })
  or die "ERROR: ".Text::CSV->error_diag ();
open my $outfh, ">:encoding(utf8)", $outpath
  or die "$outpath: $!";
foreach (@rows) { $ocsv->print($outfh, $_) }
close $outfh or die "Error writing $outpath: $!";

DESCRIPTION

Convert between CSV and spreadsheet files using external programs, plus some utility functions

$hash = OpenAsCsv INPUT

$hash = OpenAsCsv inpath => INPUT, sheetname => SHEETNAME, ...

This is a thin wrapper for convert_spreadsheet followed by open (MAYBE 2B DEPRECATED?)

If a single argument is given it specifies INPUT; otherwise all arguments must be specified as key => value pairs, and may include any options supported by convert_spreadsheet.

INPUT may be a csv or spreadsheet workbook path or an open filehandle to one of those; if a spreadsheet, then a single "sheet" is converted, specified by either a !SHEETNAME suffix in the INPUT path, or a separate sheetname => SHEETNAME option.

The resulting file handle refers to a guaranteed-seekable CSV file; this will either be a temporary file (auto-removed at process exit), or the original INPUT if it was already a seekable csv file.

RETURNS: A ref to a hash containing the following:

{
 fh        => the resulting open file handle
 iolayers  => the iolayers (i.e. binmode arg) used by the file handle
 csvpath   => the path {fh} refers to, which might be a temporary file
 inpath    => original input path or open file handle
 sheetname => sheet name if the input was a spreadsheet
 tempdir   => temporary directory, only if specified in input arguments
}

convert_spreadsheet INPUT, outpath=>OUTPATH, OPTIONS

convert_spreadsheet INPUT, cvt_to=>suffix, OPTIONS

convert_spreadsheet INPUT, cvt_to=>"csv", allsheets => 1, OPTIONS

This converts from CSV to one of various spreadsheet formats or vice-versa.

RETURNS: A ref to a hash containing at least:

{
 outpath   => path of output file (or directory with 'allsheets')
 iolayers  => (i.e. binmode arg) needed to read output if CSV
 inpath    => path of original file with any !SHEETNAME suffix removed
 sheetname => sheet name if the input was a spreadsheet
 cvt_from  => input file type, as specified or detected
 cvt_to    => output file type, as specified or derived from outpath
}

cvt_from and cvt_to are the filename suffixes (sans dot) of the corresponding file types, e.g. "csv", "xls", "xlsx", "odt" etc.

INPUT may be a path or a pre-opened file handle. If cvt_from is is not specified then it is inferred from the INPUT path suffix, or if INPUT is a handle or has no suffix then the file content is examined.

If OUTPATH is not specified then results are returned in temporary file(s) which are auto-deleted at process exit, except that if no conversion is necessary (cvt_from is the same as cvt_to) then INPUT itself is returned as outpath.

If OUTPATH is specifed then it must have the appropriate file suffix (.csv .xls etc.) except with allsheets => 1 when OUTPATH may give a directory path which will be created if it does not exist.

Some vestigial support for spreadsheet formats exists but does not work well and is not documented here.

OPTIONS may include:

sheetname => "sheet name"

The workbook 'sheet' name used when reading or writing a spreadsheet.

An input sheet name may also be specified as "!sheetname" appended to the INPUT path.

allsheets => BOOL

Valid only with cvt_to => 'csv'. All sheets in the input spreadsheet are converted to separate .csv files. outpath, if specified, must be a directory, which will be created if necessary; if not specified, then a new sub-directory of tempdir will be created to contain the the resulting .csv files.

verbose => BOOL
use_gnumeric => BOOL # instead of libre/openoffice
tempdir => "/path/to/dir"

If tempdir is not specified a temporary directory will be created and auto-removed when the last reference goes out of scope.

sane_CSV_read_options

sane_CSV_write_options

@Spreadsheet::Edit::IO:sane_CSV_read_options contains the options you will always want to use with Text::CSV objects->new().

Specifically, quotes and embedded newlines are handled correctly.

You can append overrides, e.g. auto_diag or allow_whitespace. If you change quote_char, then escape_char must be set to the same value.

cx2let COLUMNINDEX

let2cx LETTERCODE

Functions which translate between spreadsheet-column letter codes ("A", "B", etc.) and 0-based column indicies.

filepath_from_spec EXPR

sheetname_from_spec EXPR

Functions which decompose strings giving a spreadsheet path and possibly sheetname as "FILEPATH!SHEETNAME", "FILEPATH|||SHEETNAME", or "FILEPATH[SHEETNAME]". sheetname_from_spec returns undef if the input does not have a a sheetname suffix.

form_spec_with_sheetname(PATH, SHEENAME)

Composes a combined string in a "preferred" format ("PATH!SHEETNAME" or one of the others; which is not specified).