NAME
Spreadsheet::Edit::IO - convert between spreadsheet and csv files
SYNOPSIS
use Spreadsheet::Edit::IO
qw/convert_spreadsheet OpenAsCsv
cx2let let2cx @sane_CSV_read_options @sane_CSV_write_options/;
# Open a CSV file or result of converting a sheet from a spreadsheet
# DEPRECATED?
#
my $hash = OpenAsCsv("/path/to/spreadsheet.odt!Sheet1"); # single-arg form
my $hash = OpenAsCsv(inpath => "/path/to/spreadsheet.odt",
sheetname -> "Sheet1");
print "Reading ",$hash->csvpath()," with encoding ",$hash->encoding(),"\n";
while (<$hash->{fh}>) { ... }
# Convert CSV to spreadsheet
$hash = convert_spreadsheet(inpath => "mycsv.csv", cvt_to => "xls");
print "Resulting spreadsheet path is $hash->{outpath}\n";
# Convert a single sheet from a spreadsheet to CSV
$hash = convert_spreadsheet(inpath => "mywork.xls", sheetname => "Sheet1",
cvt_to => "csv");
open my $fh, (openhandle($hash->{outpath}) ? "<&" : "<"), $hash->{outpath};
binmode $fh, $hash->{iolayers};
...
# Convert all sheets in a spreadsheet to CSV files in a subdir
$hash = convert_spreadsheet(inpath => "mywork.xls", allsheets => 1,
cvt_to => "csv");
system "ls -l ".$hash->{outpath}; # show resulting .csv files
# Translate between 0-based column index and letter code (A, B, etc.)
print "The first column is column ", cx2let(0), " (duh!)\n";
print "The 100th column is column ", cx2let(99), "\n";
print "Column BF is index ", let2cx("BF"), "\n";
# Extract components of "filepath!SHEETNAME" specifiers
my $path = filepath_from_spec("/path/to/spreasheet.xls!Sheet1")
my $sheetname = sheetname_from_spec("/path/to/spreasheet.xls!Sheet1")
# Parse a csv file with sane options
my $csv = Text::CSV->new({ @sane_CSV_read_options, eol => $hash->{eol} })
or die "ERROR: ".Text::CSV->error_diag ();
my @rows
while (my $F = $csv->getline( $infh )) {
push @rows, $F;
}
close $infh or die "Error reading ", $hash->csvpath(), ": $!";
# Write a csv file with sane options
my $ocsv = Text::CSV->new({ @sane_CSV_write_options })
or die "ERROR: ".Text::CSV->error_diag ();
open my $outfh, ">:encoding(utf8)", $outpath
or die "$outpath: $!";
foreach (@rows) { $ocsv->print($outfh, $_) }
close $outfh or die "Error writing $outpath: $!";
DESCRIPTION
Convert between CSV and spreadsheet files using external programs, plus some utility functions
$hash = OpenAsCsv INPUT
$hash = OpenAsCsv inpath => INPUT, sheetname => SHEETNAME, ...
This is a thin wrapper for convert_spreadsheet
followed by open
(MAYBE 2B DEPRECATED?)
If a single argument is given it specifies INPUT; otherwise all arguments must be specified as key => value pairs, and may include any options supported by convert_spreadsheet
.
INPUT may be a csv or spreadsheet workbook path or an open filehandle to one of those; if a spreadsheet, then a single "sheet" is converted, specified by either a !SHEETNAME suffix in the INPUT path, or a separate sheetname => SHEETNAME
option.
The resulting file handle refers to a guaranteed-seekable CSV file; this will either be a temporary file (auto-removed at process exit), or the original INPUT if it was already a seekable csv file.
RETURNS: A ref to a hash containing the following:
{
fh => the resulting open file handle
iolayers => the iolayers (i.e. binmode arg) used by the file handle
csvpath => the path {fh} refers to, which might be a temporary file
inpath => original input path or open file handle
sheetname => sheet name if the input was a spreadsheet
tempdir => temporary directory, only if specified in input arguments
}
convert_spreadsheet INPUT, cvt_to=>suffix, OPTIONS
convert_spreadsheet INPUT, cvt_to=>"csv", allsheets => 1, OPTIONS
This converts from CSV to one of various spreadsheet formats or vice-versa.
RETURNS: A ref to a hash containing at least:
{
outpath => path of output file (or directory with 'allsheets')
iolayers => (i.e. binmode arg) needed to read output if CSV
inpath => path of original file with any !SHEETNAME suffix removed
sheetname => sheet name if the input was a spreadsheet
cvt_from => input file type, as specified or detected
cvt_to => output file type, as specified or derived from outpath
}
cvt_from
and cvt_to
are the filename suffixes (sans dot) of the corresponding file types, e.g. "csv", "xls", "xlsx", "odt" etc.
INPUT may be a path or a pre-opened file handle. If cvt_from
is is not specified then it is inferred from the INPUT path suffix, or if INPUT is a handle or has no suffix then the file content is examined.
If outpath => OUTPATH is specifed in OPTIONS then results are returned there. OUTPATH must have the appropriate file suffix (.csv .xls etc.) except with allsheets => 1
when OUTPATH, if specified, must be a directory path which will be created if it does not exist.
If outpath => OUTPATH is not specified (or is undef) then results are returned in temporary file(s) which are auto-deleted at process exit, except that if no conversion is necessary (cvt_from
is the same as cvt_to
) then INPUT itself is returned as outpath
.
Some vestigial support for spreadsheet formats exists but does not work well and is not documented here.
Other OPTIONS may include:
- sheetname => "sheet name"
-
The workbook 'sheet' name used when reading or writing a spreadsheet.
An input sheet name may also be specified as "!sheetname" appended to the INPUT path.
- allsheets => BOOL
-
Valid only with
cvt_to => 'csv'
. All sheets in the input spreadsheet are converted to separate .csv files.outpath
, if specified, must be a directory, which will be created if necessary; if not specified, then a new sub-directory of tempdir will be created to contain the the resulting .csv files. - verbose => BOOL
- use_gnumeric => BOOL # instead of libre/openoffice
- tempdir => "/path/to/dir"
-
If
tempdir
is not specified a temporary directory will be created and auto-removed when your process exits.
sane_CSV_read_options
sane_CSV_write_options
@Spreadsheet::Edit::IO:sane_CSV_read_options contains the options you will always want to use with Text::CSV objects->new().
Specifically, quotes and embedded newlines are handled correctly.
You can append overrides, e.g. auto_diag
or allow_whitespace
. If you change quote_char
, then escape_char
must be set to the same value.
cx2let COLUMNINDEX
let2cx LETTERCODE
Functions which translate between spreadsheet-column letter codes ("A", "B", etc.) and 0-based column indicies.
filepath_from_spec EXPR
sheetname_from_spec EXPR
Functions which decompose strings giving a spreadsheet path and possibly sheetname as "FILEPATH!SHEETNAME", "FILEPATH|||SHEETNAME", or "FILEPATH[SHEETNAME]". sheetname_from_spec
returns undef
if the input does not have a a sheetname suffix.
form_spec_with_sheetname(PATH, SHEENAME)
Composes a combined string in a "preferred" format ("PATH!SHEETNAME" or one of the others; which is not specified).