NAME
Spreadsheet::Edit::IO - convert between spreadsheet and csv files
SYNOPSIS
use Spreadsheet::Edit::IO
qw/convert_spreadsheet OpenAsCsv
cx2let let2cx @sane_CSV_read_options @sane_CSV_write_options/;
# Open a CSV file or result of converting a sheet from a spreadsheet
# DEPRECATED?
#
my $hash = OpenAsCsv("/path/to/spreadsheet.odt!Sheet1"); # single-arg form
my $hash = OpenAsCsv(inpath => "/path/to/spreadsheet.odt",
sheetname -> "Sheet1");
print "Reading ",$hash->{csvpath}," with encoding ",$hash->{encoding},"\n";
while (<$hash->{fh}>) { ... }
# Convert CSV to spreadsheet
$hash = convert_spreadsheet(inpath => "mycsv.csv", cvt_to => "xlsx");
print "Resulting spreadsheet path is $hash->{outpath}\n";
# Convert a single sheet from a spreadsheet to CSV
$hash = convert_spreadsheet(inpath => "mywork.xls", sheetname => "Sheet1",
cvt_to => "csv");
open my $fh, "<", $hash->{outpath};
binmode $fh, ":encoding(":crlf:encoding(".$hash->{encoding}.")");
...
# Convert all sheets in a spreadsheet to CSV files in a subdir
$hash = convert_spreadsheet(inpath => "mywork.xls", allsheets => 1,
cvt_to => "csv");
system "ls -l ".$hash->{outpath}; # show resulting .csv files
# Translate between 0-based column index and letter code (A, B, etc.)
print "The first column is column ", cx2let(0), " (duh!)\n";
print "The 100th column is column ", cx2let(99), "\n";
print "Column BF is index ", let2cx("BF"), "\n";
# Extract components of "filepath!SHEETNAME" specifiers
my $path = filepath_from_spec("/path/to/spreasheet.xls!Sheet1")
my $sheetname = sheetname_from_spec("/path/to/spreasheet.xls!Sheet1")
# Parse a csv file with sane options
my $csv = Text::CSV->new({ @sane_CSV_read_options, eol => $hash->{eol} })
or die "ERROR: ".Text::CSV->error_diag ();
my @rows
while (my $F = $csv->getline( $infh )) {
push @rows, $F;
}
close $infh or die "Error reading ", $hash->csvpath(), ": $!";
# Write a csv file with sane options
my $ocsv = Text::CSV->new({ @sane_CSV_write_options })
or die "ERROR: ".Text::CSV->error_diag ();
open my $outfh, ">:encoding(utf8)", $outpath
or die "$outpath: $!";
foreach (@rows) { $ocsv->print($outfh, $_) }
close $outfh or die "Error writing $outpath: $!";
DESCRIPTION
Convert between CSV and spreadsheet files using external programs, plus some utility functions
$hash = OpenAsCsv INPUT
$hash = OpenAsCsv inpath => INPUT, sheetname => SHEETNAME, ...
This is a thin wrapper for convert_spreadsheet
followed by open
(MAYBE 2B DEPRECATED?)
If a single argument is given it specifies INPUT; otherwise all arguments must be specified as key => value pairs, and may include any options supported by convert_spreadsheet
.
INPUT may be a csv or spreadsheet workbook path; if a spreadsheet, then a single "sheet" is converted, specified by either a !SHEETNAME suffix in the INPUT path, or a separate sheetname => SHEETNAME
option.
The resulting file handle refers to a guaranteed-seekable CSV file; this will either be a temporary file (auto-removed at process exit), or the original INPUT if it was already a seekable csv file.
RETURNS: A ref to a hash containing the following:
{
fh => the resulting open file handle
encoding => the encoding used for the .csv file
csvpath => the path {fh} refers to, which might be a temporary file
sheetname => sheet name if the input was a spreadsheet
}
convert_spreadsheet INPUT, cvt_to=>suffix, OPTIONS
convert_spreadsheet INPUT, cvt_to=>"csv", allsheets => 1, OPTIONS
This converts between CSV and various spreadsheet.
RETURNS: A ref to a hash containing:
{
outpath => path to the output file (or directory with 'allsheets')
(a temporary file/dir or as you specified in OPTIONS).
encoding => the encoding used when writing .csv files
}
INPUT is the input file path; it may be a separate first argument as shown above, or else included in OPTIONS as inpath => INPUT
.
If outpath => OUTPATH
is specifed then results are always saved to that path. With allsheets
this is a directory, which will be created if necessary.
If outpath
is NOT specified in OPTIONS then, with one exception, results are saved to a temporary file or directory and the path returned as outpath
in the result hash. The exception is if no conversion is necessary (i.e. cvt_from
is the same as cvt_to
), when the input file itself is returned as outpath
.
In all cases outpath
in the result hash points to the results.
cvt_from
and cvt_to
are the filename suffixes (sans dot) of the corresponding file types, e.g. "csv", "xls", "xlsx", "odt" etc. These need not be specified when they can be inferred from INPUT or outpath
respectively.
Some vestigial support for spreadsheet formats exists but does not work well and is not documented here.
OPTIONS may also include:
- sheetname => "sheet name"
-
The workbook 'sheet' name used when reading or writing a spreadsheet.
An input sheet name may also be specified as "!sheetname" appended to the INPUT path.
- allsheets => BOOL
-
Valid only with
cvt_to => 'csv'
. All sheets in the input spreadsheet are converted to separate .csv files named "SHEETNAME.csv" in the 'outpath' directory. - input_encoding => ENCODING
-
Specifies the encoding of INPUT if it is a csv file.
ENCODING may be a comma-separated list of encoding names which will be tried in the order until one seems to work (requires pre-reading the input file). If only one encoding is specified it will be used without trying it first. The default is "UTF-8,UTF-16BE,UTF-16LE,windows-1252".
- output_encoding => ENCODING
-
Specifies the encoding to use when writing csv file(s). The default it 'UTF-8'.
- verbose => BOOL
- use_gnumeric => BOOL # instead of libre/openoffice (DEPRECATED)
'binmode' Argument For Reading result CSVs
It is not possible to control the line-ending style in output CSV files, but the following incantation will correctly read either DOS/Windows (CR,LF) or *nix (LF) line endings as a single \n:
open my $fh, "<", $resulthash->{outpath};
my $enc = $resulthash->{encoding};
binmode($fh, ":crlf:encoding($enc)");
@sane_CSV_read_options
@sane_CSV_write_options
These contain options you will always want to use with Text::CSV->new()
. Specifically, quotes and embedded newlines are handled correctly.
Not exported by default.
cx2let COLUMNINDEX
let2cx LETTERCODE
Functions which translate between spreadsheet-column letter codes ("A", "B", etc.) and 0-based column indicies. Not exported by default.
filepath_from_spec EXPR
sheetname_from_spec EXPR
Functions which decompose strings containing a spreadsheet path and possibly sheetname suffix, of the form "FILEPATH!SHEETNAME", "FILEPATH|||SHEETNAME", or "FILEPATH[SHEETNAME]". sheetname_from_spec
returns undef
if the input does not have a a sheetname suffix. Not exported by default.
form_spec_with_sheetname(PATH, SHEENAME)
Composes a combined string in a "preferred" format (currently "PATH!SHEETNAME"). Not exported by default.