NAME
Spreadsheet::Read::Ingester - ingest spreadsheets to Perl data structure for faster, repeated processing
SYNOPSIS
use Spreadsheet::Read::Ingester;
# ingest raw file, store parsed data file, and return data object
my $data = Spreadsheet::Read::Ingester->new('/path/to/file');
# the returned data object has all the methods of a L<Spreadsheet::Read> object
my $num_cols = $data->sheet(1)->maxcol;
# delete old data files older than 30 days to save disk space
Spreadsheet::Read::Ingester->cleanup;
DESCRIPTION
This module is a simple wrapper for Spreadsheet::Read to make repeated ingestion of raw data files faster.
Processing spreadsheet and csv from raw data files can be time consuming, especially with large data sets. Sometimes it's necessary to ingest the raw data file repeatedly. This module saves time be ingesting and parsing the data once using Spreadsheet::Read and then immediately saves the parsed data to to the user's home directory with Storable. Files are stored in the directory determined by File::UserConfig.
Subsequent ingestions of the original data file Spreadsheet::Read::Ingester are retrieved from the stored Perl data structure instead of the raw file.
Spreadsheet::Read::Ingester generates a unique file signature for the file so that if the original file changed, the data will be reingested from the raw file instead and a new parsed file with a new signature will be saved.
To access the data from the stored files and newly ingested files using the Spreadsheet::Read::Ingester object, consult the Spreadsheet::Read documentation for the methods it provides.
METHODS
new( $path_to_file )
my $data = Spreadsheet::Read::Ingester->new('/path/to/file');
Takes same arguments as the new constructor in Spreadsheet::Read module. Returns an object identical to the object returned by the Spreadsheet::Read module along with its corresponding methods.
cleanup($days) =method cleanup()
Spreadsheet::Read::Ingester->cleanup(0);
Deletes all stored files from the user's application data directory. Takes an optional argument indicating the minimum number of days old the file must be before it is deleted. Defaults to 30 days. Passing a value of 0 deletes all files.
REQUIRES
SUPPORT
Perldoc
You can find documentation for this module with the perldoc command.
perldoc Spreadsheet::Read::Ingester
Websites
The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources.
MetaCPAN
A modern, open-source CPAN search engine, useful to view POD in HTML format.
Source Code
The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :)
https://github.com/sdondley/Spreadsheet-Read-Ingester
git clone git://github.com/sdondley/Spreadsheet-Read-Ingester.git
BUGS AND LIMITATIONS
You can make new bug reports, and view existing ones, through the web interface at https://github.com/sdondley/Spreadsheet-Read-Ingester/issues.
INSTALLATION
See perlmodinstall for information and options on installing Perl modules.
SEE ALSO
AUTHOR
Steve Dondley <s@dondley.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2019 by Steve Dondley.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.