NAME
Excel-Template-XLSX - Create Excel .xlsx files starting from (one or more) template(s).
SYNOPSIS
use Excel::Template::XLSX;
my ($self, $workbook) = Excel::Template::XLSX->new('perl.xlsx', 'template1.xlsx', 'template2.xlsx', ...);
$self->parse_template();
# Add a worksheet, ... and anything else you would do with Excel::Writer::XLSX
$worksheet = $workbook->add_worksheet();
# Although Excel::Writer::XLSX says the workbook will automatically get
# closed during global destruction. This wrapper around Excel::Writer::XLSX may
# mess this up, and it is better to specifically close your workbook when you are done.
$workbook->close();
DESCRIPTION
This module is a companion to Excel::Writer::XLSX(EWX), or if you prefer, a wrapper to that module. It uses EWX as a base class. It opens an existing spreadsheet file (.xlsx format), and also creates a new EWX object. As it parses the template file(s), it calls EWX methods to re-create the template contents in the EWX object.
When parsing is complete, the workbook object is left open for the calling perl script to add additional content.
The purpose of this module is to separate the roles of content/presentation vs programming in an Excel document, in much the same way that HTML templating engines work. A user who is knowledgeable in Excel can create an Excel file for use as a template, without requiring the skill set of Perl or Excel::Writer::XLSX. Conversely, the Perl programmer who is creating dynamic content does not need design skills to layout the presentation in the template.
WHAT IT CAN DO
Cell Values (strings, numbers, dates, ... )
Cell Formulas
Cell Hyperlinks
Cell Formatting (borders, shading, fonts, font sizes, colors)
Column Widths
Row Widths
Headers and Footers
Simple template variables (via callback) See "template_callback"
WHAT IT CAN NOT DO
Excel::Template::Excel can not modify Excel files in place! It is not
intended to. Since the parser directly adds content to the EWX workbook object
as the contents are parsed, both the template, and the output file must
be open at the same time.
It may be possible to open the output file to a file handle, and
after parsing of the template is complete, write the contents of file
over the template. The author has not tried this.
It is not the design of this module to faithfully re-create the entire
contents of the template file in the EWX output. If you are using this
module to rewrite Excel files, you are on your own.
These items are completely dropped from the output file:
Images in the Sheet
Images in Headers/Footers
Charts
Shapes
Themes (gradients, fonts, fills, styles)
macros
modules (vba code)
And probably other things. See the tests (t directory of the distribution)
for examples of what does work.
SUBROUTINES AND METHODS
__podhead
Dummy subroutine to allow me to hide this pod documentation when using code folding in the editor.
new
Creates a new Excel::Template::XLSX object, and also creates a new Excel::Writer::XLSX object. A workbook object is created for the output file.
Returns the Template object, and the workbook object. Workbook object is also available as $self->{EWX}; If the caller is only expecting a single return value, then just the $self object is returned.
parse_template
Parses common elements of the Spreadsheet, such as themes, styles, and strings. These are stored in the main object ($self).
Finds each sheet in the workbook, and initiates parsing of each sheet.
Properties for the created workbook are set from the first template that has properties. Properties in subsequent workbooks are ignored.
template_callback
Place holder method for a callback routine to modify the content of the template before being written to the output spreadsheet.
This callback is activated for all shared string (both plain and rich text strings), and also for header/footer text.
The callback is supplied with the two parameters: The object name (since this is a method), and the text to be processed. This is passed as a reference to single scalar.
This method is called numerous times during processing (e.g. once for each unique string in the spreadsheet, so the user is advised to keep it efficient.
This callback approach does not force any particular templating system on the user. They are free to use whatever system they choose.
Note that templating can only do simple scalars. Complex templating (if-then- else, loops, etc) do not make sense in that the callback is supplied with the contents of a single cell. Having said that, remember that the full power of Excel::Writer::XLSX is available to the user to modify the template after it is processed.
# A snippet of code to replace [% template %] in the
# template spreadsheet with 'Output'
my ($self, $wbk) = Excel::Template::XLSX->new($output_xlsx, $template_xlsx);
use Template::Tiny;
my $template = Template::Tiny->new( TRIM => 1 );
$self->{template_callback} = sub {
my ($self, $textref) = @_;
$template->process($textref, { template => 'Output' }, $textref );
};
$self->parse_template();
_apply_tint
Applies tinting to a color object, if the tint attribute is encountered in parsing.
_base_path_for
Manipulates the path to a member in the zip file, to find the associated rels file.
_cell_to_row_col
Converts an A1 style cell reference to a row and column index.
_color
Parses color element (rgb, index, theme, and tint)
_extract_files
Called by parse_template to fetch the xml strings from the zip file. XML strings are parsed, except for worksheets. Individual worksheets are parsed separately.
_parse_alignment
Parses horizontal and vertical cell alignments in a sheet.
_parse_borders
Parses cell border and diagonal border styles. Called from _parse_styles. Returns an array of border styles, each one as a hash.
_parse_fills
Parses styles for cell fills (pattern, foreground and background colors. horizontal and horizontal and vertical cell alignments in a sheet.
Gradients are parsed, but since EWX does not support gradients, a pattern is substituted.
_parse_fonts
Parses font information (font name, size, super/sub scripts, alignment colors, underline, bold, italic, and strikeout attributes).
_parse_numbers
Parses styles for cell number formats (financial, decimal, exponential, date-time, ...)
_parse_protection
Parses locked and hidden attributes for a cell. These are only useful if the worksheet is locked.
This module does not lock the workbook or the worksheet.
_parse_shared_strings
Parses the shared strings file. Excel does not directly store string values with the cell, but stores an index into the shared strings table instead, to save memory, if a string value is referenced more than once. Shared strings also contain formatting if multiple formats are applied within a cell (See write_rich_string in EWX.
_parse_sheet
Parses an individual worksheet. This is done in two passes. See _parse_sheet_pass1 and _parse_sheet_pass2 for what elements are parsed. This is necessary because the parse order of XML::Twig callbacks are in the wrong order for some sheet information (header/footer information, hyperlinks, and merged cells).
_parse_sheet_pass1
Parses some elements in a worksheet ( pageMargins, headerFooter, hyperlinks, pageSetup, Merged Cells, Sheet Formatting Row and Column heights, Sheet selection, and Tab Color)
_parse_sheet_pass2
Parses cell contents (first by row, then by column). Cells can contain inline strings, string references, direct string values, formulas, and hyperlinks. Each cell may also contain formatting information. The format is in an index to formatting for borders, shading, alignment, font, and number formats.
_parse_final_pass
Make a final pass to process merge_ranges that do not have any cell contents. This is it for now, but more may be added in the future.
_parse_styles
Parses style information. Parses number formats directly. Calls subroutines to parse fonts, fills, and borders, alignment, and protection.
Finally, parses Cell Xfs elements to Combine fonts, borders, number formats, alignment, patterns, into a single format specification.
Calls EWX add_formats to create a format, and stores the format information in a FORMAT array within the object.
_parse_themes
Parses theme information. Some color settings are referenced by an index to the theme.
_parse_xml
Low level subroutine to parse an entire member of a zip file. Used for small files, such as xxx.xml.rels, where the entire file is parsed.
For larger files, XML::Twig::twig_handlers are used.
_rels_for
Returns the .rels file name for a sibling workbook or worksheet.
zzpodtail
Dummy subroutine to allow me to hide pod documentation when using code folding in the editor.
INSTALLATION
Install with CPAN
cpan Excel::Template::XLSX
or, use the standard Unix style installation.
Unzip and untar the module as follows:
tar -zxvf Excel::Template::XLSX-nnn.tar.gz
The module can be installed using the standard Perl procedure:
perl Makefile.PL
make
make test
make install # As sudo/root
BUGS
- Large spreadsheets may cause segfaults on perl 5.14 and earlier
-
This module internally uses XML::Twig, which makes it potentially subject to Bug #71636 for XML-Twig: Segfault with medium-sized document on perl versions 5.14 and below (the underlying bug with perl weak references was fixed in perl 5.15.5). The larger and more complex the spreadsheet, the more likely to be affected, but the actual size at which it segfaults is platform dependent. On a 64-bit perl with 7.6gb memory, it was seen on spreadsheets about 300mb and above. You can work around this adding
XML::Twig::_set_weakrefs(0)
to your code before parsing the spreadsheet, although this may have other consequences such as memory leaks.Please report any bugs to GitHub Issues at https://github.com/davidsclarke/Excel-Template-XLSX/issues.
SUPPORT
You can find this documentation for this module with the perldoc command.
perldoc Excel::Template::XLSX
You can also look for information at:
MetaCPAN
RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Excel-Template-XLSX
Github
CPAN Ratings
DEBUGGING TIPS
Using the Perl debugger gets complicated because of XML::Twig. The objects created by XML::Twig are HUGE. Also, stepping through the code often results in exceeding a stack depth of >100. The author found it helpful to take advantage of the simplify() method in XML::Twig when using the debugger 'x' command to examine variables.
x $node->simplify()
Also, it is helpful to use the 'c' command to jump over XML::Twig subroutine calls and callbacks.
BUGS
Please report any bugs or feature requests to the author.
TO DO
Worksheet Activation
Table Formatting/Styles
Calculation Mode
REPOSITORY
The Excel::Template::XLSX source code is hosted on github: http://github.com/davidsclarke/Excel-Template-xlsx.
SEE ALSO
Excel::Writer::XLSX
This module does not provide much documentation on the capabilites of methods
for creating Excel content. The documentation provided with EWX is excellent,
and also has numerous examples included.
Spreadsheet::ParseXLSX
Although this module does not use Spreadsheet::ParseXLSX, the parsing and
comments regarding issues involved with parsing spreadsheets came from this module.
XML::Twig and Archive::Zip
Excel .xlsx files are zippped .xml files. These two modules are used to
unzip the .xlsx file, extract the members, and parse the relative portions
of the .xml files inside.
ACKNOWLEDGEMENTS
This module leverages the methods in Excel::Writer::XLSX, maintained by John McNamara to recreate the template.
The parser was developed using Spreadsheet::ParseXLSX as a starting point, maintained by Jesse Luehrs. This parser calls methods in EWX directly when a token is resolved rather than building up an object representing the parsed content.
LICENSE AND COPYRIGHT
Either the Perl Artistic Licence http://dev.perl.org/licenses/artistic.html or the GPL http://www.opensource.org/licenses/gpl-license.php.
AUTHOR
David Clarke dclarke@cpan.org