NAME
Spreadsheet::XLSX::Reader::LibXML::Styles - Read xlsx styles files with LibXML
SYNOPSIS
The following uses the 'styles.xml' file found in the t/test_files/xl/ folder. It also uses the XMLReader version of the Styles class. (Any version should do the same.)
#!/usr/bin/env perl
$|=1;
use Data::Dumper;
use MooseX::ShortCut::BuildInstance qw( build_instance );
use Spreadsheet::XLSX::Reader::LibXML::Error;
use Spreadsheet::XLSX::Reader::LibXML::XMLReader::Styles;
my $file_instance = build_instance(
package => 'StylesInstance',
superclasses => ['Spreadsheet::XLSX::Reader::LibXML::XMLReader::Styles'],
file => 'styles.xml', #( for excel zip format sheets )
error_inst => Spreadsheet::XLSX::Reader::LibXML::Error->new,
add_roles_in_sequence => [qw(
Spreadsheet::XLSX::Reader::LibXML::FmtDefault
Spreadsheet::XLSX::Reader::LibXML::ParseExcelFormatStrings
)],
);
print Dumper( $file_instance->get_format_position( 2 ) );
#######################################
# SYNOPSIS Screen Output
# 01: $VAR1 = {
# 02: 'applyNumberFormat' => '1',
# 03: 'fontId' => '0',
# 04: 'fonts' => {
# 05: 'color' => {
# 06: 'theme' => '1'
# 07: },
# 08: 'sz' => '11',
# 09: 'name' => 'Calibri',
# 10: 'scheme' => 'minor',
# 11: 'family' => '2'
# 12: },
# 13: 'numFmtId' => '164',
# 14: 'fillId' => '0',
# 15: 'xfId' => '0',
# 16: 'borders' => {
# 17: 'left' => 1,
# 18: 'right' => 1,
# 19: 'top' => 1,
# 20: 'diagonal' => 1,
# 21: 'bottom' => 1
# 22: },
# 23: 'borderId' => '0',
# 24: 'cellStyleXfs' => {
# 25: 'fillId' => '0',
# 26: 'fontId' => '0',
# 27: 'borderId' => '0',
# 28: 'numFmtId' => '0'
# 29: },
# 30: 'fills' => {
# 31: 'patternFill' => {
# 32: 'patternType' => 'none'
# 33: }
# 34: },
# 35: 'numFmts' => bless( {
# 36: 'name' => 'Excel_date_164',
# 37: 'uniq' => 86,
# 38: 'coercion' => bless( {
~ 180 lines hidden ~
# 219: }, 'Type::Coercion' )
# 220: }, 'Type::Tiny' )
# 221: };
#######################################
DESCRIPTION
This documentation is written to explain ways to use this module when writing your own excel parser or extending this package. To use the general package for excel parsing out of the box please review the documentation for Workbooks , Worksheets , and Cells.
This general class is written to get useful data from the sub file 'styles.xml' that is a member of a zipped (.xlsx) archive. The file to be read is generally found in the xl/ sub folder of the zip library. Sometimes it is found as a subset of a single xml tabular file. The styles.xml file contains the format and display options used by Excel for showing the stored data.
This documentation is for the general explanation of the Styles class. The example uses a class built with an XMLReader version at the core. Documentation specific to that parser can be found in the ~XMLReader::Styles documentation. To replace or augment this class you would need to understand how it is built on the fly using MooseX::ShortCut::BuildInstance. Next you should fork this code on github. Then add or change the parts you want and re-point the package to use the new elements in the correct circumstance using the $parser_modules variable maintained in the Spreadsheet::XLSX::Reader::LibXML class. (about line 35).
An Excel styles.xml file essentially contains two tiers of information. The first tier is a general collation of all style elements (borders, fonts, etc.) that apply to a specific style definition. The second tier is a listing of all settings for that style element. When building a format set for a specific cell all but the 'numFmts' elements are collected for informational purposes to be consumed 'as-is' by the end user of this package. The one exception to this is the 'numFmts' element which provides the definitions of how excel expects to transform stored numbers into some different string or number for display. In order to support the same view of a cell as provided by Excel this class also supplies methods to transform those numbers. The transformation build methods are extracted to the Spreadsheet::XLSX::Reader::LibXML::ParseExcelFormatStrings module. Excel also allows for localization so a number transformation will potientially look different in each localization option. As a consequence this class also has the localization element extracted to the Spreadsheet::XLSX::Reader::LibXML::FmtDefault module. Finally some parts of the package only want some parts of the format so a way to return only those sub format elements is also provided.
Required Method(s)
These are the primary way(s) to use this class. For additional Styles options see the Attributes section. All replacement classes must provide these methods. Methods used to manipulate the attributes are listed in each attribute.
get_format_position( $position, [$header] )
Definition: This will return the styles information from the identified $position (Counting from zero). the target position is usually drawn from the cell data stored in the worksheet. The information is returned as a perl hash ref. Since the styles data is in two tiers it finds all the subtier information for each indicated piece and appends them to the hash ref as values for each type key. If you only want a specific branch then you can add the branch $header key and the returned value will only contain that leg. If you know the second level position for that header then call get_sub_format_position as a quicker substitute.
Accepts: $position = an integer for the styles $position. (required)
Accepts: $header = the target header key (optional)
Returns: a hash ref of data
get_sub_format_position( $position, $header )
Definition: This will return the styles information from the identified $position (Counting from zero) for the specific $header. The information is returned as a perl hash ref. This call will not seek a second level just return the data from that header.
Accepts: $position = an integer for the styles $position (required)
Accepts: $header = a string for the header key (required)
Returns: a hash ref of data
get_default_format_position( $position, $header )
Definition: For any cell that does not have a unquely identified format excel generally stores a default format for the remainder of the sheet. This will return the two tiered default styles information. If you only want the default from a specific header then add the $header string to the method call. The information is returned as a perl hash ref.
Accepts: $header = a string for the header key (optional)
Returns: a hash ref of data
On-the-fly-roles
In order to let the end user define some behaviours for the styles implemented in this worksheet, roles are added to this class on the fly when the workbook is built for the first time. These roles are used to provide the functions "get_defined_excel_format( $integer )" in Spreadsheet::XLSX::Reader::LibXML::FmtDefault and "parse_excel_format_string( $string )" in Spreadsheet::XLSX::Reader::LibXML::ParseExcelFormatStrings that will be used by this class to get number formats (Type::Tiny Coercions). Changing which roles are supplied at run time is documented in the attributes "default_format_list" in Spreadsheet::XLSX::Reader::LibXML and "format_string_parser" in Spreadsheet::XLSX::Reader::LibXML To change the whole behaviour and interaction read paragraph 3 of the DESCRIPTION
Attributes
Data passed to new when creating an instance. For modification of these attributes see the listed 'attribute methods'. For more information on attributes see Moose::Manual::Attributes. It may be that these attributes migrate based on the reader type.
file
Definition: This needs to be the full file path to the styles file or an opened file handle . When set it will coerce to a file handle and then will open and read the primary settings in the styles.xml file and then maintain an open file handle for accessing specific styles position information.
Required: Yes
Default: none
Range an actual Excel 2007+ styles.xml file or open file handle (with the pointer set to the beginning of the file)
attribute methods Methods provided to adjust this attribute
get_file
Definition: Returns the value (file handle) stored in the attribute
set_file
Definition: Sets the value (file handle) stored in the attribute. Then triggers a read of the file level unique bits.
has_file
Definition: predicate for the attribute
error_inst
Definition: Currently all Styles readers require an Error instance. In general the package will share an error instance reference between the workbook and all classes built during the initial workbook build.
Required: Yes
Default: none
Range: The minimum list of methods to implement for your own instance is;
error set_error clear_error set_warnings if_warn
attribute methods Methods provided to adjust this attribute
get_error_inst
Definition: returns this instance
error
Definition: Used to get the most recently logged error
set_error
Definition: used to set a new error string
clear_error
Definition: used to clear the current error string in this attribute
set_warnings
Definition: used to turn on or off real time warnings when errors are set
if_warn
Definition: a method mostly used to extend this package and see if warnings should be emitted.
SUPPORT
TODO
1. Write a DOM version of the parser
AUTHOR
COPYRIGHT
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
This software is copyrighted (c) 2014, 2015 by Jed Lund
DEPENDENCIES
SEE ALSO
Spreadsheet::ParseExcel - Excel 2003 and earlier
Spreadsheet::XLSX - 2007+
Spreadsheet::ParseXLSX - 2007+
All lines in this package that use Log::Shiras are commented out