NAME

Spreadsheet::Reader::ExcelXML::Styles - The styles interface

SYNOPSIS

#!/usr/bin/env perl
use Data::Dumper;
use MooseX::ShortCut::BuildInstance qw( build_instance );
use Types::Standard qw( ConsumerOf HasMethods Int );
use Spreadsheet::Reader::ExcelXML::Error;
use Spreadsheet::Reader::ExcelXML::Styles;
use Spreadsheet::Reader::ExcelXML::XMLReader::PositionStyles;
use Spreadsheet::Reader::ExcelXML::XMLReader;
use Spreadsheet::Reader::Format::FmtDefault;
use Spreadsheet::Reader::Format::ParseExcelFormatStrings;
use Spreadsheet::Reader::Format;
my	$workbook_instance = build_instance(
		package	=> 'Spreadsheet::Reader::ExcelXML::Workbook',
		add_attributes =>{
			formatter_inst =>{
				isa	=> 	ConsumerOf[ 'Spreadsheet::Reader::Format' ],# Interface
				writer	=> 'set_formatter_inst',
				reader	=> 'get_formatter_inst',
				predicate => '_has_formatter_inst',
				handles => { qw(
						get_formatter_region			get_excel_region
						has_target_encoding				has_target_encoding
						get_target_encoding				get_target_encoding
						set_target_encoding				set_target_encoding
						change_output_encoding			change_output_encoding
						set_defined_excel_formats		set_defined_excel_formats
						get_defined_conversion			get_defined_conversion
						parse_excel_format_string		parse_excel_format_string
						set_date_behavior				set_date_behavior
						set_european_first				set_european_first
						set_formatter_cache_behavior	set_cache_behavior
						get_excel_region				get_excel_region
					),
				},
			},
			epoch_year =>{
				isa => Int,
				reader => 'get_epoch_year',
				default => 1904,
			},
			error_inst =>{
				isa => 	HasMethods[qw(
									error set_error clear_error set_warnings if_warn
								) ],
				clearer		=> '_clear_error_inst',
				reader		=> 'get_error_inst',
				required	=> 1,
				handles =>[ qw(
					error set_error clear_error set_warnings if_warn
				) ],
				default => sub{ Spreadsheet::Reader::ExcelXML::Error->new() },
			},
		},
		add_methods =>{
			get_empty_return_type => sub{ 1 },
		},
	);
my	$format_instance = build_instance(
		package => 'FormatInstance',
		superclasses => [ 'Spreadsheet::Reader::Format::FmtDefault' ],
		add_roles_in_sequence =>[qw(
				Spreadsheet::Reader::Format::ParseExcelFormatStrings
				Spreadsheet::Reader::Format
		)],
		target_encoding => 'latin1',# Adjust the string output encoding here
		workbook_inst => $workbook_instance,
	);
$workbook_instance->set_formatter_inst( $format_instance );
my	$test_instance	=	build_instance(
		package => 'StylesInterface',
		superclasses => ['Spreadsheet::Reader::ExcelXML::XMLReader'],
		add_roles_in_sequence => [
			'Spreadsheet::Reader::ExcelXML::XMLReader::PositionStyles',
			'Spreadsheet::Reader::ExcelXML::Styles',
		],
		file => '../../../../t/test_files/xl/styles.xml',,
		workbook_inst => $workbook_instance,
	);
print Dumper( $test_instance->get_format( 2 ) );

#######################################
# SYNOPSIS Screen Output
# 01: $VAR1 = {
# 02: 'cell_style' => {
# 03:     'builtinId' => '0',
# 04:     'xfId' => '0',
# 05:     'name' => 'Normal'
# 06: },
# 07: 'cell_font' => {
# 08:     'name' => 'Calibri',
# 09:     'family' => '2',
# 10:     'scheme' => 'minor',
# 11:     'sz' => '11',
# 12:     'color' => {
# 13:         'theme' => '1'
# 14:     }
# 15:  },
# 16: 'cell_fill' => {
# 17:     'patternFill' => {
# 18:         'patternType' => 'none'
# 19:      }
# 20: },
# 21: 'cell_border' => {
# 22:      'diagonal' => undef,
# 23:      'bottom' => undef,
# 24:      'right' => undef,
# 25:      'top' => undef,
# 26:      'left' => undef
# 27: },
# 28: 'cell_coercion' => bless( {
~~ Skipped 142 lines ~~
#170:                             'display_name' => 'Excel_date_164',
#171:							  'name' => 'DATESTRING',
#172:                           }, 'Type::Tiny' ),
#173: 'applyNumberFormat' => '1',
#174: };
#######################################

DESCRIPTION

This documentation is written to explain ways to use this module. To use the general package for excel parsing out of the box please review the documentation for Workbooks , Worksheets , and Cells.

This role is written as the interface for getting useful data from the sub file 'styles.xml' that is a member of a zipped (.xlsx) archive or a stand alone XML text file containing an equivalent subset of information in the 'Styles' node. The styles.xml file contains the format and display options used by Excel for showing the stored data. The SYNOPSIS shows the (very convoluted) way to get this interface wired up and working. Unless you are trying to rewrite this package don't pay attention to that. The package will build it for you. This interface doesn't hold any of the functionality it just mandates certain behaviors below it. The documentation is the explanation of how the final class should perform when the layers below are correctly implemented.

Method(s)

These are the methods mandated by this interface.

get_format( ($position|$name), [$header], [$exclude_header] )

    Definition: This will return the styles information from the identified $position (counting from zero) or $name. The target position is usually drawn from the cell data stored in the worksheet. The information is returned as a perl hash ref. Since the styles data is in two tiers it finds all the subtier information for each indicated piece and appends them to the hash ref as values for each type key.

    Accepts position 0: dependant on the role implementation; $position = an integer for the styles $position. (from Spreadsheet::Reader::ExcelXML::XMLReader::PositionStyles), $name = a (sub) node name indicating which styles node should be returned (from Spreadsheet::Reader::ExcelXML::XMLReader::NamedStyles)

    Accepts position 1: $header = the target header key (use the "Attributes" in Spreadsheet::Reader::ExcelXML::Cell that are cell formats as the definition of range for this.) It will cause only this header subset to be returned

    Accepts position 2: $exclude_header = the target header key (use the "Attributes" in Spreadsheet::Reader::ExcelXML::Cell that are cell formats as the definition of range for this.) It will exclude the header from the returned data set.

    Returns: a hash ref of data

get_default_format( [$header], [$exclude_header] )

    Definition: For any cell that does not have a unquely identified format excel generally stores a default format for the remainder of the sheet. This will return the two tiered default styles information. The information is returned in the same format as the get_format method.

    Accepts position 0: $header = the target header key (use the "Attributes" in Spreadsheet::Reader::ExcelXML::Cell that are cell formats as the definition of range for this.) It will cause only this header subset to be returned

    Accepts position 1: $exclude_header = the target header key (optional at position 2) (use the "Attributes" in Spreadsheet::Reader::ExcelXML::Cell that are cell formats as the definition of range for this.) It will exclude the header from the returned data set.

    Returns: a hash ref of data

loaded_correctly

    Definition: When building a styles reader it may be that the file is deformed. This is the way to know if the reader thought the file was good.

    Accepts: Nothing

    Returns: (1|0)

Attributes

Data passed to new when creating an instance with this interface. For modification of this(ese) attribute(s) see the listed 'attribute methods'. For more information on attributes see Moose::Manual::Attributes. The easiest way to modify this(ese) attribute(s) is during instance creation before it is passed to the workbook or parser.

file

    Definition: This attribute holds the file handle for the file being read. If the full file name and path is passed to the attribute the class will coerce that into an IO::File file handle.

    Default: no default - this must be provided to read a file

    Required: yes

    Range: any unencrypted styles.xml file name and path or IO::File file handle with that content.

    attribute methods Methods provided to adjust this attribute

      set_file

        Definition: change the file value in the attribute (this will reboot the file instance and lock the file)

      get_file

        Definition: Returns the file handle of the file even if a file name was passed

      has_file

        Definition: this is used to see if the file loaded correctly.

      clear_file

        Definition: this clears (and unlocks) the file handle

cache_positions

    Definition: Especially for sheets with lots of stored formats the parser can slow way down when accessing each postion. This is because the are not stored sequentially and the reader is a JIT linear parser. To go back it must restart and index through each position till it gets to the right place. This is especially true for excel sheets that have experienced any significant level of manual intervention prior to being read. This attribute sets caching (default on) for styles so the parser builds and stores all the styles settings at the beginning. If the file is cached it will close and release the file handle in order to free up some space. (a small win in exchange for the space taken by the cache).

    Default: 1 = caching is on

    Range: 1|0

    Attribute required: yes

    attribute methods Methods provided to adjust this attribute

SUPPORT

TODO

    1. Nothing yet

AUTHOR

Jed Lund
jandrew@cpan.org

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

This software is copyrighted (c) 2016 by Jed Lund

DEPENDENCIES

SEE ALSO