NAME

Spreadsheet::XLSX::Reader::LibXML::ParseExcelFormatStrings - Parser of XLSX format strings

SYNOPSYS

#!/usr/bin/env perl
package MyPackage;
use Moose;

sub get_log_space{};

with 	'Spreadsheet::XLSX::Reader::LibXML::UtilFunctions',
		'Spreadsheet::XLSX::Reader::LibXML::FmtDefault';
# call 'with' a second time to ensure that the prior methods are recorded
with	'Spreadsheet::XLSX::Reader::LibXML::ParseExcelFormatStrings';

package main;

my	$parser 	= MyPackage->new( epoch_year => 1904 );
my	$conversion	= $parser->parse_excel_format_string( '[$-409]dddd, mmmm dd, yyyy;@' );
print 'For conversion named: ' . $conversion->name . "\n";
for my	$unformatted_value ( '7/4/1776 11:00.234 AM', 0.112311 ){
	print "Unformatted value: $unformatted_value\n";
	print "..coerces to: " . $conversion->assert_coerce( $unformatted_value ) . "\n";
}

###########################
# SYNOPSIS Screen Output
# 01: For conversion named: DATESTRING_0
# 02: Unformatted value: 7/4/1776 11:00.234 AM
# 03: ..coerces to: Thursday, July 04, 1776
# 04: Unformatted value: 0.112311
# 05: ..coerces to: Friday, January 01, 1904
###########################

DESCRIPTION

This documentation is written to explain ways to extend this package. To use the data extraction of Excel workbooks, worksheets, and cells please review the documentation for Spreadsheet::XLSX::Reader::LibXML, Spreadsheet::XLSX::Reader::LibXML::Worksheet, and Spreadsheet::XLSX::Reader::LibXML::Cell

This is a general purpose Moose Role that will convert Excel format strings into Type::Coercion objects in order to implement the conversion defined by the format string. It does allow for the format string to have up to four parts separated by semi-colons. The four parts are positive, zero, negative, and text. In the Excel version the text is just a pass through. Because excel does not record dates prior to 1900ish as numbers this will actually take the date format indicated by the number portion and try and use that output format to coerce a text date into the same values. All dates are processed into and then potentially back out of DateTime objects.

To replace this module just build a Moose::Role that has the following Primary Methods and Attributes. Then set the format_string_parser attribute with the new role name when initially starting Spreadsheet::XLSX::Reader::LibXML.

The decimal conversion to fractions initially uses the continued fraction algorythm to calculate a possible fraction for the pased $decimal value with the setting of 20 max iterations and a maximum denominator width defined by the format string. If that does not resolve satisfactorily it then calculates an over/under numerator with decreasing denominators from the maximum denominator (based on the format string) all the way to the denominator of 2 and takes the most accurate result. There is no early-out set in this computation so if you reach this point for multi digit denominators it is computationally intensive. (Not that continued fractions are computationally so cheap.). However, doing the calculation this way generally yields the same result as Excel. In some few cases the result is more accurate. I was unable to duplicate the results from Excel exactly.

requires

These are method(s) used by this Role but not provided by the role. Any class consuming this role will not build without first providing these methods prior to loading this role.

get_log_space

    Definition: Used to return the log space used by the code protected by ###LogSD. See Log::Shiras for more information.

get_excel_region

Primary Methods

These are the primary ways to use this Role. For additional ParseExcelFormatStrings options see the Attributes section.

parse_excel_format_string( $string )

    Definition: This is the method to convert Excel format strings into Type::Tiny objects with built in coercions. The type coercion objects are then used to convert unformatted values into formatted values using the assert_coerce method. Coercions built allow for the format string to have up to four parts separated by semi-colons. These four parts correlate to four different data input ranges. The four parts are positive, zero, negative, and text. If three substrings are sent then the data input is split to positive and zero, negative, and text. If two input types are sent the data input is split between numbers and text. One input type is a take all comers with the exception of dates. Dates always add a possible from-text conversion to process Excel pre-1900ish dates. This is because excel does not record dates prior to 1900ish as numbers. All dates are processed into and then potentially back out of DateTime objects. This requires deep or stacked coercions .

    Accepts: an Excel number format strings

    Returns: a Type::Tiny object with type coercions and pre-filters set for each input type from the formatting string

Attributes

Data passed to new when creating the Styles instance. For modification of these attributes see the listed 'attribute methods'. For more information on attributes see Moose::Manual::Attributes. Most of these are not exposed to the top level of Spreadsheet::XLSX::Reader::LibXML.

epoch_year

    Definition: This is the epoch year in the Excel sheet. It differentiates between Windows and Apple Excel implementations. For more information see DateTimeX::Format::Excel

    Default: 1900

    attribute methods Methods provided to adjust this attribute

      get_epoch_year

        Definition: returns the value of the attribute

datetime_dates

    Definition: It may be that you desire the full DateTime object rather than the finalized datestring when converting unformatted data to formatted date data. This attribute sets whether data coersions are built to do the full conversion or just to a DateTime object.

    Default: 0 = unformatted values are coerced completely to date strings (1 = stop at DateTime)

    attribute methods Methods provided to adjust this attribute

      get_date_behavior

        Definition: returns the value of the attribute

      set_date_behavior( $Bool )

        Definition: sets the attribute value (only new coercions are affected)

        Accepts: Boolean values

cache_formats

    Definition: In order to save re-building the coercion each time it is required, built coercions can be cached with the format string as the key. This attribute sets whether caching is turned on or not.

    Default: 1 = caching is on

    attribute methods Methods provided to adjust this attribute

      get_cache_behavior

        Definition: returns the value of the attribute

format_cash

    Definition: This is the format cache described in cache_formats. It stores pre-built formats for re-use.

    Default: {}

    attribute methods Methods provided to adjust this attribute

      has_cached_format( $format_string )

        Definition: returns true if the $format_string has a pre-built coersion already stored

      =set_cached_format( $format_string = $coercion )>

        Definition: sets the coersion object for $format_string key

      get_cached_format( $format_string )

        Definition: gets the coersion object stored against the $format_string key

SUPPORT

TODO

    1. Attempt to merge _split_decimal_integer and _integer_and_decimal

AUTHOR

Jed Lund
jandrew@cpan.org

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

This software is copyrighted (c) 2014 by Jed Lund

DEPENDENCIES

SEE ALSO