NAME

Spreadsheet::XLSX::Reader::LibXML::XMLReader::Worksheet - A LibXML::XMLReader worksheet base class

SYNOPSIS

	use Data::Dumper;
	use MooseX::ShortCut::BuildInstance qw( build_instance );
	use Types::Standard qw( Bool HasMethods );
	use Spreadsheet::XLSX::Reader::LibXML::Error;
	use Spreadsheet::XLSX::Reader::LibXML::XMLReader::Worksheet;
	my  $error_instance = Spreadsheet::XLSX::Reader::LibXML::Error->new( should_warn => 0 );
	my  $workbook_instance = build_instance(
	        package     => 'WorkbookInstance',
	        add_methods =>{
	            counting_from_zero         => sub{ return 0 },
	            boundary_flag_setting      => sub{},
	            change_boundary_flag       => sub{},
	            _has_shared_strings_file   => sub{ return 1 },
				get_shared_string_position => sub{},
				_has_styles_file           => sub{},
	            get_format_position        => sub{},
	            get_group_return_type      => sub{},
	            set_group_return_type      => sub{},
	            get_epoch_year             => sub{ return '1904' },
	            change_output_encoding     => sub{ $_[0] },
	            get_date_behavior          => sub{},
	            set_date_behavior          => sub{},
	            get_empty_return_type      => sub{ return 'undef_string' },
	            get_values_only            => sub{},
	            set_values_only            => sub{},
	        },
	        add_attributes =>{
	            error_inst =>{
	                isa => 	HasMethods[qw(
	                            error set_error clear_error set_warnings if_warn
	                        ) ],
	                clearer  => '_clear_error_inst',
	                reader   => 'get_error_inst',
	                required => 1,
				    handles =>[ qw(
						error set_error clear_error set_warnings if_warn
					) ],
	            },
	            empty_is_end =>{
	                isa     => Bool,
					writer  => 'set_empty_is_end',
	                reader  => 'is_empty_the_end',
	                default => 0,
	            },
	            from_the_edge =>{
	                isa     => Bool,
					reader  => '_starts_at_the_edge',
	                writer  => 'set_from_the_edge',
					default => 1,
	            },
			},
			error_inst => $error_instance,
	    );
	my $test_instance = Spreadsheet::XLSX::Reader::LibXML::XMLReader::Worksheet->new(
	        file              => 'xl/worksheets/sheet3.xml',
	        error_inst        => $error_instance,
	        sheet_name        => 'Sheet3',
	        workbook_instance => $workbook_instance,
	    );
	my $x = 0;
	my $result;
	while( $x < 20 and (!$result or $result ne 'EOF') ){
	    $result = $test_instance->_get_next_value_cell;
	    print "Collecting data from position: $x" . Dumper( $result );
		$x++;
	}

	###############################################
	# SYNOPSIS Screen Output
	# 01: Collecting data from position: 0
	# 02: $VAR1 = {
	# 03:           'r' => 'A2',
	# 04:           'row' => 2,
	# 05:           'col' => 1,
	# 06:           'v' => {
	# 07:                  'raw_text' => '0'
	# 08:                },
	# 09:           't' => 's'
	# 10:         };
	# 11: 
	# 12: Collecting data from position: 1
	# 13: $VAR1 = {
	# 14:           'r' => 'D2',
	# 15:           'row' => 2,
	# 16:           'col' => 4,
	# 17:           'v' => {
	# 18:                  'raw_text' => '2'
	# 19:                },
	# 20:           't' => 's'
	# 21:         };
	#		~~ Continuing ~~
	###############################################
	
    

DESCRIPTION

This documentation is written to explain ways to use this module when writing your own excel parser. To use the general package for excel parsing out of the box please review the documentation for Workbooks, Worksheets, and Cells

This module provides the basic connection to individual worksheet files for parsing xlsx workbooks. It does not provide a way to connect to chartsheets. It does not provide the final view of a given cell. The final view of the cell is collated with the role Spreadsheet::XLSX::Reader::LibXML::GetCell. This reader extends the base reader class Spreadsheet::XLSX::Reader::LibXML::XMLReader. The functionality provided by those modules is not covered here.

Modification of this module probably means extending a different reader or using other roles for implementation of the class. See lines 18 and on in the code here for the location to change and See line 54 in the code Spreadsheet::XLSX::Reader::LibXML for the way to repoint the package at a new module.

Attributes

Data passed to new when creating an instance. For access to the values in these attributes see the listed 'attribute methods'. For general information on attributes see Moose::Manual::Attributes. For ways to manage the instance when opened see the Public Methods. The remaining undocumented attributes are used internally for tracking state.

sheet_type

Definition: This will always be 'worksheet' for this module. It is provided as a simple introspection method for distinguishing between worksheets and chartsheets in case the circumstances are ambiguous.

Default: 'worksheet'

Range: 'worksheet'

attribute methods Methods provided to adjust this attribute

get_sheet_type

Definition: return the attribute value

sheet_rel_id

Definition: To coordinate information accross the various sub-files excel maintains a relId for sheets. This is the value that excel assigned to this sheet.

Range: a string

attribute methods Methods provided to adjust this attribute

rel_id

Definition: return the attribute value

sheet_id

Definition: When writing vbScript the sheet can be identified by a number instead of a name. This is that number.

Range: an integer

attribute methods Methods provided to adjust this attribute

sheet_id

Definition: return the attribute value

sheet_position

Definition: Even if there are chartsheets in the workbook you will only get a list of worksheets when using 'worksheet' methods. However, this position is the visible position of the worksheet in the workbook including chartsheets. This can be different than sheet_id

Range: an integer

attribute methods Methods provided to adjust this attribute

position

Definition: return the attribute value

sheet_name

Definition: This is the visible string expressed on the tab of the worksheet in the workbook.

Range: a String

attribute methods Methods provided to adjust this attribute

get_name

Definition: return the attribute value

_sheet_min_col

Definition: This is the minimum column in the sheet with data or formatting. For this module it is pulled from the xml file at worksheet/dimension:ref = "upperleft:lowerright"

Range: an integer

attribute methods Methods provided to adjust this attribute

_set_min_col

Definition: sets the attribute value

_min_col

Definition: returns the attribute value

has_min_col

Definition: attribute predicate

_sheet_min_row

Definition: This is the minimum row in the sheet with data or formatting. For this module it is pulled from the xml file at worksheet/dimension:ref = "upperleft:lowerright"

Range: an integer

attribute methods Methods provided to adjust this attribute

_set_min_row

Definition: sets the attribute value

_min_row

Definition: returns the attribute value

has_min_row

Definition: attribute predicate

_sheet_max_col

Definition: This is the maximum column in the sheet with data or formatting. For this module it is pulled from the xml file at worksheet/dimension:ref = "upperleft:lowerright"

Range: an integer

attribute methods Methods provided to adjust this attribute

_set_max_col

Definition: sets the attribute value

_max_col

Definition: returns the attribute value

has_max_col

Definition: attribute predicate

_sheet_max_row

Definition: This is the maximum row in the sheet with data or formatting. For this module it is pulled from the xml file at worksheet/dimension:ref = "upperleft:lowerright"

Range: an integer

attribute methods Methods provided to adjust this attribute

_set_max_row

Definition: sets the attribute value

_max_row

Definition: returns the attribute value

has_max_row

Definition: attribute predicate

_merge_map

Definition: This is an array ref of array refs where the first level represents rows and the second level of array represents cells. If a cell is merged then the merge span is stored in the row sub array position. This means the same span is stored in multiple positions. The data is stored in the Excel convention of count from 1 so the first position in both levels of the array are essentially placeholders. The data is extracted from the merge section of the worksheet at worksheet/mergeCells. That array is read and converted into this format for reading by this module when it first opens the worksheet..

Range: an array ref

attribute methods Methods provided to adjust this attribute

_set_merge_map

Definition: sets the attribute value

delegated methods This attribute uses the native trait 'Array'

_get_row_merge_map( $int ) => 'get'

Definition: returns the sub array ref representing any merges for that row. If no merges are available for that row it returns undef.

Public Methods

These are the methods provided by this class for use by the end user.

min_row

Definition: This returns the minimum row with data or formatting in the worksheet. It is separated from _min_row so that the package can modify the output between the functions to match the attribute "count_from_zero" in Spreadsheet::XLSX::Reader::LibXML

Accepts: nothing

Returns: the minimum row integer

max_row

Definition: This returns the maximum row with data or formatting in the worksheet. It is separated from _max_row so that the package can modify the output between the functions to match the attribute "count_from_zero" in Spreadsheet::XLSX::Reader::LibXML

Accepts: nothing

Returns: the maximum row integer

min_col

Definition: This returns the minimum column with data or formatting in the worksheet. It is separated from _min_col so that the package can modify the output between the functions to match the attribute "count_from_zero" in Spreadsheet::XLSX::Reader::LibXML

Accepts: nothing

Returns: the minimum column integer

max_col

Definition: This returns the maximum column with data or formatting in the worksheet. It is separated from _max_col so that the package can modify the output between the functions to match the attribute "count_from_zero" in Spreadsheet::XLSX::Reader::LibXML

Accepts: nothing

Returns: the maximum column integer

row_range

Definition: This returns the first and last row with data or formatting in the worksheet. It is separated from _min_row and _max_row so that the package can modify the output between the functions to match the attribute "count_from_zero" in Spreadsheet::XLSX::Reader::LibXML

Accepts: nothing

Returns: ( $min_row, $max_row ) as a list

col_range

Definition: This returns the first and last column with data or formatting in the worksheet. It is separated from _min_col and _max_col so that the package can modify the output between the functions to match the attribute "count_from_zero" in Spreadsheet::XLSX::Reader::LibXML

Accepts: nothing

Returns: ( $min_col, $max_col ) as a list

Private Methods

These are the methods provided by this class for use within the package but are not intended to be used by the end user. Other private methods not listed here are used in the module but not used by the package. If the private method is listed here then replacement of this module either requires replacing them or rewriting all the associated connecting roles and classes.

_load_unique_bits

Definition: This is called by Spreadsheet::XLSX::Reader::LibXML::XMLReader when the file is loaded for the first time so that file specific data can be collected. All the Attributes with a leading _ in the documentation are filled in at this point.

Accepts: nothing

Returns: nothing

_get_next_value_cell

Definition: This returns the worksheet file hash ref representation of the xml stored for the 'next' value cell. A cell is determined to have value based on the attribute "values_only" in Spreadsheet::XLSX::Reader::LibXML. Next is affected by the attribute "empty_is_end" in Spreadsheet::XLSX::Reader::LibXML. This method never returns an 'EOR' flag. It just wraps automatically.

Accepts: nothing

Returns: the cell (or value as requested)

_get_next_cell

Definition: This returns on every cell position whether there is data or not. For empty cells undef is returned. Next is affected by the attribute "empty_is_end" in Spreadsheet::XLSX::Reader::LibXML. This method never returns an 'EOR' flag. It just wraps automatically.

Accepts: nothing

Returns: undef, a value, or the cell

_get_col_row( $col, $row )

Definition: This is the way to return the information about a specific position in the worksheet. Since this is a private method it requires its inputs to be in the 'count from one' index.

Accepts: ( $column, $row ) - both required in that order

Returns: whatever is in that worksheet position as a hashref

_get_row_all( $row )

Definition: This is returns an array ref of each of the values in the row placed in their 'count from one' position. If the row is empty but it is not the end of the sheet then this will return an empty array ref.

Accepts: ( $row ) - required

Returns: an array ref

SUPPORT

github Spreadsheet::XLSX::Reader::LibXML/issues

TODO

1. Nothing yet

AUTHOR

Jed Lund
jandrew@cpan.org

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

This software is copyrighted (c) 2014, 2015 by Jed Lund

DEPENDENCIES

version

perl 5.010

Moose

MooseX::StrictConstructor

MooseX::HasDefaults::RO

Carp - confess

Type::Tiny - 1.000

Spreadsheet::XLSX::Reader::LibXML::XMLReader

Spreadsheet::XLSX::Reader::LibXML::CellToColumnRow

Spreadsheet::XLSX::Reader::LibXML::XMLReader::XMLToPerlData

Spreadsheet::XLSX::Reader::LibXML::GetCell

SEE ALSO

Spreadsheet::ParseExcel - Excel 2003 and earlier

Spreadsheet::XLSX - 2007+

Spreadsheet::ParseXLSX - 2007+

Log::Shiras

All lines in this package that use Log::Shiras are commented out