NAME

Data::Walk::Extracted - An extracted dataref walker

SYNOPSIS

    #!perl
	use Modern::Perl;
	use YAML::Any;
	use Moose::Util qw( with_traits );
	use Data::Walk::Extracted v0.011;
	use Data::Walk::Print v0.009;

	$| = 1;

	#Use YAML to compress writing the data ref
	my  $firstref = Load(
		'---
		Someotherkey:
			value
		Parsing:
			HashRef:
				LOGGER:
					run: INFO
		Helping:
			- Somelevel
			- MyKey:
				MiddleKey:
					LowerKey1: lvalue1
					LowerKey2:
						BottomKey1: 12345
						BottomKey2:
						- bavalue1
						- bavalue2
						- bavalue3'
	);
	my  $secondref = Load(
		'---
		Someotherkey:
			value
		Helping:
			- Somelevel
			- MyKey:
				MiddleKey:
					LowerKey1: lvalue1
					LowerKey2:
						BottomKey2:
						- bavalue1
						- bavalue2
						BottomKey1: 12354'
	);
	my $AT_ST = with_traits( 
			'Data::Walk::Extracted', 
			( 'Data::Walk::Print' ),
		)->new(
			match_highlighting => 1,#This is the default
		);
	$AT_ST->print_data(
		print_ref	=>  $firstref,
		match_ref	=>  $secondref,
		sort_HASH	=> 1,#To force order for demo purposes
	);
    
    ############################################################################
    #     Output of SYNOPSIS
    # 01:{#<--- Ref Type Match
    # 02:	Helping => [#<--- Secondary Key Match - Ref Type Match
    # 03:		'Somelevel',#<--- Secondary Position Exists - Secondary Value Matches
    # 04:		{#<--- Secondary Position Exists - Ref Type Match
    # 05:			MyKey => {#<--- Secondary Key Match - Ref Type Match
    # 06:				MiddleKey => {#<--- Secondary Key Match - Ref Type Match
    # 07:					LowerKey1 => 'lvalue1',#<--- Secondary Key Match - Secondary Value Matches
    # 08:					LowerKey2 => {#<--- Secondary Key Match - Ref Type Match
    # 09:						BottomKey1 => '12345',#<--- Secondary Key Match - Secondary Value Does NOT Match
    # 10:						BottomKey2 => [#<--- Secondary Key Match - Ref Type Match
    # 11:							'bavalue1',#<--- Secondary Position Exists - Secondary Value Matches
    # 12:							'bavalue2',#<--- Secondary Position Exists - Secondary Value Does NOT Match
    # 13:							'bavalue3',#<--- Secondary Position Does NOT Exist - Secondary Value Does NOT Match
    # 14:						],
    # 15:					},
    # 16:				},
    # 17:			},
    # 18:		},
    # 19:	],
    # 20:	Parsing => {#<--- Secondary Key Mismatch - Ref Type Mismatch
    # 21:		HashRef => {#<--- Secondary Key Mismatch - Ref Type Mismatch
    # 22:			LOGGER => {#<--- Secondary Key Mismatch - Ref Type Mismatch
    # 23:				run => 'INFO',#<--- Secondary Key Mismatch - Secondary Value Does NOT Match
    # 24:			},
    # 25:		},
    # 26:	},
    # 27:	Someotherkey => 'value',#<--- Secondary Key Match - Secondary Value Matches
    # 28:},
    ##############################################################################

    

DESCRIPTION

This module takes a data reference (or two) and recursivly travels through it(them). Where the two references diverge the walker follows the primary data reference. At the beginning and end of each "node" the code will attempt to call a method using data from the current location of the node.

Definitions

node

Each branch point of a data reference is considered a node. The original top level reference is the 'zeroth' node.

Caveat utilitor

This is not an extention of Data::Walk

This module uses the 'defined or' ( //= ) and so requires perl 5.010 or higher.

This is a Moose based data handling class. Many software developers will tell you Moose and data manipulation don't belong together. They are most certainly right in startup-time critical circumstances.

Recursive parsing is not a good fit for all data since very deep data structures will consume a fair amount of computer memory! As the module recursively parses through each level of data the code leaves behind a snapshot of the previous level that allows it to keep track of it's location.

This class has no external effect! all output above is from Data::Walk::Print

The primary_ref and secondary_ref are effectivly deep cloned during this process. To leave the primary_ref pointer intact see "fixed_primary"

The "COPYRIGHT" is down lower.

Supported node walking types

ARRAY
HASH
SCALAR

Supported one shot "Attributes"

sort_HASH
sort_ARRAY
skip_HASH_ref
skip_ARRAY_ref
skip_SCALAR_ref
change_array_size
fixed_primary

What is the unique value of this module?

With the recursive part of data walking extracted the various functionalities desired when walking the data can be modularized without copying this code. The Moose framework also allows diverse and targeted data parsing without dragging along a Kitchen sink API for every implementation of this Class.

Acknowledgement of MJD

This is an implementation of the concept of extracted data walking from Higher-Order-Perl Chapter 1 by Mark Jason Dominus. The book is well worth the money! With that said I diverged from MJD purity in two ways. This is object oriented code not functional code. Second, like the MJD equivalent, the code does nothing on its own. Unlike the MJD equivalent it looks for methods provided in a role or class extention at the appropriate places for action. The MJD equivalent expects to use a passed CodeRef at the action points. There is clearly some overhead associated with both of these differences. I made those choices consciously and if that upsets you do not hassle MJD!

Extending Data::Walk::Extracted

All action taken during the data walking must be initiated by implementation of action methods that do not exist in this Class. They can be added with a traditionally incorporated Role Moose::Role, by extending the class, or attaching a Role with the needed functionality at run time using 'with_traits' from Moose::Util. See the internal method _process_the_data to see the detail of how these methods are incorporated and review the "Recursive Parsing Flow" to understand the details of how the methods are used.

What is the recomended way to build a role that uses this class?

First build a method to be used when the Class reaches a data "node" and another to be used when the Class leaves a data node (as needed). Then create the 'action' method for the role. This would preferably be named something descriptive like 'mangle_data'. Remember if more than one Role is added to Data::Walk::Extracted then this method should be named with namespace considerations in mind. This method should compose any required node action methods and data references into a $passed_ref and possibly a $conversion_ref to be used by _process_the_data . Then the 'action' method should call;

$passed_ref = $self->_process_the_data( $passed_ref, $conversion_ref );

Afterwards returning anything from the $passed_ref of interest.

Finally, Write some tests for your role!

Methods

Methods used to write Roles

_process_the_data( $passed_ref, $conversion_ref ) - internal

Definition: This method is the primary method call used when extending the class and implementing some public method that will act when walking through a data structure. This module recursively walks through the passed primary_ref data structure. If provided it will check at each "node" if a secondary_ref matches at that level. Each time the walker reaches a "node" it will see if it can call a before_method. After the node has been processed it will attempt to call an after_method. For more details see the "Recursive Parsing Flow". Extentions or Roles that use this method are expected to compose and pass the following data to this method.
Accepts: $passed_ref and $conversion_ref
$passed_ref this ref contains key value pairs as follows;
primary_ref - a dataref that the walker will walk
secondary_ref - a dataref that is used for comparision while walking
before_method - a method name that will perform some action at the beginning of each node
after_method - a method name that will perform some action at the end of each node
[attribute name] - attribute names are accepted with temporary attribute settings. These settings are temporarily set for a single "_process_the_data" call and then the original attribute values are restored. For this to work the the attribute must have the following prefixed methods get_$name, set_$name, clear_$name, and has_$name.
$conversion_ref This allows a public method to accept different key names for the various keys listed above and then convert them later to the generic terms used by this Class.
Action: The passed data is first scrubbed were the minimum acceptable list of passed arguments are: 'primary_ref' and either of 'before_method' or 'after_method'. The list can also contain a 'secondary_ref' and a 'branch_ref' but they are not required. Any errors will be 'confess'ed using the passed names not the data walker names. When naming the before_method and after_method for the role keep in mind possible namespace collisions with other role methods. After the data scrubbing the $passed_ref is sent to the recursive data walker. The before_method and after_method are allowed to change the primary_ref and secondary_ref. For more details see the "Recursive Parsing Flow".
An example
$passed_ref ={
	print_ref =>{ 
		First_key => 'first_value',
	},
	match_ref =>{
		First_key => 'second_value',
	},
	before_method	=> '_print_before_method',
	after_method	=> '_print_after_method',
	sort_Array	=> 1,#One shot attribute setter
}

$conversion_ref ={
	primary_ref	=> 'print_ref',# generic_name => role_name,
	secondary_ref	=> 'match_ref',
}
Returns: the $passed_ref (only) with the key names restored to the original versions.

_build_branch( $base_ref, @arg_list )

Definition: There are times when a Role will wish to reconstruct the branch that lead to the current position that the Data Walker is at. This private method takes a data reference and recursivly appends the branch to the front of it using the information in the branch ref
Accepts: a list of arguments starting with the seed data reference $base_ref to build from. The remaining arguments are just the array elements of the branch ref and example call would be;
$ref = $self->_build_branch( 
	$seed_ref, 
	@{ $passed_ref->{branch_ref}},
);
Returns: a data reference with the current path back to the start appended to the $seed_ref

Public Methods

set_change_array_size( $bool )

Definition: This method is used to change the "change_array_size" attribute after the instance is created. This attribute is not used by this class! However, it is provided here so multiple Roles can share behavior rather than each setting this attribute differently. The intent is for this attribute to indicate if the array size should be changed when modifying an array. The intent is that the array will be reduced when prune is called and expanded when graft is called if the attribute is positive. If the attribute is negative the array dimensions will remain the same after the prune and graft operations.
Accepts: a Boolean value
Returns: nothing

get_change_array_size()

Definition: This method returns the current state of the "change_array_size" attribute.
Accepts: nothing
Returns: $Bool value representing the state of the 'change_array_size' attribute

has_change_array_size()

Definition: This method is used to test if the "change_array_size" attribute is set.
Accepts: nothing
Returns: $Bool value indicating if the 'change_array_size' attribute has been set

clear_change_array_size()

Definition: This method clears the "change_array_size" attribute.
Accepts: nothing
Returns: nothing

set_fixed_primary( $bool )

Definition: This method is used to change the "fixed_primary" attribute after the instance is created.
Accepts: a Boolean value
Returns: nothing

get_fixed_primary()

Definition: This method returns the current state of the "fixed_primary" attribute.
Accepts: nothing
Returns: $Bool value representing the state of the 'fixed_primary' attribute

has_fixed_primary()

Definition: This method is used to test if the "fixed_primary" attribute is set.
Accepts: nothing
Returns: $Bool value indicating if the 'fixed_primary' attribute has been set

clear_fixed_primary()

Definition: This method clears the "fixed_primary" attribute.
Accepts: nothing
Returns: nothing

Attributes

Data passed to ->new when creating an instance. For modification of these attributes see "Public Methods". The ->new function will either accept fat comma lists or a complete hash ref that has the possible appenders as the top keys. Additionally some attributes that meet the criteria can be passed to _process_the_data and will be adjusted for just the run of that method call.

sort_HASH

Definition: This attribute is set to sort (or not) Hash Ref keys prior to walking the Hash Ref node.
Default 0 (No sort)
Range Boolean values and sort coderefs.

sort_ARRAY

Definition: This attribute is set to sort (or not) Array values prior to walking the Array Ref node. Warning this will permanantly sort the actual data in the passed ref permanently. If a secondary ref also exists it will be sorted as well!
Default 0 (No sort)
Range Boolean values and sort coderefs.

skip_HASH_ref

Definition: This attribute is set to skip (or not) the processing of HASH Ref nodes.
Default 0 (Don't skip)
Range Boolean values.

skip_ARRAY_ref

Definition: This attribute is set to skip (or not) the processing of ARRAY Ref nodes.
Default 0 (Don't skip)
Range Boolean values.

skip_SCALAR_ref

Definition: This attribute is set to skip (or not) the processing of SCALAR's of ref branches.
Default 0 (Don't skip)
Range Boolean values.

change_array_size

Definition: This attribute will not be used by this class directly. However the Data::Walk::Prune Role and the Data::Walk::Graft Role both use it so it is placed here so there will be no conflicts.
Default 1 (This usually means that the array will grow or shrink when a position is added or removed)
Range Boolean values.

fixed_primary

Definition: This attribute will leaved the primary_ref data ref intact rather than deep cloning it. This also means that no changes made at lower levels will be passed upwards.
Default 0 = The primary ref is not fixed (and will be changed / deep cloned)
Range Boolean values.

Recursive Parsing Flow

First

before_method The class checks for an available 'before_method'. Using the test;

exists $passed_ref->{before_method};

If the test passes then the next sequence is run.

$method = $passed_ref->{before_method};
$passed_ref = $self->$method( $passed_ref );

Then if the new $passed_ref contains the key $passed_ref->{bounce} or is undef the program deletes the key 'bounce' from the $passed_ref (as needed) and then returns $passed_ref directly back up the data tree. Do not pass 'Go' do not collect $200. Otherwise the $passed_ref is sent on to the node parser. If the $passed_ref is modified by the 'before_method' then the node parser will parse the new ref and not the old one.

Second

Determine node type and elements The current node is examined to determine it's reference type. The relevant skip attribute is consulted and if this node should be skipped then the program goes directly to the "Fourth" step. If the node type is not skipped then a list is generated for multi-element nodes. SCALARs are considered 'SCALAR' types and are handled as single element nodes. Next, if the list should be sorted then the list is sorted. Finally the node is tested for an empty set. If the set is empty this is considered a 'base state' and the code also skips to the "Fourth" step else the code sends the list to the "Third" step.

Third

Iterate through each element For each element a new $passed_ref is generated. Based on the data branch below that element. The secondary_ref is only constructed if it has a matching element at that node with the primary_ref. Non matching portions of the secondary_ref are discarded. Node matching for hashrefs is done by string compares of the key only. Node matching for arrayrefs is done by testing if the secondary_ref has the same array position available as the primary_ref. No position content compare is done! The current element is then documented by pushing an element to an array_ref kept as the key branch_ref in the $passed_ref. This branch_ref can be thought of as the stack trace that documents the node elements directly between the current position and the top level of the parsed data_ref. Past completed branches and future pending branches are not shown. The array element pushed to the branch_ref is an array_ref that contains four positions which describe the current element position. The values in each position are;

[
	ref_type, 
	hash key name or '' for ARRAYs,
	element sequence position (from 0),#For hashes this is only relevent if sort_HASH is called
	level of the node (from 1),#The zeroth level is the passed data ref
]

The new $passed_ref is then passed to the recursive (private) subroutine.

my $alternative_passed_ref = $self->_walk_the_data( $new_passed_ref );

When the values are returned from the recursion call the last branch_ref element is poped off and the returned value(s) is(are) used to replace the passed primary_ref and secondary_ref values in the current $passed_ref. The program then returns to the "Third" step for the next element.

Fourth

after_method The class checks for an available 'after_method' using the test;

exists $passed_ref->{after_method};

If the test passes then the following sequence is run.

$method = $passed_ref->{after_method};
$passed_ref = $self->$method( $passed_ref ); 

Fifth

Go up The $passed_ref is passed back up to the next level. (with changes)

GLOBAL VARIABLES

$ENV{Smart_Comments}

The module uses Smart::Comments with the '-ENV' option so setting the variable $ENV{Smart_Comments} will turn on smart comment reporting. There are three levels of 'Smartness' called in this module '### #### #####'. See the Smart::Comments documentation for more information.

SUPPORT

Data-Walk-Extracted/issues

TODO

Support recursion through CodeRefs
Support recursion through Objects
Add a Data::Walk::Top Role to the package
Add a Data::Walk::Thin Role to the package

AUTHOR

Jed Lund
jandrew@cpan.org

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

Dependancies

version
Carp
Moose
MooseX::StrictConstructor
MooseX::Types::Moose
Smart::Comments -ENV option set

SEE ALSO

Data::Walk
Data::Walker
Data::Dumper - Dump
YAML - Dump
Data::Walk::Print - or other action object
Data::Walk::Prune
Data::Walk::Graft
Data::Walk::Clone