NAME
Data::Walk::Extracted - An extracted dataref walker
SYNOPSIS
#! C:/Perl/bin/perl
use Modern::Perl;
use YAML::Any;
use Moose::Util qw( with_traits );
use Data::Walk::Extracted v0.007;
use Data::Walk::Print v0.007;
$| = 1;
#Use YAML to compress writing the data ref
my $firstref = Load(
'---
Someotherkey:
value
Parsing:
HashRef:
LOGGER:
run: INFO
Helping:
- Somelevel
- MyKey:
MiddleKey:
LowerKey1: lvalue1
LowerKey2:
BottomKey1: 12345
BottomKey2:
- bavalue1
- bavalue2
- bavalue3'
);
my $secondref = Load(
'---
Someotherkey:
value
Helping:
- Somelevel
- MyKey:
MiddleKey:
LowerKey1: lvalue1
LowerKey2:
BottomKey1: 12346
BottomKey2:
- bavalue1
- bavalue3'
);
my $newclass = with_traits( 'Data::Walk::Extracted', ( 'Data::Walk::Print' ) );
my $AT_ST = $newclass->new(
match_highlighting => 1,#This is the default
sort_HASH => 1,#To force order for demo purposes
);
$AT_ST->print_data(
print_ref => $firstref,
match_ref => $secondref,
);
#######################################
# Output of SYNOPSIS
# 01:{#<--- Ref Type Match
# 02: Helping => [#<--- Secondary Key Match - Ref Type Match
# 03: 'Somelevel',#<--- Secondary Position Exists - Secondary Value Matches
# 04: {#<--- Secondary Position Exists - Ref Type Match
# 05: MyKey => {#<--- Secondary Key Match - Ref Type Match
# 06: MiddleKey => {#<--- Secondary Key Match - Ref Type Match
# 07: LowerKey1 => 'lvalue1',#<--- Secondary Key Match - Secondary Value Matches
# 08: LowerKey2 => {#<--- Secondary Key Match - Ref Type Match
# 09: BottomKey1 => '12345',#<--- Secondary Key Match - Secondary Value Does NOT Match
# 10: BottomKey2 => [#<--- Secondary Key Match - Ref Type Match
# 11: 'bavalue1',#<--- Secondary Position Exists - Secondary Value Matches
# 12: 'bavalue2',#<--- Secondary Position Exists - Secondary Value Does NOT Match
# 13: 'bavalue3',#<--- Secondary Position Does NOT Exist - Secondary Value Does NOT Match
# 14: ],
# 15: },
# 16: },
# 17: },
# 18: },
# 19: ],
# 20: Parsing => {#<--- Secondary Key Mismatch - Ref Type Mismatch
# 21: HashRef => {#<--- Secondary Key Mismatch - Ref Type Mismatch
# 22: LOGGER => {#<--- Secondary Key Mismatch - Ref Type Mismatch
# 23: run => 'INFO',#<--- Secondary Key Mismatch - Secondary Value Does NOT Match
# 24: },
# 25: },
# 26: },
# 27: Someotherkey => 'value',#<--- Secondary Key Match - Secondary Value Matches
# 28:},
#######################################
DESCRIPTION
This module takes a data reference (or two) and recursivly travels through it(them). Where the two references diverge the walker follows the primary data reference. At the beginning and end of each node the code will attempt to call a method using data from the current location of the node.
Beware Recursive parsing is not a good fit for all data since very deep data structures will burn a fair amount of perl memory! Meaning that as the module recursively parses through the levels perl leaves behind snapshots of the previous level that allow perl to keep track of it's location.
This is an implementation of the concept of extracted data walking from Higher-Order-Perl Chapter 1 by Mark Jason Dominus. The book is well worth the money! With that said I diverged from MJD purity in two ways. This is object oriented code not functional code and moreover it is written in Moose. :) Second, like the MJD equivalent, the code does nothing on its own. Unlike the MJD equivalent it looks for methods provided in a role or class extention at the appropriate places for action. The MJD equivalent expects to use a passed CodeRef at the action points. There is clearly some overhead associated with both of these differences. I made those choices consciously and if that upsets you do not hassle MJD!
Default Functionality
This module does not do anything by itself but walk the data structure. It takes no action on its own during the walk. All the output above is from Data::Walk::Print
Basic interface
The module uses five basic pieces of data to work;
- primary_ref - a dataref that the walker will walk
- secondary_ref - a dataref that is used for comparision while walking
- before_method - some action performed at the beginning of each node
- after_method - some action performed at the beginning of each node
- conversion_ref - a way to change the data ref naming used in the role to the name used in the base class. This allows the data to be named in a way unique to the role so that any bad callout can be caught but still be used generically by the base class.
An example
$passed_ref ={
print_ref =>{
First_key => 'first_value',
},
match_ref =>{
First_key => 'second_value',
},
before_method => '_print_before_method',
after_method => '_print_after_method',
}
$conversion_ref =>{
primary_ref => 'print_ref',# generic_name => role_name,
secondary_ref => 'match_ref',
}
The minimum acceptable list of passed arguments are: 'primary_ref' and either of 'before_method' or 'after_method'. The list can also contain 'secondary_ref' and 'branch_ref' but they are not required. When nameing the before_method and after_method for the role keep in mind possible namespace collisions with other role methods. The input scrubber will use the $conversion_ref to test the $passed_ref for the correct $key names. If the key names are passed differently from the role then the scrubber will change the keys prior to sending the $passed_ref to the data walker. Any errors will be 'croak'ed using the passed names not the data walker names.
After the data scrubbing the $passed_ref is sent to the data walker.
v0.007
- State This code is still in Beta state and therefore the API is subject to change. I like the basics and will try to add rather than modify whenever possible in the future. The goal of future development will be focused on supporting additional branch types. API changes will only occur if the current functionality proves excessivly flawed in some fasion. All fixed functionality will be defined by the test suit.
- Included ArrayRefs and HashRefs are supported data walker nodes. Strings and Numbers are all currently treated as base states.
- Excluded Objects and CodeRefs are not currently handled. The should cause the code to croak if the module encounters them (not tested). See "TODO"
Extending Data::Walk::Extracted
All action taken during the data walking must be initiated by implementation of two possible methods. The before_method and the after_method. The methods are not provided by the base Data::Walk::Extracted class. They can be added with a Moose::Role or by extending the class.
- How to add Roles to the Class?
-
One way to incorporate a role into this class and then use it is the method 'with_traits' from Moose::Util.
- What is the reccomended way to build a role that uses this class?
-
First start by creating the 'action' method for the role. This would preferably be named something descriptive like 'mangle_data'. This method should build a $passed_ref and possibly a $conversion_ref. The $passed_ref can include up to two data references, a call to either a 'before_method' or an 'after_method' or both, and possibly a 'branch_ref'. The "$conversion_ref" should contain key / value pairs that repsesent the translation of the $passed_ref keys used in the Role to the names used by this class. This allows for generic handling of walking but still allowing multiple roles to coexist in the class when built. These two values are used as follows
$result = $self->_process_the_data( $passed_ref, $conversion_ref );
Then build one or both of before_method and after_method for use when walking the data. For examples review the code in Data::Walk::Print
- what is the recursive data walking sequence?
-
- First The class checks for an available 'before_method'. Using the test exists $passed_ref->{before_method}. If the test passes then the sequence $method = $passed_ref->{before_method}; $passed_ref = $self->$method( $passed_ref ); is run. If the new $passed_ref contains the key $passed_ref->{bounce} or is undef the program deletes the key 'bounce' from the $passed_ref (as needed) and then returns $passed_ref directly back up the data tree. Do not pass 'Go' do not collect $200. Otherwise $passed_ref is sent on to the node parser. If the $passed_ref is modified by the 'before_method' then the node parser will parse the new ref and not the old one.
- Second It determines what reference type the node is at the current level. Strings and Numbers are considered 'TERMINATOR' types and are handled as single element nodes. Then, any listing available for elements of that node is created and if the list should be sorted then the list is sorted. If the current node is 'undef' this is considered a 'base state' and the code skips to the "Fifth" step.
- Fourth When the values are returned from the recursion call the returned value(s) is(are) used to replace the pased primary_ref and secondary_ref values in the current $passed_ref.
- Fifth - The class checks for an available 'after_method'. Using the test exists $passed_ref->{after_method}. If the test passes then the sequence $method = $passed_ref->{after_method}; $passed_ref = $self->$method( $passed_ref ); is run.
- Sixth the $passed_ref is passed back up to the next level. (with changes)
Attributes
Data passed to ->new when creating an instance. For modification of these attributes see "Methods". The ->new function will either accept fat comma lists or a complete hash ref that has the possible appenders as the top keys.
sort_HASH
- Definition: This attribute is set to sort (or not) Hash Ref keys prior to walking the Hash Ref node.
- Default 0 (No sort)
- Range Boolean values. See "TODO" for future direction.
sort_ARRAY
- Definition: This attribute is set to sort (or not) Array values prior to walking the Array Ref node. Warning this will permanantly sort the actual data in the passed ref permanently. If a secondary ref also exists it will be sorted as well!
- Default 0 (No sort)
- Range Boolean values. See "TODO" for future direction.
skip_HASH_ref
- Definition: This attribute is set to skip (or not) the processing of HASH Ref nodes.
- Default 0 (Don't skip)
- Range Boolean values.
skip_ARRAY_ref
- Definition: This attribute is set to skip (or not) the processing of ARRAY Ref nodes.
- Default 0 (Don't skip)
- Range Boolean values.
skip_TERMINATOR_ref
- Definition: This attribute is set to skip (or not) the processing of TERMINATOR's of ref branches.
- Default 0 (Don't skip)
- Range Boolean values.
change_array_size
- Definition: This attribute will not be used by this class directly. However the Data::Walk::Prune Role and the Data::Walk::Graft Role both use it so it is placed here so there will be no conflicts.
- Default 1 (This usually means that the array position will be added or removed)
- Range Boolean values.
Methods
change_array_size_behavior( $bool )
GLOBAL VARIABLES
- $ENV{Smart_Comments}
-
The module uses Smart::Comments with the '-ENV' option so setting the variable $ENV{Smart_Comments} will turn on smart comment reporting. There are three levels of 'Smartness' called in this module '### #### #####'. See the Smart::Comments documentation for more information.
- $Carp::Verbose
-
The module uses Carp to die(croak) so the variable $Carp::Verbose can be set for more detailed debugging.
SUPPORT
TODO
- Support recursion through CodeRefs
- Support recursion through Objects
- Allow the sort_XXX attributes to recieve a sort subroutine
- Add a Data::Walk::Top Role to the package
- Add a Data::Walk::Thin Role to the package
AUTHOR
COPYRIGHT
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
Dependancies
- version
- Carp
- Moose
- MooseX::StrictConstructor
- MooseX::Types::Moose
- Smart::Comments -ENV option set
SEE ALSO
- Data::Walk
- Data::Walker
- Data::Dumper - Dump
- YAML - Dump
- Data::Walk::Print - or other action object