NAME

Data::Walk::Extracted - An extracted dataref walker

SYNOPSIS

#! C:/Perl/bin/perl
use Modern::Perl;
use YAML::Any;
use Moose::Util qw( with_traits );
use lib '../lib';
use Data::Walk::Extracted v0.05;
use Data::Walk::Print v0.05;

$| = 1;

#Use YAML to compress writing the data ref
my  $firstref = Load(
    '---
    Someotherkey:
        value
    Parsing:
        HashRef:
            LOGGER:
                run: INFO
    Helping:
        - Somelevel
        - MyKey:
            MiddleKey:
                LowerKey1: lvalue1
                LowerKey2:
                    BottomKey1: 12345
                    BottomKey2:
                    - bavalue1
                    - bavalue2
                    - bavalue3'
);
my  $secondref = Load(
    '---
    Someotherkey:
        value
    Helping:
        - Somelevel
        - MyKey:
            MiddleKey:
                LowerKey1: lvalue1
                LowerKey2:
                    BottomKey1: 12346
                    BottomKey2:
                    - bavalue1
                    - bavalue3'
);
# Apply the role
my $newclass = with_traits( 'Data::Walk::Extracted', ( 'Data::Walk::Print' ) );
# Use the combined class to build an instance
my $AT_ST = $newclass->new(
        match_highlighting => 1,#This is the default
        sort_HASH => 1,#To force order for demo purposes
);
# Walk the data with the Data walker
$AT_ST->walk_the_data(
    primary_ref     =>  $firstref,
    secondary_ref   =>  $secondref,
);

#######################################
#     Output of SYNOPSIS
# 01:{#<--- Ref Type Match
# 02:	Helping => [#<--- Secondary Key Match - Ref Type Match
# 03:		'Somelevel',#<--- Secondary Position Exists - Secondary Value Matches
# 04:		{#<--- Secondary Position Exists - Ref Type Match
# 05:			MyKey => {#<--- Secondary Key Match - Ref Type Match
# 06:				MiddleKey => {#<--- Secondary Key Match - Ref Type Match
# 07:					LowerKey1 => 'lvalue1',#<--- Secondary Key Match - Secondary Value Matches
# 08:					LowerKey2 => {#<--- Secondary Key Match - Ref Type Match
# 09:						BottomKey1 => '12345',#<--- Secondary Key Match - Secondary Value Does NOT Match
# 10:						BottomKey2 => [#<--- Secondary Key Match - Ref Type Match
# 11:							'bavalue1',#<--- Secondary Position Exists - Secondary Value Matches
# 12:							'bavalue2',#<--- Secondary Position Exists - Secondary Value Does NOT Match
# 13:							'bavalue3',#<--- Secondary Position Does NOT Exist - Secondary Value Does NOT Match
# 14:						],
# 15:					},
# 16:				},
# 17:			},
# 18:		},
# 19:	],
# 20:	Parsing => {#<--- Secondary Key Mismatch - Ref Type Mismatch
# 21:		HashRef => {#<--- Secondary Key Mismatch - Ref Type Mismatch
# 22:			LOGGER => {#<--- Secondary Key Mismatch - Ref Type Mismatch
# 23:				run => 'INFO',#<--- Secondary Key Mismatch - Secondary Value Does NOT Match
# 24:			},
# 25:		},
# 26:	},
# 27:	Someotherkey => 'value',#<--- Secondary Key Match - Secondary Value Matches
# 28:},
#######################################

DESCRIPTION

This module takes a data reference (or two) and recursivly travels through it(them). Where the two references diverge the walker follows the primary data reference. At the beginning and end of each node the code will attempt to call a method using data from the current location of the node.

Beware Recursive parsing is not a good fit for all data since very deep data structures will burn a fair amount of perl memory! Meaning that as the module recursively parses through the levels perl leaves behind snapshots of the previous level that allow perl to keep track of it's location.

This is an implementation of the concept of extracted data walking from Higher-Order-Perl Chapter 1 by Mark Jason Dominus. The book is well worth the money! With that said I diverged from MJD purity in two ways. This is object oriented code not functional code and moreover it is written in Moose. :) Second, the code uses methods that are not included in the class, to provide add-on functionality at the appropriate places for action. The MJD equivalent expects to use a passed CodeRef at the action points. There is clearly some overhead associated with both of these differences. I made those choices consciously and if that upsets you do not hassle MJD!

Default Functionality

This module is does not do anything by itself but walk the data structure. Because I want the code to do something every time I call it a new instance will append a default set of functionality "Data::Walk::Print" during BUILD using 'apply_all_roles' from Moose::Util. See "Extending Data::Walk::Extracted" for more details of extending this data walker.

Data::Walk::Print will print a perlish version of the primary data stucture as it walks through. If a second data set is provided and the correct flag is set it will add a comment string with matching information. Both Data::Dumper Dump and YAML Dump functions are more mature than the default Data::Walk::Print function included here.

v0.05

State This code is still in Beta state and therefore the API is subject to change. I like the basics and will try to add rather than modify whenever possible in the future. The goal of future development will be focused on supporting additional branch types. API changes will only occur if the current functionality proves excessivly flawed in some fasion. All fixed functionality will be defined by the test suit.
Included ArrayRefs and HashRefs are supported data walker nodes. Strings and Numbers are all currently treated as base states.
Excluded Objects and CodeRefs are not currently handled. The should cause the code to croak if the module encounters them (not tested). See "TODO"

Extending Data::Walk::Extracted

All action taken during the data walking must be initiated by implementation of two possible methods. The before_method and the after_method. The methods are not provided by the base Data::Walk::Extracted class. They can be added with a Moose::Role or by extending the class.

How to add methods using a Role?

One way to incorporate a role into this class and then use it is the method 'with_traits' from Moose::Util. Warning When the Data::Walk::Extracted class is used to create a new instance it will check (using Moose BUILD) if the instance can( 'before_method' ) or can( 'after_method' ). If neither method is available the Default Role will be added to the instance using Moose::Util 'apply_all_roles' and then 'carp' a warning.

what are the minimum requriements for use?

The role must provide one of either a before_method or an after_method.

How does the class interact with these methods?

At each node the class calls $passed_ref = $self->before_method( $passed_ref ) before parsing the node and $passed_ref = $self->after_method( $passed_ref ) after parsing the node when available. Both methods can either return a (possibly modified) $passed_ref or undef. If either method returns undef, then undef is immediatly passed back up to the previous layer. So if the before_method returns undef the data walker also skips parsing the node or attempting the 'after_method'.

what does the $passed_ref contain?

Every time the extracted walker calls a method it will pass a master data ref specific to that layer. See below for more details.

Attributes

Data passed to ->new when creating an instance. For modification of these attributes see "Methods". The ->new function will either accept fat comma lists or a complete hash ref that has the possible appenders as the top keys.

sort_HASH

Definition: This attribute is set to sort (or not) Hash Ref keys prior to walking the Hash Ref node.
Default 0 (No sort)
Range Boolean values. See "TODO" for future direction.

sort_ARRAY

Definition: This attribute is set to sort (or not) Array values prior to walking the Array Ref node. Warning this will permanantly sort the actual data in the passed ref permanently. If a secondary ref also exists it will be sorted as well!
Default 0 (No sort)
Range Boolean values. See "TODO" for future direction.

skip_HASH_ref

Definition: This attribute is set to skip (or not) the processing of HASH Ref nodes.
Default 0 (Don't skip)
Range Boolean values.

skip_ARRAY_ref

Definition: This attribute is set to skip (or not) the processing of ARRAY Ref nodes.
Default 0 (Don't skip)
Range Boolean values.

skip_TERMINATOR_ref

Definition: This attribute is set to skip (or not) the processing of TERMINATOR's of ref branches.
Default 0 (Don't skip)
Range Boolean values.

Methods

walk_the_data( %args )

This method is used to build a data reference that will recursivly parse the target data reference. Effectivly it takes the passed reference(s) and walks vertically down each data branch. At each node it calls a 'before_method' and an 'after_method' if available. The detailed sequence is listed below.

First The class checks for an available 'before_method'. If available $passed_ref = $self->before_method( $passed_ref ) is called. If the new $passed_ref contains the key $passed_ref->{bounce} or is undef the program deletes the key 'bounce' from the $passed_ref (as needed) and then returns $passed_ref back up the tree. Do not pass 'Go' do not collect $200. Otherwise $passed_ref is sent on to the node parser. If the $passed_ref is modified by the 'before_method' then the node parser will parse the new ref and not the old one.
Second It determines what reference type the node is at the current level. Strings and Numbers are considered 'TERMINATOR' types and are handled as single element nodes. Then, any listing available for elements of that node is created and if the list should be sorted then the list is sorted. If the current node is 'undef' this is considered a 'base state' and the code skips to the "Fifth" step.
Third - building the $passed_ref For each element of the node a new dataset is built. The dataset consists of a "primary_ref", a "secondary_ref" and a "branch_ref". The primary_ref contains only the portion of the dataset that exists below the selected element of that node. The secondary_ref is only constructed if it has a matching element at that node with the primary_ref. Node matching for hashrefs is done by string compares of the key only. Node matching for arrayrefs is done by testing if the secondary_ref has the same array position available as the primary_ref. No position content compare is done! The secondary_ref would then be built like the primary_ref. The branch_ref will contain an array ref of array refs. Each of the top array positions represents a previously traveled node on the current branch. The lower array ref will have four positions which describe the the element taken for that branch. The values in each position are; 0-ref type, 1-hash key name or '', 2-element sequence position (from 0), and 3-level of the node (from 1). The branch_ref arrays are effectivly the linear (vertical) breadcrumbs that show how the parser got to that point. Past completed branches and future pending branches are not shown. The new dataset is then passed to the recursive (private) subroutine to be parsed in the same manner ("First").
Fourth When the values are returned from the recursion call the returned value(s) is(are) used to replace the pased primary_ref and secondary_ref values in the current $passed_ref.
Fifth - $passed_ref = $self->after_method( $passed_ref ) is called on the instance if available.
Seventh the $passed_ref is passed back up to the next level. (with changes)

%args

arguments are accepted in either a hash or hashref style.

primary_ref

accepts a multilevel dataref - Mandatory
range HashRefs or ArrayRefs with string or number terminators

secondary_ref

accepts a multilevel dataref - Optional
range HashRefs or ArrayRefs with string or number terminators

branch_ref

default []
accepts an Array of Arrays, - Optional - discouraged

beware of messing with this since the module uses this for traceability

GLOBAL VARIABLES

$ENV{Smart_Comments}

The module uses Smart::Comments with the '-ENV' option so setting the variable $ENV{Smart_Comments} will turn on smart comment reporting. There are three levels of 'Smartness' called in this module '### #### #####'. See the Smart::Comments documentation for more information.

$Carp::Verbose

The module uses Carp to warn(carp) and die(croak) so the variable $Carp::Verbose can be set for more detailed debugging.

BUGS

Data-Walk-Extracted/issues

TODO

Support recursion through CodeRefs
Support recursion through Objects
Allow the sort_XXX attributes to recieve a sort subroutine

SUPPORT

jandrew@cpan.org

AUTHOR

Jed Lund
jandrew@cpan.org

COPYRIGHT

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.

Dependancies

Modern::Perl
version
Carp
Moose
MooseX::Types::Moose
Smart::Comments -ENV option set
Data::Walk::Print - or other action object

SEE ALSO

Data::Walk
Data::Walker
Data::Dumper - Dump
YAML - Dump