NAME
CXC::Data::Visitor - Invoke a callback on every element at every level
of a data structure.
VERSION
version 0.12
SYNOPSIS
use CXC::Data::Visitor 'visit', 'RESULT_CONTINUE';
my %root = (
fruit => {
berry => 'purple',
apples => [ 'fuji', 'macoun' ],
} );
visit(
\%root,
sub ( $kydx, $vref, @ ) {
$vref->$* = 'blue' if $kydx eq 'berry';
return RESULT_CONTINUE;
} );
say $root{fruit}{berry}
results in
blue
DESCRIPTION
CXC::Data::Visitor::visit performs a depth-first traversal of a data
structure, invoking a provided callback subroutine on elements in the
structure.
Features
* The type of element passed to the callback (containers, terminal
elements) can be selected.
* The order of traversal at a given depth (i.e. within a container's
elements) may be customized.
* The callback can modify the traversal process.
* The complete path from the structure to an element (both the
ancestor containers and the keys and indexes required to traverse
the path) is available to the callback.
* Cycles are detected upon traversing a container a second time in a
depth first search, and the resultant action may be specified.
* Objects are treated as terminal elements and are not traversed.
* Containers that can be reached multiple times without cycling are
visited once per parent.
Overview
"visit" recursively traverses the container, $root, calling the passed
subroutine, $callback on each element, $element, which is allowed by the
"visit" option.
The traversal is depth-first, e.g. if $element is a container, $callback
is called on it and then its contents before processing $element's
siblings.
Each container's contents are traversed in sorted order. For hashes,
this is alphabetical, for arrays, numerical. (This may be changed with
the "key_sort" and "idx_sort" options).
For example, the default traversal order for the structure in the
"SYNOPSIS" is
+-------------------------+-----------------------+-----+
| path | value | idx |
+-------------------------+-----------------------+-----+
| $root{fruit} | \$root{fruit} | 0 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 |
| $root{fruit}{apples}[0] | fuji | 0 |
| $root{fruit}{apples}[1] | macoun | 1 |
| $root{fruit}{berry} | purple | 1 |
+-------------------------+-----------------------+-----+
Containers that can be reached multiple times without cycling, e.g.
%hash = ( a => { b => 1 }, );
$hash{c} = $hash{a};
are visited once per parent, e.g.
{a}, {a}{b}
{c}, {c}{b}
$callback's return value indicates how "visit" should proceed (see
"Traversal Directives"). The simplest directive is to continue
traversal; additional directives abort the traversal, abort descent into
a container, revisit the current container immediately, revisit a
container after its contents are visited, and other obscure
combinations.
USAGE
"visit" has the following signature:
( $completed, $context, $metadata ) = visit( $root, $callback, %options )
The two mandatory arguments are $root, a reference to either a hash or
an array, and $callback, a reference to a subroutine which will be
invoked on visited elements. By default $callback is invoked on $root's
elements, not on $root itself; use the "VISIT_ROOT" flag change this.
$completed => *Boolean*
*true* if all elements were visited, *false* if $callback requested
a premature return.
$context
The variable of the same name passed to $callback; see the "context"
option.
$metadata => *hash*
collected metadata. See "Metadata".
Options
"visit" may be passed the following options:
context *optional*
Arbitrary data to be passed to "$callback" via the $context
argument. Use it for whatever you'd like. If not specified, a hash
will be created.
cycle => *constant|coderef*
How cycles within $root should be handled. See "Cycles".
visit => *constant*
Specify elements (by type) which will be passed to $callback. See
"Element Filters"
key_sort => *boolean* | $coderef
The order of keys when traversing hashes. If *true* (the default),
the order is that returned by Perl's "sort" routine. If *false*, it
is the order returned that Perl's "keys" routine.
If a coderef, it is used to sort the keys. It is called as
\@sorted_keys = $coderef->( \@unsorted_keys );
idx_sort => $coderef
By default array elements are traversed in order of their ascending
index. Use "idx_sort" to specify a subroutine which returns them in
an alternative order. It is called as
\@indices = $coderef->( $n );
where $n is the number of elements in the array.
sort_keys => *coderef*
*DEPRECATED*
An optional coderef which implements a caller specific sort order.
It is passed two keys as arguments. It should return -1, 0, or 1
indicating that the sort order of the first argument is less than,
equal to, or greater than that of the second argument.
revisit_limit
A container may be scanned multiple times during a visit to it. This
sets the maximum number of times the container is re-scanned during
a visit before "visit" throws an exception to avoid infinite loops.
This limit also applies to "RESULT_REVISIT_ROOT".
The defaults is 10. Set it to 0 to indicate no limit.
Callback
"visit" invokes $callback on selected elements of $root (see "Element
Filters"). $callback is invoked as
$directive = $callback->( $kydx, $vref, $context, \%metadata );
The arguments passed to $callback are:
$kydx
The location (key or index) of the element in its parent container.
This will be undefined when $callback is invoked on $root (see
"VISIT_ROOT").
$vref
A reference to the element. Use $vref->$* to extract or modify the
element's value. Do not cache this value; the full path to the
element is provided via the "$metadata" argument.
$context
A reference to data reserved for use by $callback. See the "context"
option.
$metadata
A hash of state information used to keep track of progress. While
primarily of use by "visit", some may be of interest to $callback.
See "Metadata"
Traversal Directives
"$callback" must return a constant (see "EXPORTS") indicating what
"visit" should do next. Not all constants are allowed in all contexts in
which $callback is invoked; see "Calling Contexts and Allowed Traversal
Directives".
Single Directives
RESULT_CONTINUE
Visit the next element in the parent container.
+-------------------------+-----------------------+-----+
| path | value | idx |
+-------------------------+-----------------------+-----+
| $root{fruit} | \$root{fruit} | 0 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 |
| $root{fruit}{apples}[0] | fuji | 0 |
| $root{fruit}{apples}[1] | macoun | 1 |
| $root{fruit}{berry} | purple | 1 |
+-------------------------+-----------------------+-----+
RESULT_RETURN
Return immediately to the caller of "visit".
RESULT_STOP_DESCENT
If the current element is a hash or array, do not visit its
contents, and visit the next element in the parent container.
If the element is not a container, the next element in the container
will be visited (just as with "RESULT_CONTINUE").
For example, If "RESULT_STOP_DESCENT" is returned when
"$root{fruit}{apples}" is traversed, the traversal would look like
this:
+----------------------+-----------------------+-----+
| path | value | idx |
+----------------------+-----------------------+-----+
| $root{fruit} | \$root{fruit} | 0 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 |
| $root{fruit}{berry} | purple | 1 |
+----------------------+-----------------------+-----+
RESULT_REVISIT_CONTENTS
Do not visit the next element in the parent container. restart with
the first element in the container. The order of elements is
determined when the container is visited, so starts within a visit
will have the same order.
For example, if "RESULT_REVISIT_CONTENTS" is returned the first time
"$root{fruit}{apples}[0]" is traversed, the traversal would look
like this:
+-------------------------+-----------------------+-----+-------+
| path | value | idx | visit |
+-------------------------+-----------------------+-----+-------+
| $root{fruit} | \$root{fruit} | 0 | 1 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 | 1 |
| $root{fruit}{apples}[0] | fuji | 0 | 1 |
| $root{fruit}{apples}[0] | fuji | 0 | 2 |
| $root{fruit}{apples}[1] | macoun | 1 | 2 |
| $root{fruit}{berry} | purple | 1 | 1 |
+-------------------------+-----------------------+-----+-------+
To avoid inadvertent infinite loops, the number of revisits during a
traversal of a container is limited (see "revisit_limit").
Containers with multiple parents are traversed once per parent; The
limit is reset for each traversal.
RESULT_REVISIT_ROOT
Stop processing and re-start at $root. To avoid inadvertent infinite
loops, the number of revisits is limited (see "revisit_limit").
RESULT_REVISIT_ELEMENT
If the current element is not a container, the next element in the
container will be visited (just as with "RESULT_CONTINUE").
If the current element is a container, its contents will be visited,
and "$callback" will be invoked on it again afterwards.
During the call to $callback on the container prior to visiting its
contents,
$metadata->{pass} & PASS_VISIT_ELEMENT
will be true. During the followup visit
$metadata->{pass} & PASS_REVISIT_ELEMENT
will be true.
For example, If "RESULT_REVISIT_ELEMENT" is returned when
"$root{fruit}{apples}" is traversed, the traversal would look like
this:
+-------------------------+-----------------------+-----+----------------------+
| path | value | idx | pass |
+-------------------------+-----------------------+-----+----------------------+
| $root{fruit} | \$root{fruit} | 0 | PASS_VISIT_ELEMENT |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 | PASS_VISIT_ELEMENT |
| $root{fruit}{apples}[0] | fuji | 0 | PASS_VISIT_ELEMENT |
| $root{fruit}{apples}[1] | macoun | 1 | PASS_VISIT_ELEMENT |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 | PASS_REVISIT_ELEMENT |
| $root{fruit}{berry} | purple | 1 | PASS_VISIT_ELEMENT |
+-------------------------+-----------------------+-----+----------------------+
Mixed Directives
Some directives can be mixed with "RESULT_REVISIT_CONTENTS" and
"RESULT_REVISIT_ELEMENT" by performing a binary OR with them.
RESULT_STOP_DESCENT | RESULT_REVISIT_CONTENTS
If the current element is not a container, the next element in the
container will be visited (just as with "RESULT_CONTINUE").
If the current element is a hash or array, do not visit its
contents, and continue with the next element in the parent
container. For non-container elements, continue with the next
element in the parent container.
After all of the container's contents have been visited, start again
with the first element in the container.
For example, if "RESULT_STOP_DESCENT | RESULT_REVISIT_CONTENTS" is
returned when "$root{fruit}{apples}" is traversed when
"$metadata-"{visit} ==1>, the traversal would look like
+----------------------+-----------------------+-----+-------+
| path | value | idx | visit |
+----------------------+-----------------------+-----+-------+
| $root{fruit} | \$root{fruit} | 0 | 1 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 | 1 |
| $root{fruit}{berry} | purple | 1 | 1 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 | 2 |
| $root{fruit}{berry} | purple | 1 | 2 |
+----------------------+-----------------------+-----+-------+
RESULT_CONTINUE | RESULT_REVISIT_CONTENTS
Visit the remaining elements in the parent container, then start
again with the first element in the container.
For example, if "RESULT_CONTINUE | RESULT_REVISIT_CONTENTS" is
returned when $callback is first passed "$root{fruit}{apples}[0]",
the traversal would look like
+-------------------------+-----------------------+-----+-------+
| path | value | idx | visit |
+-------------------------+-----------------------+-----+-------+
| $root{fruit} | \$root{fruit} | 0 | 1 |
| $root{fruit}{apples} | \$root{fruit}{apples} | 0 | 1 |
| $root{fruit}{apples}[0] | fuji | 0 | 1 |
| $root{fruit}{apples}[1] | macoun | 1 | 1 |
| $root{fruit}{apples}[0] | fuji | 0 | 2 |
| $root{fruit}{apples}[1] | macoun | 1 | 2 |
| $root{fruit}{berry} | purple | 1 | 1 |
+-------------------------+-----------------------+-----+-------+
Calling Contexts and Allowed Traversal Directives
$callback's allowed return value depends upon the context it is called
in. $callback may be called on an element multiple times during
different stages of traversal.
When invoked on an element during a scan of its parent container
* The "pass" metadata attribute is set to "PASS_VISIT_ELEMENT"
* $callback must return one of
RESULT_REVISIT_CONTENTS
RESULT_RETURN
RESULT_CONTINUE
RESULT_CONTINUE | RESULT_REVISIT_CONTENTS
RESULT_STOP_DESCENT | RESULT_REVISIT_CONTENTS
RESULT_CONTINUE | RESULT_REVISIT_ELEMENT
When invoked on a container immediately after its contents have been visited
See "RESULT_REVISIT_ELEMENT".
* The "pass" metadata attribute is set to "PASS_REVISIT_ELEMENT"
* $callback must return one of
RESULT_RETURN
RESULT_CONTINUE
RESULT_REVISIT_CONTENTS
RESULT_CONTINUE | RESULT_REVISIT_CONTENTS
When invoked on $root before its contents have been visited
See "VISIT_ROOT".
* The "pass" metadata attribute is set to "PASS_VISIT_ELEMENT"
* $callback must return one of
RESULT_CONTINUE
RESULT_CONTINUE | RESULT_REVISIT_ELEMENT
RESULT_RETURN
RESULT_REVISIT_ROOT
RESULT_STOP_DESCENT
When invoked on the $root immediately after its elements have been visited
See "VISIT_ROOT" and "RETURN_REVISIT_ELEMENT".
* The "pass" metadata attribute is set to "PASS_REVISIT_ELEMENT"
* $callback must return one of
RESULT_RETURN
RESULT_CONTINUE
Metadata
$callback is passed a hash of state information ($metadata) kept by
CXC::Data::Visitor::visit, some of which may be of interest to the
callback:
$metadata has the following entries:
container
A reference to the hash or array which contains the element being
visited.
path
An array which contains the path (keys and indices) used to arrive
at the current element from $root.
ancestors
An array containing references to the ancestor containers of the
current element.
pass
A constant indicating the current visit pass of an element. See
"RESULT_REVISIT_ELEMENT".
visit
A unary-based counter indicating the number of times the element's
container has been scanned and its contents processed in a single
visit. This will be greater than 1 if the "RESULT_REVISIT_CONTENTS"
directive has been applied. It is *not* the number of times that the
element has been visited, as scans may be interrupted and restarted.
idx A zero-based index indicating the order of the element in its
container. Ordering depends upon how container elements are sorted;
see "key_sort" and "idx_sort".
Element Filters
The parts of the structure that will trigger a callback. Note that by
default the passed top level structure, $root is *not* passed to the
callback. See "VISIT_ROOT".
See "EXPORTS" to import the constants.
VISIT_CONTAINER
Invoke "$callback" on containers (either hashes or arrays). For
example, the elements in the following structure
$root = { a => { b => 1, c => [ 2, 3 ] } }
passed to "$callback" are:
a => {...} # $root->{a}
c => [...] # $root->{c}
VISIT_ARRAY
VISIT_HASH
Only visit containers of the given type.
VISIT_LEAF
Invoke "$callback" on terminal (leaf) elements. For example, the
elements in the following structure
$root = { a => { b => 1, c => [ 2, 3 ] } }
passed to "$callback" are:
b => 1 # $root->{a}{b}
0 => 2 # $root->{a}{c}[0]
1 => 3 # $root->{a}{c}[1]
VISIT_ALL
Invoke "$callback" on all elements except for $root. This is the
default.
VISIT_ROOT
Pass $root to $callback. To filter on one of the other values, pass
a binary OR of "VISIT_ROOT" and the other filter, e.g.
VISIT_ROOT | VISIT_LEAF
Specifying "VISIT_ROOT" on its own is equivalent to
VISIT_ROOT | VISIT_ALL
Cycles
CYCLE_DIE
Throw an exception (the default).
CYCLE_CONTINUE
Pretend we haven't seen it before. Will cause stack exhaustion if
$callback does handle this.
CYCLE_TRUNCATE
Truncate before entering the cycle a second time.
*$coderef*
Examine the situation and request a particular resolution. $coderef
is called as
$coderef->( $container, $context, $metadata );
where $container is the hash or array which has already been
traversed. See below for "$context" and "$metadata".
$coderef should return one of CYCLE_DIE, CYCLE_CONTINUE, or
CYCLE_TRUNCATE, indicating what should be done.
EXPORTS
This module uses Exporter::Tiny, which provides enhanced import
utilities.
Subroutines
The following subroutine may be imported:
visit
Constants
Constants may be imported individually or as groups via tags. The
available tags and their respective imported symbols are:
all Import all symbols.
results
RESULT_CONTINUE
RESULT_RETURN
RESULT_REVISIT_CONTAINER # deprecated alias for RESULT_REVISIT_CONTENTS
RESULT_REVISIT_CONTENTS
RESULT_REVISIT_ELEMENT
RESULT_REVISIT_ROOT
RESULT_STOP_DESCENT
cycles
CYCLE_CONTINUE
CYCLE_DIE
CYCLE_TRUNCATE
visits
VISIT_ALL
VISIT_CONTAINER
VISIT_LEAF
VISIT_ROOT
passes
PASS_REVISIT_ELEMENT
PASS_VISIT_ELEMENT
constants
Import tags "cycles", "passes", "results", "visits".
DEPRECATED CONSTRUCTS
* RESULT_REVISIT_CONTAINER is a deprecated alias for
"RESULT_REVISIT_CONTENTS".
SUPPORT
Bugs
Please report any bugs or feature requests to
bug-cxc-data-visitor@rt.cpan.org or through the web interface at:
<https://rt.cpan.org/Public/Dist/Display.html?Name=CXC-Data-Visitor>
Source
Source is available at
https://codeberg.org/CXC-Optics/p5-CXC-Data-Visitor
and may be cloned from
https://codeberg.org/CXC-Optics/p5-CXC-Data-Visitor.git
SEE ALSO
Please see those modules/websites for more information related to this
module.
* Data::Rmap
* Data::Traverse
* Data::Visitor::Lite
* Data::Visitor::Tiny
* Data::Walk
AUTHOR
Diab Jerius <djerius@cpan.org>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2024 by Smithsonian Astrophysical
Observatory.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007