NAME
Data::Path::XS - Fast path-based access to nested data structures
SYNOPSIS
use Data::Path::XS qw(path_get path_set path_delete path_exists);
my $data = { foo => { bar => [1, 2, 3] } };
path_get ($data, '/foo/bar/1'); # 2
path_set ($data, '/foo/bar/1', 42); # 42 (returns the value set)
path_exists($data, '/foo/baz'); # 0
path_delete($data, '/foo/bar/0'); # 1 (returns the deleted value)
# Pre-parsed path components (binary-safe; allows "/" in keys)
use Data::Path::XS qw(patha_get patha_set);
patha_get($data, ['foo', 'bar', 1]);
patha_set($data, ['foo', 'new'], 'value');
# Pre-compiled paths for hot loops
use Data::Path::XS qw(path_compile pathc_get);
my $cp = path_compile('/foo/bar/1');
my $other = { foo => { bar => [4, 5, 6] } };
pathc_get($data, $cp); # 42
pathc_get($other, $cp); # 5 — reuse across data
# Keyword syntax (compile-time optimized)
use Data::Path::XS ':keywords';
my $v = pathget $data, "/foo/bar/1";
pathset $data, "/foo/bar/1", 99;
pathdelete $data, "/foo/bar/1";
print "ok\n" if pathexists $data, "/foo/bar";
DESCRIPTION
Fast XS access to deeply nested Perl data structures via slash-separated paths (similar shape to JSON Pointer, but without RFC 6901's ~0/~1 escaping). Four parallel APIs let you trade ergonomics against speed:
"STRING PATH API" -
path_*, the general-purpose entry point."ARRAY PATH API" -
patha_*, when path components are already parsed or may contain/or other special characters."COMPILED PATH API" -
path_compile+pathc_*, when the same path is reused many times on different data."KEYWORDS API" -
pathget/pathset/etc. as syntax via XS::Parse::Keyword, compiled to inline custom ops or (where possible) native Perl assignment ops.
All four APIs share the same path syntax ("PATH FORMAT") and the same container-dispatch semantics ("Numeric vs String Keys").
IMPORTING
use Data::Path::XS qw(path_get path_set ...); # function exports
use Data::Path::XS ':keywords'; # enable keyword syntax
use Data::Path::XS ':keywords', qw(path_get); # both
The :keywords tag installs lexically-scoped keyword hints; the keywords are visible only inside the importing scope. no Data::Path::XS; removes them. Function exports follow standard Exporter rules.
PATH FORMAT
Components are separated by
/. A leading/is optional:"/foo/bar"and"foo/bar"are equivalent.An empty string or
"/"refers to the root. Repeated and trailing slashes ("//foo//") are tolerated and yield the same components.Numeric components may address array elements when the parent container is an array; on a hash parent the same string is treated as a hash key. See "Numeric vs String Keys".
Negative indices work like Perl's native array access (
-1is the last element). See "Negative Array Indices".No escaping is provided in the string API: keys containing
/or the empty string cannot be expressed in a string path. Use the array API (e.g.patha_get($data, ['', 'a/b'])) for those.UTF-8 keys are propagated correctly. The path SV's
SvUTF8flag (or, in the array API, each key SV's flag) is forwarded tohv_fetch/hv_storeso"/café"matches hash keys stored underuse utf8.
Numeric vs String Keys
All four APIs dispatch by parent container type, not by key shape:
my $h = { '0' => 'zero' };
path_get($h, '/0'); # 'zero' - hash key
pathget $h, "/0"; # 'zero' - same
my $a = ['x', 'y', 'z'];
path_get($a, '/0'); # 'x' - array index
pathget $a, "/0"; # 'x' - same
When autovivifying a missing intermediate, the type to create is chosen by the next component's shape: a numeric next component creates an array, otherwise a hash.
STRING PATH API
path_get($data, $path)
Returns the value at $path, or undef if any component is missing. An empty path returns $data itself. Never autovivifies.
path_get($data, '/foo/bar');
path_get($data, ''); # returns $data
path_set($data, $path, $value)
Stores $value at $path, creating intermediate hashes/arrays as needed (see "Numeric vs String Keys" for the type-decision rule). Existing non-reference scalars at intermediate positions are silently replaced. Returns $value. Croaks on an empty path or on a path that cannot be navigated (e.g. through a tied container, see "Tied containers").
path_set($data, '/foo/bar', 42);
path_set($data, '/items/0/name', 'first'); # autovivifies array
path_delete($data, $path)
Deletes the value at $path and returns it, or undef if not found. Croaks on an empty path.
my $old = path_delete($data, '/foo/bar');
path_exists($data, $path)
Returns 1 if $path resolves to an existing element (using exists semantics: explicit undef values count as existing), 0 otherwise. The empty path always exists.
do_thing() if path_exists($data, '/foo/bar');
ARRAY PATH API
The patha_* functions take an arrayref of components instead of a slash-separated string. Use this when path pieces are already parsed, when keys may contain /, or when you want to address an empty-string key (['']).
Each key SV's SvUTF8 flag is honoured per component.
patha_get($data, \@path)
patha_get($data, ['foo', 'bar', 0]);
patha_get($data, []); # returns $data
patha_set($data, \@path, $value)
patha_set($data, ['foo', 'bar'], 42);
patha_delete($data, \@path)
patha_delete($data, ['foo', 'bar']);
patha_exists($data, \@path)
patha_exists($data, ['foo', 'bar']);
COMPILED PATH API
Pre-compile a path once, then reuse it for many lookups. The compiled object holds parsed components, pre-computed array indices, and the UTF-8 flag, so per-call overhead drops to the navigation itself.
path_compile($path)
Returns a compiled path object (a blessed reference). The object owns its own copy of the path string, so the caller may freely mutate or discard the source SV.
my $cp = path_compile('/users/0/name');
pathc_get($data, $compiled)
for my $record (@records) {
my $val = pathc_get($record, $cp);
}
pathc_set($data, $compiled, $value)
pathc_set($data, $cp, 'new value');
pathc_delete($data, $compiled)
pathc_delete($data, $cp);
pathc_exists($data, $compiled)
pathc_exists($data, $cp);
KEYWORDS API
use Data::Path::XS ':keywords';
The keywords compile to either an inline custom op or, where the path allows, native Perl assignment ops. They never call into XSUB dispatch and so reach near-native speed.
pathget DATA, PATH
Get a value. Returns undef for missing paths and never autovivifies.
my $val = pathget $data, "/users/0/name";
pathset DATA, PATH, VALUE
Set a value, autovivifying intermediates as needed. Returns VALUE.
pathset $data, "/users/0/name", "Alice";
pathdelete DATA, PATH
Delete a value and return it.
my $old = pathdelete $data, "/users/0/name";
pathexists DATA, PATH
True if PATH exists.
print "found\n" if pathexists $data, "/users/0/name";
Constant vs Dynamic Paths
When pathset is called with a compile-time constant path that
contains only string components (no numeric pieces), and
does not carry the
SvUTF8flag (i.e. is not authored underuse utf8),
the keyword compiles directly to a native HELEM-chain assignment with autovivification - zero per-call overhead. Because this uses Perl's native ops:
error messages match Perl's (e.g.
"Not a HASH reference") rather than this module's ("Cannot navigate to path"), anda non-reference intermediate causes a croak rather than being silently replaced.
In every other case (numeric component, UTF-8 path, non-constant path), the keyword falls through to a custom op with the same semantics as path_set.
The other three keywords (pathget, pathexists, pathdelete) always use custom ops.
EDGE CASES
Empty Paths
The empty path ("", "/", "///") addresses the root:
path_get ($data, ""); # $data
path_exists($data, "/"); # 1
path_set ($data, "", $v); # croaks "Cannot set root"
path_delete($data, ""); # croaks "Cannot delete root"
Negative Array Indices
Negative indices behave like Perl's:
my $data = { arr => ['a', 'b', 'c'] };
path_get($data, '/arr/-1'); # 'c'
path_set($data, '/arr/-1', 'z'); # arr now ['a','b','z']
Out-of-range negative indices return undef (or false for exists).
Leading Zeros
Strings with leading zeros are treated as hash keys, not array indices:
path_get($data, '/arr/007'); # $data->{arr}{007}
path_get($data, '/arr/0'); # $data->{arr}[0] (single zero ok)
Integer Overflow
Indices with more than 18 digits (9 on 32-bit perls) are treated as hash keys to prevent overflow:
path_get($data, '/arr/12345678901234567890'); # hash key
LIMITATIONS
Tied containers
Read operations (path_get, path_exists, path_delete, and their array/compiled/keyword counterparts) work on tied hashes and arrays via the standard fetch/exists/delete magic.
Write operations (path_set, patha_set, pathc_set, and the pathset keyword) currently croak with a message of the form "Cannot ... on tied/magical hash" or "... on tied/magical array", rather than invoking the tied STORE method. For tied write targets, assign through native Perl syntax. This limitation may be relaxed in a future release.
THREAD SAFETY
The module uses no global state and is safe in threaded programs as long as each thread operates on its own data. No locking is performed on shared structures.
Compiled-path objects own internal buffers and should not be shared across threads; create one per thread.
PERFORMANCE
Indicative numbers from bench/benchmark.pl on a single sample run (rate per second, higher is better):
Operation Pure Perl Native Perl Data::Path::XS
----------------------- ----------- -------------- -----------------
path_get shallow 2.1 M/s 35.4 M/s 22.6 M/s
path_get deep (5 levels) 0.8 M/s 7.0 M/s 8.6 M/s
path_get missing key 1.3 M/s 4.4 M/s 14.7 M/s
path_set deep existing 0.8 M/s 8.1 M/s 7.3 M/s
pathget kw const shallow - 37.5 M/s 42.2 M/s
pathget kw const deep - 7.3 M/s 8.5 M/s
pathexists kw const deep - 6.3 M/s 10.2 M/s
The keyword API matches or exceeds native Perl on most workloads. The compiled API adds another ~20-35% on hot paths by skipping parsing. Run bench/benchmark.pl for a fuller comparison on your hardware.
SEE ALSO
Data::Diver - pure-Perl deep accessor with similar reach.
JSON::Pointer - RFC 6901 path syntax (with
~0/~1escaping) over the same kinds of structures.Data::DPath - XPath-like queries over data.
XS::Parse::Keyword - the keyword-plugin framework used to install the
pathget/pathset/pathdelete/pathexistssyntax.
AUTHOR
vividsnow
BUGS
Please report issues at https://github.com/vividsnow/perl5-data-path-xs/issues.
LICENSE
This is free software; you can redistribute it and/or modify it under the same terms as Perl itself.