NAME

Data::RadixTree::Shared - shared-memory compressed radix tree (prefix tree) for Linux

SYNOPSIS

use Data::RadixTree::Shared;

# anonymous shared mapping: 4096-node pool, 64 KiB label arena
my $t = Data::RadixTree::Shared->new(undef, 4096, 65536);

$t->insert("foo", 1);          # 1 (newly inserted)
$t->insert("foobar", 2);       # 1
$t->insert("foo", 9);          # 0 (key existed -> value updated)

$t->lookup("foo");             # 9
$t->get("foobar");             # 2  ('get' is an alias for lookup)
$t->exists("foo");             # true
$t->contains("fo");            # false ('contains' is an alias for exists)

# longest-prefix match: the longest stored key that is a prefix of the query
$t->insert("10",       100);
$t->insert("10.0",     200);
$t->insert("10.0.0",   300);
$t->longest_prefix("10.0.0.5");   # 300 (value of "10.0.0")
$t->longest_prefix("9");          # undef (no stored key is a prefix)

$t->delete("foo");             # 1 (removed); 'remove' is an alias
$t->count;                     # number of stored keys ('size' is an alias)
$t->clear;                     # empty the tree

# share across processes via a backing file
my $shared = Data::RadixTree::Shared->new("/tmp/routes.bin", 1 << 16, 1 << 20);

DESCRIPTION

A compressed radix tree (a radix-256, PATRICIA-style trie) in shared memory. It maps arbitrary byte-string keys to unsigned-integer values. Edge compression collapses chains of single-child nodes into one labelled edge, so insert and lookup are O(key length) regardless of how many keys are stored, and the structure stays compact.

Beyond exact lookup, a radix tree answers longest-prefix queries efficiently: given a query string, longest_prefix returns the value of the longest stored key that is a prefix of the query. This is exactly what routing tables, dispatch tables and autocomplete back-ends need (e.g. find the most specific CIDR-like prefix that covers an address).

Keys are raw bytes: a key containing wide characters croaks (encode to bytes first, e.g. utf8::encode or Encode::encode). Values are unsigned integers (a Perl UV, stored as 64-bit). Because the node pool and label arena live in a shared mapping, several processes share one tree: any process that opens the same backing file, inherits the anonymous mapping across fork, or reopens a passed memfd, sees the others' keys. A write-preferring futex rwlock with dead-process recovery guards every mutation, so concurrent inserts and deletes serialize cleanly.

lookup, get, exists, contains and longest_prefix are pure reads (no path compression) and take the read lock, so many processes can query in parallel. insert, delete and clear take the write lock.

v1 NOTE -- delete is lazy: delete unmarks the key (so lookup and exists no longer find it and count drops) but does not reclaim the node-pool slots or the label-arena bytes the key occupied. Size the capacities for your working set, or call clear to reset the whole tree and reclaim all space at once.

Capacity is fixed at creation. The node pool holds node_capacity nodes and the label arena holds arena_capacity bytes; insert croaks (after releasing the lock) when either would overflow. A single insert consumes at most two nodes and at most length(key) arena bytes, so size both with headroom for your key set. Linux-only. Requires 64-bit Perl.

METHODS

Constructors

my $t = Data::RadixTree::Shared->new($path, $node_capacity, $arena_capacity);
my $t = Data::RadixTree::Shared->new(undef, $node_capacity, $arena_capacity); # anonymous
my $t = Data::RadixTree::Shared->new_memfd($name, $node_capacity, $arena_capacity);
my $t = Data::RadixTree::Shared->new_from_fd($fd);

$path is the backing file (undef or omitted for an anonymous mapping). $node_capacity (default 4096) is the number of tree nodes the pool holds; it must be >= 2 (one slot is the reserved NIL sentinel, one is the root) and <= 2**24. $arena_capacity (default 65536) is the number of bytes the label arena holds; it must be >= 1 and <= 0xF0000000 (~3.75 GiB). new and new_memfd croak if either capacity is out of range. A freshly created tree is empty (count == 0).

When reopening an existing file or memfd, the stored geometry wins and the existing keys are preserved; the capacities you pass to new on a reopen are only used when the file is brand new. new_memfd creates a Linux memfd (transferable via its memfd descriptor); new_from_fd reopens one in another process.

Insert, lookup, prefix queries

my $new   = $t->insert($key, $value);   # 1 if newly inserted, 0 if updated
my $value = $t->lookup($key);           # the stored value, or undef
my $value = $t->get($key);              # alias for lookup
my $bool  = $t->exists($key);           # true if $key is stored
my $bool  = $t->contains($key);         # alias for exists
my $value = $t->longest_prefix($query); # value of the longest stored prefix, or undef

insert stores $value (default 1) under $key, returning 1 if the key was new or 0 if it already existed (the value is updated either way). It croaks if the node pool or label arena is exhausted -- the croak happens after the write lock is released, and the tree is left consistent (every already-inserted key is still present and queryable).

Because the capacity check runs before the key is looked up, in a full tree insert croaks even for an update of an existing key (which would need no new space); size the capacities with headroom, or call clear.

lookup (aliased get) returns the value stored under $key, or undef if the key is not present. exists (aliased contains) returns a boolean.

longest_prefix returns the value of the longest stored key that is a prefix of $query, or undef if no stored key is a prefix of it. (The empty string, if stored, is a prefix of every query.) It returns the matched key's value, not the key itself; recovering the matched key string is deferred to a future version.

Keys are byte strings; a key (or query) containing wide characters croaks before any lock is taken. Values are unsigned integers stored as 64-bit.

Delete and clear

my $removed = $t->delete($key);   # 1 if removed, 0 if absent
my $removed = $t->remove($key);   # alias for delete
$t->clear;                        # remove every key

delete (aliased remove) removes $key, returning 1 if it was present or 0 if it was absent. As noted above, delete is lazy in v1: it unmarks the key and decrements count, but does not reclaim node or arena space. Re-inserting the same key reuses its existing path at no extra cost; inserting many distinct keys after deletes still consumes fresh capacity. clear empties the tree and reclaims all node and arena space at once.

Size and stats

my $n = $t->count;   # number of stored keys
my $n = $t->size;    # alias for count

count (aliased size) is the current number of distinct stored keys. See "STATS" for the full stats hashref.

Lifecycle

$t->path; $t->memfd; $t->sync; $t->unlink;   # or Class->unlink($path)

sync flushes the mapping to its backing store (a no-op for anonymous and memfd trees, which have none); unlink removes the backing file (also callable as Class->unlink($path)); path returns the backing path (undef for anonymous, memfd, or fd-reopened trees) and memfd the backing descriptor -- the memfd of a new_memfd tree or the dup'd fd of a new_from_fd tree, and -1 for file-backed or anonymous trees.

STATS

stats() returns a hashref describing the tree:

  • keys -- the number of distinct stored keys (same as count).

  • nodes_used -- high-water number of node-pool slots ever allocated (includes the NIL sentinel and the root). With lazy delete this never decreases except on clear.

  • nodes_capacity -- the fixed node-pool capacity.

  • arena_used -- bytes of label arena consumed (never decreases except on clear).

  • arena_capacity -- the fixed label-arena capacity in bytes.

  • ops -- running count of operations that took the write lock (insert, delete, clear).

  • mmap_size -- bytes of the shared mapping.

SHARING ACROSS PROCESSES

The tree lives in a shared mapping, shared the same three ways as the rest of the family: a backing file (every process calls new($path, ...) on the same path), an anonymous mapping inherited across fork, or a memfd whose descriptor is passed to an unrelated process (over a UNIX socket via SCM_RIGHTS, or via /proc/$pid/fd/$n) and reopened with new_from_fd($fd). Because the mapping is shared, every process inserts into and queries the same tree. All mutation is serialized by the write lock, so the final contents are independent of how the processes interleave.

# parent and children share one radix tree with no coordination
my $t = Data::RadixTree::Shared->new(undef, 1 << 16, 1 << 20);   # before fork
unless (fork) { $t->insert("child-$_", $_) for 1 .. 1000; exit }
wait;
print $t->count, "\n";   # reflects the child's inserts

SECURITY

The mmap region is writable by all processes that open it, and a reopened file is trusted: validation checks the header geometry but cannot vet every internal node index against a maliciously crafted file. A hostile file could direct a walk to an out-of-range node index. Do not share backing files with, or reopen files from, untrusted processes.

CRASH SAFETY

Mutation is guarded by a futex-based write-preferring rwlock with PID-encoded ownership; if a holder dies, the next contender detects the dead owner and recovers. Because an insert performs all of its node/arena allocations under the lock after a capacity pre-check, a crash leaves the tree consistent up to the last completed insert or delete. Limitation: PID reuse is not detected (very unlikely in practice).

SEE ALSO

Data::DisjointSet::Shared, Data::Intern::Shared, Data::SortedSet::Shared, Data::SpatialHash::Shared, Data::Histogram::Shared, Data::CountMinSketch::Shared, Data::HyperLogLog::Shared, Data::BloomFilter::Shared, and the rest of the Data::*::Shared family.

AUTHOR

vividsnow

LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as Perl itself.