NAME
Data::HashMap::Shared - Type-specialized shared-memory hash maps for multiprocess access
SYNOPSIS
use Data::HashMap::Shared::II;
# Create or open a shared map (file-backed mmap)
my $map = Data::HashMap::Shared::II->new('/tmp/mymap.shm', 100000);
# Keyword API (fastest)
shm_ii_put $map, 42, 100;
my $val = shm_ii_get $map, 42;
# Method API
$map->put(42, 100);
my $v = $map->get(42);
# Atomic counters (lock-free fast path)
shm_ii_incr $map, 1;
shm_ii_incr_by $map, 1, 10;
shm_ii_max $map, 1, 50; # monotonic: store max(current, 50)
# Compare-and-swap (all variants; byte-compare for string values)
shm_ii_cas $map, 1, 11, 42; # swap to 42 only if current == 11
# LRU cache (evicts least-recently-used when full)
my $cache = Data::HashMap::Shared::II->new('/tmp/cache.shm', 100000, 1000);
shm_ii_put $cache, 42, 100; # auto-evicts LRU entry if size > 1000
# TTL (entries expire after N seconds)
my $ttl_map = Data::HashMap::Shared::II->new('/tmp/ttl.shm', 100000, 0, 60);
shm_ii_put $ttl_map, 1, 10; # expires in 60s
shm_ii_put_ttl $ttl_map, 2, 20, 5; # per-key: expires in 5s
# Multiprocess
if (fork() == 0) {
my $child = Data::HashMap::Shared::II->new('/tmp/mymap.shm', 100000);
shm_ii_incr $child, 1; # atomic increment visible to parent
exit;
}
wait;
DESCRIPTION
Data::HashMap::Shared provides type-specialized hash maps stored in file-backed shared memory (mmap(MAP_SHARED)), enabling efficient multiprocess data sharing on Linux.
Linux-only. Requires 64-bit Perl.
Features
File-backed mmap for cross-process sharing
Futex-based read-write lock (fast userspace path)
Lock-free atomic counters (incr/decr under read lock)
Elastic capacity (starts small, grows/shrinks automatically)
Arena allocator for string storage in shared memory
Keyword API via XS::Parse::Keyword for maximum speed
Opt-in LRU eviction and per-key TTL (lock-free reads via clock eviction)
Stale lock recovery for both writers and readers (dead PIDs detected and drained automatically)
Variants
Integer Range and Wrapping
Integer keys and values are stored as fixed-width two's-complement integers: I16/SI16/I16S use a signed 16-bit range (-32768 .. 32767), I32/SI32/I32S a signed 32-bit range, and II/IS/SI a signed 64-bit range. A key or value outside the variant's range is silently truncated to the low bits (two's complement), with no warning: on an I16 map, $map->put(70000, ...) stores under key 4464 (70000 & 0xFFFF), so get(70000) and get(4464) address the same entry. incr/decr wrap the same way (32767 + 1 becomes -32768). Pick a variant wide enough for your data.
Constructor
my $map = Data::HashMap::Shared::II->new($path, $max_entries);
my $map = Data::HashMap::Shared::II->new(undef, $max_entries); # anonymous
my $map = Data::HashMap::Shared::II->new($path, $max_entries, $max_size);
my $map = Data::HashMap::Shared::II->new($path, $max_entries, $max_size, $ttl);
my $map = Data::HashMap::Shared::II->new($path, $max_entries, $max_size, $ttl, $lru_skip);
my $map = Data::HashMap::Shared::II->new_memfd($name, $max_entries, ...); # memfd-backed
my $map = Data::HashMap::Shared::II->new_from_fd($fd); # reopen memfd
my $fd = $map->memfd; # -1 if not memfd
Creates or opens a shared hash map backed by file $path. Passing undef as the path creates an anonymous MAP_SHARED|MAP_ANONYMOUS mapping that is inherited across fork but has no filesystem presence.
new_memfd creates an unlinked memfd-backed map whose file descriptor can be passed to another process (via SCM_RIGHTS, fork+exec, or duped+open). new_from_fd reopens such a descriptor. Both require a 64-bit Perl on Linux (memfd_create(2)).
$max_entries, $max_size, $ttl, and $lru_skip are used only when creating a new file; when opening an existing one, all parameters are read from the stored header and the constructor arguments are ignored. Multiple processes can open the same file simultaneously. Dies if the file exists but was created by a different variant or is corrupt.
Optional $max_size enables LRU eviction: when the map reaches $max_size entries, the least-recently-used entry is evicted on insert. Set to 0 (default) to disable. LRU uses a clock/second-chance algorithm: get sets an accessed bit (lock-free, no write lock), and eviction gives a second chance to recently accessed entries before evicting.
Optional $ttl sets a default time-to-live in seconds for all entries. Expired entries are lazily removed on access. Set to 0 (default) to disable. When TTL is active, get and exists check expiry. Expiry is measured against a monotonic clock (CLOCK_MONOTONIC_COARSE): TTLs track elapsed running time and do not advance while the system is suspended or hibernating.
Optional $lru_skip (0-99, default 0) reduces how often LRU promotion reorders the recency list: higher values skip more promotions. Promotion runs only when an operation updates an existing entry under the write lock (put/incr/get_or_set on a hit, the update family); plain get and get_with_ttl never promote — they set a lock-free accessed bit that the clock eviction consumes. Skipping promotions cuts write-lock churn for Zipfian (power-law) patterns where a few hot keys dominate. The LRU tail (eviction victim) is never skipped, preserving eviction correctness. Set to 0 for strict LRU ordering.
Zero-cost when disabled: with both $max_size=0 and $ttl=0, the fast lock-free read path is used. The only overhead is a branch (predicted away).
String Keys/Values and UTF-8
String-key variants (SS, SI, SI16, SI32) compare keys as raw bytes: two keys are the same entry if and only if they contain the same byte sequence. The SV UTF-8 flag is stored alongside the key so retrieval round-trips it to the returned SV, but it is not part of key identity. Consequences:
ASCII keys with a toggled UTF-8 flag hash and match the same entry (
use utf8,utf8::upgrade, andutf8::downgradeon ASCII are all equivalent from the map's point of view).Non-ASCII keys with different byte encodings are distinct.
"caf\xe9"(latin-1, 4 bytes) and"café"withuse utf8(5 UTF-8 bytes) are two different keys. If your input comes in mixed encodings, normalize withEncode::encode_utf8before use.
String-value variants (SS, IS, I16S, I32S) store the SV UTF-8 flag alongside each value and round-trip it on retrieval. The cas comparison of $expected against the stored value is byte-only — the UTF-8 flag on $expected is ignored (same rationale as string-key equality).
Sharding
my $map = Data::HashMap::Shared::II->new_sharded($path_prefix, $shards, $max_entries, ...);
Creates $shards independent maps (files $path_prefix.0, $path_prefix.1, ...) behind a single handle, each with up to $max_entries entries (total capacity is $shards * $max_entries). Per-key operations automatically route to the correct shard via hash dispatch. Writes to different shards proceed in parallel with independent locks. new_sharded requires a filesystem $path_prefix; anonymous (undef-path) sharded maps are not supported.
All operations work transparently on sharded maps: put, get, remove, exists, add, update, swap, take, incr, max, min, cas, cas_take, get_or_set, put_ttl, add_ttl, update_ttl, touch, persist, set_ttl, keys, values, items, to_hash, set_multi (method only), remove_multi (method only), get_multi (method only), get_with_ttl (method only), each, pop, shift, drain, clear, flush_expired, flush_expired_partial, size, stats (method only), reserve, and all diagnostic keywords.
Diagnostic counters and capacities reported for a sharded handle are aggregate totals across all shards: size, capacity, max_entries, max_size, tombstones, mmap_size, arena_used, arena_cap, and the stats eviction/expiry/recovery counts all sum over the shards. (ttl is the shared per-entry default, so it reports a single shard's value.) reserve $n pre-grows each shard to $n entries (not $n in total).
Cursors chain across shards automatically. cursor_seek routes to the correct shard based on key hash. $shards is rounded up to the next power of 2.
API
Replace xx with variant prefix: i16, i32, ii, i16s, i32s, is, si16, si32, si, ss.
my $ok = shm_xx_put $map, $key, $value; # insert or overwrite
my $ok = shm_xx_add $map, $key, $value; # insert only if key absent
my $ok = shm_xx_update $map, $key, $value; # overwrite only if key exists
my $old = shm_xx_swap $map, $key, $value; # put + return old value (undef if new)
my $ok = shm_xx_cas $map, $key, $expected, $desired; # compare-and-swap
my $v = shm_xx_cas_take $map, $key, $expected; # compare-and-remove; returns value on match, undef otherwise
my $n = $map->set_multi($k, $v, ...); # batch put under single lock, returns count
my $n = $map->remove_multi(@keys); # batch remove under single lock, returns count
my @v = $map->get_multi($k1, $k2, ...); # batch get under single lock with prefetch pipeline
my ($v, $ttl) = $map->get_with_ttl($key); # atomic snapshot; () if missing, $ttl is undef on non-TTL map, 0 = permanent; sets LRU clock bit
my $v = shm_xx_get $map, $key; # returns undef if not found
my $ok = shm_xx_remove $map, $key; # returns false if not found
my $ok = shm_xx_exists $map, $key; # returns boolean
my $s = shm_xx_size $map;
my $m = shm_xx_max_entries $map;
my @k = shm_xx_keys $map;
my @v = shm_xx_values $map;
my @items = shm_xx_items $map; # flat (k, v, k, v, ...)
while (my ($k, $v) = shm_xx_each $map) { ... } # auto-resets at end
shm_xx_iter_reset $map;
shm_xx_clear $map;
my $href = shm_xx_to_hash $map;
my $v = shm_xx_get_or_set $map, $key, $default; # returns value
get_or_set returns the existing value, or stores and returns $default when the key is absent. It returns undef only if the key is absent and the value cannot be stored — the map is at max_entries, or (string-value variants) the arena is full.
cas is available for all variants. Returns true when the stored value matched $expected and was atomically replaced with $desired; false if the key is missing or expired, the value did not match, or (string-value variants) the arena is full. See "String Keys/Values and UTF-8" for the byte-only comparison rule.
swap returns the previous value, or undef when the key did not exist (a fresh insert). undef is also returned when the new value cannot be stored — the map is already at max_entries capacity, or (string-value variants) the arena is full — in which case an existing key keeps its old value. swap by itself therefore cannot distinguish a fresh insert from a full-map failure; check exists or size first if that matters. On TTL maps, swap refreshes an existing entry's TTL to the default and assigns the default TTL on insert, but leaves a permanent entry (TTL 0) permanent.
Integer-value variants also have:
my $n = shm_xx_incr $map, $key; # returns new value
my $n = shm_xx_decr $map, $key; # returns new value
my $n = shm_xx_incr_by $map, $key, $delta;
my $n = shm_xx_max $map, $key, $desired; # store max(current, desired), return it
my $n = shm_xx_min $map, $key, $desired; # store min(current, desired), return it
A missing key is created starting from zero (Redis-style): the first incr returns 1, decr returns -1, and incr_by returns $delta. These die only when the key is new and the map is already at max_entries (no room to insert). The result wraps at the variant's integer width (see "Integer Range and Wrapping").
max/min atomically store max($current, $desired) / min($current, $desired) and return the resulting value; a missing key is inserted as $desired. The compare-and-set runs under a single lock acquisition, so there is no read-modify-write gap for a concurrent incr_by/cas/max/min on the same key: the result is monotonic (max never lowers, min never raises) and never clobbers a concurrent increment. The common "value already past the bound" case completes under a shared read lock, writing nothing. Like incr_by, they die only when the key is new and the map is at max_entries, and the result wraps at the variant's integer width.
LRU/TTL operations (put_ttl, add_ttl, and update_ttl require a TTL-enabled map):
my $ok = shm_xx_put_ttl $map, $key, $value, $ttl_sec; # per-key TTL (0 = permanent); requires TTL-enabled map
my $ok = shm_xx_add_ttl $map, $key, $value, $ttl_sec; # insert-if-absent with per-key TTL (0 = permanent)
my $ok = shm_xx_update_ttl $map, $key, $value, $ttl_sec; # overwrite-only with per-key TTL (0 = permanent)
my $ms = shm_xx_max_size $map; # LRU capacity (0 = disabled)
my $t = shm_xx_ttl $map; # default TTL in seconds
my $r = shm_xx_ttl_remaining $map, $key; # seconds left (0 = permanent, undef if missing/expired/no TTL)
my $ok = shm_xx_touch $map, $key; # refresh TTL to default (permanent entries stay permanent); promotes in LRU; false if no TTL/LRU
my $ok = shm_xx_persist $map, $key; # remove TTL, make key permanent; false on non-TTL maps
my $ok = shm_xx_set_ttl $map, $key, $sec; # change TTL without changing value (0 = permanent); false on non-TTL maps
my $n = shm_xx_flush_expired $map; # proactively expire all stale entries, returns count
my ($n, $done) = shm_xx_flush_expired_partial $map, $limit; # gradual: scan $limit slots
Atomic remove-and-return:
my $v = shm_xx_take $map, $key; # remove key and return value (undef if missing)
my ($k, $v) = shm_xx_pop $map; # remove+return from LRU tail / scan forward
my ($k, $v) = shm_xx_shift $map; # remove+return from LRU head / scan backward
my @kv = shm_xx_drain $map, $n; # remove+return up to N entries as flat (k,v,...) list
pop and shift remove from opposite ends: pop takes the LRU tail (oldest / least recently used) while shift takes the LRU head (newest / most recently used). On non-LRU maps, pop scans forward and shift scans backward. drain removes in pop order (tail-first). Useful for work-queue patterns and batch processing.
Cursors (independent iterators, allow nesting and removal during iteration):
my $cur = shm_xx_cursor $map; # create cursor
while (my ($k, $v) = shm_xx_cursor_next $cur) { ... }
shm_xx_cursor_reset $cur; # restart from beginning
shm_xx_cursor_seek $cur, $key; # position at specific key (best-effort across resize)
# cursor auto-destroyed when out of scope
shm_xx_each is also safe to use with remove during iteration. Resize/compaction is deferred until iteration ends.
Diagnostics:
my $cap = shm_xx_capacity $map; # current table capacity (slots)
my $tb = shm_xx_tombstones $map; # tombstone count
my $au = shm_xx_arena_used $map; # arena bytes used (0 for int-only)
my $ac = shm_xx_arena_cap $map; # arena total capacity (0 for int-only)
my $sz = shm_xx_mmap_size $map; # backing file size in bytes
my $ok = shm_xx_reserve $map, $n; # pre-grow (false if exceeds max)
my $ev = shm_xx_stat_evictions $map; # cumulative LRU eviction count
my $ex = shm_xx_stat_expired $map; # cumulative TTL expiration count
my $rc = shm_xx_stat_recoveries $map; # cumulative stale lock recovery count
my $p = $map->path; # backing file path (method only)
my $s = $map->stats; # hashref with all diagnostics in one call
# stats keys: size, capacity, max_entries, tombstones, mmap_size,
# arena_used, arena_cap, evictions, expired, recoveries, max_size, ttl
set_multi, get_multi, remove_multi, get_with_ttl, stats, path, sync, and unlink are method-only (no keyword form).
File management:
$map->sync; # flush the mmap to the backing file (msync MS_SYNC)
$map->unlink; # remove backing file (mmap stays valid)
Data::HashMap::Shared::II->unlink($path); # class method form
sync issues a synchronous msync(2) over the whole mapping (every shard, for sharded maps) and dies on error. Use it to force durability of a file-backed map; it is a no-op for anonymous mappings, which have no backing file. Changes are visible to other processes sharing the mapping without sync — it only affects on-disk persistence.
Crash Safety
If a process dies (e.g., SIGKILL, OOM kill) while holding the write lock, other processes detect the stale lock within 2 seconds and automatically recover. The writer's PID is encoded in the rwlock word itself (single atomic CAS, no crash window). On FUTEX_WAIT timeout, waiters kill($pid, 0) the holder and CAS-release the lock if it's dead.
Reader-side recovery uses a 1024-slot table in the shared mmap (one slot per process, claimed lazily on first lock; fork()'d children claim a fresh slot via pthread_atfork). On a writer-lock timeout the recovery scan CAS-claims each dead PID's slot, drains the waiter counts, and force-resets the reader counter once no live reader holds it — so a worker killed mid-incr_by no longer pins the rwlock indefinitely. If a live reader is concurrently present, the dead slot is left intact for the next recovery cycle (preserves the only record of the stuck counter). Beyond 1024 simultaneous handles per map, new handles skip slot tracking and fall back to the slow per-timeout drain.
The same path validates and rebuilds the LRU doubly-linked list if a dead writer left it inconsistent. stat_recoveries in stats counts every recovery event.
Recovery uses kill($pid, 0) for liveness, which cannot distinguish a reused PID from the original. Hitting a false "alive" requires a process to die in the brief window it holds a read lock and the kernel to cycle through the entire PID space back to that exact number within the ~2-second recovery window and hand it to a long-lived process — i.e. a runaway fork storm. Even then the effect is bounded: writers stall until the recycled process exits; reads are unaffected and no data is corrupted. Writer-crash recovery is immune (the writer PID lives in the lock word and is reclaimed independently of the slot table).
Limitation: PID-based recovery assumes all processes share the same PID namespace. Cross-container sharing (different PID namespaces) is not supported.
After recovery from a mid-mutation crash, the map data may be partially inconsistent (e.g., one entry was being updated when the writer died). Map structure (locks, LRU, free lists, counters) is restored, but the specific entry being mutated may have stale or partial bytes. Calling clear after detecting a stale lock recovery is recommended for safety-critical applications.
BENCHMARKS
Throughput versus other shared-memory / on-disk solutions, 25K entries, single process, Linux x86_64. All values in M ops/s (higher is better). Run perl -Mblib bench/vs.pl 25000 to reproduce.
Integer key → integer value (Shared::II):
BerkeleyDB LMDB Shared::II
INSERT 31 46 184
LOOKUP 35 40 383
INCREMENT 16 18 165
String key → string value, short (inline ≤ 7B, Shared::SS):
FastMmap BerkeleyDB LMDB SharedMem Shared::SS
INSERT 11 26 40 62 130
LOOKUP 10 32 34 146 213
DELETE 14 18 -- 32 68
String key → string value, long (~50-100B, Shared::SS):
BerkeleyDB LMDB SharedMem Shared::SS
INSERT 25 37 61 133
LOOKUP 30 33 125 229
LRU cache lookup (25K entries, lock-free clock eviction):
plain LRU
II 350 373 (lock-free, ~6% faster via clock)
SS 159 159
Cross-process (25K SS entries, 2 processes, ops/s):
Shared::SS SharedMem LMDB
READS 3,250,000 1,986,000 728,000
WRITES 2,801,000 826,000 95,000
MIXED 50/50 3,691,000 1,963,000 211,000
LMDB benchmarked with MDB_WRITEMAP|MDB_NOSYNC|MDB_NOMETASYNC|MDB_NORDAHEAD. BerkeleyDB with DB_PRIVATE|128MB cache.
Key takeaways:
10x faster lookups than LMDB for integer keys (lock-free seqlock path)
1.5x faster than Hash::SharedMem for short string lookups (inline strings, no arena overhead)
1.8x faster than Hash::SharedMem for long string lookups
4.5x faster cross-process reads than LMDB; 3.4x faster writes than SharedMem
LRU reads are lock-free (clock eviction) — no overhead vs plain maps
Atomic
incris 9x faster than get+put on competitorsStrings ≤ 7 bytes stored inline in node (zero arena overhead)
SEE ALSO
Data::Buffer::Shared - typed shared array
Data::Queue::Shared - FIFO queue
Data::PubSub::Shared - publish-subscribe ring
Data::ReqRep::Shared - request-reply
Data::Sync::Shared - synchronization primitives
Data::Pool::Shared - fixed-size object pool
Data::Stack::Shared - LIFO stack
Data::Deque::Shared - double-ended queue
Data::Log::Shared - append-only log (WAL)
Data::Heap::Shared - priority queue
Data::Graph::Shared - directed weighted graph
Data::BitSet::Shared - shared bitset (lock-free per-bit ops)
Data::RingBuffer::Shared - fixed-size overwriting ring buffer
AUTHOR
vividsnow
LICENSE
This is free software; you can redistribute it and/or modify it under the same terms as Perl itself.