NAME
Store::Directories - Manage a key/value store of directories with controls for concurrent access and locking.
SYNOPSIS
use Store::Directories;
# Create a new store at given directory
# (or adopt one that is already there)
my $store = Store::Directories->init("path/to/store/")
# (In this example, we create a new directory containing a text file
# and then atomically increment the value written in the file)
my $value = 1;
# Get a directory with the key 'foo' in the store,
# creating it if it doesn't exist yet
my $lock;
my $dir = $store->get_or_add('foo' {
# as an option, we can provide a subroutine to use to
# initialize the directory contents if we create it
# (but if the directory already exists, this won't be called)
init => sub {
my $dir = shift;
open(my $fh, '>', "$dir/hello.txt") or die "could not open file: $!";
print $fh $value;
close $fh;
}
});
{
# Get an exclusive lock on the directory before reading/writing to it.
# This ensure no other process is reading or modifying the directory
# contents while we're working.
my $lock = $store->lock_ex('foo');
open(my $fh, '<', "$dir/hello.txt") or die "could not open file: $!";
$value = <$fh>;
open($fh, '>', "$dir/hello.txt") or die "could not re-open file: $!";
print $fh $value + 1;
close $fh;
# The lock is released once $lock is out-of-scope
}
DESCRIPTION
Store::Directories manages a key/value store of directories and allows processes to assert shared (read-only) or exclusive (writable) locks on those directories.
Directories in a Store::Directories Store are referenced by unique string "keys". Internally, the directories are named with hexadecimal UUIDS, so the keys you use to identify them can contain illegal or unusual characters for filenames. (web URLs are a common example).
Processes can perform operations on these directories in parallel by requesting "locks" on particualr directories. These locks can be either shared or exclusive (to borrow flock(2) terminology). Lock objects are obtained with the lock_sh
or lock_ex
methods of a Store::Directories instance and are automatically released once they go out of scope.
Shared locks are used when a process wants to read, but not modify the contents of a directory while being sure that no other process can modify the contents while its reading. There can be multiple shared locks from different processes on a directory at once, but never at the same time as an exclusive lock.
Exclusive locks are used when a process wants to read and modify the contents of a directory while being sure that no other process can modify or read the contents while its working. There can only be one exclusive lock on a directory at once and there can't be any shared locks with it.
If a process requests a lock that is unavailable at the moment (due to another process already having an incompatible lock), then the process will block until the lock can be obtained (either by the other process dying or releasing its locks). Be aware that the order in which locks are granted is not necessarily the same order that that they were requested in.
WARNING: The guarantees around locking make the assumption that every process is using this package and playing by its rules. Unrelated processes are free to ignore the rules and mess things up as much as they like.
PUBLIC METHODS
init DIRECTORY
Create and return a new Store::Directories instance in the given directory. Bookkeeping files and directory entries will be stored inside this directory. If a Store::Directories instance already exists in that directory, then this will simply adopt the one that's there.
path
Get the absolute path to this Store's directory.
get_or_add KEY, {OPTIONS}
Get the path to the directory referred to by
KEY
, creating it if it doesn't yet exist. Returns the absolute path to the directory.OPTIONS
is a hashref that can contain the following options:init (subroutine ref)
A subroutine used to initialize the directory in the event that it gets created (although if the directory already exists when
get_or_add
is called, this won't be called). This is called with the absolute path to the directory as the first arguemnt and the key name as the second argument. An exclusive lock is active on the directory for the duration of the function. If the function dies, then the entire call toget_or_add
will croak and the directory will not be created. If this isn't specified, an empty directory is created. (default: undef)lock_sh (scalar ref)
Create a shared lock to the directory, storing it in the value referenced by this option. This works like calling the
lock_sh
method, but eliminates the possible race condition where another process can get a lock (or even remove) the directory between creating it and callinglock_sh
. However, if the directory already exists, this may block until the lock can be obtained. (default: undef)lock_ex (scalar ref)
Just like the
lock_sh
option, but for an exclusive lock. If both options are specified, only the exclusive lock is created and the shared lock is ignored. (default: undef)
Example:
my $lock; my $dir = $store->get_or_add('foobar' { init => sub { my $dir = shift; # Initialize directory }, lock_sh => \$lock });
NOTE: Keys matching the pattern
/^__.*__$/
(that is, surrounded by double-underscores) are reserved by Store::Directories and cannot be used. Currently, the only key like this is__LISTING__
, which is used internally to lock the list of directories (so that they can't be removed or added).lock_sh KEY, [NOBLOCK]
Create and return a new shared lock for the given key. This asserts that no other process can modify the corresponding entry until this lock goes out-of-scope.
This blocks until the lock can be obtained. So it will wait for any processes that already have an exclusive lock on this key to release their locks before returning. But if
NOBLOCK
is true, then this will not block but may return undef if the lock couldn't be obtained.This will croak if this process already has a lock (either kind) on this key, or if the key does not exist in the store.
lock_ex KEY [NOBLOCK]
Create and return a new exclusive lock for the given key. This asserts that no other process can read the corresponding entry until this lock goes out-of-scope.
This blocks until the lock can be obtained. So it will wait for any processes that have locks on this key to release them before returning. But if
NOBLOCK
is true, then this will not block but may return undef if the lock couldn't be obtained.This will croak if this process already has a lock (either kind) on this key, or if the key does not exist in the store.
remove KEY [SUB]
Remove the directory with the given key from the store. You MUST have an exclusive lock already on the directory before calling this.
SUB
is a subroutine ref which, if specified, will be called immediately before deleting the directory.SUB
is called with the path to the directory as the first argument and the key for the directory as the second argument.If an error occurs removing the directory from disk, (from
SUB
failing, or otherwise), then the directory will still be removed from the store's index and a warning will be given as the directory still on disk may be in a degraded state.get_locks KEY
Returns a hashref listing all of the current locks for the directory with the given
KEY
. Each key in the hash is the PID of a process and each corresponding value is true/false indicating whether or not the lock is exclusive.get_listing
Returns a hashref listing all of the directories in the store. Each key in the hash is the key for that directory while the corresponding value is the absolute path to the directory.
get_in_dir KEY, SUB [INIT]
Get a shared lock for the directory with key,
KEY
, then execute the subroutine reference,SUB
(calling with the absolute path to the directory as the first argument and the key as the second argument). Returns whateverSUB
returns. Essentially, this is just a convenient shortcut for something like this:my $dir = $store->get_or_add('foo'); my $lock = $store->lock_sh('foo'); my $val = do_whatever($dir, 'foo'); # shortcut my $val = $store->get_in_dir('foo', \&do_whatever);
Naturally, your
SUB
subroutine shouldn't modify the contents of the directory or else you'll be violating the trust that Store::Directories (and other processes!) place in you.The optional
INIT
argument is a subroutine used to initialize the directory in the event it doesn't yet exist when this is called. (Same semantics as theinit
option toget_or_add
).run_in_dir KEY, SUB [INIT]
Get an exclusive lock for the directory with key,
KEY
, then execute the subroutine reference,SUB
(calling with the absolute path to the directory as the first argument and the key as the second argument). Returns whateverSUB
returns. Essentially, this is just a convenient shortcut for something like this:my $dir = $store->get_or_add('foo'); my $lock = $store->lock_ex('foo'); my $val = do_whatever($dir, 'foo'); # shortcut my $val = $store->run_in_dir('foo', \&do_whatever);
Unlike
get_in_dir
, yourSUB
subroutine is allowed to modify (or even delete!) the directory and its contents.The optional
INIT
argument is a subroutine used to initialize the directory in the event it doesn't yet exist when this is called. (Same semantics as theinit
option toget_or_add
).get_or_set KEY, GET, SET [INIT]
A combination of
get_in_dir
andrun_in_dir
.GET
andSET
are subroutine references. For the directory with key,KEY
, runs theGET
subroutine under a shared lock and returns whatever it returns. But ifGET
returnsundef
, then it will callSET
under an exclusive lock before tryingGET
again. (If it returnsundef
this time, then this method will just returnundef
).Both subroutines are called with the absolute path to the directory as the first argument, and the key as the second argument. If any of them die, then this entire function will croak.
This is useful when you have multiple processes that may want to perform some operation in the same directory, but you want to make sure that operation is only performed once.
GET
can be made to return undef if it detects the operation has not been done yet, whileSET
performs the operation.Be aware that
GET
may actually get called up to three times. First, under the shared lock. And, if it returnsundef
, then it will be called again immediately after upgrading to an exclusive lock (in case another process got to the exclusive lock first and already calledSET
for us). If that's stillundef
, then it will be called a third and final time.The optional
INIT
argument is a subroutine used to initialize the directory in the event it doesn't yet exist when this is called. (Same semantics as theinit
option toget_or_add
).
AUTHOR
Cameron Tauxe camerontauxe@gmail.com
LICENSE AND COPYRIGHT
This software is copyright (c) 2020 by Cameron Tauxe.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.