NAME

Directory::Queue::Simple - object oriented interface to a simple directory based queue

SYNOPSIS

use Directory::Queue::Simple;

#
# sample producer
#

$dirq = Directory::Queue::Simple->new(path => "/tmp/test");
foreach $count (1 .. 100) {
    $name = $dirq->add("element $count\n");
    printf("# added element %d as %s\n", $count, $name);
}

#
# sample consumer (one pass only)
#

$dirq = Directory::Queue::Simple->new(path => "/tmp/test");
for ($name = $dirq->first(); $name; $name = $dirq->next()) {
    next unless $dirq->lock($name);
    printf("# reading element %s\n", $name);
    $data = $dirq->get($name);
    # one could use $dirq->unlock($name) to only browse the queue...
    $dirq->remove($name);
}

DESCRIPTION

This module is very similar to Directory::Queue but uses a different way to store data in the filesystem, using less directories. Its API is almost identical.

Compared to Directory::Queue, this module:

  • is simpler

  • is faster

  • uses less space on disk

  • can be given existing files to store

  • does not support schemas

  • can only store and retrieve binary strings

  • is not compatible (at filesystem level) with Directory::Queue

Please refer to Directory::Queue for general information about directory queues.

CONSTRUCTOR

The new() method can be used to create a Directory::Queue::Simple object that will later be used to interact with the queue. The following attributes are supported:

path

the queue toplevel directory (mandatory)

umask

the umask to use when creating files and directories (default: use the running process' umask)

granularity

the time granularity for intermediate directories, see "DIRECTORY STRUCTURE" (default: 60)

METHODS

The following methods are available:

new()

return a new Directory::Queue::Simple object (class method)

copy()

return a copy of the object; this can be useful to have independent iterators on the same queue

path()

return the queue toplevel path

id()

return a unique identifier for the queue

count()

return the number of elements in the queue

first()

return the first element in the queue, resetting the iterator; return an empty string if the queue is empty

next()

return the next element in the queue, incrementing the iterator; return an empty string if there is no next element

add(DATA)

add the given data (a binary string) to the queue and return the corresponding element name

add_ref(REF)

add the given data (a reference to a binary string) to the queue and return the corresponding element name, this can avoid string copies with large strings

add_path(PATH)

add the given file (identified by its path) to the queue and return the corresponding element name, the file must be on the same filesystem and will be moved to the queue

lock(ELEMENT[, PERMISSIVE])

attempt to lock the given element and return true on success; if the PERMISSIVE option is true (which is the default), it is not a fatal error if the element cannot be locked and false is returned

unlock(ELEMENT[, PERMISSIVE])

attempt to unlock the given element and return true on success; if the PERMISSIVE option is true (which is not the default), it is not a fatal error if the element cannot be unlocked and false is returned

touch(ELEMENT)

update the access and modification times on the element's file to indicate that it is still being used; this is useful for elements that are locked for long periods of time (see the purge() method)

remove(ELEMENT)

remove the given element (which must be locked) from the queue

get(ELEMENT)

get the data from the given element (which must be locked) and return a binary string

get_ref(ELEMENT)

get the data from the given element (which must be locked) and return a reference to a binary string, this can avoid string copies with large strings

get_path(ELEMENT)

get the file path of the given element (which must be locked), this file can be read but not removed, you must use the remove() method for this

purge([OPTIONS])

purge the queue by removing unused intermediate directories, removing too old temporary elements and unlocking too old locked elements (aka staled locks); note: this can take a long time on queues with many elements; OPTIONS can be:

maxtemp

maximum time for a temporary element (in seconds, default 300); if set to 0, temporary elements will not be removed

maxlock

maximum time for a locked element (in seconds, default 600); if set to 0, locked elements will not be unlocked

DIRECTORY STRUCTURE

The toplevel directory contains intermediate directories that contain the stored elements, each of them in a file.

The names of the intermediate directories are time based: the element insertion time is used to create a 8-digits long hexadecimal number. The granularity (see the new() method) is used to limit the number of new directories. For instance, with a granularity of 60 (the default), new directories will be created at most once per minute.

Since there is usually a filesystem limit in the number of directories a directory can hold, there is a trade-off to be made. If you want to support many added elements per second, you should use a low granularity to keep small directories. However, in this case, you will create many directories and this will limit the total number of elements you can store.

The elements themselves are stored in files (one per element) with a 14-digits long hexadecimal name SSSSSSSSMMMMMR where:

SSSSSSSS

represents the number of seconds since the Epoch

MMMMM

represents the microsecond part of the time since the Epoch

R

is a random digit used to reduce name collisions

A temporary element (being added to the queue) will have a .tmp suffix.

A locked element will have a hard link with the same name and the .lck suffix.

AUTHOR

Lionel Cons http://cern.ch/lionel.cons

Copyright CERN 2011