NAME

Argon

SYNOPSIS

# Start a manager on port 8000
cluster -p 8000

# Start a stand-alone node with 4 workers on port 8000
node -w 4 -p 8000

# Start a node and attach to a manager
node -w 4 -p 8001 -m somehost:8000

DESCRIPTION

Argon is a multi-platform distributed task processing system, designed with the goal of making the creation of a robust system simple.

USAGE

Argon systems are build from two pieces: managers and nodes. A manager (or cluster) is a process that manages one or more nodes. A node is a process that manages a pool of worker processes. A node can be stand-alone (unmanaged) or have a single manager. A manager does not need to know about nodes underneath it; nodes that are started with the -m parameter register their presence with the manager. If the node goes down or becomes unavailable, the manager will automatically account for this and route tasks to other nodes.

Stand-alone nodes

A stand-alone node does not register with a manager. It can accept tasks directly from clients. Tasks will be assigned or queued to worker processes. The number of worker processes is controlled with the -w parameter.

To start a basic node with 4 worker processes, listening on port 8000, use:

node -w 4 -p 8000

Note that by default, 4 workers are started, so -w isn't truly necessary here.

A node must know where to find any code that is used in the tasks it is given. This is accomplished with the -i parameter:

node -p 8000 -i /path/to/libs -i /path/to/otherlibs

As with any long-running process, workers started by the node may end up consuming a significant amount of memory. To address this, the node accepts the -r parameter, which controls the max number of tasks a worker may handle before it is restarted to release any memory it is holding. By default, workers may handle an indefinite number of tasks.

node -p 8000 -i /path/to/libs -r 250

Managed nodes

A managed node is one that registers itself with a manager/cluster process. The manager is added with the -m parameter:

node -p 8000 -i /path/to/libs -r 250 -m manager:8000

Once started, the node will connect to the server manager on port 8000 and attempt to register. Once registered, the node is immediately available to begin handling requests from the manager.

Although the node will technically still accept requests directly from clients in managed mode, this is bad practice and will cause inaccuracy in the manager's routing algorithm.

Managers

Managers (also called clusters) are servers that route tasks to the most available node. This is determined by analyzing the average processing time for a given node and comparing it with the number of tasks it is currently assigned.

Managers are started very simply:

cluster -p 8000

Managers do not execute arbitrary code and therefore do not need to know where any libraries are stored.

Queues

Nodes and managers both maintain a bounded queue. As requests come in, they are added to the queue. If the queue is full, the task is rejected.

The reason for this is that when the system is under high load this avoids the creation of a large backlog of tasks. A large backlog acts like a traffic jam, affecting system responsiveness for a much longer period as the backlog is cleared before it returns to normal operation.

Instead, rejected tasks are automatically retried by the client using an algorithm designed to prevent overloading the system with retry requests. By default, the client will retry an unlimited number of times (although this is configurable).

The size of the queue is controlled with -l (lower-case L, for limit) parameter. This parameter applies to both nodes and managers. By default, it is set to 64, although this value may not be optimal for your hardware and worker count. A good rule of thumb is to allow 8-16 slots in the queue per worker. For a node, this means the number of workers directly managed by the ndoe. For a cluster, this means the total number of workers expected to be available to it through its registered nodes.

Clients

The Argon::Client class provides a simple way to converse with an Argon system:

use Argon::Client;

my $client = Argon::Client->new(port => 8000, host => 'some.host.name');
$client->connect;
my $result = $client->process(
    class  => 'Some::Class', # with Argon::Role::Task
    params => [ foo => 'bar', baz => 'bat' ],
);

The only requirement is that all nodes in the system know where Some::Class is located. See the -i parameter above to node.

Multiplexing clients

"process" in Argon::Client does not return until the task has been completed. However, Argon is implemented using Coro, allowing the process method to yield to other threads while it waits for its result. This makes it extremely simple to process multiple tasks through multiple clients at the same time.

use Coro;
use Argon::Client;

# Assume a list of tasks, where each element is an array ref of C<[$class,
# $params]>.
my @tasks;

# Create a simple pool of client objects
my $clients = Coro::Channel->new();
for (1 .. 4) {
    my $client = Argon::Client->new(port => 8000, host => 'some.host.name');
    $clients->put($client);
}

# Loop over the task list
my @pending;
while (my ($class, $params) = pop @tasks) {
    # Get the next available client. This blocks until a client is
    # available from the Coro::Channel ($clients).
    my $client = $clients->get();

    # Send the client the task in a Coro thread, storing the return value
    # in @pending.
    push @pending, async {
        # Send the task
        my $result = $client->process(
            class  => $class,
            params => $params,
        );

        # Release the client back into the pool
        $clients->put($client);

        # Do something with result
        ...
    };
}

# Wait on each thread to complete
$_->join foreach @pending;

See bin/bench for a more robust implementation.

Task design

Tasks must use the Argon::Role::Task class. Tasks will be created by instantiating the class with the parameters provided to the "process" in Argon::Client method. Task classes must also have a run method which performs the task's work and returns the result.

CAVEATS

As with all such systems, performance is greatly affected by the size of the messages sent. Therefore, it is recommended to keep as much data used by a task as possible in a database or other network-accessible storage location. For example, design your task such that it accepts an id that can be used to access the task data from a database, and have it return an id which can be used to access the result.

AUTHOR

Jeff Ober mailto:jeffober@gmail.com

LICENSE

BSD license