Design of Disbatch 4

This documents the Disbatch Execution Node (DEN) protocol and schema. All DENs using the same MongoDB database must follow this, as well as the Disbatch Task Runners (DTR) used by the DENs and any Disbatch Command Interfaces (DCI) using the database.

Overview

The core components of Disbatch 4 are one or more DENs, the DTRs, one or more DCIs, and MongoDB which is used as the data store and for all message passing.

The DEN ensures the database is set up correctly and runs the appropriate number of tasks for each queue.

The DTR is called by the DEN when it claims a task. The DTR is responsible for loading the plugin and running the task, as well as updating the task document when the task completes.

The DCI provides a JSON REST API for the DENs, as well as a web browser interface to the API. An additional CLI tool interacts with this API.

Each DEN monitors one or more queues, which may be restricted to a subset of DENs, restricted from a subset of DENs, or available to all DENs.

Each queue uses a specific Disbatch plugin for its tasks. If the plugin listed for a queue is not available, the queue is ignored. The queue sets the limit of tasks to run across all DENs for that queue with the threads field.

A DEN may also limit the number of tasks to run on that DEN with the maxthreads field in its nodes collection's document.

Each task links to a single queue.

On startup, each DEN:

At a set interval (1 second), each DEN:

Orphaned Tasks

Before a DEN starts processing tasks, it must clean up any orphaned tasks that were not put into a completed state by setting their status to -6. It can also check for this periodically. A recommendation is checking for tasks with a status of -1 and an mtime of older than 5 minutes.

Task Lifecycle

Each task is initialised with its node as null (unclaimed) and status as -2 (queued).

DENs claim tasks from queues using findOneAndUpdate(filter, update, options), (which returns the task object), by putting them into a claimed state (setting status to -1 and node to the hostname of the DEN) until the per-DEN maxthreads and per-queue theads thresholds are reached. The DEN then notifies the DTR of the task, and the DTR puts the task into a running state (setting status to 0). When the plugin has finished, it reports back the status, stdout, and stderr of the task to the DTR. The DTR then updates the task's document in MongoDB with these values as well as the mtime.

findOneAndUpdate(filter, update, options)

See your MongoDB driver's documentation on its implemenation of findOneAndUpdate(). If it is not available, you can use findAndModify().

This ensures that there will be no race conditions amongst DENs, even in a sharded or replicated MongoDB cluster.

Database Collections

Nodes

DEN documents are in the nodes collection.

Specification

The following elements must be included when registering a DEN:

Each node document can also contain:

MongoDB will create an ObjectId for the node's _id.

Example
{
    "_id" : ObjectId("56fc05087aa3a33942e42a6a"),
    "node" : "mig01.example.com",
    "timestamp" : ISODate("2016-04-26T19:26:33.649Z"),
    "maxthreads": 5,
}

Queues

Queue documents are in the queues collection.

Specification

The following elements must be included when creating a queue:

The following elements may be included when creating a queue:

MongoDB will create an ObjectId for the queue's _id.

Example
{
    "_id" : ObjectId("571f8951b75bf335634ec271"),
    "plugin" : "Disbatch::Plugin::Demo",
    "name" : "demo",
    "threads" : 0,
    "sort" : "fifo"
}

Tasks

Task documents are in the tasks collection.

Specification

The following elements must be included when creating a task:

The following elements should be created by the DEN when the task finishes, and should be set to null when created:

MongoDB will create an ObjectId for the task's _id.

Example
{
    "_id" : ObjectId("571fac85ee63413233049fbd"),
    "params" : {
        "migration" : "oneoff",
        "user1" : "ashley@example.com",
        "user2" : "ashley@example.com",
        "commands" : "*",
    },
    "ctime" : ISODate("2016-04-26T17:59:33Z"),
    "status" : 1,
    "mtime" : ISODate("2016-04-26T18:37:40Z"),
    "queue" : ObjectId("54a700074b485f0b00000000"),
    "node" : "mig01.example.com",
    "stderr" : ObjectId("571fbac9d8590b78fe4830b4"),
    "stdout" : ObjectId("571fbac9d8590b78fe4830b2")
}

Task Status Codes

These are the standard status codes in Disbatch 4:

Formerly defined status codes that may be used for other needs:

You may use additional integer values for status codes. As a postive integer indicates that a task has finished, your plugin must return a positive integer for the status. Any unused negative value may be set when a task is queued to prevent the DEN from claiming it.

GridFS for Task stdout and stderr

Task stdout and stderr can be stored in the task as strings or by using MongoDB's GridFS specification.

As a document cannot be more than 16MB, GridFS will be needed to store stdout and stderr if they can cause the task document to exceed this size.

Disbatch uses the collections tasks.files and tasks.chunks instead of the default fs.files and fs.chunks, and the chunks are stored to ensure data is of type String and not BinData. Each file contains metadata: { task_id: task._id }, and the filenames are stdout or stderr.

Config file

On startup, the DEN, DCI, and DTR read a JSON format configuration file.

Mandatory settings are:
Optional MongoDB settings are:
Additional settings that may be specified are: