NAME

Net::OnlineCode - A rateless forward error correction scheme

SYNOPSIS

use strict;

# Base class only exports routines for doing xor
use Net::OnlineCode ':xor';

my @strings = ("abcde", "     ", "ABCDE", "\0\0\0\0\0");

# xor routines take a reference to a destination string, which is
# modified by xoring it with all the other strings passed in. The
# calculated value is also returned.

# "safe" xor routine is a pure Perl implementation
my $result1 = safe_xor_strings(\$strings[0], @strings[1..3]);

# "fast" xor routine is implemented in C
my $result2 = fast_xor_strings(\$strings[0], @strings[1..3]);

DESCRIPTION

This module implements the common functions required for the Net::OnlineCode::Encoder and Net::OnlineCode::Decoder modules. Apart from the two xor library routines shown above, there are no other user-callable methods or functions.

The remainder of this document will give a brief overview of the Online Code algorithms. For a programmer's view of how to use this collection of modules, refer to the man pages for the Encoder and Decoder modules.

ONLINE CODES

Briefly, Online Codes are a scheme that allows a sender to break up a message (eg, a file) into a series of "check blocks" for transmission over a lossy network. When the receiver has received and decoded enough check blocks, they will ultimately be able to recover the original message in its entirety.

Online Codes differ from traditional "forward error correcting" schemes in two important respects:

  • they are fast to calculate (on both sending and receiving end); and

  • they are "rateless", meaning that the sender can send out a (practically) infinite stream of check blocks. The receiver typically only has to correctly receive a certain number of them (usually a only small percentage more than the number of original message blocks) in order to decode the full message.

When using a traditional error-correction scheme, the sender usually has to set up the encoder parameters to take account of the expected packet loss rate. In contrast, with Online Codes, the sender effectively doesn't care about what the packet loss rate: it just keeps sending new check blocks until either the receiver(s) acknowledge the message as having been decoded, or until it has a reasonable expectation that the message should have been decoded.

ONLINE CODES IN MORE DETAIL

The fundamental idea used in Online Codes is to xor some number of message blocks together to form either auxiliary blocks (which are internal to the algorithm) or check blocks (which are sent across the network). Each check block is sent along with a block ID, which encodes information about which message blocks (or auxiliary blocks) comprise that check block. Initially, the only check blocks that a receiver can use are those that are comprised of only a single message block, but as more message blocks are decoded, they can be xored (or, in algebraic terms, "substituted") into each pending (unsolved) check block. Eventually, given enough check blocks, the receiver will be able to solve each of the message blocks.

ENCODING/DECODING STEPS

Encoding consists of two parts:

  • Before transmission begins, some number of auxiliary blocks are created by xoring a random selection of message blocks together.

  • For each check block that is to be transmitted, a random selection of auxiliary and/or message blocks (collectively referred to as "composite blocks") are xored together.

Decoding follows the same steps as in the Encoder, except in reverse. Each received check block can potentially solve one unknown auxiliary or message block directly. Further, every time an auxiliary or message block becomes solved, that value can be "substituted in" to any check block that has not yet been fully decoded. Those check blocks may then be able to solve more message or auxiliary blocks.

PSEUDO-RANDOM NUMBER GENERATORS AND BLOCK IDS

When the receiver receives a check block, it needs to know which message and/or auxiliary blocks it is composed of. Likewise, it needs to know which message blocks each auxiliary block is composed of. This is achieved by having both the sender and receiver side use an identical pseudo-random number generator algorithm. Since both sides are using an identical PRNG, they can both use it to randomly select which message blocks comprise each auxiliary block, and which composite blocks comprise each check block.

Naturally, for this to work, not only should the sender and receiver both be using the same PRNG algorithm, but they also need to be using the same PRNG seeds. This is where Block IDs (and also, for the auxiliary block mapping, File IDs) come in. For check blocks, the sender picks a (truly) random Block ID and uses it to seed the PRNG. Then, using the PRNG, it pseudo-randomly selects some number of composite blocks. It then sends the Block ID along with the xor of all the selected composite blocks. The sender then uses the Block ID to seed its own PRNG, so when it pseudo-randomly selects the list of composite blocks, it should be the same as that selected by the sender.

A similar scheme is used at the start of the transmission to determine which message blocks are xored to create the auxiliary blocks.

The upshot of this is that Block (and File) IDs only need to be a fixed size, regardless of how many message blocks there are, how many composite blocks are included in a check block, and so on. This makes it much easier to design a packet format and process it at the sending and receiving sides.

IMPLEMENTATION DETAILS

The module is a fairly faithful implementation of Maymounkov and Maziere's paper describing the scheme. There are some slight variations:

  • in the case where the epsilon value would produce a max degree (F) value greater than the number of message blocks, it (ie, epsilon) is increased until F <= # of message blocks. The code to do this is based on Python code by Gwylim Ashley.

  • duplicate check blocks are quashed (since they provide no new information)

  • the graph decoding algorithm is replaced with a functionally equivalent one

  • message blocks need not decoded immediately: rather, the calling application makes the choice about when to do the xors (this may be more time-efficient, particularly if the application is storing check blocks to secondary storage)

Apart from that, the original paper does not specify a PRNG algorithm. This module implements one using SHA-1, since it is portable across platforms and readily available.

IMPORTANT NOTE

This is the initial software release. It should be free of serious bugs, however it has a few shortcomings:

  • Memory usage is particularly high, especially if the number of check blocks runs into the thousands; and

  • The Decoder must see more check blocks than the original paper suggests should be the case

  • Although intended to be used with POE, the Decoder is currently not very suitable for use in a POE callback because it does not yield until it has done as much graph decoding as possible. If POE is being used to handle incoming network packets, the delay can cause it to drop packets. A future version will correct this by reducing the amount of work done per callback (so that control is returned to the POE event loop sooner, with the possibility to queue the remaining outstanding work with more calls to the Decoder object)

As a result, it is likely that any future version that tries to address these two points may result in changes to the calling interfaces. See the TODO file that came in this distribution for more details.

SEE ALSO

Wikipedia page describing Online Codes

PDF/PostScript links:

Github repository for Gwylim Ashley's Online Code implementation (various parts of my code are based on this)

This module is part of the GnetRAID project. For project development page, see:

https://sourceforge.net/projects/gnetraid/develop

AUTHOR

Declan Malone, <idablack@users.sourceforge.net>

COPYRIGHT AND LICENSE

Copyright (C) 2013 by Declan Malone

This package is free software; you can redistribute it and/or modify it under the terms of the "GNU General Public License" ("GPL").

The C code at the core of this Perl module can additionally be redistributed and/or modified under the terms of the "GNU Library General Public License" ("LGPL"). For the purpose of that license, the "library" is defined as the unmodified C code in the clib/ directory of this distribution. You are permitted to change the typedefs and function prototypes to match the word sizes on your machine, but any further modification (such as removing the static modifier for non-exported function or data structure names) are not permitted under the LGPL, so the library will revert to being covered by the full version of the GPL.

Please refer to the files "GNU_GPL.txt" and "GNU_LGPL.txt" in this distribution for details.

DISCLAIMER

This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the "GNU General Public License" for more details.