NAME
DoubleBlind - Perl extension for data-obfuscation in double-blind experiments.
SYNOPSIS
use DoubleBlind;
sub cb($$$) { my ($n, $id, $label) = (shift, shift, shift);
rename "f$id.txt", "g$label.txt" or die; }
print process_shuffled \&cb, 55, 1;
DESCRIPTION
The intent is to simplify double-blind experiments in a "friendly" environment, when it is known that the experimentator would not try to consciously break the "coding". (For example, this may work when one does experiments on oneself, or when the generated "label" can be hidden from the subject.) The decoding can be easily done using a calculator, but (with exception of major computational savants) cannot be done unconsciously.
Several items are generated; each one has a "secret" id (which is an integer from user-specified interval), and a "public" label (which is a decimal fraction). A caller-supplied callback function is executed with these data; it is supposed that it would prepare the experimental data, and would mark it with the label.
In the simplest case, the callback would do all the work itself. For example, given files with names f1.txt .. f55.txt, this code would rename them to files with names similar to g2342.461.txt:
sub cb($$$) {
my ($n, $id, $label) = (shift, shift, shift);
rename "f$id.txt", "g$label.txt" or die;
}
print process_shuffled \&cb, 55, 1;
(additionally, it would output the decoding instructions). In more complicated cases, the callback might, e.g., output instructions for a third party to label the experimental data.
As an additional convenience, the items are supplied to the callback in a randomized order (the call order is the argument $n to the callback above). (For example, one could apply one of 55 transformations to each of the files above basing on the number $n.)
It should work for up to 1e4 items. (For best result, use 0 for the start index if the number of items is a power of 10; the top item number should not exceed 999999.) Since no attempt of speed optimization is done, large collections of items may require some computational resources.
process_shuffled($callback, $items, $start)
Generates $items items, each with an item ID, and an item label. An item ID is one of $items consecutive integers starting at $start. An item label is a decimal fraction about 2000 with 3 places after the decimal separator. The item ID can be restored as the last N digits before the decimal separator in the square of the label (here the last item has N digits).
For example, the label 1766.433 (its square is 3120285.543489) may correspond to the id 285 if the ids are between 1 and 5000. (For decoding, the calculator should better keep an extra digit after the separation when it emits the square; errors up to 2 units at this position are tolerated.) In absense of calculator, the squaring can be done with Perl as in
perl -wle "print 1766.433**2"
The callback is a reference to a function taking 3 arguments: the call number (increasing from 1 to $items), the id, and the label.
EXPORT
None by default.
SEE ALSO
The file ex.pl in the distribution contains a complete real-life example of usage to check which audio storing options are suitable for your acoustic environment. Together with instructions inside this script, one can create a CD with double-blind sample of
AUTHOR
Ilya Zakharevich, <ilyaz@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2008 by Ilya Zakharevich
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.