README - metacpan.org

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
NAME
Session::Token - Secure, efficient, simple random session token
generation
SYNOPSIS
Simple 128-bit session token
my $token = Session::Token->new->get;
## 74da9DABOqgoipxqQDdygw
Keep generator around
my $generator = Session::Token->new;
my $token = $generator->get;
## bu4EXqWt5nEeDjTAZcbTKY
my $token2 = $generator->get;
## 4Vez56Zc7el5Ggx4PoXCNL
Custom minimum entropy in bits
my $token = Session::Token->new(entropy => 256)->get;
## WdLiluxxZVkPUHsoqnfcQ1YpARuj9Z7or3COA4HNNAv
Custom alphabet and length
my $token = Session::Token->new(alphabet => 'ACGT', length => 100_000_000)->get;
## AGTACTTAGCAATCAGCTGGTTCATGGTTGCCCCCATAG...
DESCRIPTION
This module provides a secure, efficient, and simple interface for
creating session tokens, password reset codes, temporary passwords,
random identifiers, and anything else you can think of.
When a Session::Token object is created, 1024 bytes are read from
"/dev/urandom" (Linux, Solaris, most BSDs), "/dev/arandom" (some older
BSDs), or Crypt::Random::Source::Strong::Win32 (Windows). These bytes
are used to seed the ISAAC-32
<http://www.burtleburtle.net/bob/rand/isaacafa.html> pseudo random
number generator.
Once a generator is created, you can repeatedly call the "get" method on
the generator object and it will return a new token each time.
IMPORTANT: If your application calls "fork", make sure that any
generators are re-created in one of the processes after the fork since
forking will duplicate the generator state and both parent and child
processes will go on to produce identical tokens (just like perl's rand
after it is seeded).
After the generator context is created, no system calls are used to
generate tokens. This is one way that Session::Token helps with
efficiency. However, this is only important for certain use cases
(generally not web sessions).
ISAAC is a cryptographically secure PRNG that improves on the well-known
RC4 algorithm in some important areas. For instance, it doesn't have
short cycles or initial bias like RC4 does. A theoretical shortest
possible cycle in ISAAC is "2**40", although no cycles this short have
ever been found (and probably don't exist at all). On average, ISAAC
cycles are "2**8295".
GENERATORS AND URANDOM
Developers must choose whether a single token generator will be kept
around and used to generate all tokens, or if a new Session::Token
object will be created every time a token is needed. As mentioned above,
this module accesses urandom in its constructor for seeding purposes,
but not subsequently while generating tokens.
Generally speaking the generator should be kept around and re-used.
Probably the most important reason for this is that generating a new
token from an existing generator cannot fail due to a full
file-descriptor table. Creating a new Session::Token object for every
token can fail because, as described above, the constructor needs to
open "/dev/urandom" and this will not succeed if all allotted
descriptors are in use, or if the read is interrupted by a signal. In
these events a perl exception will be thrown.
Programs that re-use a generator are more likely to be portable to
"chroot"ed environments where "/dev/urandom" may not be present.
Finally, accessing urandom frequently is inefficient because it requires
making system calls and because (at least on linux) reading from urandom
acquires a system-wide kernel lock.
On the other hand, re-using a generator may be undesirable because
servers are typically started immediately after a system reboot and the
kernel's randomness pool might be poorly seeded at that point.
Similarly, when starting a virtual machine a previously used entropy
pool state may be restored. In these cases all subsequently generated
tokens will be derived from a weak/predictable seed. For this reason,
you might choose to defer creating the generator until the first request
actually comes in, periodically re-create the generator object, and/or
manually handle seeding in some other way.
Programs that assume opening "/dev/urandom" will always succeed can
return session tokens based only on the contents of nulled or
uninitialised memory. This is not the case with Session::Token since its
constructor will always throw an exception if it can't seed itself. Some
modern systems provide system calls with fewer failure modes (ie
`getentropy(2)` on OpenBSD and `getrandom(2)` on linux). Future versions
of Session::Token will likely use these system calls when available.
CUSTOM ALPHABETS
Being able to choose exactly which characters appear in your token is
sometimes useful. This set of characters is called the *alphabet*. The
default alphabet size is 62 characters: uppercase letters, lowercase
letters, and digits ("a-zA-Z0-9").
For some purposes, base-62 is a sweet spot. It is more compact than
hexadecimal encoding which helps with efficiency because session tokens
are usually transferred over the network many times during a session
(often uncompressed in HTTP headers).
Also, base-62 tokens don't use "wacky" characters like base-64 encodings
do. These characters sometimes cause encoding/escaping problems (ie when
embedded in URLs) and are annoying because often you can't select tokens
by double-clicking on them.
Although the default is base-62, there are all kinds of reasons for
using another alphabet. One example is if your users are reading tokens
from a print-out or SMS or whatever, you may choose to omit characters
like "o", "O", and 0 that can easily be confused.
To set a custom alphabet, just pass in either a string or an array of
characters to the "alphabet" parameter of the constructor:
Session::Token->new(alphabet => '01')->get;
Session::Token->new(alphabet => ['0', '1'])->get; # same thing
Session::Token->new(alphabet => ['a'..'z'])->get; # character range
Constructor args can be a hash-ref too:
Session::Token->new({ alphabet => ['a'..'z'] })->get;
ENTROPY
There are two ways to specify the length of tokens. The most primitive
is in terms of characters:
print Session::Token->new(length => 5)->get;
## -> wpLH4
But the primary way is to specify their minimum entropy in terms of
bits:
print Session::Token->new(entropy => 24)->get;
## -> Fo5SX
In the above example, the resulting token contains at least 24 bits of
entropy. Given the default base-62 alphabet, we can compute the exact
entropy of a 5 character token as follows:
$ perl -E 'say 5 * log(62)/log(2)'
29.7709815519344
So these tokens have about 29.8 bits of entropy. Note that if we removed
one character from this token, it would bring it below our desired 24
bits of entropy:
$ perl -E 'say 4 * log(62)/log(2)'
23.8167852415475
The default minimum entropy is 128 bits. Default tokens are 22
characters long and therefore have about 131 bits of entropy:
$ perl -E 'say 22 * log(62)/log(2)'
130.992318828511
An interesting observation is that in base-64 representation, 128-bit
minimum tokens also require 22 characters and that these tokens contain
only 1 more bit of entropy.
Another Session::Token design criterion is that all tokens should be the
same length. The default token length is 22 characters and the tokens
are always exactly 22 characters (no more, no less). Instead of tokens
that are exactly "N" characters, some libraries that use arbitrary
precision arithmetic end up creating tokens of *at most* "N" characters.
A fixed token length is nice because it makes writing matching regular
expressions easier, simplifies storage (you never have to store length),
causes various log files and things to line up neatly on your screen,
and ensures that encrypted tokens won't leak token entropy due to length
(see "VARIABLE LENGTH TOKENS").
In summary, the default token length of exactly 22 characters is a
consequence of these decisions: base-62 representation, 128 bit minimum
token entropy, and fixed token length.
MOD BIAS
Some token generation libraries that implement custom alphabets will
generate a random value, compute its modulus over the size of an
alphabet, and then use this modulus to index into the alphabet to
determine an output character.
Assume we have a uniform random number source that generates values in
the set "[0,1,2,3]" (most PRNGs provide sequences of bits, in other
words power-of-2 size sets) and wish to use the alphabet "abc".
If we use the naïve modulus algorithm described above then 0 maps to
"a", 1 maps to "b", 2 maps to "c", and 3 *also* maps to "a". This
results in the following biased distribution for each character in the
token:
P(a) = 2/4 = 1/2
P(b) = 1/4
P(c) = 1/4
Of course in an unbiased distribution, each character would have the
same chance:
P(a) = 1/3
P(b) = 1/3
P(c) = 1/3
Bias is undesirable because certain tokens are obvious starting points
when token guessing and certain other tokens are very unlikely. Tokens
that are unbiased are equally likely and therefore there is no obvious
starting point with them.
Session::Token provides unbiased tokens regardless of the size of your
alphabet (though see the "INTRODUCING BIAS" section for a mis-use
warning). It does this in the same way that you might simulate producing
unbiased random numbers from 1 to 5 given an unbiased 6-sided die:
Re-roll every time a 6 comes up.
In the above example, Session::Token eliminates bias by only using
values of 0, 1, and 2 (the "t/no-mod-bias.t" test contains some more
notes on this topic).
Note that mod bias can be made arbitrarily small by increasing the
amount of data consumed from a random number generator (provided that
arbitrary precision modulus is available). Because this module
fundamentally avoids mod bias, it can use each of the 4 bytes from an
ISAAC-32 word for a separate character (excepting "re-rolls").
EFFICIENCY OF RE-ROLLING
Throwing away a portion of random data in order to avoid mod bias is
slightly inefficient. How many bytes from ISAAC do we expect to consume
for every character in the token? It depends on the size of the
alphabet.
Session::Token masks off each byte using the smallest power of two
greater than or equal to the alphabet size minus one so the probability
that any particular byte can be used is:
P = alphabet_size / next_power_of_two(alphabet_size)
For example, with the default base-62 alphabet "P" is "62/64".
In order to find the average number of bytes consumed for each
character, calculate the expected value "E". There is a probability "P"
that the first byte will be used and therefore only one byte will be
consumed, and a probability "1 - P" that "1 + E" bytes will be consumed:
E = P*1 + (1 - P)*(1 + E)
E = P + 1 + E - P - P*E
0 = 1 - P*E
P*E = 1
E = 1/P
So for the default base-62 alphabet, the average number of bytes
consumed for each character in a token is:
E = 1/(62/64) = 64/62 ≅ 1.0323
Because of the next power of two masking optimisation described above,
"E" will always be less than 2. In the worst case scenario of an
alphabet with 129 characters, "E" is roughly 1.9845.
This minor inefficiency isn't an issue because the ISAAC implementation
used is quite fast and this module is very thrifty in how it uses
ISAAC's output.
INTRODUCING BIAS
If your alphabet contains the same character two or more times, this
character will be more biased than a character that only occurs once.
You should be careful that your alphabets don't repeat in this way if
you are trying to create random session tokens.
However, if you wish to introduce bias this library doesn't try to stop
you. (Maybe it should print a warning?)
Session::Token->new(alphabet => '0000001', length => 5000)->get; # don't do this
## -> 0000000000010000000110000000000000000000000100...
Due to a limitation discussed below, alphabets larger than 256 aren't
currently supported so your bias can't get very granular.
Aside: If you have a constant-biased output stream like the above
example produces then you can re-construct an un-biased bit sequence
with the von neumann algorithm. This works by comparing pairs of bits.
If the pair consists of identical bits, it is discarded. Otherwise the
order of the different bits is used to determine an output bit, ie 00
and 11 are discarded but 01 and 10 are mapped to output bits of 0 and 1
respectively. This only works if the bias in each bit is constant (like
all characters in a Session::Token are).
ALPHABET SIZE LIMITATION
Due to a limitation in this module's code, alphabets can't be larger
than 256 characters. Everywhere the above manual says "characters" it
actually means bytes. This isn't a Unicode limitation per se, just the
maximum size of the alphabet. If you like, you can map tokens onto new
alphabets as long as they aren't more than 256 characters long. Here is
how to generate a 128-bit minimum entropy token using the lowercase
greek alphabet (note that both forms of lowercase sigma are included
which may not be desirable):
use utf8;
my $token = Session::Token->new(alphabet => [map {chr} 0..25])->get;
$token = join '', map {chr} map {ord($_) + ord('α')} split //, $token;
# ρφνδαπξδββφδοςλχτμγσψδψζειετ
Here's an interesting way to generate a uniform random integer between 0
to 999 inclusive:
0 + Session::Token->new(alphabet => ['0'..'9'], length => 3)->get
If you wanted to natively support high code points, there is no point in
hard-coding a limitation on the size of Unicode or even the (higher)
limitation of perl characters. Instead, arbitrary precision "characters"
should be supported with bigint. Here's an example of something similar
in lisp: isaac.lisp <http://hcsw.org/downloads/isaac.lisp>.
This module is not however designed to be the ultimate random number
generator and at this time I think changing the design as described
above would interfere with its goal of being secure, efficient, and
simple.
TOKEN TEMPLATES
String::Random has a method called "randpattern" where you provide a
pattern that serves as a template when creating the token. You define
the meaning of 1 or more template characters and each one that occurs in
the pattern is replaced by a random character from a corresponding
alphabet.
Andrew Beverley requested this feature for Session::Token and I
suggested approximately the following:
use Session::Token;
sub token_template {
my (%m) = @_;
%m = map { $_ => Session::Token->new(alphabet => $m{$_}, length => 1) } keys %m;
return sub {
my $v = shift;
$v =~ s/(.)/exists $m{$1} ? $m{$1}->get : $1/eg;
return $v;
};
}
In order to use "token_template" you should pass it key-vaue pairs of
the different token characters and the alphabets they represent. It will
return a sub that should be passed the template pattern and it will
return the resulting random tokens.
For example, here is how to create UUID version 4
<https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_.
28random.29> tokens:
sub uuid_v4_generator {
my $t = token_template(
x => [ 0..9, 'a'..'f' ],
y => [ 8, 9, 'a', 'b' ],
);
return sub {
return $t->('xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx');
}
}
"uuid_v4_generator" returns a generator function that will return tokens
of the following form:
1b782499-9913-4726-a80a-25e7b2221a7c
90f85a64-d826-43bf-98e7-94ba87406bfb
b8b73175-3cce-4861-b43b-3dec5ed5d641
3afb64ab-6de3-4647-bbff-eb94dfa7d4b0
447d2001-2aec-4d32-9910-8c289ae34c48
Note that characters in the pattern which don't have template characters
defined ("-" and 4 in the above example) are passed through to the
output token.
SEEDING
This module is designed to always seed itself from your kernel's secure
random number source. You should never need to seed it yourself.
However if you know what you're doing you can pass in a custom seed as a
1024 byte long string. For example, here is how to create a "null
seeded" generator:
my $gen = Session::Token(seed => "\x00" x 1024);
This is done in the test-suite to compare against Jenkins' reference
ISAAC output, but obviously don't do this in regular applications
because the generated tokens will be the same every time your program is
run.
One valid reason for manually seeding is if you have some reason to
believe that there isn't enough entropy in your kernel's randomness pool
and therefore you don't trust "/dev/urandom". In this case you should
acquire your own seed data from somewhere trustworthy (maybe
"/dev/random" or a previously stored trusted seed).
VARIABLE LENGTH TOKENS
As mentioned above, all tokens produced by a Session::Token generator
are the same length. If you prefer tokens of variable length, it is
possible to post-process the tokens in order to achieve this so long as
you keep some things in mind.
If you randomly truncate tokens created by Session::Token, be careful
not to introduce bias. For example, if you choose the length of the
token as a uniformly distributed random length between 8 and 10, then
the output will be biased towards shorter token sizes. Length 8 tokens
should appear less frequently than length 9 or 10 tokens because there
are fewer of them.
Another approach is to eliminate leading characters of a given value in
the same way as leading 0s are commonly eliminated from numeric
representations. Although this approach doesn't introduce bias, the
tokens 1 and 01 are not distinct so it does not increase token entropy
given a fixed maximum token length which is the main reason for
preferring variable length tokens. The ideal variable length algorithm
would generate both 1 and 01 tokens (with identical frequency of
course).
Implementing unbiased, variable-length tokens would complicate the
Session::Token implementation especially since you should still be able
to specify minimum entropy variable-length tokens. Minimum entropy is
the primary input to Session::Token, not token length. This is the
reason that the default token length of 22 isn't hard-coded anywhere in
the Session::Token source code (but 128 is).
The final reason that Session::Token discourages variable length tokens
is that they can leak token information through a side-channel. This
could occur when a message is encrypted but the length of the original
message can be inferred from the encrypted ciphertext.
BUGS
Should check for biased alphabets and print warnings.
Would be cool if it could detect forks and warn or re-seed in the child
process (without incurring "getpid" overhead).
There is currently no way to extract the seed from a Session::Token
object. Note when implementing this: The saved seed must either store
the current state of the ISAAC round as well as the 1024 byte "randsl"
array or else do some kind of minimum fast forwarding in order to
protect against a partially duplicated output-stream bug.
Doesn't work on perl 5.6 and below due to the use of ":raw" (thanks CPAN
testers). It could probably use "binmode" instead, but meh.
On windows we use Crypt::Random::Source::Strong::Win32 which has a big
dependency tree. We should instead use a slimmer module like
Crypt::Random::Seed.
COMMAND-LINE APP
There is a command-line application called App::Session::Token which is
a convenience wrapper around Session::Token. You can generate session
tokens by running the "session-token" binary:
$ echo "Your password is `session-token`"
Your password is 8Yom6z4AeB1RXxCGzklJFt
It supports all the options of this module via command line parameters,
and multiple session tokens can be generated with the "--num" (aka "-n")
switch. For example:
$ session-token --alphabet ABC --entropy 32 --num 5
BACAACABCCCCAACBBBCAB
BCBACACBBCACCBABABCBA
ABBBCBABBACBBBCBBBCCA
AACCBBBCCAAACBABACABC
CCABCABBCCCAACAAACCAA
SEE ALSO
The Session::Token github repo
<https://github.com/hoytech/Session-Token>
App::Session::Token
Presentation for Toronto Perl Mongers
<https://www.youtube.com/watch?v=c2KZBTtrmZE?start=3705>
There are lots of different modules for generating random data. If the
characterisations of any of them below are inaccurate or out-of-date,
please file a github issue and I will correct them.
Like this module, perl's "rand()" function implements a user-space PRNG
seeded from "/dev/urandom". However, perl's "rand()" is not secure. Perl
doesn't specify a PRNG algorithm at all. On linux, whatever it is is
seeded with a mere 4 bytes from "/dev/urandom".
Data::Token is the first thing I saw when I looked around on CPAN. It
has an inflexible and unspecified alphabet. It tries to get its source
of unpredictability from UUIDs and then hashes these UUIDs with SHA-1. I
think this is bad design because some standard UUID formats aren't
designed to be unpredictable at all. This is acknowledged in RFC 4122
section 6: "Do not assume that UUIDs are hard to guess; they should not
be used as security capabilities (identifiers whose mere possession
grants access)." With certain UUIDs, knowing a target's MAC address or
the rough time the token was issued may help you predict a reduced area
of token-space to concentrate guessing attacks upon. I don't know if
Data::Token uses these types of UUIDs or the potentially secure "version
4" UUIDs, but because this wasn't addressed in the documentation and
because of an apparent misapplication of hash functions (if you really
had an unpredictable UUID, there would be no need to hash), I don't feel
good about using this module.
There are several decent random number generators like
Math::Random::Secure and Crypt::URandom but they usually don't implement
alphabets and some of them require you open or read from "/dev/urandom"
for every chunk of random bytes. Note that Math::Random::Secure does
prevent mod bias in its random integers and could be used to implement
unbiased alphabets (slowly).
String::Random has a neat regexp-like language for specifying random
tokens which is more flexible than alphabets. However, it uses perl's
"rand()" and its documentation fails to discuss performance, bias, or
security. See the "TOKEN TEMPLATES" section for a similar feature.
String::Urandom has alphabets, but it uses the flawed mod algorithm
described above and opens "/dev/urandom" for every token.
There are other modules like Data::Random, App::Genpass,
String::MkPasswd, Crypt::RandPasswd, Crypt::GeneratePassword, and
Data::SimplePassword but they use insecure PRNGs such as "rand()" or
mersenne twister, don't adequately deal with bias, and/or don't let you
specify generic alphabets.
Bytes::Random::Secure has alphabets (aka "bags"), uses ISAAC, and avoids
mod bias using the re-roll algorithm. It is much slower than
Session::Token (even when using Math::Random::ISAAC::XS) but does
support alphabets larger than 256 and might work in environments without
XS.
Neil Bowers has conducted a 3rd party review
<http://neilb.org/reviews/passwords.html> of various token/password
generation modules including Session::Token.
Leo Zovic has created a Common Lisp implementation of session-token
<https://github.com/Inaimathi/session-token>.
AUTHOR
Doug Hoyte, "<doug@hcsw.org>"
COPYRIGHT & LICENSE
Copyright 2012-2016 Doug Hoyte.
This module is licensed under the same terms as perl itself.
ISAAC code:
By Bob Jenkins. My random number generator, ISAAC. Public Domain

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)