NAME
Data::Password::Entropy - Calculate password strength
SYNOPSIS
use Data::Password::Entropy;
print "Entropy is ", password_entropy("pass123"), " bits."; # prints 31
if (password_entropy("mypass") < password_entropy("Ha20&09_X!t")) {
print "mypass is weaker. It is unexpectedly, isn't it? :)";
}
DESCRIPTION
Information entropy, also known as password quality or password strength when used in a discussion of the information security, is a measure of a password in resisting brute-force attacks.
There are a lot of different ways to determine a password's entropy. We use a simple, empirical algorithm: first, all characters from string splitted into several classes, such as numbers, lower- or upper-case letters and so on. Any characters from one class have equal probability of being in password. Mixing of characters from different classes extends the number of possible symbols (symbols base) in password and thereby increases its entropy. Then, we calculate the effective length of password to ensure the next rules:
some orderliness decreases total entropy, so
'1234'
is weaker password than'1342'
,repeating sequences decrease total entropy, so
'a' x 100
insignificantly stronger than'a' x 4
(it may seem, that's too insignificantly).
Do not expect too much: an algorithm does not check password weakness with dictionary lookup (see Data::Password). Also it can not detect obfuscation like 'p@ssw0rd'
, sequences from a keyboard row or personally related information.
Probability of characters occuring depends on capacity of character class only. Perhaps, it should be to take into account a prevalence of symbol class actually -- it is very unlikely to find a control character in password. But common password policies don't allow control characters, spaces or extended characters in passwords, therefore, so they should not occur in practice.
Similarly, there is no well-defined approach to process national characters. For example, the Greek letters block in Unicode Character Database contains about 400 symbols, but not all of them have equivalent frequency of usage. An intruder, who knows that password may contain Greek letters, will not probe the α (Greek letter Alpha) with the same probability as the ἆ (Greek small letter Alpha with psili and perispomeni), therefore it might be incorrect to consider a whole UCD block or script as a base for calculating probabilities.
So, data are treated as a bytes string, not a wide-character string, and all characters with codes higher than 127 form one class.
The character classes based on the ASCII encoding. If you have something else, e.g. EBCDIC, you can try something like the Encode or Convert::EBCDIC modules.
FUNCTIONS
There's only one function in this package and it is exported by default.
SEE ALSO
Data::Password, Data::Password::Manager, Data::Password::BasicCheck.
http://en.wikipedia.org/wiki/Password_strength
"A Conceptual Framework for Assessing Password Quality" by Wanli Ma, John Campbell, Dat Tran, and Dale Kleeman [PDF] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.3266&rep=rep1&type=pdf
COPYRIGHT
Copyright (c) 2010 Oleg Alistratov. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
Oleg Alistratov <zero@cpan.org>