NAME
Pg::Explain::StringAnonymizer - Class to anonymize sets of strings
VERSION
Version 2.7
SYNOPSIS
This module provides a way to turn defined set of strings into anonymized version of it, that has 4 properties:
the same original string should give the same output string (within the same input set)
strings shouldn't be very long
it shouldn't be possible to reverse the operation
generated strings should be easy to read, and easy to distinguish between themselves.
Points first and third can be done easily with some hashing function (md5, sha), but generated hashes violate fourth point, and sometimes also second.
Example of usage:
my $anonymizer = Pg::Explain::StringAnonymizer->new();
$anonymizer->add( 'a', 'b', 'c');
$anonymizer->add( 'depesz' );
$anonymizer->add( [ "any strings, "are possible" ] );
$anonymizer->finalize();
print $anonymizer->anonymized( 'a' ), "\n";
my $full_dictionary = $anonymizer->anonymization_dictionary();
METHODS
new
Object constructor, doesn't take any arguments.
add
Adds new string(s) to anonymization list.
Strings can be given either as list of ArrayRef.
It is important to note, that one cannot add() more elements to anonymized set after finalization (call to finalize() method).
If such call will be made (add() after finalize()) it will raise exception.
finalize
Finalizes string set creation, and creates anonymized versions.
It has to be called after some number of add() calls, so that it will have something to work on.
After running finalize() one cannot add() more string.
Also, before finalize() you cannot run anonymized() or anonymization_dictionary() methods.
anonymized
Returns anonymized version of given string, or undef if the string wasn't previously added to anonymization set.
If it will be called before finalize() it will raise exception.
anonymize_text
Anonymize given text using loaded dictionary of substiturions.
anonymization_dictionary
Returns hash reference containing all input strings and their anonymized versions, like:
{
'original1' => 'anon1',
'original2' => 'anon2',
...
'originalN' => 'anonN',
}
If it will be called before finalize() it will raise exception.
INTERNAL METHODS
_hash
Converts given string into array of 32 integers in range 0..31.
This is done by taking sha1 checksum of string, splitting it into 32 5-bit long "segments", and transposing each segment into integer.
_word
Returns n-th word from number-to-word translation dictionary.
_make_prefixes
Scan given keys, and changes their values (in ->{'strings'} hash) to shortest unique prefix.
_stringify
Converts arrays of ints (prefixes for hashed words) into strings
AUTHOR
hubert depesz lubaczewski, <depesz at depesz.com>
BUGS
Please report any bugs or feature requests to depesz at depesz.com
.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Pg::Explain::StringAnonymizer
COPYRIGHT & LICENSE
Copyright 2008-2023 hubert depesz lubaczewski, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.