NAME

Algorithm::Huffman - Perl extension that implements the Huffman algorithm

SYNOPSIS

use Algorithm::Huffman;

my $huff = Algorithm::Huffman->new(\%char_counting);
my $encode_hash = $huff->encode_hash;
my $decode_hash = $huff->decode_hash;

DESCRIPTION

This modules implements the huffman algorithm. The aim is to create a good coding scheme for a given list of different characters (or even strings) and their occurence numbers.

ALGORITHM

Please have a look to a good data compression book for a detailed view. However, the algorithm is like every good algorithm very easy.

Assume we have a heap (keys are the characters/strings; values are their occurencies). In each step of the algorithm, the two rarest characters are looked at. Both get a suffix (one "0", the other "1"). They are joined together and will occur from that time as one "element" in the heap with their summed occurencies. The joining creates a tree growing on while the heap is reducing.

Let's take an example. Given are the characters and occurencies.

a (15) b(7) c(6) d(6) e(5)

In the first step e and d are the rarest characters, so we create this new heap and tree structure:

a(15) de(11) b(7) c(6)

      de
     /  \
 "0"/    \"1"
   d      e

Next Step:

a(15) bc(13) de(11)

      de                bc
     /  \              /  \
 "0"/    \"1"      "0"/    \"1"
   d      e          b      c

Next Step:

a(15) bcde(24)

              bcde
            /      \
       "0"/          \"1"
        /              \
      de                bc
     /  \              /  \
 "0"/    \"1"      "0"/    \"1"
   d      e          b      c

Next Step unifies the rest:

                          Huffman-Table
                             /    \
                       "0"/          \"1"
                      /                  \
                  /                          \
             bcde                              a
           /      \
      "0"/          \"1"
       /              \
     de                bc
    /  \              /  \
"0"/    \"1"      "0"/    \"1"
  d      e          b      c

Finally this encoding table would be created:

Please note, that there is no rule defining what element in the tree is ordered to left or to right. So it's also possible to get e.g. the coding scheme:

METHODS

my $huff = Algorithm::Huffman->new( HASHREF )

Creates a new Huffman table, based on the given occurencies of characters. The keys of the given hashref are the characters/strings, the values are their occurencies.

A hashref is used, as such a hash can become quite large (e.g. all three letter combinations).

$huff->encode_hash

Returns a reference to the encoding hash. The keys of the encoding hash are the characters/strings passed at the construction. The values are their bit representation. Please note that the bit represantations are strings of ones and zeros is returned and not binary numbers.

$huff->decode_hash

Returns a reference to the decoding hash. The keys of the decoding hash are the bit presentations, while the values are the characters/strings the bitstrings stands for. Please note that the bit represantations are strings of ones and zeros is returned and not binary numbers.

EXPORT

None by default.

BUGS

There is no great parameter validation. That will be changed in future versions.

If a character/string has occurs zero times, it is still coded. At the moment, you have to grep them out before. I don't plan to change it, as it can realistic happen and they would play a role. (Imagine, you would code all three letter combinations found in some english texts, you still would have to code all ASCII characters, even if they don't occur in the texts you have analyzed. Reason is that they could occur in other texts and you would have to guarantee that you can code every text without any lost information)

It isn't tested with a big histogram of characters/strings.

There could be some others, as this code is still in the ALPHA stadium.

TODO

I'll need a encode and decode method based on the created internal huffman table should be implemented.

AUTHOR

Janek Schleicher, <bigj@kamelfreund.de>

COPYRIGHT AND LICENSE

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

To install Algorithm::Huffman, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Algorithm::Huffman

CPAN shell

perl -MCPAN -e shell
install Algorithm::Huffman

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)