You are on page 1of 2

4/21/2018 4.

3 An Application of Binary Trees: Huffman Code Construction

Next:4.3.1 ImplementationUp:4. Binary TreesPrevious:4.2 Binary Trees

4.3 An Application of Binary Trees: Huffman Code Construction


REF.
David A Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE. Volume 40, Number 9, pp.
1098-1101, 1952.

Suppose we have messages consisting of sequences of characters. In each message, the characters are independent and appear with a
known probability in any given position; the probabilities are the same for all positions.

e.g. - Suppose we have a message made from the five characters a, b, c, d, e, with probabilities 0.12, 0.40, 0.15, 0.08, 0.25,
respectively.

We wish to encode each character into a sequence of 0s and 1s so that no code for a character is the prefix of the code for any other
character. This prefix property allows us to decode a string of 0s and 1s by repeatedly deleting prefixes of the string that are codes for
characters.

Symbol Prob code 1 code

2 a 0.12 000

000

b 0.40 001

11 c 0.15 010

01

d 0.08 011

001 e 0.25 100

10

In the above, Code 1 has the prefix property. Code 2 also has the prefix property.
The problem: given a set of characters and their probabilities, find a code with the prefix property such that the average length of a
code for a character is a minimum.

Code 1: (0.12)(3) + (0.4)(3) + (0.15)(3) + (0.08)(3) + (0.25)(3) = 3


Code 2: (0.12)(3) + (0.4)(2) + (0.15)(2) + (0.08)(3) + (0.25)(2) = 2.2

Huffman's algorithm is one technique for finding optional prefix codes. The algorithm works by selecting two characters a and b
having the lowest probabilities and replacing them with a single (imaginary) character, say x, whose probability of occurrence is the
sum of probabilities for a and b. We then find an optimal prefix code for this smaller set of characters, using this procedure
recursively. The code for the original character set is obtained by using the code for x with a 0appended for ''a'' and a 1 appended for
''b''.
We can think of prefix codes as paths in binary trees. For example, see Figure 4.8.

Figure 4.8: Code 1 and Code 2

http://lcm.csa.iisc.ernet.in/dsa/node88.html 1/2
4/21/2018 4.3 An Application of Binary Trees: Huffman Code Construction
The prefix property guarantees that no character can have a code that is an interior node, and conversely, labeling the leaves of any
binary tree with characters gives us a code with the prefix property for these characters.
Huffman's algorithm is implemented using a forest (disjoint collection of trees), each of which has its leaves labeled by characters
whose codes we desire to select and whose roots are labeled by the sum of the probabilities of all the leaf labels. We call this sum the
weight of the tree.
Initially each character is in a one-node tree by itself and when the algorithm ends, there will be only one tree, with all the characters
as its leaves. In this final tree, the path from the root to any leaf represents the code for the label of that leaf.
The essential step of the algorithm is to select the two trees in the forest that have the smallest weights (break ties arbitrarily).
Combine these two trees into one, whose weight is the sum of the weights of the two trees. To combine the trees, we create a new
node, which becomes the root and has the roots of the two given trees as left and right children (which is which doesn't matter). This
process continues until only one tree remains.
An example of Huffman algorithm is shown in Figure 4.9 for five alphabets.

Figure 4.9: An example of Huffman algorithm

4.3.1 Implementation
4.3.2 Sketch of Huffman Tree Construction

Next:4.3.1 ImplementationUp:4. Binary TreesPrevious:4.2 Binary Trees


eEL,CSA_Dept,IISc,Bangalore

http://lcm.csa.iisc.ernet.in/dsa/node88.html 2/2

You might also like