2. HUFFMAN CODING
• Huffman Coding Algorithm— a bottom-up
approach.
• The Huffman coding is a procedure to generate a
binary code tree. The algorithm invented by David
Huffman in 1952 ensures that the probability for
the occurrence of every symbol results in its code
length.
• Huffman coding could perform effective data
compression by reducing the amount of redundancy
in the coding of symbols.Rahul Khanvani For More Visit Binarybuzz.wordpress.com
3. Huffman Coding Algorithm
1. Initialization: Put all symbols on a list sorted according
to their frequency counts.
2. Repeat until the list has only one symbol left:
1. From the list pick two symbols with the lowest frequency
counts
2. Form a Huffman sub-tree that has these two symbols as
child nodes and create a parent node.
3. Assign the sum of the children’s frequency counts to the
parent and insert it into the list such that the order is
maintained.
4. Delete the children from the list.
3. Assign a codeword for each leaf based on the path from
the root.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
5. Constructing A Tree of Nodes Who
Has Minimum Occurance
(11)
D(6) E(5)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
6. Constructing A Tree of Nodes Who
Has Minimum Occurance
17
C(6) (11)
D(6) E(5)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
7. Re-Constructing A Tree of Nodes Who
Has Minimum Occurance
17
(13)
B(7) C(6)
(11)
D(6) E(5)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
8. Re-Constructing A Tree of Nodes Who
Has Minimum Occurance
(39)
A(15) (24)
(13)
B(7) C(6)
(11)
D(6) E(5)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
9. Huffman Coding Result
Symbol Count Bits
A 15 0
B 7 100
C 6 101
D 6 110
E 5 111
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
10. Comparison Of Huffman And Shanon-
Fano Coding Algorithm
Symbol Count Shanon-
Fano
Bit Size
Huffman Bit
Size
Shanon
Fano Total
Bits
Huffman
Total Bits
A 15 2 1 30 15
B 7 2 3 14 21
C 6 2 3 12 18
D 6 3 3 18 18
E 5 3 3 15 15
Total 89 87
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
11. Comparison Conclusion
• Shannon-Fano and Huffman coding are close in
performance.
• But Huffman coding will always at least equal the
efficiency of Shannon-Fano coding, so it has become
the predominant coding method of its type.
• both algorithms take a similar amount of processing
power.
• it seems sensible to take the one that gives slightly
better performance.
• Huffman was able to prove that this coding method
cannot be improved on with any other integral bit-
width coding stream.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
12. Huffman Coding Types:
• The construction of a code tree for the
Huffman coding is based on a certain
probability distribution.
• Varies In Three Types:
– static probability distribution
– dynamic probability distribution
– adaptive probability distribution
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
13. Static probability distribution
• Coding procedures with static Huffman codes
operate with a predefined code tree.
• Provided that the source data correspond to the
adopted frequency distribution, an acceptable
efficiency of the coding can be achieved.
• It is not necessary to store the Huffman tree or
the frequencies within the encoded data.
• It is sufficient to keep them available within the
encoder or decoder software.
• Additionally the coding tables do not need to be
generated at run-time.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
14. Dynamic probability distribution
• Instead of a static tree being identical for any
type of data, a dynamic analysis of the
probability distribution could take place.
• Codes generated from these code trees match
the real conditions clearly better than standard
distributions.
• The major disadvantage of this procedure is,
that the information about the Huffman tree has
to be embedded into the compressed files or
data transmissions.
• A code table or the symbol's frequencies must
be part of the header data.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
15. Adaptive probability distribution
• The adaptive coding procedure uses a code
tree that is permanently adapted to the
previously encoded or decoded data. Starting
with an empty tree or a standard
distribution.
• each encoded symbol will be used to refine
the code tree. This way a continuous
adaption will be achieved and local variations
will be compensated at run-time.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
16. Adaptive probability distribution
• Adaptive Huffman codes initially using empty
trees operate with a special control character
identifying new symbols currently not being
part of the tree.
• This variant is characterized by its minimum
requirements for header data, but the
attainable compression rate is unfavourable
at the beginning of the coding or for small
files.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
17. Extended Huffman Coding
• Extended Alphabet : For alphabet
S={s1,s2,...,sn}, if k symbols are grouped
together, then the extended alphabet is:
• Problem: If k is relatively large (e.g., k≥3), then
for most practical applications where n>1, k
implies a huge symbol table that is impractical.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
18. Adaptive(Dynamic) Huffman Coding
• In adaptive Huffman Coding statistics are gathered and up-
dated dynamically as the data stream arrives.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
19. Adaptive(Dynamic) Huffman Coding
1. Initial code : assigns symbols with some initially
agreed upon codes, without any prior knowledge
of the frequency counts.
2. Update tree : constructs an Adaptive Huffman tree.
It basically does two things:
1. increments the frequency counts for the symbols (includ-
ing any new ones).
2. updates the configuration of the tree.
3. The encoder and decoder must use exactly the
same initial code and update tree routines.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
20. Notes on Adaptive Huffman Tree
Updating
• Nodes are numbered in order from left to
right, bottom to top. The numbers in
parentheses indicates the count.
• The tree must always maintain its sibling
property.
• When a swap is necessary, the farthest node
with count N is swapped with the node
whose count has just been increased to N+
1.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
26. Another Example: Adaptive Huffman
Coding
• This is to clearly illustrate more implementation
details. We show exactly what bits are sent, as
opposed to simply stating how the tree is
updated.
• An additional rule: if any character/symbol is to
be sent the first time, it must be preceded by a
special symbol, NEW.
• The initial code for NEW is 0. The count for NEW
is always kept as 0 (the count is never
increased);
• hence it is always denoted as NEW:(0)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
27. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(1)
NEW:0 A: (1)
(2)
NEW:0 A: (2)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
28. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(3)
A : (2)(1)
NEW:0 D: (1)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
29. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(4)
A: (2)(2)
(1)
NEW:0 C: (1)
D: (1)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
30. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(4)
A: (2)(2)
(1)
NEW:0 C: (1+1)
D: (1)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
31. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(4)
A: (2)(2+1)
(1)
NEW:0 D: (1)
C: (2)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
32. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(5)
A: (2) (3)
C : (2)(1)
NEW:0 D: (1)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
33. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(6)
A: (2) (4)
C : (2)(2)
NEW:0 D: (2)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
34. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(6)
A: (2) (4)
C : (2)(2)
NEW:0 D: (2+1)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
35. Initial code assignment for AADCCDD
using adaptive Huffman coding.
(7)
D: (3) (4)
C : (2)(2)
NEW:0 A: (2)
Rahul Khanvani For More Visit Binarybuzz.wordpress.com
36. Sequence of symbols and codes sent
to the decoder
Symb
ol
NEW A A NEW D NEW C C D D
Code 0000
0000
0000
0001
0000
0001
0000
0000
0000
0100
0000
0000
0000
0011
0000
0011
0000
0100
0000
0100
It is important to emphasize that the code for a
particular symbol changes during the adaptive
Huffman coding process.
Rahul Khanvani For More Visit Binarybuzz.wordpress.com