0% found this document useful (0 votes)
28 views

Huffman Tree

Huffman coding is a lossless data compression technique that uses variable-length codewords. It assigns shorter codewords to more frequent characters and longer codewords to less frequent characters. The document provides an example of Huffman coding applied to a text with various characters and their frequencies. It constructs the Huffman tree, assigns codewords to each character based on the tree, calculates the average code length, and determines the length of the Huffman encoded message in bits.

Uploaded by

Sakshi More
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Huffman Tree

Huffman coding is a lossless data compression technique that uses variable-length codewords. It assigns shorter codewords to more frequent characters and longer codewords to less frequent characters. The document provides an example of Huffman coding applied to a text with various characters and their frequencies. It constructs the Huffman tree, assigns codewords to each character based on the tree, calculates the average code length, and determines the length of the Huffman encoded message in bits.

Uploaded by

Sakshi More
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Huffman Coding-

 Huffman Coding is a famous Greedy Algorithm.


 It is used for the lossless compression of data.
 It uses variable length encoding.
 It assigns variable length code to all the characters.
 The code length of a character depends on how frequently it occurs in the given text.
 The character which occurs most frequently gets the smallest code.
 The character which occurs least frequently gets the largest code.
 It is also known as Huffman Encoding.

Prefix Rule-

 Huffman Coding implements a rule known as a prefix rule.


 This is to prevent the ambiguities while decoding.
 It ensures that the code assigned to any character is not a prefix of the code assigned to
any other character.
PRACTICE PROBLEM BASED ON HUFFMAN CODING-

Problem-

A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression, determine-
1. Huffman Code for each character
2. Average code length
3. Length of Huffman encoded message (in bits)

Characters Frequencies

a 10

e 15
i 12

o 3

u 4

s 13

t 1

Input : Structure/ Information

Char S T U C R E
Freq. 1 2 2 1 2 1
Information
Char I N F O R M A T
Freq. 2 2 1 2 1 1 1 1

Solution-

First let us construct the Huffman Tree.


Huffman Tree is constructed in the following steps-

Step-01:
Step-02:

Step-03:

Step-04:
Step-05:

Step-06:
Step-07:
Now,
 We assign weight to all the edges of the constructed Huffman Tree.
 Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.

Rule
 If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.
 If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.
 Any of the above two conventions may be followed.
 But follow the same convention at the time of decoding that is adopted at the time of
encoding.

After assigning weight to all the edges, the modified Huffman Tree is-

Now, let us answer each part of the given problem one by one-
1. Huffman Code For Characters-

To write Huffman Code for any character, traverse the Huffman Tree from root node to the
leaf node of that character.
Following this rule, the Huffman Code for each character is-
 a = 111
 e = 10
 i = 00
 o = 11001
 u = 1101
 s = 01
 t = 11000

From here, we can observe-


 Characters occurring less frequently in the text are assigned the larger code.
 Characters occurring more frequently in the text are assigned the smaller code.

2. Average Code Length-

Using formula-01, we have-


Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 + 12 + 3 +
4 + 13 + 1)
= 2.52

3. Length of Huffman Encoded Message-

Using formula-02, we have-


Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= 58 x 2.52
= 146.16
≅ 147 bits

You might also like