0% found this document useful (0 votes)

63 views

Algorithmics: Information Coding Techniques

The document discusses Huffman coding, an algorithm for data compression. Huffman coding assigns variable-length codes to input characters, symbols, or values, where the length of each assigned code is inversely proportional to the frequency of the character it represents. It creates a prefix code tree by repeatedly merging the two nodes with the lowest frequencies until only one tree remains. This allows more frequent characters to be represented using fewer bits than less common characters, achieving optimal compression. The document provides an example of how Huffman coding can compress the string "ACDABA" into a more efficient encoding than a fixed-length scheme.

Uploaded by

thendral

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Algorithmics: Information Coding Techniques

Uploaded by

thendral

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 44

Algorithmics

CT065-3.5-3

Information Coding Techniques

Level 3 Computing (Software Engineering)
Learning Outcomes
By the end of this lesson you should be
able to:
Briefly explain the Information Coding
Techniques algorithms in the following
areas
Huffman Coding

Module Code and Module Title Title of Slides Slide 3 (of 50)
Keywords
Huffmans algorithm
Prefix code
Optimal prefix code

Module Code and Module Title Title of Slides Slide 4 (of 50)
Data Compression
Reducing the amount of bits required
for data representation is known as
compression

In text compression, each character

within a file is encoded using a certain
number of bits (usually less than 8 bits
for each character)

Module Code and Module Title Title of Slides Slide 5 (of 50)
Data Compression
Compression consists of 2 phases:
Encoding phase (compressing)
Data is converted using an encoding scheme

Decoding phase (decompressing)

Data is decoded to its original form using the
same scheme that used to encode it

Module Code and Module Title Title of Slides Slide 6 (of 50)
Encoding/Decoding
Will use message in generic sense to
mean the data to be compressed

Input Compressed Output

Message Encoder Message Decoder Message

The encoder and decoder need to understand

common compressed format.

15-853
Page6
Module Code and Module Title Title of Slides
Data Compression
Purpose:
To reduce the size of files stored on disk
(i.e. in effect increasing the capacity of
the disk)
To increase the effective rate of data
transmission (by transmitting less data)

Module Code and Module Title Title of Slides Slide 7 (of 50)
Fixed Length Encoding
Concepts:
A fixed number of bits are used to represent
each character in the encoding scheme
For example, a 3-bit code length would be
required to uniquely represent 8 different
characters
The most frequently occurring characters have
short codes

Module Code and Module Title Title of Slides Slide 8 (of 50)
Fixed Length Encoding - Example
Table 1 - A fixed length encoding scheme

Character Code Frequency Total Bits

a 000 10 3x10 = 30
e 001 15 3x15 = 45
i 010 12 3x12 = 36
s 011 3 3x3 = 9
t 100 4 3x4 = 12
sp (blank space) 101 13 3x13 = 39
nl (new line) 110 1 3x1 = 3

Total bits required for encoding 174

Module Code and Module Title Title of Slides Slide 9 (of 50)
Fixed Length Encoding
Prefix Code Tree
The codes in Table 1 can be represented by the binary tree
below:

0 1 Total : 174 bits

0 1 0 1

0 1 0 1 0 1 0

a e i s t sp nl

All characters are only stored on leaf nodes.

In the tree above, a left branch represents 0 and a right branch
represents 1. The path to a node indicates its representation.

Module Code and Module Title Title of Slides Slide 10 (of 50)
Fixed Length Encoding
Prefix Code Tree
If we further impose the condition that the tree is to be a
strictly binary tree (i.e. all nodes are either leaves or have
2 children), then the codes could be represented with
somewhat less bits, as shown below:

Total : 173 bits

0 1

0 1 0 1
nl

0 1 0 1 0 1

a e i s t sp

Module Code and Module Title Title of Slides Slide 11 (of 50)
Variable Length Encoding
A more efficient encoding scheme
compared to fixed length encoding
would be one that:
Allows the code lengths(previous
example for fixed is 3) to vary from
character to character, with the most
frequently occurring characters having
short codes

Module Code and Module Title Title of Slides Slide 14 (of 50)
Huffman Encoding
The Huffman encoding algorithm was
created in 1952 and is named after its
inventor, David Huffman
It is a loss less encoding algorithm that is
ideal for compressing text or program files
The algorithm uses variable length codes

Module Code and Module Title Title of Slides Slide 15 (of 50)
Huffman Coding

Huffman codes can be used to compress information

Like WinZip although WinZip doesnt use the
Huffman algorithm
JPEGs do use Huffman as part of their
compression process

The basic idea is that instead of storing each

character in a file as an 8-bit ASCII value, we will
instead store the more frequently occurring characters
using fewer bits and less frequently occurring
characters using more bits
On average this should decrease the filesize
(usually )
Module Code and Module Title Title of Slides
Huffman Encoding - Example
Assume that we want to compress the
following piece of data using Huffman
encoding:
ACDABA
Since there are 6 characters, this
uncompressed text is 6 bytes or 48
bits long

Module Code and Module Title Title of Slides Slide 16 (of 50)
Huffman Encoding - Example
Data:
ACDABA

With Huffman encoding, the file is searched

for the most frequently appearing symbols
(in this case the character A occurs 3
times) and then a prefix code tree is built
that replaces the symbols by shorter bit
sequences.

Module Code and Module Title Title of Slides Slide 17 (of 50)
Huffman Encoding - Example
Data:
ACDABA

In this particular case, the algorithm

would use the following encoding
table: A=0, B=10, C=110, D=111.

Module Code and Module Title Title of Slides Slide 18 (of 50)
Huffman Algorithm
How was the prefix code tree
constructed in the previous example?
Huffmans encoding algorithm constructs
an optimal prefix code tree by repeatedly
merging the two trees with the least
weight

Module Code and Module Title Title of Slides Slide 20 (of 50)
Huffman Algorithm
Given a character A with frequency, f. The Huffman tree is
constructed using a priority queue, Q, of nodes, with
frequencies as keys.

Module Code and Module Title Title of Slides Slide 21 (of 50)
Huffman Algorithm
Assume the number of characters to be
encoded is C
Huffman algorithm can be described as
follows:
Maintain a forest of trees (each tree
represents one character)
The weight of a tree = sum of frequencies
of its leaves

Module Code and Module Title Title of Slides Slide 23 (of 50)
Huffman Algorithm

At the beginning of the algorithm, there

are C single-node trees

At the end of the algorithm there is one

tree, and this is an optimal Huffman tree

Module Code and Module Title Title of Slides Slide 24 (of 50)
Huffman Algorithm - Example

Let us construct an optimal prefix

code tree for the following text:
ACDABA

Module Code and Module Title Title of Slides Slide 25 (of 50)
Huffman Algorithm - Example
Maintain a forest with the each character
representing a tree within the forest and its
frequency within the text representing the
weight of the tree:

A B C D
3 1 1 1

Module Code and Module Title Title of Slides Slide 26 (of 50)
Huffman Algorithm - Example
Merge the two trees with the smallest
weights 2
A B
3 1
C D
1 1

Note: We chose to merge trees C and D, in fact we could have

chosen any two out of B, C and D. The trees are
arbitrarily merged as either the right or left subtree.

Module Code and Module Title Title of Slides Slide 27 (of 50)
Huffman Algorithm - Example

Merge the next two trees with the smallest

weights: 3

B 2
1

A C D
3 1 1
Module Code and Module Title Title of Slides Slide 28 (of 50)
Huffman Algorithm - Example
Merge the last two trees to produce the
optimal prefix code tree:
0 6 1

A 3
0 1
3
A=0
B 2
1 0 1

B = 10
C D
C = 110 1 1 D = 111
Module Code and Module Title Title of Slides Slide 29 (of 50)
Huffman Encoding - Example
Data: ACDABA
Codes: A=0, B=10, C=110, D=111
If these codes are used to compress the file,
the compressed data would look like this:
01101110100
This means that 11 bits are used instead of
48, a compression ratio of 4 to 1 for this
particular file

Module Code and Module Title Title of Slides Slide 19 (of 50)
Huffman Coding Example 2
As an example, lets take the string:
duke blue devils
We first to a frequency count of the characters:
e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1
Next we use a Greedy algorithm to build up a
Huffman Tree
We start with nodes for each character

e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1

Module Code and Module Title Title of Slides

Huffman Coding

We then pick the nodes with the smallest

frequency and combine them together to
form a new node
The selection of these nodes is the Greedy
part
The two selected nodes are removed from
the set, but replace by the combined node
This continues until we have only 1 node
left in the set

Module Code and Module Title Title of Slides

Huffman Coding

e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1

Module Code and Module Title Title of Slides

Huffman Coding

e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 2

i,1 s,1

Module Code and Module Title Title of Slides

Huffman Coding

e,3 d,2 u,2 l,2 sp,2 k,1 2 2

b,1 v,1 i,1 s,1

Module Code and Module Title Title of Slides

Huffman Coding

e,3 d,2 u,2 l,2 sp,2 3 2

k,1 2 i,1 s,1

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

e,3 d,2 u,2 4 3 2

l,2 sp,2 k,1 2 i,1 s,1

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

e,3 4 4 3 2

d,2 u,2 l,2 sp,2 k,1 2 i,1 s,1

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

e,3 4 4 5

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

7 4 5

e,3 4 l,2 sp,2 2 3

d,2 u,2 i,1 s,1 k,1 2

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

7 9

e,3 4 4 5

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

7 9

e,3 4 4 5

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

b,1 v,1

Module Code and Module Title Title of Slides

Huffman Coding

Now we assign codes to the tree by

placing a 0 on every left branch and a 1 on
every right branch
A traversal of the tree from root to leaf give
the Huffman code for that particular leaf
character
Note that no code is the prefix of another
code

Module Code and Module Title Title of Slides

Huffman Coding
16 e 00
d 010
7 9 u 011
l 100
e,3 4 4 5
s 101
p
d,2 u,2 l,2 sp,2 2 3 i 1100
s 1101
i,1 s,1 k,1 2
k 1110
b 11110
b,1 v,1
v 11111

Module Code and Module Title Title of Slides

Huffman Coding

These codes are then used to encode the

string
Thus, duke blue devils turns into:
010 011 1110 00 101 11110 100 011 00 101 010 00 11111 1100 100 1101

When grouped into 8-bit bytes:

01001111 10001011 11101000 11001010 10001111 11100100 1101xxxx
Thus it takes only 7 bytes of space compared to 16
characters * 1 byte/char = 16 bytes uncompressed

Module Code and Module Title Title of Slides

Huffman Coding
Uncompressing works by reading in the file bit by
bit
Start at the root of the tree
If a 0 is read, head left
If a 1 is read, head right
When a leaf is reached decode that character and
start over again at the root of the tree

Module Code and Module Title Title of Slides

Huffman Encoding -
Applications
Huffman encoding is mainly used in
compression programs like pkZIP, lha,
gz, zoo, and arj

It is also used within JPEG and MPEG

compressions

Module Code and Module Title Title of Slides Slide 30 (of 50)

Favnil XMLRPC Service V1.0
No ratings yet
Favnil XMLRPC Service V1.0
8 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Huffman Trees and Codes-v1
No ratings yet
Huffman Trees and Codes-v1
15 pages
Huffman Code
No ratings yet
Huffman Code
7 pages
Huffman Coding
No ratings yet
Huffman Coding
40 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Huffman
No ratings yet
Huffman
13 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Huffman Coding
No ratings yet
Huffman Coding
12 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
Huff Man
No ratings yet
Huff Man
8 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
Optimization Problems
No ratings yet
Optimization Problems
38 pages
Lec.4n - COMM 552 Information Theory and Coding
No ratings yet
Lec.4n - COMM 552 Information Theory and Coding
23 pages
Huffman
No ratings yet
Huffman
11 pages
Unit 2
No ratings yet
Unit 2
28 pages
Data Compression Unit-2
No ratings yet
Data Compression Unit-2
74 pages
Huffman Code
No ratings yet
Huffman Code
29 pages
Huffman
No ratings yet
Huffman
24 pages
Huffman Coding
No ratings yet
Huffman Coding
22 pages
12 - Huffman Coding Algorithm
No ratings yet
12 - Huffman Coding Algorithm
16 pages
Huffman Coding
No ratings yet
Huffman Coding
7 pages
7.4 Huffman Coding (2)
No ratings yet
7.4 Huffman Coding (2)
26 pages
Data Compression
No ratings yet
Data Compression
28 pages
Huffman Code
No ratings yet
Huffman Code
25 pages
Huffman Coding Scheme
No ratings yet
Huffman Coding Scheme
59 pages
M1 Greedy - Huffman Codes
No ratings yet
M1 Greedy - Huffman Codes
2 pages
Unit 2 CA209
No ratings yet
Unit 2 CA209
29 pages
Data Compression - Unit 2
No ratings yet
Data Compression - Unit 2
31 pages
Huffman Coding
No ratings yet
Huffman Coding
65 pages
Huffman Coding: Version of September 17, 2016
No ratings yet
Huffman Coding: Version of September 17, 2016
27 pages
2.2.5Huffman
No ratings yet
2.2.5Huffman
52 pages
L10 Huffman Encoding Greedy
No ratings yet
L10 Huffman Encoding Greedy
52 pages
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
No ratings yet
Huffman Encoding: WWW - Cis.Upenn - Edu/ Matuszek/Cit594-2002/SLIDES/HUFFMAN
13 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
Term Paper Huffman Coding
No ratings yet
Term Paper Huffman Coding
9 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Assignment 6: Huffman Encoding: Assignment Overview and Starter Files
No ratings yet
Assignment 6: Huffman Encoding: Assignment Overview and Starter Files
20 pages
Huffman's Algorithm Lecture1
No ratings yet
Huffman's Algorithm Lecture1
69 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
M2 Huffman
No ratings yet
M2 Huffman
36 pages
Huffman Coding
No ratings yet
Huffman Coding
16 pages
Mini Project
No ratings yet
Mini Project
26 pages
2. Coding Theory
No ratings yet
2. Coding Theory
49 pages
Data Compression Algorithms and Their Applications
100% (1)
Data Compression Algorithms and Their Applications
14 pages
Manual GRP A - Assignment 2 .Docx 1 1
No ratings yet
Manual GRP A - Assignment 2 .Docx 1 1
15 pages
Mini Project 2
No ratings yet
Mini Project 2
4 pages
Lecture 26
No ratings yet
Lecture 26
2 pages
Wa0023.
No ratings yet
Wa0023.
28 pages
0g Huffman
No ratings yet
0g Huffman
23 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
Assignment No-05
No ratings yet
Assignment No-05
3 pages
UNIT 2
No ratings yet
UNIT 2
82 pages
Lesson - Huffman and Entropy Coding
No ratings yet
Lesson - Huffman and Entropy Coding
31 pages
Application of Compression
No ratings yet
Application of Compression
14 pages
2.3a Huffman Coding
No ratings yet
2.3a Huffman Coding
25 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
Huffman Coding: Vida Movahedi
No ratings yet
Huffman Coding: Vida Movahedi
24 pages
DB2 11 for z/OS: Intermediate Training for Application Developers
From Everand
DB2 11 for z/OS: Intermediate Training for Application Developers
Robert Wingate
No ratings yet
Tungsten Network Job Description and Skills Matrix
No ratings yet
Tungsten Network Job Description and Skills Matrix
3 pages
Presentation QWEASD
No ratings yet
Presentation QWEASD
12 pages
Tbs M Project Charter
No ratings yet
Tbs M Project Charter
15 pages
OS FordF150 PDF
No ratings yet
OS FordF150 PDF
2 pages
Tetris
No ratings yet
Tetris
23 pages
The Model-View-Controller (MVC) Pattern With C# - WinForms - CodeProject
No ratings yet
The Model-View-Controller (MVC) Pattern With C# - WinForms - CodeProject
12 pages
C
No ratings yet
C
20 pages
Course Title Solaris Basics Course Length: 3 Days (TRP2)
No ratings yet
Course Title Solaris Basics Course Length: 3 Days (TRP2)
3 pages
New Barcode Printing in SAP Using Smart Forms
No ratings yet
New Barcode Printing in SAP Using Smart Forms
11 pages
PLD G
No ratings yet
PLD G
33 pages
Sg246116.Implementing An IBM-Brocade SAN With 8Gbps Directors and Switches
No ratings yet
Sg246116.Implementing An IBM-Brocade SAN With 8Gbps Directors and Switches
842 pages
Computer History Timeline
No ratings yet
Computer History Timeline
11 pages
Konsep IOT
No ratings yet
Konsep IOT
19 pages
SNO Title Pageno Indroduction: Project Description
No ratings yet
SNO Title Pageno Indroduction: Project Description
3 pages
Introduction To Fpgas
No ratings yet
Introduction To Fpgas
46 pages
01 Sample Programs C++
No ratings yet
01 Sample Programs C++
78 pages
02 - Buffer Cache Tuning
No ratings yet
02 - Buffer Cache Tuning
11 pages
Distributed GIS: Technology, Components, Applications and Future
No ratings yet
Distributed GIS: Technology, Components, Applications and Future
64 pages
APi File Structures
0% (1)
APi File Structures
58 pages
How To Integrate A Base Band
No ratings yet
How To Integrate A Base Band
7 pages
Cyber Safe Girl Ebook PDF
No ratings yet
Cyber Safe Girl Ebook PDF
44 pages
QoS (Transport Layer)
No ratings yet
QoS (Transport Layer)
2 pages
Floating Point Numbers: The Architecture of Computer Hardware and Systems Software
No ratings yet
Floating Point Numbers: The Architecture of Computer Hardware and Systems Software
28 pages
Super Keyword
No ratings yet
Super Keyword
5 pages
BCSL 045
No ratings yet
BCSL 045
5 pages
Uts
No ratings yet
Uts
20 pages
Software Development Process
No ratings yet
Software Development Process
6 pages
Inside The Apple IIe
No ratings yet
Inside The Apple IIe
429 pages
CS GATE'2017 Paper 02 Key Solution
No ratings yet
CS GATE'2017 Paper 02 Key Solution
32 pages
CyberCrime A
No ratings yet
CyberCrime A
10 pages
CLL F037 TRM Eng
No ratings yet
CLL F037 TRM Eng
51 pages
Testing 1
No ratings yet
Testing 1
7 pages
Btech IT PDF
No ratings yet
Btech IT PDF
82 pages

Algorithmics: Information Coding Techniques

Uploaded by

Algorithmics: Information Coding Techniques

Uploaded by

Algorithmics

Information Coding Techniques

In text compression, each character

Decoding phase (decompressing)

Input Compressed Output

The encoder and decoder need to understand

Character Code Frequency Total Bits

Total bits required for encoding 174

0 1 Total : 174 bits

All characters are only stored on leaf nodes.

Total : 173 bits

Huffman codes can be used to compress information

The basic idea is that instead of storing each

With Huffman encoding, the file is searched

In this particular case, the algorithm

At the beginning of the algorithm, there

At the end of the algorithm there is one

Let us construct an optimal prefix

Note: We chose to merge trees C and D, in fact we could have

Merge the next two trees with the smallest

Module Code and Module Title Title of Slides

We then pick the nodes with the smallest

Module Code and Module Title Title of Slides

Module Code and Module Title Title of Slides

e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 2

Module Code and Module Title Title of Slides

e,3 d,2 u,2 l,2 sp,2 k,1 2 2

b,1 v,1 i,1 s,1

Module Code and Module Title Title of Slides

e,3 d,2 u,2 l,2 sp,2 3 2

k,1 2 i,1 s,1

Module Code and Module Title Title of Slides

e,3 d,2 u,2 4 3 2

l,2 sp,2 k,1 2 i,1 s,1

Module Code and Module Title Title of Slides

d,2 u,2 l,2 sp,2 k,1 2 i,1 s,1

Module Code and Module Title Title of Slides

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

Module Code and Module Title Title of Slides

e,3 4 l,2 sp,2 2 3

d,2 u,2 i,1 s,1 k,1 2

Module Code and Module Title Title of Slides

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

Module Code and Module Title Title of Slides

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

Module Code and Module Title Title of Slides

Now we assign codes to the tree by

Module Code and Module Title Title of Slides

Module Code and Module Title Title of Slides

These codes are then used to encode the

When grouped into 8-bit bytes:

Module Code and Module Title Title of Slides

Module Code and Module Title Title of Slides

It is also used within JPEG and MPEG

You might also like