0% found this document useful (0 votes)

1 views17 pages

Chapter 5 Data Compression

Chapter 5 discusses basic data compression techniques, categorizing them into lossless and lossy methods. It covers concepts such as redundancy, variable length coding, Huffman encoding, run-length encoding, and quantization, explaining how these methods reduce data size while preserving or approximating the original information. The chapter highlights the trade-offs between compression efficiency and data fidelity, particularly in the context of different types of data.

Uploaded by

bakr khader

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views17 pages

Chapter 5 Data Compression

Uploaded by

bakr khader

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Chapter 5

Basic Compression

 Data Compression
 Redundancy
 Variable Length Coding
 Huffman encoding
 Run Length Encoding (RLE)
 Quantization (Lossy)

1
Data Compression
Two categories
• Information Preserving
– Error free compression
– Original data can be recovered completely
• Lossy
– Original data is approximated
– Less than perfect
– Generally allows much higher compression

2
Basics
Data Compression
– Process of reducing the amount of data required to
represent a given quantity of information
Data vs. Information
– Data and Information are not the same thing
– Data
• the means by which information is conveyed
• various amounts of data can convey the same
information
– Information
• “A signal that contains no uncertainty”
3
Redundancy
Redundancy
– “data” that provides no relevant information
– “data” that restates what is already known
• For example
– Consider that N1 and N2 denote the number of “data
units” in two sets that represent the same information
– where Cr is the “Compression Ratio”
• Cr = N1 / N2
• EXAMPLE
N1 = 10 & N2 = 1 data can encode the same information Compression is Ratio
Cr = N1/N2 = 10 (or 10:1)
Implying 90% of the data in N1 is redundant
4
Variable Length Coding (1)
Our binary system (called natural binary) is not always that good at
representing data from a compression point of view

Consider the following string of data:

abaaaaabbbcccccaaaaaaaabbbbdaaaaaaa

There are 4 different pieces of information (let’s say 4 symbols)

a, b, c, d

In natural binary we would need at least 2 bits to represent this, assigning bits as
follows:
a=00, b=01, c=10, d=11

There
5 are 35 pieces of data, that is 35*2bits = 70bits
Variable Length Coding (2)
Now, consider the occurrence of each symbol:
a,b,c,d
Abaaaaabbbcccccaaaaaaaabbbbdaaaaaaa

a = 21/35 (60%)
b = 8 /35 (22%)
c = 5/35 (14%)
d = 1/35 (02%)

6
Variable Length Coding (3)
Idea of variable length coding: assign less bits to encode to
more frequent symbols, more bits for less frequent symbols

Bit assignment using VLC:

a=1, b=01, c=001, d=0001

Now, compute the number of bits used to encode the same

data:
21*(1) + 8*(2) + 5*(3) + 1*(4) = 56bits

a = 21/35 (84%)
So, we have a compression ratio of 70:56 or 1.25, meaning
b = 8 /35 (22%)
c = 5/35 (14%)
7
20% of the data using natural binary encoding is redundant
d = 1/35 (02%)
David Huffman

Huffman encoding

This is an example of error free coding, the information is completely the same,
the data is different

• This idea of variable length coding is used in many places

– Area codes for cities in large countries (China/India)

• Prof. David A. Huffman developed an algorithm to take a data set and compute
its “optimal” encoding (bit assignment)
– He developed this as a student at MIT

• This is very commonly applied to many compression techniques as a final stage

• There are other VLC techniques, such as Arthimetic coding and LZW coding
8
(ZIP)
Huffman encoding
Is the same of VLC, but using the Histogram
amount to measure the frequency for each level
in the Gray image.

See next example:

Note
The image coded with 3 bits per pixel

9
Huffman encoding
codew Gray
lp l p
ord level
pi=hi/n
0.000 0
• p(i) is the probability of 0.012 1
occurrence for a gray level i. 0.071 2
• h is the frequency of occurrence
0.019 3
of a gray level i
• n is the total number of pixels in 0.853 4
the image 0.023 5
0.019 6
0.003 7

Digital image processing a practical introduction using java, Nick Efford

10
Huffman encoding
codew Gray
lp l p
ord level
1111
0 0.000 0
4 0.853 11
0
2 0.071 1.317 1111
0
5 0.023 0.042 0 0.147 0.012 1
0
3 0.019 1 0 0.076 1
6 0.019 10 0.071 2
0 0.034 1
1 0.012 0.015 1 1101 0.019 3
0
7 0.003 0.003 1 0 0.853 4
0 0.000 1
1 1100 0.023 5
1110 0.019 6
1111
0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

11
Huffman encoding
codew Gray
lp l p
ord level
1111
6 0.000 0
11
1111
5 0.012 1
0
2 10 0.071 2
4 1101 0.019 3
1 0 0.853 4
4 1100 0.023 5
4 1110 0.019 6
1111
6 0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

12
Huffman encoding
x
codew Gray
lp l p
ord level
1111
0.000 6 0.000 0
11
1111
0.060 5 0.012 1
0
0.142 2 10 0.071 2
0.076 4 1101 0.019 3
0.853 1 0 0.853 4
0.092 4 1100 0.023 5
0.076 4 1110 0.019 6
1111
0.018 6 0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

13
Huffman encoding
codew Gray
lp l p
ord level
The compression
ratio achieved by 1111
0.000 6 0.000 0
Huffman coding is 11
3/1.317=2.28 1111
0.060 5 0.012 1
That mean there is a 0
56% data redundancy 0.142 2 10 0.071 2 +
0.076 4 1101 0.019 3
0.853 1 0 0.853 4
0.092 4 1100 0.023 5
0.076 4 1110 0.019 6
1111 1.317
0.018 6 0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

14
Run-length-encoding (RLE)
Consider the following
aaaaaaaabbbaaaaaaaabbbba

We could instead represent this by “runs”

8a,3b,19a,4b,1a

RLE is often more compact, especially when data contains lots of runs
of the same number(s)

RLE is also lossless, you can always reconstruct the original

Fax machines transmit RLE-encoded linescans of the black-and-white

15 documents
Quantization (Lossy)
Another thing we can do is actually quantize the data such that it cannot be
recovered completely

Consider the following string of numbers:

ORIGINAL = {10, 12, 15, 18, 20, 3, 5, 2, 13}
(all the numbers are unique so Huffman coding won’t help)

Integer divide this by 5 (i.e. quantize it)

QUANTIZED = {2, 2, 3, 3, 4, 0, 1, 0, 2}

The natural binary range is smaller, we could also use Huffman encoding to get a bit
more compression.

Of course, the values are not the same, on “reconstruction” (multiply by 5) we get
only an approximation of the original:
RECOVERED
16 = {10, 10, 15, 15, 20, 0, 5, 0, 10}
Lossy vs. Lossless
• For things like text documents and computer data files, lossy compression
doesn’t make sense
– An approximation of the original is no good!

• But for data like audio or visual, small errors are not easily detectable by
our senses
– An approximation is acceptable

• This is one reason we can get significant compression of images and audio,
vs. other types of data
– Lossless 10:1 is typically possible
– Lossy 300:1 is possible with no significant perceptual loss

• With lossy we can even talk about “quality”

– The more like the original the higher the quality
17 – The less like the original, the lower the quality

Image Compression: CS474/674 - Prof. Bebis
100% (1)
Image Compression: CS474/674 - Prof. Bebis
110 pages
Image Compression
100% (1)
Image Compression
38 pages
Signals and Systems Model Exit Exam
No ratings yet
Signals and Systems Model Exit Exam
12 pages
Chapter 3 Multimedia Data Compression
100% (2)
Chapter 3 Multimedia Data Compression
23 pages
The Ultimate Guide To Audio Effects
No ratings yet
The Ultimate Guide To Audio Effects
1 page
Image Compression
100% (1)
Image Compression
111 pages
Scheme Cbcs Nep Ece 2024
No ratings yet
Scheme Cbcs Nep Ece 2024
223 pages
Data Compression
No ratings yet
Data Compression
20 pages
Modulation
No ratings yet
Modulation
63 pages
Iterative Interpolation Super Resolution Image Reconstruction A Computationally Efficient Technique High-Quality eBook
100% (8)
Iterative Interpolation Super Resolution Image Reconstruction A Computationally Efficient Technique High-Quality eBook
15 pages
Module-5 Convolution in Time and Frequency Domain Objective: To Understand The Importance of Convolution Operation in LTI Systems
No ratings yet
Module-5 Convolution in Time and Frequency Domain Objective: To Understand The Importance of Convolution Operation in LTI Systems
15 pages
MSC Thesis
No ratings yet
MSC Thesis
108 pages
Introduction To Radar System Design by Skolnik Chapter 3 Solution
67% (3)
Introduction To Radar System Design by Skolnik Chapter 3 Solution
10 pages
Selected Topics in Computer Science Ch2
No ratings yet
Selected Topics in Computer Science Ch2
43 pages
Unit 1 Data Compression
No ratings yet
Unit 1 Data Compression
30 pages
GP 120
No ratings yet
GP 120
24 pages
DIP - 2025 - Matlab-123
No ratings yet
DIP - 2025 - Matlab-123
15 pages
Multimedia System: Chapter Eight: Multimedia Data Compression
No ratings yet
Multimedia System: Chapter Eight: Multimedia Data Compression
29 pages
ch4 DIGITAL TRANSMISSION
No ratings yet
ch4 DIGITAL TRANSMISSION
13 pages
Frequency Modulation
No ratings yet
Frequency Modulation
76 pages
Mastering FL Studio Native Plugins
No ratings yet
Mastering FL Studio Native Plugins
2 pages
Lecture2 Chapter6 - An Introduction To Counters
No ratings yet
Lecture2 Chapter6 - An Introduction To Counters
17 pages
Tutorial Exercise
No ratings yet
Tutorial Exercise
9 pages
3 MM Compression
100% (1)
3 MM Compression
35 pages
02 Acquisition Processing Basics 02
No ratings yet
02 Acquisition Processing Basics 02
56 pages
Sari 2
100% (1)
Sari 2
7 pages
Image Compression
100% (1)
Image Compression
47 pages
Automatic Gain Control (AGC) in Receivers: Iulian Rosu, YO3DAC / VA3IUL
No ratings yet
Automatic Gain Control (AGC) in Receivers: Iulian Rosu, YO3DAC / VA3IUL
11 pages
RL RC Frequency Response Exp
No ratings yet
RL RC Frequency Response Exp
3 pages
L15-Compression
No ratings yet
L15-Compression
63 pages
Module 5 IVP
No ratings yet
Module 5 IVP
112 pages
l2.4up
No ratings yet
l2.4up
21 pages
Stu-Lossless Compression Algos
No ratings yet
Stu-Lossless Compression Algos
21 pages
Rohini 98229548802
No ratings yet
Rohini 98229548802
5 pages
PreSonus StudioLive 16.0.2
No ratings yet
PreSonus StudioLive 16.0.2
5 pages
Digital Signal Processing IIR Filter Design Via Impulse Invariance
No ratings yet
Digital Signal Processing IIR Filter Design Via Impulse Invariance
11 pages
CHAPTER 7
No ratings yet
CHAPTER 7
36 pages
Audio Setting
No ratings yet
Audio Setting
4 pages
Warwick Pro Fet IV Manual 473085
No ratings yet
Warwick Pro Fet IV Manual 473085
7 pages
Image Acquisition: What Is An Image ?
No ratings yet
Image Acquisition: What Is An Image ?
10 pages
Domnic Image&Video Compression 2022
No ratings yet
Domnic Image&Video Compression 2022
76 pages
unit 5 data compression
No ratings yet
unit 5 data compression
98 pages
Chapter 4 - Introduction To Source Coding
No ratings yet
Chapter 4 - Introduction To Source Coding
72 pages
Module 5 - Info Theory and Compression Algo
No ratings yet
Module 5 - Info Theory and Compression Algo
58 pages
Atik Cameras - White Paper - The Move From CCD To CMOS
No ratings yet
Atik Cameras - White Paper - The Move From CCD To CMOS
2 pages
MM-Lecture 5 Image Compression
No ratings yet
MM-Lecture 5 Image Compression
20 pages
Lecture 3 Compression in Multimedia
No ratings yet
Lecture 3 Compression in Multimedia
60 pages
Compression 2
No ratings yet
Compression 2
70 pages
Ip-Un3 1
No ratings yet
Ip-Un3 1
44 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Compression [Compatibility Mode]
No ratings yet
Compression [Compatibility Mode]
12 pages
Compression: Safeen H. Rasool Assist. Lecturer
No ratings yet
Compression: Safeen H. Rasool Assist. Lecturer
16 pages
Chapter 4 - Introduction To Source Coding PDF
No ratings yet
Chapter 4 - Introduction To Source Coding PDF
72 pages
Anolg Modeled EQ Plugins
No ratings yet
Anolg Modeled EQ Plugins
4 pages
Chapter-5 Data Compression
No ratings yet
Chapter-5 Data Compression
53 pages
Chapter 4 Lossless Compression Algorithims
No ratings yet
Chapter 4 Lossless Compression Algorithims
30 pages
Image Compression
No ratings yet
Image Compression
50 pages
ICT - Module 1 Lecture 3
No ratings yet
ICT - Module 1 Lecture 3
43 pages
Unit 5 - Data Compression
No ratings yet
Unit 5 - Data Compression
46 pages
Image Compression
No ratings yet
Image Compression
39 pages
Lecture11 Lossless
No ratings yet
Lecture11 Lossless
34 pages
Data Compression
No ratings yet
Data Compression
28 pages
Manual 37615B SPM-D2-10 - Synchronizing Unit: © Woodward Page 45/68
No ratings yet
Manual 37615B SPM-D2-10 - Synchronizing Unit: © Woodward Page 45/68
1 page
FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I
No ratings yet
FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I
33 pages
Unit Iv - Multimedia File Handling: Compression and Decompression
No ratings yet
Unit Iv - Multimedia File Handling: Compression and Decompression
49 pages
ImageCompression-UNIT-V-students Material
No ratings yet
ImageCompression-UNIT-V-students Material
88 pages
MM Unit-III - 0
No ratings yet
MM Unit-III - 0
22 pages
16 San
No ratings yet
16 San
7 pages
CHAPTER FOURmultimedia
No ratings yet
CHAPTER FOURmultimedia
23 pages
Compression PDF
No ratings yet
Compression PDF
55 pages
Image Compression: - Data vs. Information - Entropy - Data Redundancy
No ratings yet
Image Compression: - Data vs. Information - Entropy - Data Redundancy
30 pages
Introduction To Data Compression - Guy E. Blelloch PDF
No ratings yet
Introduction To Data Compression - Guy E. Blelloch PDF
54 pages
Assignment 1
No ratings yet
Assignment 1
14 pages
06 Image Compresssion
No ratings yet
06 Image Compresssion
49 pages
6.1 Lossless Compression Algorithms: Introduction: Unit 6: Multimedia Data Compression
No ratings yet
6.1 Lossless Compression Algorithms: Introduction: Unit 6: Multimedia Data Compression
25 pages
Data Compression
No ratings yet
Data Compression
21 pages
Test Lah
No ratings yet
Test Lah
47 pages
Compression
No ratings yet
Compression
71 pages
CH 15
No ratings yet
CH 15
34 pages
Compressor Principles
No ratings yet
Compressor Principles
32 pages
BlenderV3 Manual
No ratings yet
BlenderV3 Manual
1 page
Synopsis On: Data Compression
No ratings yet
Synopsis On: Data Compression
25 pages
Anti-Aliasing with MSAA vs ABAA
From Everand
Anti-Aliasing with MSAA vs ABAA
Michel A Rohner
No ratings yet
Compression Techniques and Cyclic Redundency Check
No ratings yet
Compression Techniques and Cyclic Redundency Check
5 pages
Literature Survey
No ratings yet
Literature Survey
5 pages
Vb Net Programming
From Everand
Vb Net Programming
Martin Booch
No ratings yet
Data Compression and Huffman Algorithm
0% (1)
Data Compression and Huffman Algorithm
18 pages
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Anti Aliasing: Enhancing Visual Clarity in Computer Vision
From Everand
Anti Aliasing: Enhancing Visual Clarity in Computer Vision
Fouad Sabry
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet

Chapter 5 Data Compression

Uploaded by

Chapter 5 Data Compression

Uploaded by

Chapter 5

Consider the following string of data:

There are 4 different pieces of information (let’s say 4 symbols)

Bit assignment using VLC:

Now, compute the number of bits used to encode the same

• This idea of variable length coding is used in many places

• This is very commonly applied to many compression techniques as a final stage

See next example:

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

We could instead represent this by “runs”

RLE is also lossless, you can always reconstruct the original

Fax machines transmit RLE-encoded linescans of the black-and-white

Consider the following string of numbers:

Integer divide this by 5 (i.e. quantize it)

• With lossy we can even talk about “quality”

You might also like