Unit - 2 - Mathematical Preliminaries For Lossless Compression Models
Unit - 2 - Mathematical Preliminaries For Lossless Compression Models
Mathematical
2 Preliminaries for
Lossless Compression
Models
Data
Data
Compression
Compression
Get Prepared Together
Prepared and Edited by:- Divya Kaurani Designed by:- Kussh Prajapati
www.collegpt.com [email protected]
Unit - 2 : Mathematical Preliminaries
for Lossless Compression Models
Modeling + Coding :
Types of Models :
1. Physical Model :
These models exploit knowledge about the physical process that generated
the data. For example, speech recognition, to install the electricity meter in an
area, in image compression, a physical model might account for the spatial
correlations between neighboring pixels. By understanding how the data is
created, the model can predict and potentially remove redundant information.
2. Probability Models :
These models analyze the statistical properties of the data, like the frequency of
occurrence of symbols or patterns. This analysis helps identify symbols or
sequences that appear more frequently and allows for efficient representation
using shorter codes.
Basic Probability Model: This assumes each symbol in the data stream is
independent and has an equal probability of occurring. It's a simple approach,
but may not be very effective for real-world data with inherent biases.
3. Markov Model :
A Markov model, named after the mathematician Andrey Markov, is a stochastic
model used to model sequences of random variables where the probability of
each variable's value depends only on the state of the preceding variable. In
other words, it's a memoryless process where the future state depends only
on the current state and not on the sequence of events that preceded it.
Markov models are widely used in various fields such as natural language
processing, speech recognition, bioinformatics, finance, and more.
Example:
Imagine a simple weather model with states Sunny (S), Rainy (R), and Cloudy
(C). The transition matrix below shows the probabilities of transitioning from
one state to another:
This matrix indicates, for example, that if today is sunny, there is a 70% chance
of it being sunny tomorrow, a 20% chance of it being cloudy, and a 10% chance
of it raining.
Imagine a dictionary:
If any codeword is the prefix of any other codeword then it is NOT a prefix code.
Imagine a dictionary where each word (symbol) has a unique abbreviation (codeword).
A uniquely decodable code ensures that no abbreviation accidentally matches the
beginning of another word. Here's why it's important:
Algorithmic Information Theory (AIT): It's about finding the shortest way to describe
something. In data compression, it means trying to represent data using as little space
as possible while still keeping all the important information.
Minimum Description Length (MDL) Principle: This principle suggests that the best
way to describe data is to find the shortest computer program that can generate it,
plus the length of that program. In data compression, it means finding the shortest way
to describe the data and the method used to compress it.
AIT helps us understand how complex the data is and how much we can compress it
theoretically.
The MDL principle guides us in finding the most efficient compression method by
considering both the data and the description of how it's compressed.
So, these concepts help in creating compression methods that make data smaller
while still being able to recreate the original data accurately.
All the Best
"Enjoyed these notes? Feel free to share them with
Visit: www.collegpt.com
www.collegpt.com [email protected]