Information Theory & Coding Techniques-DCom
Information Theory & Coding Techniques-DCom
Techniques
© Ajinkya C. Kulkarni
Subject : DCOM
Unit 5
By
Ajinkya C. Kulkarni
[email protected]
1 Digital Communication ©
Contents at Glance
Introduction
Basics of Information theory
Entropy concept
Data compression
Variable length coding
Shannon Fano coding
Huffman coding
2 Digital Communication ©
Information theory
Information basically is a set of symbols with
each symbol has its own probability of
occurrence.
3 Digital Communication ©
Information theory
1
I ( p ) log( ) log p
p
Measures amount of information in event of
occurrence of probability p
Properties of I(p)
• I(p) 0 (a real nonnegative measure)
•I(p1,2) =I(p1p2) = I(p1)+I(p2) for independent event
•I(p) is a continuous function of p
4 Digital Communication ©
Entropy
the average minimum bit length of coding symbols
without distortion (the low-bound amount of the
compacted data completely truly recovered back)
5 Digital Communication ©
Entropy
Set of symbols (alphabet) S={s1, s2, …, sN},
N is number of symbols in the alphabet.
Probability distribution of the symbols: P={p1, p2, …, pN}
6 Digital Communication ©
Terms in Entropy
N
H pi log 2 ( pi )
i 1
7 Digital Communication ©
Entropy for binary source: N=2
S={0,1}
p0=p
p1=1-p 1-p
0 1
H [ p log 2 p (1 p ) log 2 (1 p )]
8 Digital Communication ©
Entropy for uniform distribution: pi=1/N
Uniform distribution of probabilities: pi=1/N:
N
H (1 / N ) log 2 (1 / N ) log 2 ( N )
i 1
Pi=1/N
s1 s 2 sN
Examples:
N= 2: pi=0.5; H=log2(2) = 1 bit
N=256: pi=1/256; H=log2(256)= 8 bits
Entropy gives min. number of bits required
9 Digital Communication © for coding
Entropy example
X is sampled from {a, b, c, d}
P: {1/2, 1/4, 1/8, 1/8}
Find entropy.
10 Digital Communication ©
This is B. Tech. IIIrd year.
Data Compression
the process of coding that will effectively reduce the total
number of bits needed to represent certain information
11 Digital Communication ©
Lossless vs Lossy Compression
If the compression and decompression
processes induce no information loss, then
the compression scheme is lossless;
otherwise, it is lossy.
12 Digital Communication ©
Compression codes
Variable length coding
Shannon-Fano code
Huffman’s code
13 Digital Communication ©
Shannon Fano coding
A top-down approach
14 Digital Communication ©
Calculate Entropy & Shannon Fano code for given
Information
Shannon-Fano Code: Example (1st step)
si pi A,B, C,D,E
A- 15/39 15,7, 6,6,5
B- 7/39
C- 6/39 0 1
D- 6/39
E- 5/39
A,B C,D,E
15+7 =22 6+6+5=17
15 Digital Communication ©
Shannon-Fano Code: Example (2nd step)
si pi
A,B, C,D,E
A- 15/39 15,7, 6,6,5
B- 7/39
C- 6/39
D- 6/39 0 1
E- 5/39
A,B C,D,E
15+7 =22 6+6+5=17
0 1 0 1
A B C D,E
15 7 6 6+5=11
16 Digital Communication ©
Shannon-Fano Code: Example (3rd step)
si pi A,B, C,D,E
A- 15/39 15,7, 6,6,5
B- 7/39
C- 6/39 0 1
D- 6/39
E- 5/39
A,B C,D,E
15+7 =22 6+6+5=17
0 1 0 1
A B C D,E
15 7 6 6+5=11
0 1
D E
17 Digital Communication © 6 5
Shannon-Fano Code: Example (Result)
0 1 0 1
Binary tree
A B C 0 1 H=89/39=2.28 bits
D E
18 Digital Communication ©
Huffman’s Coding A down-up approach
Huffman Coding Algorithm
Repeat the procedure
Arrange the symbols
until only one symbol
according to decreasing
remains
probability
19 Digital Communication ©
Huffman Coding –
Explained……….
Source alphabet A = {a, b, c, d, e}
Probability distribution: {0.2, 0.4, 0.2, 0.1, 0.1}
b (0.4)
a (0.2)
c (0.2)
d (0.1)
e (0.1)
20 Digital Communication ©
Huffman Coding (Result)
Entropy:
H(S) = - [0.2*log2(0.2)*2 +
0.4*log2(0.4)+0.1*log2(0.1)*2]
= 2.122 bits / symbol
Average Huffman codeword length:
L = 0.2*2+0.4*1+0.2*3+0.1*4+0.1*4 = 2.2 bits / symbol
21 Digital Communication ©
Properties of Huffman code
Unique Prefix Property:
precludes any ambiguity in decoding (not unique)
Optimality:
minimum redundancy code - proved optimal for a given
data model
The two least prob symbols will have the same length for
their Huffman codes, differing only at the last bit.
Symbols that occur more frequently will have shorter
Huffman codes than symbols that occur less frequently.
The average code length for an information source
S is strictly less than entropy+ 1.
22 Digital Communication ©
Shannon’s theorem
Source Coding theorem
Given the discrete memory less source of entropy H the
average code word length L for any source encoding is given
as,
H≤L
Considering Shannon Fano code,
M = number of messages to be transmitted
N = bits per word
Then M=2N
23 Digital Communication ©
Shannon Heartly Theorem
Channel Capacity Theorem
24 Digital Communication ©
Shannon Heartly Theorem
Explanation
Gaussian Noise
25 Digital Communication ©
Trade off between BW & SNR
Effect of SNR
Effect of BW
Trade Off
26 Digital Communication ©
Assignment
A discrete memory less source has S={a, b, c, d, e} & P={0.4, 0.19,
0.16, 0.15, 0.1}. Explain Shannon Fano algorithm to construct
code for this source.
Thank You!
28 Digital Communication ©