Ultimedia OF ATA Ompression: IS502:M D I S
Ultimedia OF ATA Ompression: IS502:M D I S
SYSTEM
MULTIMEDIA OF DATA
COMPRESSION
Presenter Name: Mahmood A.Moneim
Supervised By: Prof. Hesham A.Hefny
Winter 2014
compress
decompress
By Mahmood A.Moneim
compressed Data
Decompressed Data
2
By Mahmood A.Moneim
By Mahmood A.Moneim
By Mahmood A.Moneim
Dictionary-based
Example:
LZ family
runlength code
By Mahmood A.Moneim
By Mahmood A.Moneim
Shannon-Fano Algorithm
To illustrate the algorithm, lets suppose the symbols
to coded are characters in the word HELLO. The
frequency count of the symbols is
By Mahmood A.Moneim
By Mahmood A.Moneim
Cont.
Entropy
By Mahmood A.Moneim
10
Huffman code
Huffman code: (illustrated with a
manageable example)
Letter
Frequency (%)
A
25
B
15
C
10
D
20
E
30
Huffman code
Huffman code: Code formation
- Assign weights to each character
- Merge two lightest weights into one root
node with sum of weights .
- Repeat until one tree is left
- Traverse the tree from root to the leaf (for
each node, assign 0 to the left, 1 to the right)
Huffman code
Huffman code: Code Interpretation
- No prefix property: code for any character
never appears as the prefix of another code
(Verify)
- Receiver continues to receive bits until it
finds a code and forms the character
- 01110001110110110111 (extract the string)
Example. Find Huffman codes and compression ratio (C.R.) for Table 1,
assuming that uncompressed representation takes 8-bit per character and
assume that size of Huffman table is not part of the compressed size.
Table 1:
Char
Freq
90
60
50
20
12
Huffman Codes:
A
00
01
10
111
1101
11001
110000
110001
11
10
01
000
0010
00110
001111
001110
14
Huffman Tree
250
/
\
150 100
/ \ / \
A B C 50
/ \
30 D
/ \
18 E
/ \
10 F
/ \
G H
Char
Freq
90
60
50
20
12
Huffman
Code
00
01
10
111
1101
11001
110000
110001
C.R. = (250*8) / (2*90 + 2*60 + 2*50 + 3*20 + 4*12 + 5*8 + 6*7 + 6*3) = 3.29
15
00
01
10
111
1101
11001
110000
110001
11
10
01
000
0010
00110
001111
001110
16
Arithmetic compression
Arithmetic compression: is based on
Interpreting a character-string as a single real
number
Letter
Frequency (%) Subinterval [p, q]
A
25
[0, 0.25]
B
15
[0.25, 0.40]
C
10
[0.40, 0.50]
D
20
[0.50, 0.70]
E
30
[0.70, 1.0]
Arithmetic compression
Arithmetic compression: Coding CABAC
Generate subintervals of decreasing length,
subintervals depend uniquely on the strings
characters and their frequencies.
Interval [x, y] has width w = y x, the new
interval based on [p, q] is x = x + w.p, y = x +
w.q
Step 1: C 0..0.4.0.5..1
based on p = 0.4, q = 0.5
Arithmetic compression
Step 2: A 0.40.425.....0.5
based on p = 0.0, q = 0.25
Step 3: B
0.40.406250.41..0.425
based on p = 0.25, q = 0.4
Step 4: A
Step 5: C
0.406625 0.4067187
Final representation (midpoint)?
Arithmetic compression
Arithmetic compression: Extracting CABAC
N
0.4067
0.067
0.268
0.12
0.48
Interval[p, q]
0.4 0.5
0 0.25
0.25 0.4
0 0.25
0.4 0.5
Width Character
0.1
C
0.25
A
0.15
B
0.25
A
0.1
C
N-p
0.0067
0.067
0.018
0.12
0.08
(N-p)/width
0.067
0.268
0.12
0.48
0.8
LZW Algorithm
LZW Compression
Begin
S= next input character
While not EOF
{
C= next input character
Is s+c exists in the dictionary
S= s+c
Else{
Output the code for s;
Add String s+ c to dictionary with a new code
S=c
}
}
Output the code for s
End
By Mahmood A.Moneim
21
By Mahmood A.Moneim
22
Cont.
By Mahmood A.Moneim
23
LZW Decompression
By Mahmood A.Moneim
24
Cont.
Input code for the decoder is 124523461.
By Mahmood A.Moneim
25
QUESTIONS?
By Mahmood A.Moneim
29