MM05_example-1
MM05_example-1
Example
Arithmetic Vs. Huffman Coding
Huffman utilizes a static table to represent all the characters and their
frequencies, then generates a code table accordingly.
Huffman uses a static table for the whole coding process, so it is rather fast,
but does not produce an efficient compression ratio.
An ideal compression method should satisfy all those features given in the
table.
Compression Method Arithmetic Huffman
Compression Ratio Very Good Poor
Compression Speed Slow Fast
Decompression Speed Slow Fast
Memory Space Very Low Low
Compressed Pattern Matching No Yes
Permits Random Access No Yes
IR ={ [0, p1), [p1, p1+ p2), [p1+ p2, p1+ p2+ p3), … [p1+ … + pn-1, p1+ … + pn) }
j
Putting q p
j i we can write IR = { [0, q1), [q1, q2), …[qn-1, 1) }
i1
10
Arithmetic Code
Coding
ArithmeticEncoding ( Message )
1. CurrentInterval = [0, 1);
While the end of message is not reached
2. Read letter xi from the message;
3. Divid CurrentInterval into subintervals IRCurrentInterval;
Output any number from the CurrentInterval (usually its left boundary);
A B C #
0.4 0.3 0.1 0.2
Example 1
input message: A B B C #
Example 1
input message: A B B C #
q p
j i
i1
Example 1
input message: A B B C #
IR ={
[0,1)
[0, 0.4) , [0.4, 0.7),
[0.7, 0.8), [0.8, 1)
}
Arithmetic Code A B C #
Example 1
input message: A B B C #
No. 1
A B C #
0.4 0.3 0.1 0.2
Example 1
input message: A B B C #
q p j i
i1
Example 1
input message: A B B C #
IR[0,0.4)= {
[0, 0.16) , [0.16, 0.2 8),
[0.28, 0.32), [0.32, 0.4)
}
Arithmetic Code A B C #
Example 1
input message: A B B C #
No. 2
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)
Arithmetic Code A B C #
Example 1
input message: A B B C #
Example 1
input message: A B B C #
No. 2
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
[0.208, 0.244)
Arithmetic Code A B C #
Example 1
input message: A B B C #
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
Arithmetic Code A B C #
Example 1
input message: A B B C #
No. 3
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
[0.2332, 0.2368)
Arithmetic Code A B C #
Example 1
input message: A B B C #
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
Arithmetic Code A B C #
Example 1
input message: A B B C #
No. 4
A B C #
0.4 0.3 0.1 0.2
Xi Current interval Subintervals
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
B [0, 0.4) [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
B [0.16, 0.28) [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
[0.23608, 0.2368)
Arithmetic Code A B C #
Example 1
input message: A B B C #
# is the end of input message Stop Return current interval [0.23608, 0.2368)
C [0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
# [0.2332, 0.2368) [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
[0.23608, 0.2368)
A B C #
Example 1 ABBC# 0.4 0.3 0.1 0.2
After B C #
A B
seeing
1 0.4 0. 28 0.244 0.2368
0.8 0.23608
0.7
0.4
Example 1
input message: A B B C #
# is the end of input message Stop Return current interval [0.23608, 0.2368)
ABBC# 0.23608
Example 1 A B C #
• The size of the final range is 0.4 0.3 0.1 0.2
0.2368 - 0.23608 = 0.00072,
• According to Shannon :
The best compression code is the output length contains a contribution
of –log(p) bits from the encoding of each symbol whose probability of
occurrence is p.
ArithmeticDecoding ( Codeword )
1. CurrentInterval = [0, 1);
While(1)
2. Divid CurrentInterval into subintervals IRCurrentInterval;
3. Determine the subintervali of CurrentInterval to which
Codeword belongs;
4. Output letter xi corresponding to this subinterval;
5. If xi is the symbol ‘#’
Return;
6. CurrentInterval = subintervali in IRCurrentInterval;
Arithmetic Code
Decoding
Symbol Probability
A 0.4
B 0.3
C 0.1
# 0.2
q p
j i
i1
IR[0,1)= {
[0, 0.4) , [0. 4, 0.7),
[0.7, 0.8), [0. 8, 1)
}
Arithmetic Code A B C #
Similarly we repeat the algorithm steps 1 to 5 until the output symbol = ‘#’