Arithmetic Coding: I J I J J
Arithmetic Coding: I J I J J
ji
Example
Letter
Probability
Interval
0.2
[0,0.2)
0.3
[0.2,0.5)
0.1
[0.5,0.6)
0.2
[0.6,0.8)
0.1
[0.8,0.9)
0.1
[0.9,1)
4.
0.2
0.23
0.233
a
0.2
0.26
e
0.5
0.23
i
0.236
0.5
4.
0.26
0.233
i
0.2336
0.23354 #
0.2336
0.236
Decompression
The shortest binary fraction in this
interval is 0.00111011110011 which is
0.23358154296875.
This number is more than 0.2 and less
than 0.5 so the first item is 'e'.
Two methods to stop decompression:
A special character for EOF.
Attaching the number of items.
4.
Scaling
The intervals in arithmetic coding are
shrinking very fast.
Many digits must be used to provide
sufficient precision.
Multiplications and divisions on very
long numbers are slow and
complicated.
4.
Example
Items:
a-10%
i-20%
r-30%
y-40%
Intervals:
a-[0,0.1)
i-[0.1,0.3)
r-[0.3,0.6)
y-[0.6,1)
Example (Cont.)
[0,1)
[0.6,1)
[0.2,1)
1
[0.2,0.28)
0
[0.4,0.56)
follow
[0.3,0.62)
follow
[0.1,0.74)
[0.164,0.292)
0ff = 011
[0.328,0.584)
[0.156,0.668)
follow
[0.3096,0.4632)
0f = 01
[0.6192,0.9264)
1
[0.2384,0.8528)
4.
Example (cont. )
The shortest fraction in the interval
[0.2384,0.8528) is 0.5 which is 0.1 in binary.
So the final output is 100110111.
Without scaling we would get:
y
a
i
[0,1) [0.6,1)
[0.6,0.64)
r
[0.604,0.612) [0.6064,0.6088)
The shortest binary fraction in this interval is
0.100110111 which is about 0.607422.
4.
an optimal code.
4.
Definitions
File X: x1,...,xn.
Probabilities of items in file X: p1,...,pn.
The items' set: a1,...,am.
xi{a1,...,am}.
Probabilities of items: q1,...,qm.
Frequencies of items: f1,...,fm.
qi=fi/n
4.
Proof of optimality
n
mf
i =1
i =1
i =1