0% found this document useful (0 votes)
2 views

lect03

The document discusses distributed lossless compression, focusing on concepts such as distributed lossless source coding, random binning, and the Slepian-Wolf theorem. It outlines the achievability of rate regions for two sources, provides examples, and explains the implications of time sharing in coding. The document also includes details on error analysis and methods for achieving optimal compression rates.

Uploaded by

1801099819
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

lect03

The document discusses distributed lossless compression, focusing on concepts such as distributed lossless source coding, random binning, and the Slepian-Wolf theorem. It outlines the achievability of rate regions for two sources, provides examples, and explains the implications of time sharing in coding. The document also includes details on error analysis and methods for achieving optimal compression rates.

Uploaded by

1801099819
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Lecture  Distributed Lossless Compression

(Reading: NIT .–., .)

∙ Distributed lossless source coding


∙ Lossless source coding via random binning
∙ Time sharing
∙ Achievability proof of the Slepian–Wolf theorem
∙ Extension to more than two sources

© Copyright – Abbas El Gamal and Young-Han Kim

Distributed lossless compression system

Xn M
Encoder 
̂ n, X
X ̂n
 
Decoder
Xn M
Encoder 

∙ Two-component DMS (-DMS) (X × X , p(x , x ))


∙ A (nR , nR , n) code:
󳶳 Two encoders: m (xn ) ∈ [ : nR ) and m (xn ) ∈ [ : nR )
󳶳 Decoder (̂xn , x̂ n )(m , m )

∙ Probability of error: Pe(n) = P{(X̂ n , X̂ n ) ̸= (Xn , Xn )}


∙ (R , R ) achievable if ∃ (nR , nR , n) codes such that limn→∞ Pe(n) = 
∙ Optimal rate region R ∗ : Closure of the set of achievable (R , R )

 / 
Bounds on the optimal rate region
R
Outer bound
Inner bound

H(X )
?
H(X |X )
R
H(X |X ) H(X )

∙ Sufficient condition for individual compression:


R > H(X ), R > H(X )
∙ Necessary condition for centralized compression:
R + R ≥ H(X , X )
∙ Can also show that
R ≥ H(X |X ), R ≥ H(X |X ) are necessary

 / 

Slepian–Wolf theorem
R

H(X )

H(X |X )
R
H(X |X ) H(X )

Theorem . (Slepian–Wolf )


The optimal rate region R ∗ is the set of (R , R ) such that

R ≥ H(X |X ),
R ≥ H(X |X ),
R + R ≥ H(X , X )

 / 
Example

∙ Doubly symmetric binary source (DSBS(p)) (X , X )


−p X
/   / X  
−p p
X X 
 
p −p
/   / 
−p  

∙ Let p = .
∙ Individual compression:  bits/symbol-pair
∙ Slepian–Wolf coding: H(X , X ) = . bits/symbol-pair

 / 

Lossless source coding via random binning

∙ Codebook generation:
󳶳 Randomly assign an index m(xn ) ∈ [ : nR ] to each sequence xn ∈ X n
󳶳 The set of sequences with the same index m form a bin B(m), m ∈ [ : nR ]
󳶳 Bin assignments are revealed to the encoder and decoder

∙ Encoding:
󳶳 Upon observing xn ∈ B(m), send the bin index m

∙ Decoding:
󳶳 Find the unique typical sequence x̂ n ∈ B(m)

B() B(m) B(nR )

xn ∈ Tє(n)
 / 
Lossless source coding via random binning

∙ Codebook generation:
󳶳 Randomly assign an index m(xn ) ∈ [ : nR ] to each sequence xn ∈ X n
󳶳 The set of sequences with the same index m form a bin B(m), m ∈ [ : nR ]
󳶳 Bin assignments are revealed to the encoder and decoder

∙ Encoding:
󳶳 Upon observing xn ∈ B(m), send the bin index m

∙ Decoding:
󳶳 Find the unique typical sequence x̂ n ∈ B(m)

xn

B() B(m) B(nR )

 / 

Lossless source coding via random binning

∙ Codebook generation:
󳶳 Randomly assign an index m(xn ) ∈ [ : nR ] to each sequence xn ∈ X n
󳶳 The set of sequences with the same index m form a bin B(m), m ∈ [ : nR ]
󳶳 Bin assignments are revealed to the encoder and decoder

∙ Encoding:
󳶳 Upon observing xn ∈ B(m), send the bin index m

∙ Decoding:
󳶳 Find the unique typical sequence x̂ n ∈ B(m)

x̂ n

B() B(m) B(nR )

 / 
Analysis of the probability of error

∙ We bound Pe(n) averaged over random bin assignments


∙ Let M denote the random bin index of X n , i.e., X n ∈ B(M)
∙ Note that M ∼ Unif[ : nR ], independent of X n
∙ Error events:
E = {X n ∉ Tє(n) }, or
E = {̃xn ∈ B(M) for some x̃ n ̸= X n , x̃ n ∈ Tє(n) }

Thus

P(E) ≤ P(E ) + P(E )


= P(E ) + P(E |X n ∈ B())

∙ By the LLN, P(E ) → 

 / 

Analysis of the probability of error

∙ Consider
P(E |X n ∈ B()) = 󵠈 P󶁁X n = xn | X n ∈ B()󶁑
xn
⋅ P󶁁̃xn ∈ B() for some x̃ n ̸= xn , x̃ n ∈ Tє(n) | xn ∈ B(), X n = xn 󶁑
≤ 󵠈 p(xn ) 󵠈 P󶁁̃xn ∈ B() | xn ∈ B(), X n = xn 󶁑
xn x̃ n ∈T󰜖(n)
x̃ n ̸=x n

= 󵠈 p(xn ) 󵠈 P{̃xn ∈ B()}


xn x̃ n ∈T󰜖(n)
x̃ n ̸=x n

≤ |Tє(n) | ⋅ −nR
≤ n(H(X)+δ(є)) −nR

∙ Hence P(E ) →  as n → ∞ if R > H(X) + δ(є)

 / 
Achievability via linear binning

∙ Let X be a Bern(p) source


∙ R = H(X) achieved via linear binning (hashing)
󳶳 Let H be a randomly generated nR × n binary parity-check matrix
󳶳 Encoder sends HX n
󳶳 Decoder recovers X n with high probability if R > H(p) (why?)

 / 

Time sharing

Proposition .
If (R󳰀 , R󳰀 ), (R󳰀󳰀 , R󳰀󳰀 ) ∈ R ∗ , then (R , R ) = (αR󳰀 + αR
̄ 󳰀󳰀 , αR󳰀 + αR
̄ 󳰀󳰀 ) ∈ R ∗ for α ∈ [, ]

∙ The rate region R ∗ is convex


R

(R󳰀 , R󳰀 )

(R , R )
(R󳰀󳰀 , R󳰀󳰀 )

R

 / 
Time sharing

Proposition .
If (R󳰀 , R󳰀 ), (R󳰀󳰀 , R󳰀󳰀 ) ∈ R ∗ , then (R , R ) = (αR󳰀 + αR
̄ 󳰀󳰀 , αR󳰀 + αR
̄ 󳰀󳰀 ) ∈ R ∗ for α ∈ [, ]

∙ The rate region R ∗ is convex


∙ Proof: Time sharing argument
󳰀 󳰀
(k)
󳶳 Let Ck󳰀 be a sequence of (kR , kR , k) codes with Pe →
󳰀󳰀 󳰀󳰀
(k)
󳶳 Let Ck󳰀󳰀 be a sequence of (kR , kR , k) codes with Pe →
󳰀 󳰀󳰀
󳶳 Construct a new sequence of codes by using Cαn for i ∈ [ : αn] and Cαn
̄ for i ∈ [αn +  : n]
αn ̄
αn
󳰀 󳰀󳰀
Cαn Cαn
̄

(αn) ̄
(αn)
󳶳 By the union of events bound, Pe(n) ≤ Pe + Pe →

∙ Remarks:
󳶳 Time division is a special case of time sharing (between (R , ) and (, R ))
󳶳 The rate (capacity) region of any source (channel) coding problem is convex
 / 

Achievability proof of the S–W theorem (Cover )


∙ We show that the corner point (H(X ), H(X |X )) is achievable
∙ Achievability of the other corner point (H(X |X )), H(X )) follows similarly
∙ The rest of the region is achieved using time sharing

R

H(X )

H(X |X )
R
H(X |X ) H(X )

 / 
Achievability proof of the S–W theorem (Cover )
∙ We show that the corner point (H(X ), H(X |X )) is achievable
∙ Achievability of the other corner point (H(X |X )), H(X )) follows similarly
∙ The rest of the region is achieved using time sharing

R

H(X )

H(X |X )
R
H(X |X ) H(X )

 / 

Achievability proof of the S–W theorem (Cover )


∙ We show that the corner point (H(X ), H(X |X )) is achievable
∙ Achievability of the other corner point (H(X |X )), H(X )) follows similarly
∙ The rest of the region is achieved using time sharing

R

H(X )

H(X |X )
R
H(X |X ) H(X )

 / 
Achievability of (H(X ), H(X |X ))
∙ Codebook generation:
󳶳 Assign a distinct index m ∈ [ : nR ] to each xn ∈ Tє(n) (X ), and m = , otherwise
󳶳 Randomly assign an index m (xn ) ∈ [ : nR ] to each xn ∈ Xn
󳶳 The sequences with the same index m form a bin B(m )

xn
  m nR
xn
B()

B(m ) Tє(n) (X , X )

B(nR )
 / 

Achievability of (H(X ), H(X |X ))


∙ Encoding:
󳶳 Upon observing xn , encoder  sends the index m (xn )
󳶳 Upon observing xn ∈ B(m ), encoder  sends m

xn
  m nR
xn
B()

B(m ) (xn , xn )

B(nR )
 / 
Achievability of (H(X ), H(X |X ))
∙ Decoding: Sources recovered successively
󳶳 Declare x̂ n = xn (m ) for the unique xn (m ) ∈ Tє(n) (X )
󳶳 Find the unique x̂ n ∈ B(m ) ∩ Tє(n) (X |̂xn )
󳶳 If there is none or more than one, the decoder declares an error

xn
  m nR
xn
B()

B(m ) (̂xn , x̂n )

B(nR )
 / 

Analysis of the probability of error

∙ Let M and M denote the random bin indices for Xn and Xn
∙ Error events:
E = 󶁁(Xn , Xn ) ∉ Tє(n) 󶁑,
E = 󶁁̃xn ∈ B(M ) for some x̃ n ̸= Xn , (Xn , x̃ n ) ∈ Tє(n) 󶁑

Then,

P(E) ≤ P(E ) + P(E )

∙ P(E ) →  by LLN

 / 
Analysis of the probability of error

∙ By symmetry,
P(E ) = P(E |Xn ∈ B())
= 󵠈 P{(Xn , Xn ) = (xn , xn ) | Xn ∈ B()}
(x n ,x n )
⋅ P󶁁̃xn ∈ B() for some x̃ n ̸= xn , (xn , x̃ n ) ∈ Tє(n) 󵄨󵄨󵄨󵄨 xn ∈ B(),
(Xn , Xn ) = (xn , xn )󶁑
≤ 󵠈 p(xn , xn ) 󵠈 P{̃xn ∈ B()}
(x n ,x n ) x̃ n ∈T󰜖(n) (X  |x n )
x̃ n ̸=x n

≤ n(H(X |X )+δ(є)) −nR

∙ Hence, P(E ) →  as n → ∞ if R > H(X |X ) + δ(є)


∙ Remark: Achievability can be proved without time-sharing (NIT ..)

 / 

Extension to more than two sources

Theorem .
The optimal rate region R ∗ (X , . . . , Xk ) for the k-DMS (X , . . . , Xk ) is the set of
(R , . . . , Rk ) such that

󵠈 Rj ≥ H(X(S)|X(S c )) for all S ⊆ [ : k]


j∈S

∙ For k = , R ∗ (X , X , X ) is the set of (R , R , R ) such that


R ≥ H(X |X , X ),
R ≥ H(X |X , X ),
R ≥ H(X |X , X ),
R + R ≥ H(X , X |X ),
R + R ≥ H(X , X |X ),
R + R ≥ H(X , X |X ),
R + R + R ≥ H(X , X , X )

 / 
Summary

∙ k-Component discrete memoryless source (k-DMS)


∙ Distributed lossless source coding for a k-DMS:
󳶳 Slepian–Wolf optimal rate region
󳶳 Random binning

∙ Time sharing

 / 

References
Cover, T. M. (). A proof of the data compression theorem of Slepian and Wolf for ergodic sources.
IEEE Trans. Inf. Theory, (), –.
Slepian, D. and Wolf, J. K. (). Noiseless coding of correlated information sources. IEEE Trans. Inf.
Theory, (), –.

 / 

You might also like