Lecture6
Lecture6
Basak Guler
1 / 17
Differential Entropy
• So far we have only worked with discrete random variables.
2 / 17
Differential Entropy
• So far we have only worked with discrete random variables.
where S is the set of all x for which f (x) > 0 (support set of X ).
2 / 17
Example 26
• Example 26 (Uniform Random Variable). Let X ∼ U[a, b] be a
uniform random variable with PDF
(
1
for x ∈ [a, b]
f (x) = b−a
0 otherwise
3 / 17
Example 26
• Example 26 (Uniform Random Variable). Let X ∼ U[a, b] be a
uniform random variable with PDF
(
1
for x ∈ [a, b]
f (x) = b−a
0 otherwise
• Solution.
Z Z b
1 1
h(X ) = − f (x) log f (x)dx = − log dx = log(b − a)
S a b−a b−a
3 / 17
Example 27
• Example 27 (Gaussian Random Variable). Let X ∼ N (0, σ 2 ) be a
Gaussian random variable with mean 0 and variance σ 2 . The
corresponding PDF is:
1 x2
f (x) = √ e− 2σ2
2πσ 2
Find its differential entropy.
4 / 17
Example 27
• Solution.
Z
h(X ) = − f (x) log f (x)dx
ZS∞
1 − x2 1 − x2
=− √ e 2σ 2 log √ e 2σ 2 dx
−∞ 2πσ 2 2πσ 2
Z ∞
x2
1 x2 1
=− √ e− 2σ2 log √ log− e dx
−∞ 2πσ 2 2πσ 2 2σ 2
∞
log e ∞ x2
Z Z
1 1 − x2
2 x2
= − log √ √ e 2σ dx + 2 √ e− 2σ2 dx
2πσ 2 −∞ 2πσ 2 2σ −∞ 2πσ 2
| {z } | {z }
1 =E[X 2 ]=σ 2 +02 =σ 2
1 1
= log(2πσ 2 ) + log e
2 2
1
= log(2πeσ 2 ) bits
2
5 / 17
Other properties of Differential Entropy
• Relative entropy (KL-distance) between two PDFs f (x) and g(x) is:
Z
f (x)
D(f ||g) = f (x) log dx ≥ 0
g(x)
• Joint and conditional different entropy are similar to the discrete case:
where
6 / 17
Other properties of Differential Entropy
• We also have the following properties, for any constant c,
7 / 17
Gaussian Distribution Maximizes Differential Entropy
• Recall that entropy was maximized by a uniform distribution.
8 / 17
Gaussian Distribution Maximizes Differential Entropy
• Recall that entropy was maximized by a uniform distribution.
8 / 17
Gaussian Distribution Maximizes Differential Entropy
x2
• Proof. Let g(x) = √ 1 e− 2σ2 be the Gaussian PDF for N (0, σ 2 ).
2πσ 2
9 / 17
Gaussian Distribution Maximizes Differential Entropy
x2
• Proof. Let g(x) = √ 1 e− 2σ2 be the Gaussian PDF for N (0, σ 2 ).
√2
2πσ
• Then, log g(x) = x2
− log 2πσ 2 − 2σ 2 log e and therefore,
which occurs if and only if f (x) = g(x), which means f (x) should be a
Gaussian distribution N (0, σ 2 ).
10 / 17
The Gaussian Channel
• Definition 39 (Gaussian Channel). The Gaussian channel is
defined as:
Yi = Xi + Zi
where Xi is the channel input (at time i), Yi is the channel output, and
Zi ∼ N (0, N) is the noise, drawn i.i.d. from a Gaussian distribution. Zi
is independent from Xi .
Noise
Zi
Channel Input Channel Output
Xi + Yi
11 / 17
The Gaussian Channel
• Definition 39 (Gaussian Channel). The Gaussian channel is
defined as:
Yi = Xi + Zi
where Xi is the channel input (at time i), Yi is the channel output, and
Zi ∼ N (0, N) is the noise, drawn i.i.d. from a Gaussian distribution. Zi
is independent from Xi .
Noise
Zi
Channel Input Channel Output
Xi + Yi
11 / 17
The Gaussian Channel
Intuition behind “Gaussian”:
• The central limit theorem says that the aggregate effect of a large
number of these random events will have a Gaussian distribution.
12 / 17
Gaussian Channel Capacity
• Recall the channel capacity for the discrete alphabet case:
13 / 17
Gaussian Channel Capacity
• Recall the channel capacity for the discrete alphabet case:
13 / 17
Gaussian Channel Capacity
• Recall the channel capacity for the discrete alphabet case:
Noise
Z
Channel Input Channel Output
X + Y
14 / 17
Gaussian Channel Capacity
• Theorem 37 (Gaussian Channel Capacity). The capacity of the
Gaussian channel with average power constraint P and noise
variance N is equal to:
1 P
C = log 1 + (5)
2 N
15 / 17
Gaussian Channel Capacity
• Proof. Let Y = X + Z where X and N are independent from each
other, X ∼ p(x) and Z ∼ N (0, N).
16 / 17
Gaussian Channel Capacity
• Proof. Let Y = X + Z where X and N are independent from each
other, X ∼ p(x) and Z ∼ N (0, N).
• Then,
16 / 17
Gaussian Channel Capacity
• Now, note that the second moment of Y is:
17 / 17
Gaussian Channel Capacity
• Now, note that the second moment of Y is:
17 / 17
Gaussian Channel Capacity
• Now, note that the second moment of Y is: