10.1 Gaussian Channel
10.1 Gaussian Channel
Gaussian Channel
In communication theory it is often assumed that the transmitted signals are distorted
by some noise. The most common noise to assume is additive Gaussian noise, i.e. the so
called Additive White Gaussian Noise channel, AWGN. Even though the noise in reality
is more complex, this model is very efficient when simulating for example background
noise or amplifier noise. Then the model can be complemented by e.g. impulse noise
or other typical noise models that are out there. In this chapter we will have a closer
look at AWGN channels and see how the previous theory applies here. We will derive a
fundamental limit of the signal to noise ration (SNR) specifying when it is not possible to
achieve reliable communication.
Y =X +Z
where X is the information carrier component and Z noise component. The average
power allocated by the variable X is defined as the second moment,
P = E X2
131
132 CHAPTER 10. GAUSSIAN CHANNEL
Zi
Xi Yi = Xi + Zi
E X2 ≤ P
Without the power constraint in the definition we would be able to choose as many sig-
nal alternatives as far apart as we like. Then we would be able to transmit as much
information as we like in a single channel use. With the power constraint we get a more
realistic system where we need to find other means that increasing the power to get a
higher information throughput over the channel.1 To see how much information is pos-
sible to transmit over the channel we again maximizes the mutual information between
the transmitted variable X and the received variable Y , with the side condition that the
power is limited by P .
C = max I(X; Y )
f (x),P
With the statistics if the noise known to be a normal distribution with zero mean and
variance N , we can get
1
H(Z) = 2 log(2πeN )
We also know from the previous chapter that for a given mean and variance, the Gaussian
distribution maximizes the entropy. So, maximizing H(Y ) over all distributions of X
gives
max H(Y ) ≤ 1
2 log(2πeσ 2 )
f (x),P
where we get equality for Y ∈ N (0, σ). Since Y = X + Z, we can√ use that the sum of two
Gaussian variables is again Gaussian. Then, by letting X ∈ N (0, P ), we get the desired
distribution on Y , where σ 2 = P − N . Hence, the information capacity is given by
C = max I(X; Y )
f (x),P
√
Theorem 44 The mutual information for a Gaussian channel is maximized for X ∼ N (0, P ),
as
C = max I(X; Y )
f (x),P
Similar to the discrete case we want to consider which code rates that can give a code
where the error probability goes to zero as the length of the codewords tend to infinity.
To start with we define the terminology of achievable code rate as below.
Definition 32 A code rate is achievable if there exists a (2nR , n) code that satisfy the power
constraint, such that the probability of error Pe tends to zero. The channel capacity is the
supremum of all achievable rates.
In a similar way as for the discrete case we can formulate the channel coding theorem.2
2
In this text the proof for the channel coding theorem for the Gaussian channel is omitted.
134 CHAPTER 10. GAUSSIAN CHANNEL
Theorem 45 The Channel capacity of a Gaussian channel with power constraint P and noise
variance N is
1 P
C = log 1 +
2 N
The terminology signal to noise ratio, SNR, is often used for the relation between the signal
power and 2the
noise power. In this case the signal power is P while the noise has the
P
power E Z = N . Hence, in this case SNR = N . Depending on the topic and what
type of system considered, there are many different ways to define the SNR. It is often
important to be aware of the specific definition used in a text.
In some cases there can be several independent parallel Gaussian channels used by the
same communication system. In Figure 10.1 there
are n such parallel channels. Each of
the channels has a power constraint P = E X 2 and a noise variance N . The total power
P i i i
is P = i Pi .
Z1
X1 + Y1
Z2
X2 + Y2
.. ..
. .
Zn
Xn + Yn
where we have equality in (a) if the variables Xi are independent and in (b) if they are
Gaussian. Since σX 2 = E[X 2 ] = P , we can maximize the mutual information for the
√
set of Pi by using independent Gaussian varibale Xi ∈ N (o, P ). To get the capacity
10.2. PARALLEL GAUSSIAN CHANNELS 135
we
P maximize the above expression with respect to Pi . With the additional constraint
i Pi = P we can use the Lagrange multiplier to achieve this. The maximization function
is then given by
n n
X 1 Pi X
J= log 1 + +λ Pi − P
2 Ni
i=1 i=1
from the reality, saying that Pi ≥ 0. From the Kuhn-Tucker conditions3 we can rewrite
the first set equations, and remaining the optimality, as
+
Pi = B − Ni
where
(
x, x ≥ 0
(x)+ =
0, x < 0
The above modification Means that some channels may have too much noise and should
not be used at all. We summarize the derivations as a theorem.
where
(
+ + x, x ≥ 0
Pi = (B − Ni ) , (x) =
0, x < 0
P
and B is such that i Pi = P.
3
The Kuhn-Tucker method can be seen as a generalization of the Lagrangian multiplier method, often
used in non-linear optimization. It is also known under the name Karush-Kuhn-Tucker.
136 CHAPTER 10. GAUSSIAN CHANNEL
This method is often referred to as water filling, which can be seen in the next example.
Example 60 Assume we have a system with four independent Gaussian channels with
noise variance N1 = 2, N2 = 4, N3 = 6a and N4 = 3. The total power used in transmission
is restricted to P = 6. The condition Pi = B − Ni is equivalent to
B = Pi + Ni
4B = P1 + N1 + P2 + N2 + P3 + N3 + P4 + N4
= P1 + P2 + P3 + P4 + N1 + N2 + N3 + N4 = 21
| {z } | {z }
P =6 15
This gives B = 21 21 3
4 . Using this result would require P3 = 4 − 6 = − 4 , which is not
possible. The conclusion of this is that the 3rd sub-channel has too much noise and cannot
be used if optimizing according to the algorithm.
In a second attempt to find an optimal distribution of the available power we turn off
sub-channel 3 and use the other three. Similar as above we get
3B = P1 + P2 + P4 + N1 + N2 + N4 = 15
| {z } | {z }
P =6 9
15
and B = 3 = 5. Hence, we get the power distribution
P1 = 3 P2 = 1 P3 = 0 P4 = 2
To start with, a band limited signal is a signal where frequency contents is limited inside a
bandwidth W . For example, speech is normally located within the frequency bandwidth
0-4 kHz. By modulating the signal it can be shifted up in frequency and located in a
10.3. BAND-LIMITED GAUSSIAN CHANNEL 137
6
5 B=5
P2
4 P4
P1
3 N3
2 N2
N4
1 N1
1 2 3 4
higher band. Still it occupies 4 kHz bandwidth. In this way it is possible to allocate
several bands of 4 kHz after each other, and in principle it we can pack one voice signal
each 5kHz in the frequency band.
To transmit e.g. a voice signal we can either use analogue technology and transmit it as it
is. But it is easier to process the signal if it is sampled and converted to digital data. Then
it is possible to use a suitable source coding algorithm to reduce the redundancy. Fol-
lowing this there should also be a channel code to protect the information from channel
errors. In this way it is possible to achieve much better quality at a lower transmission
cost (i.e. bandwidth). Sampling the signal means taking the value from the continu-
ous signal at periodic time values. Setting the sampling frequency to Fs , meaning there
should be Fs samples each second. If the continuous time signal x(t) is sampled with
frequency Fs the sample values are given by
xn (n) = x Fns
For a band limited signal with a bandwidth of W , the sampling theorem states that Fs ≥
2W to be able to reconstruct the original signal. So, a voice signal that is band limited to
W = 4 kHz should be sampled with at least Fs = 8 kHz.
The next theorem is the celebrated sampling theorem, introduced by Harry Nyquist in
1928 [19], and further improved by Shannon in [22]. Actually, Nyquist studied the num-
ber of pulses that can be transmitted over a certain bandwidth, which can be seen as the
dual of the sampling theorem as we know it today, and what is given next.
Theorem 47 Let x(t) be a band limited signal, fmax ≤ W . If the signal is sampled with Fs =
n
2W samples per second to form the sequence x( 2W ), it can be reconstructed with
∞ n
X n
x(t) = x sinc t −
n=−∞
2W 2W
where
sin(2πW t)
sinc(t) =
2πW t
138 CHAPTER 10. GAUSSIAN CHANNEL
We are now ready to define a channel model for band limited signals. Assume we have
a signal with highest frequency content fmax = W , giving the required bandwidth W .
Then, sampling the signal at the Nyquist rate we have the sampling frequency Fs = 2W .
The sampling time, i.e. the time between two samples, is
1 1
Ts = =
Fs 2W
Sampling the signal x(t) gives the sampled sequence
n
xn = x(nTs ) = x 2W
Definition 33 A band limited Gaussian channel consists of a band limited input signal x(t),
where fmax = W , additive white Gaussian noise ζ(t), and an ideal low-pass filter, as in the
following figure.
ζ(t) H(f )
Since the signal x(t) is band limited in the bandwidth W it passes the ideal filter with-
out changes. The meaning of that the noise is white is that the power spectral density
function occupies all frequencies with a constant value. This value is normally set to
N0
Rζ (f ) = , f ∈R
2
After the filtering we get the signal z(t) = ζ(t) ∗ h(t), which is also band limited with
power spectral density
(
N0
, −W ≤ f ≤ W
Rz (f ) = 2
0, otherwise
The corresponding auto correlation function is the inverse Fourier transform of the power
spectral density function. In this case the noise auto correlation function is
N0
rz (τ ) = sinc(τ )
2
To get a time discrete sequence the received signal is sampled at the Nyquist rate, Fs =
2W . Then the auto-correlation sequence for the noise becomes
(
N0
n , n=0
rz 2W = 2
0, otherwise
10.3. BAND-LIMITED GAUSSIAN CHANNEL 139
This imply that the resulting sampled noise is normal distributed with zero mean and
variance N0 /2,
zn ∈ N 0, N20
Hence, we can use the previous theory for the Gaussian channel. For each sample trans-
mitted the capacity is
1 2σ 2
C= log 1 + x bit/sample
2 N0
The power of the transmitted signal is constraint to P . That means each transmitted
P
sample has the energy σx2 = 2W , which gives the capacity per sample
1 P
C= log 1 + bit/sample
2 N0 W
With Fs = 2W samples every second, the achievable bit rate becomes
P
C = W log 1 + bit/second
N0 W
We formulate this result as a theorem.
Theorem 48 Let x(t) be a band limited signal, fmax ≤ W , and z(t) noise with power spectra
Rz (f ) = N0 /2, |f | ≤ W . The channel
Example 61 Today the main dominating technology for fixed Internet access is through
ADSL (Asymmetric Digital Subscriber Line), which can give bit rates up to 26 Mb/s. But
there is also a generation change ongoing and the Internet providers are upgrading to
VDSL (Very high speed DSL) equipment. This will enable bit rates of up to the order of
150 Mb/s. The advantage with DSL technology is that it reuses the old telephone lines
to access the housholds, so it is a relativly cheap technology to roll out. Comparing with
optical networks in the access link (fibre to the home, FttH) where they must dig a new
infrastructure of optical fibres to each house, this is an economically feasible technology.
In both ADSL and VDSL the speech signals are in the band 0 − 4 kHz and the data signals
are positioned from 25 kHz up to 2.2 MHz for ADSL and 17 MHz for VDSL (depending
on which band-plan is used). To do capacity calculations on the VDSL band we neglect
the speech band and assume that the available bandwidth is W = 17 MHz. The signalling
level is set by the regulators (standardized by ITU-T) to −60 dBm/Hz.4
4
Often the unit dBm/Hz is used for PSD. This means the power level expressed in mW, normalized with
the bandwidth and expressed in dB, i.e. PdBm/Hz = 10 log10 (PmW/Hz ).
140 CHAPTER 10. GAUSSIAN CHANNEL
The absolute maximum that is possible to transmit can be found when the noise is as low
as possible. The thermal noise, or the Johnson-Nyquist noise, is the noise generated in
electrical circuits. The thermal noise is typically white and at room temperature about
−174dBm/Hz. We can now calculate the power and the noise variance as
P = 10−60/10 · W [mW]
N0 = 10−174/10 [mW/Hz]
In many communication systems of today a wide bandwidth is used to get high bit-rates.
The channel is typically not constant over the entire band, but there are variations both
in noise level and signal attenuation. One popular method to signal over such channels
is with OFDM (Orthogonal Frequency Division Multiplexing) modulation. This is used
in e.g. WLAN (802.11), xDSL, DVB-T (digital TV) and the down link of LTE (Long Term
Evolution). Then the band width is divided into several sub-bands that can be used
independently of each other. Then we have a case very similar to the parallel Gaussian
channels, which led to the water filling algorithm. To get a similar result we use the
same approach with Lagrangian multiplier and set up the optimization function for n
sub-channels with bandwidth W∆ as
X X X Pi X
J= Ci + λ Pi − P = W∆ log 1 + +λ Pi − P
N0,i W∆
i i i i
Setting its derivative with respect to Pi equal to zero and solving for Pi we get
W∆
Pi = − − N0,i W∆
λ ln 2
The first term is constant and can be assigned as
W∆
B=−
λ ln 2
As in the previous case we can not have negative powers and we need to use the Kuhn-
Tucker argument to achieve the water-filling algorithm,
( +
Pi = B − N0,i W∆
P (10.1)
i Pi = P
10.4. FUNDAMENTAL SHANNON LIMIT 141
In many situations the signal is attenuated during the transmission. This is normally
modelled by a filter affecting the signal x(t) and the received signal is instead
After sampling the received signal power is Prec = P |G|2 , where |G(f )| = G is assumed
to be constant over the considered bandwidth. Then the capacity becomes
P |G|2
C = W log 1 +
N0 W
For the OFDM type of channel the water filling argument corresponding to (10.1) be-
comes
P = B − N0,i W∆ +
i |Gi |2
P P = P
i i
where Gi is assumed to be constant over the sub-channel. However the attenuation can
vary between sub-channels and in this case both the noise level and attenuation level can
be considered to be frequency dependent over the bandwidth.
One of the most famous results from Information theory is the fundamental limit, or
Shannon limit, that sets requirements on the signal to noise ratio for reliable communica-
tion. We will in this section derive this limit in the general case without restrictions. To
reach this limit it is required that the coding rate goes to zero, and we will also consider
the case when the coding rate is fixed.
Consider a band limited Gaussian channel with bandwidth W and noise level N0 . If the
transmitted power constraint is P , then the capacity is given by
P
C = W log 1 + bit/second
N0 W
If we not have any other constraints we would like to use as much bandwidth as possible.
In theory the available bandwidth is infinite, and therefore we let W → ∞ in the formula.
P/N0
C∞ = lim W log 1 +
W →∞ W
P/N0 W
= lim log 1 +
W →∞ W
P/N 0
= log eP/N0 =
ln 2
Assigning the achieved bit rate as Rb , it is required that this is not more than the capac-
ity, C∞ > Rb . Further, assume that the signalling time is Ts and that in each signal k
information bits are transmitted. Then
P Ts = Es = Eb k (10.2)
142 CHAPTER 10. GAUSSIAN CHANNEL
where Es is the average energy per transmitted symbol and Eb is the average energy per
information bit. The variable Eb is a very important number since it is something that
can be compared between different systems, without having the same number of bits per
symbol or even the same coding rate. Then a very system independent signal to noise
ratio is SNR = Eb /N0 . From (10.2) we can write the energy per bit as
P Ts
Eb =
k
Considering the ration between C∞ and the bit rate Rb we get
C∞ P/N0 Ts Eb /N0
= = >1
Rb ln 2 k ln 2
where we used that Rb = Tks and that for reliable communication we require C∞ > Rb .
Rewriting the above, we can conclude that for reliable communication we require the
SNR to be
Eb
> ln 2 = 0.69 = −1.59 dB
N0
The value −1.6 dB is the well known Shannon limit and constitute a hard limit for when
it is possible to achieve reliable communication. If the SNR is less than this limit it is not
possible to reach error probability that tends to zero, independent of what system is used.
In the above calculations there are no limits on either bandwidth or coding rate. In fact,
it will require coding rate that goes to zero and a computational complexity that goes to
infinity to reach this limit. In this section we will derive a corresponding limit for fixed
code rate. First, assume that a codeword consists of N samples and that there are K
information bits in it, giving a (2K , N ) code with rate
K
R=
N
The duration of time for a codeword can then be set to T , and assuming the Nyquist rate,
this is coupled to the number of samples through N = T Fs = 2W T sample/codeword.
The information bit rate is the number of information bits in a codeword over the dura-
tion,
K K
Rb = = 2W = 2W R
T N
Similarly, in each codeword we use an average energy of KEb , and the corresponding
power is
KEb
P =
T
With this at hand, we can rewrite the SNR of the capacity formula as
P KEb Eb Eb
= = 2W R =2 R
N0 W T N0 W N0 W N0
10.5. CODING GAIN AND SHAPING GAIN 143
Then, since the bit rate is less than the capacity, we get
Eb
Rb = 2W R < W log 1 + 2 R
N0
which gives
Eb
1+2 R > 22R
N0
or, equivalently,
Eb 22R − 1
>
N0 2R
1
Using, e.g. a code with rate R = 2 we can see that the limit is now shifted to
Eb
> 1 = 0 dB
N0
To get a better communication environment we need to decrease the code rate. It can
only be decreased down to zero, where we see that the bound has a limit value in
Eb 22R − 1 2 · 22R ln 2
> lim = lim = ln 2 = −1.59 dB
N0 R→0 2R R→0 2
where we used l’Hospital’s rule to derive the limit. From this we can see that to reach the
limit −1.59 dB we need to have the code rate approaching zero.
SNR gap
To be done.