0% found this document useful (0 votes)
54 views17 pages

Fading Channels How Perfect Need Perfect Side Information Be

Uploaded by

Tsega Teklewold
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views17 pages

Fading Channels How Perfect Need Perfect Side Information Be

Uploaded by

Tsega Teklewold
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

1118 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO.

5, MAY 2002

Fading Channels: How Perfect Need “Perfect Side


Information” Be?
Amos Lapidoth, Senior Member, IEEE, and Shlomo Shamai (Shitz), Fellow, IEEE

Abstract—The analysis of flat-fading channels is often per- for an average transmission power constraint by imposing an
formed under the assumption that the additive noise is white energy constraint on the input symbols . In particular, in
and Gaussian, and that the receiver has precise knowledge of the discussing block coding for this channel, we shall typically re-
realization of the fading process. These assumptions imply the
optimality of Gaussian codebooks and of scaled nearest-neighbor quire that the -tuple cor-
decoding. Here we study the robustness of this communication responding to each message satisfy
scheme with respect to errors in the estimation of the fading
process. We quantify the degradation in performance that re-
sults from such estimation errors, and demonstrate the lack of (2)
robustness of this scheme. For some situations we suggest the
rule of thumb that, in order to avoid degradation, the estimation
error should be negligible compared to the reciprocal of the for some constant . Here1
signal-to-noise ratio (SNR).
Index Terms—Channel capacity, fading channels, general-
ized mutual information (GMI), mismatched decoding, nearest
neighbor decoding, Rayleigh fading. denotes the set of possible messages, and and denote the
block length and rate of the code, respectively.
We say that the realization of the fading process is un-
I. INTRODUCTION
known to the receiver if the decoder must decide which

A PASSBAND pulse amplitude modulated signal is said


to experience “flat fading” if it is transmitted over a
fading channel of a delay spread that is negligible compared to
message was transmitted based on the channel output sequence

from to
only. Such a decoder is thus a mapping
that maps the received output sequence to the
the symbol duration [1]. If the continuous-time output of the decision message . Similarly, we say that the realization of
channel is match-filtered to the pulse shape and subsequently the fading process is known to the decoder if it may base its
sampled, then the resulting samples can sometimes be decision not only on the received sequence but also on the
modeled as fading sequence . Such a decoder is thus a
mapping that maps pairs to the decision message .
(1) If the noise samples are independent and identically
distributed (i.i.d.) zero-mean circularly symmetric2 Gaussian
where represents the channel input. Here is an er- random variables of variance , and if the fading process
godic complex-valued process representing additive noise, and is known only to the receiver, then the capacity of
is an ergodic complex-valued process representing fading. this channel is given by [2], [1]
We assume throughout that the process is independent of
the process , and that their laws do not depend on the input
(3)
sequence.
If the modulating pulse is orthogonal to all its time shifts by
integer multiples of the signaling period, then we can account Here denotes the expectation functional (in this case with
respect to the fading process) and the subscript SI stands for
side information (at the receiver).
Manuscript received January 23, 2000; revised March 22, 2001. This work
was conducted in part while A. Lapidoth was a resident at the Rockefeller
This capacity can be achieved using a maximum-likelihood
Foundation Bellagio Study and Conference Center and also while the authors (ML) decoder that given a received sequence
were visiting Lucent Technologies, Bell Labs, Murray Hill, NJ. The work of and a fading realization sequence
S. Shamai was supported in part by the Fund for the Promotion of Research
at the Technion. The material in this paper was presented in part at the 1999
assigns each message with the “metric”
IEEE Workshop on Information Theory, Kruger National Park, South Africa,
June 20–25, 1999.
A. Lapidoth is with the Department of Electrical Engineering, Swiss Fed- (4)
eral Institute of Technology (ETH), Zurich CH-8092, Switzerland (e-mail: lapi-
[email protected]).
1All logarithms in this paper are natural logarithms, and all rates are expressed
S. Shamai (Shitz) is with the Department of Electrical Engineering, Tech-
nion–Israel Institute of Technology, Haifa 32000, Israel (e-mail: sshlomo@ in nats per complex symbol.
ee.technion.ac.il). 2The distribution of a complex random variable W is said to be circularly
Communicated by P. Narayan, Associate Editor for Shannon Theory. 0 
symmetric if for any deterministic   <  the distribution of the random
Publisher Item Identifier S 0018-9448(02)02796-7. variable e W is identical to the distribution of W .

0018-9448/02$17.00 © 2002 IEEE

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1119

and chooses the message of least metric. We shall refer to such 5) A genie provides the decoder with a noisy measurement
a decoder as a “scaled nearest neighbor decoder.” The existence of the true fading level, with an error that is in-
of capacity-achieving codebooks can be demonstrated by con- dependent of the codebook, transmitted message, noise,
sidering random coding ensembles where the codewords are and fading level. The decoder performs scaled nearest
chosen independently, each according to a product distribution neighbor decoding based on the genie’s noisy measure-
whose marginal is , i.e., a zero-mean circularly sym- ment.
metric Gaussian distribution of variance .
6) The magnitude estimator is formed as a strictly causal
In this paper, we study the robustness of the above decoding
shift-invariant mapping from the channel outputs ,
rule with respect to the noise distribution and, more importantly,
and the phase estimator is decision-directed with
with respect to the receiver’s accuracy in estimating the fading
a strictly causal shift-invariant mapping of ,
sequence. Specifically, we shall study the performance of a re-
. The noise is i.i.d.
ceiver that chooses the message that minimizes the metric
7) The receiver gets its fading estimate from a
(5) “channel estimator” box, which—with the help of a
genie—estimates the fading level in a strictly causal
where , and where , can be shift-invariant way based on the received signal and the
thought of as magnitude and phase estimators of the fading at true codeword. The noise is i.i.d.
time . (The case where corresponds to precise Imprecise side information can be often analyzed using the
knowledge of the fading realization.) classical information-theoretic tools by considering the mutual
The notation allows for “decision-directed” phase es- information between the channel inputs and the combination of
timation. That is, we allow for the metrics associated with the the output symbols and any additional genie-provided informa-
different codewords to be based on different fading estimators. tion, see, e.g., [3], [4]. This approach, however, typically tacitly
This flexibility may be useful in analyzing receivers that are assumes optimal decoding, so that the receiver has knowledge
structured as a bank of estimator-correlators. Note, how- of the joint distribution of the fading process and the genie-pro-
ever, that for technical reasons, we do not allow for the time- vided side information, and that based on this law, it performs
magnitude estimator to depend on the message whose metric optimal decoding.
is being computed. We exclude such cases not because they lack Our approach is different in that we fix the decoding rule, and
engineering interest, but because they require a separate anal- study the resulting achievable rates.3 To this end, we rely on the
ysis, which is beyond the scope of this paper. generalized mutual information (GMI) [6], which speci-
The restrictions imposed on the fading estimators will be fies the highest rate for which the average probability of error,
stated later. Here we merely list some possible examples. averaged over the ensemble of Gaussian codebooks, converges
1) The case to zero. This gives some indication of how a “typical” codebook
that was designed for the channel with perfect side information
(6) and with Gaussian noise might behave under non-Gaussianity
of the noise and under imperfect fading level estimate (side in-
corresponds to a receiver having perfect side information formation).
but ignoring any non-Gaussianity or memory in the noise While it can be shown [6] that for all rates above , as
process . the block length tends to infinity, the average probability of
2) The case error—averaged over a Gaussian ensemble—tends to one, this
does not imply that no rate higher than can be achieved
(7) by some codes. In this respect, the GMI lacks the “authority” of
channel capacity. Our upper bounds should therefore not be in-
corresponds to the case where a genie provides the de- terpreted as ultimate bounds on the achievable rates, but rather
coder with the precise phase of the fading process, but the as bounds on the rates that are achievable by a “typical” code-
receiver is otherwise ignorant of the fading magnitude and book that was picked from a Gaussian ensemble without regard
merely uses its mean in the computation of the metric. to the fading estimation errors.
3) The fading estimator is a fixed deterministic complex In earlier work, the sensitivity to the Gaussian noise assump-
number. Thus, and are deterministic and do tion (but with perfect side information) was studied for baseband
not depend on time or on the codeword whose metric signaling [7]. An extension to the passband case yields that for
is being computed. any ergodic noise process

4) More generally, we can consider a scenario where some


side information is provided to the receiver who sets (9)

(8)
3An analysis of a fixed linear receiver is also carried out, for example, in [5]
Here, it is assumed that the pair , are jointly but using a “signal-to-interference” (SIR) criterion rather than from the achiev-
ergodic and independent of the inputs and additive noise. able rates point of view.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1120 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

where . Thus, in terms of achievable rates, 1) The decoder is symmetric in the sense that the proba-
Gaussian codebooks and scaled nearest neighbor decoding bility of error, averaged over the ensemble of codebooks,
make every noise distribution appear as though it were i.i.d. is identical for all messages. Consequently, for the pur-
Gaussian. pose of studying the ensemble-averaged probability of
In the present paper, we shall demonstrate that Gaussian error we shall assume without loss of generality that the
codebooks and scaled nearest neighbor decoders (5) are far less transmitted message is the first message. The transmitted
robust with respect to imperfect side information. symbols thus correspond to the first codeword
In the next section, we discuss more rigorously the class of and the received symbols are thus
decoders under study. In Section III, we describe the GMI and (10)
compute it for the problem at hand. There, we shall see that in
some scenarios the GMI can be disappointingly low. In an at- 2) The complex-valued fading process and the
tempt to understand whether the culprit is the scaled nearest complex-valued noise process are ergodic.4
neighbor decoding rule or the choice of Gaussian signaling, Also, , , and are
we study in Section IV the performance of Gaussian signaling independent.
with optimal (ML) decoding. Since the general expressions for
3) The noise is of zero mean and of variance
the different scenarios can be quite complicated, we study the
asymptotic low signal-to-noise ratio (SNR) and the high-SNR and (11)
regimes in Sections V and VI, respectively. In each regime, we
4) We require that the following ergodic condition holds:
try to compare the capacity of the channel, the Gaussian mutual
information (with optimal decoding), the GMI, and the highest
rates achievable with nearest neighbor decoding (and arbitrary
codebooks). Section VII concludes the paper with a brief sum-
mary and some conclusions. in probability
(12)

II. ASSUMPTIONS AND THE DECODER STRUCTURE 5) The noise sample at time is uncorrelated with the
product of the transmitted symbol and the estimated
In some systems, the decoder may learn something about the fading symbol, i.e.,
fading process not only from the received signal that is to be
(13)
decoded, but also from other parts of the system. For example,
in some cellular systems, a down-link decoder can gain some where denotes the complex conjugate of the noise
knowledge of the fading levels by studying the output of the sample at time .
down-link “control channel.” In other cases, the receiver can
6) For any incorrect message and at any time instant
use the previously decoded codeword to estimate the fading
, the random variable conditional on
level based on the previously received signals [8], or the receiver
the random variables
can make use of training sequences, pilot signals, and the like.
We can think of such information as being conveyed by some (14)
friendly genie. To treat such scenarios, we shall allow for the
probability space under consideration to include not only pro- is -distributed.
cesses such as the noise process and fading process, but also 7) The processes , are jointly ergodic.
additional random variables and processes that describe the ad-
ditional information provided by the genie. Moreover, since we 8) For any
shall be considering ensembles of Gaussian codebooks, we shall
(15)
assume that those too are in the probability space.
We thus assume a probability space over which the i.i.d.
random variables are defined. Here, We next briefly discuss some of these assumptions:
is the entry in the ensemble of codebooks that corre- • Assumption 4 is an ergodicity assumption that is required
sponds to the signal transmitted at time if message is to be for the analysis of the metric accumulated by the cor-
conveyed. To simplify notation, we assume that our probability rect codeword. This condition is satisfied, for example, if
space includes all random variables for all natural and are time-invariant functions of , ,
numbers and . The probability space also includes the .
fading process and the noise process . The prob-
ability space can include many other random variables, thus • Assumption 5 holds, for example, if the noise sequence
allowing for various genies. is i.i.d. (and by (11) of zero mean) and the estimators
We next enumerate the main assumptions. To simplify are strictly causal.5 It is also satisfied if and are
notation we shall use to denote the random variables 4Throughout when we write “ergodic” we tacitly assume stationarity.
. For example, is the set of channel outputs 5By “strictly causal” we mean that the estimator at time k may depend only
, where is the block length. on random variables of time index strictly smaller than k .

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1121

deterministic, or more generally if and remain Somewhat more restrictive than Assumption 6 but of
uncorrelated also when conditioned on , . more engineering sense are the following two simulta-
neous assumptions.
• Assumption 6 is required for the analysis of the metric
accumulated by an incorrect codeword. It allows us to a) For every , is independent
compute the conditional log moment-generating function of , . (Since the incorrect codewords
(MGF) of the metric associated with an incorrect are independent of the output, this condition can
message . Indeed, if for we define be replaced with the condition that conditional on
the output sequence , the magnitude estima-
(16) tors and the incorrect codeword
are independent. This condition thus essentially pre-
then this assumption guarantees that cludes decision-directed magnitude estimation.)
b) For every and every time instant , the phase
estimator is independent of the phase of
conditional on the random variables (14).
We are thus allowing for some strictly causal deci-
(17) sion directed phase estimators (if the noise is, e.g.,
i.i.d.).

If we had not allowed for decision-directed phase esti- • Assumption 7 allows us to compute the limit of
mation, this expression would have followed immediately as the block length tends to infinity. This
from the expression for the log MGF of a chi-squared limit is essential for the application of the Gärtner–Ellis
distributed random variable, and from the fact that the theorem [9] to study the large deviation probability that
log MGF of a product of independent random variables an incorrect codeword will accumulate a metric smaller
is the sum of the log MGFs of the random variables. In- than the one accumulated by the correct codeword.
deed, in this case, the assumption is very natural in merely Indeed, using the ergodic theorem we obtain from (17)
requiring that for the ensemble under consideration, any that, almost surely6
incorrect codewords be independent of the channel out-
puts corresponding to the true codeword, and of the mag-
nitude estimator (the latter essentially precluding deci-
sion-directed magnitude estimators). (18)
However, we can handle some decision-directed phase
estimators. In this more general case, (17) can be verified
• Assumption 8 is used to further simplify (18) to yield
by defining

so that (19)

By writing

and by repeating times the argument

it is seen that Assumption 8 is satisfied whenever the pair


is independent of the pair , e.g.,
when the noise process is i.i.d., and the magnitude esti-
mator is strictly causal.

III. THE GENERALIZED MUTUAL INFORMATION


For a given channel and decoding rule, the GMI [10]–[12],
[6] corresponding to the i.i.d. Gaussian input distribution is the
6From the ergodic theorem we obtain that for any  this limit holds with prob-
ability one. Ostensibly, the event that this limit holds for all  < 0 may not have
probability one, as there are noncountable such  ’s. Nevertheless, we can show
where the last equality follows from the assumption and using a convexity argument that this is not the case, and the probability that this
from the log MGF of chi-squared random variables. limit holds for all  < 0 is still one. See [7] for details.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1122 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

rate below which the average probability of error—aver-


aged over the ensemble of i.i.d. Gaussian codebooks—decays
to zero (as the block length tends to infinity), and above which
(24)
this average tends to one.
A general expression for can be found in [12], [6]. In
our setup, it is given by the following theorem.
Theorem 3.0.1: For the flat-fading channel (1), a decoder (25)
that bases its decision on the metric (5), and an i.i.d. Gaussian
ensemble, if Assumptions 1)–8) hold then the GMI is
given by Corollary 3.0.1: Suppose that in addition to the assumptions
of Theorem 3.0.1
(20)
(26)
where

(21) which, for example, holds if is the minimum mean


squared error estimator of with respect to some subset of
the observations. Then, see (27) at the bottom of the page, with
(22) equality if
and

(28)

Proof: By (5) and (12), it follows that the metric i.e., if the estimation error does not depend on (e.g.,
accumulated by the correct codeword converges almost surely in a Gaussian regime with optimal minimum mean squared error
to a deterministic constant , where estimators).
Proof: See Appendix A.
The following corollary addresses the case where a genie pro-
vides the decoder with the precise value of the fading phase, but
with no information about the fading magnitude. Consequently,
in computing the metric associated with each of the codewords,
a.s.
the decoder uses the mean of the fading magnitude.

a.s. Corollary 3.0.2: Consider the channel (1) where the fading
process is ergodic and independent of the zero-mean vari-
The probability that an incorrect codeword accumulates a metric ance- ergodic noise process , and where the joint law of
smaller than decays exponentially in the block length, and the the processes does not depend on the input sequence . Let
GMI is just the exponent. denote the generalized mutual information corresponding
Using the Gärtner–Ellis theorem [9], we deduce that to an i.i.d. ensemble and a decoder that chooses the
codeword that minimizes (5) with and given in (7).
Then
where the limiting log MGF
(29)

is computed via (19) as


where
(23)
(30)

(27)

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1123

Proof: For the decoder based on (5) and (7), the assump- (31)
tions of the corollary imply all the assumptions of Theorem
Then
3.0.1. By (21) we have

and by (24)

(32)
Proof: See Appendix B.
Now we can optimize over to obtain (29) with the optimal
being . In fact, this optimization can be Note 1: In the special case where the fading is i.i.d., the
done by inspection, because the threshold and the log MGF phase of is independent of its magnitude, and the side infor-
that we got are identical to those that we would get on an mation constitutes the phase of the fading,
ordinary (nonfading) complex Gaussian channel with transmit we have
power and noise variance .
The above analysis indicates that inaccuracy in the estima-
tion of the fading level can be quite detrimental to reliable com- and the lower bound of the theorem coincides with (29).
munication using Gaussian codebooks and nearest neighbor de-
coding. In fact, in some cases (see (29)), the generalized mutual To better understand the loss in performance due to the use
information becomes bounded in the power! of suboptimal codes and suboptimal decoding, we study two
As a rule of thumb, our results indicate that, for the communi- asymptotic regimes: the low SNR regime, and the high SNR
cation scheme under consideration, the side information can be regime. In each case we try to compare the capacity of the
regarded as perfect only if the estimation error in estimating the channel, the mutual information achievable with Gaussian in-
fading level is far smaller than the reciprocal of the SNR. This puts and ML decoding, the rates achievable with Gaussian code-
puts in question the robustness of the “perfect side information” books and suboptimal decoding, and the rates achievable with
assumption at high SNR. scaled nearest-neighbor decoding with arbitrary codebooks.
In an attempt to separate the effects of the suboptimal
Gaussian codebook and the suboptimal scaled nearest neighbor V. LOW-SNR REGIME
decoder, we proceed in the next section to study the perfor- A. Channel Capacity
mance of Gaussian codebooks with ML decoding.
We begin with a study of the capacity of a fading channel at
IV. GAUSSIAN CODEBOOKS WITH GENIE-AIDED low SNR. Our result here is that for i.i.d. Gaussian noise and for
ML DECODING ergodic fading processes of finite second moment, the capacity
of the fading channel at low SNR is identical to the capacity of
In this section, we study the rates that are achievable using a nonfading Gaussian channel of equal average receive power.
Gaussian signaling over a fading channel with some genie-pro- A special case of this result when are i.i.d. zero-mean
vided receiver side information. As in [3], we do not restrict Gaussians was established by Verdú [13] as an application of
the receiver in any way, so that it may perform ML decoding. his technique for computing the capacity per unit cost of mem-
We thus study the mutual information (corresponding to i.i.d. oryless channels. The continuous time version of this theorem
Gaussian inputs) between the channel input and the combina- can be found in [14].
tion of the channel outputs and side information.
We shall denote the side information provided by the genie Theorem 5.1.1: Consider the fading channel (1) where
by . This could take the form of a random variable, or per- are i.i.d. and independent of the ergodic process of
haps a stochastic process. We shall see that the treatment can finite second moment . Further assume that the joint law of
be quite general, as only conditional expectations with respect and does not depend on the channel input . Let
to will enter our analysis. We shall assume throughout that denote the capacity of the channel with average transmit
the genie-provided side information is independent of the power ; see (2). Then
channel inputs and noise. The side information is, of course,
typically correlated with the fading process . (33)
Theorem 4.0.2: Let be i.i.d. , and let
be independent of and i.i.d. with . Let the Proof: See Appendix C.
pair of the fading process and the side information
Loosely speaking, this theorem implies that asymptotically,
be independent of the pair . Let
at very low SNR, side information does not increase capacity.
More precisely, it implies the following.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1124 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

Corollary 5.1.1: Under the conditions stated in the theorem, Note 2: This result agrees with the results of Gallager
the relationship (33) continues to hold even when the decoder and Médard [15] in the case that is a zero-mean com-
has access to some side information , provided that condi- plex-Gaussian.
tional on the fading process , the side information is
Note 3: A comparison of Proposition 5.2.1 and Theorem
independent of the pair .
5.1.1 demonstrates that the capacity of nontrivial fading chan-
Proof: The conditional independence of and
nels is typically not achieved by scale families of input distribu-
guarantees that
tions.
Note 4: Under some additional constraints (e.g., fourth-mo-
ment constraint), the result can be extended to situations where
the input distribution does not merely scale with the SNR, i.e., to
situations where the distribution of may depend on the SNR
in a controlled way.
Proof: Let . By the chain rule

where the second inequality follows from (3), and the subse- (36)
quent inequality by the concavity of the logarithmic function. Now and so
The corollary now follows using that by Lemma 5.2.1 the limiting behavior of the first term on
the right-hand side (RHS) of (36) can be expressed as

(37)

B. Gaussian Mutual Information As for the second term on the RHS of (36), again by Lemma
5.2.1
In this subsection, we consider the mutual information be-
tween the terminals of channel (1) at low SNR under the as-
(38)
sumptions that the additive noise is i.i.d. , the fading
process is i.i.d., and the channel input is i.i.d. . Now
We demonstrate that this mutual information is typically zero,
unless the fading process is of nonzero mean.
Before stating this result more precisely, we provide a
lemma, which is useful for the analysis of low-SNR fading
(39)
channels with Gaussian inputs. Loosely speaking, the lemma
demonstrates that at low SNR almost any zero-mean input dis- so that by the dominated convergence theorem (38) and (39)
tribution achieves the capacity of the additive white Gaussian imply
noise channel.
Lemma 5.2.1: Let be a fixed zero-mean unit-variance (40)
complex random variable, and let be indepen-
dent of . Then The proof is concluded using (36), (37), and (40).

(34) C. Gaussian Codebooks With Genie-Aided ML Decoding


In the previous subsection we saw that if the fading is of zero
Proof: See Appendix D. mean, then Gaussian codebooks perform poorly at low SNR,
even if ML decoding is performed; see (35). This picture may
With the aid of this lemma we can now derive the asymptotic
change dramatically in the presence of side information. Indeed,
mutual information corresponding to Gaussian signaling. The
if the receiver has access to some side information , then
result does not depend heavily on the Gaussian assumption and
under the conditions of Theorem 4.0.2 the mutual information
holds for any scale family, e.g., uniform input distributions. We
corresponding to i.i.d. Gaussian inputs is asymptotically lower-
thus have somewhat more generally the following proposition.
bounded as
Proposition 5.2.1: Let the random variables , , be
fixed independent random variables of finite second moment.
Assume that is of zero-mean and of unit-variance and that
. Then (41)
where

(35)

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1125

In the special case where in addition to the assumptions of that it is bounded in the noise power. This seems like a good
Theorem 4.0.2 the noise is i.i.d. and the side assumption as the statistics of the fading process remain fixed
information is of the form , where in our analysis; it is only the power of the additive noise term
are i.i.d. we can be even more precise. that grows.
With this assumption, one can show that the optimal in (20)
Proposition 5.3.1: Let be a zero-mean unit-variance com-
is of order . Consequently, we can expand of (23) as
plex random variable that is independent of . Let
the pair of complex random variables be independent of
, and assume . Then

and of (21) as
(42)

where

to yield
Proof: Let . Using the chain rule we
obtain

Optimizing over we obtain

To analyze consider first . Let


and let be the
conditional error in estimating from . (46)
Conditional on , the channel output can thus be It can be readily verified that this expression agrees with
expressed as the asymptotic expansion of (9) in the perfect side information
case (6). Similarly, there is an agreement between (46) and the
asymptotic expansion of (29) in the case where the genie pro-
vides the decoder with the phase (but not magnitude) as in (7).
where is of conditional mean zero given . Conse- Finally, note that as a special case of (46) we obtain that if
quently, by Proposition 5.2.1 the side information is some ergodic process that condi-
tionally on is independent of the channel inputs and noise,
and if the decoder is based on
(43)
(47)
Moreover
where

(48)

then (46) reduces to

(49)
(44)
so that by (44) and the dominated convergence theorem it fol- Comparing (49) with (42) we see that for i.i.d. Gaussian in-
lows from (43) that puts, if are i.i.d. and independent of the input and
noise, then at low SNR the mismatched decoder that is based
on the metric (5) with (47) and (48) performs asymptotically as
(45) well as the optimal ML rule.

E. The Mismatch Capacity


In this section, we focus on the rates that are achievable at
D. Gaussian Codebooks and Scaled Nearest Neighbor low SNR with a scaled nearest neighbor decoder but with ar-
Decoding bitrary codebooks. This problem is an instance of the problem
Here we shall analyze (20) in the asymptotic regime where of computing the mismatch capacity of a channel; see [16] and
the SNR—as measured by —tends to zero. references therein.
In general, the fading magnitude estimator may depend The general problem of computing the mismatch capacity
on the SNR, but here we shall assume that is , i.e., of an arbitrary channel with an arbitrary decoding rule is

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1126 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

generally believed to be very difficult. For example, it can be VI. LARGE SNRS
shown to be at least as difficult as the problem of computing the Our discussion of fading channels in the high-SNR regime
zero-error capacity of a general graph [17]. will mostly focus on the Rayleigh-fading channel, whose time-
Lower bounds on can, however, be derived using random output is given by
coding arguments [6] (and references therein). Any input distri-
bution typically gives rise to a lower bound

where denotes the channel input; the additive noise


is i.i.d. ; the fading process is independent of
with an i.i.d. distribution; and the joint law of
where is the highest rate below which the average
does not depend on the channel input . We
probability of error of the mismatched decoder—averaged over
denote the capacity of the Rayleigh-fading channel subject to
the ensemble of codebooks whose codewords are drawn inde-
the average power constraint (2) by .
pendently and uniformly over the type defined by —decays
to zero. For an expression for for channels over infi- A. Channel Capacity
nite input alphabets see [6].
The tightest bound that can be derived using this approach is The capacity of the Rayleigh-fading channel was studied ex-
denoted by , i.e., tensively in [18]–[20], and recently in [21] and [22]. At high
SNR, the capacity grows double-logarithmically in
the SNR. More precisely [21], [22]

(53)
where the supremum is over all allowable input distributions.
The computation of can be quite involved. Fortunately, at
where denotes Euler’s constant, and the additive
low SNR and in the presence of an input symbol of zero energy,
error term tends to zero as the SNR tends to infinity.
this computation can be much simplified [6]. In fact, for such
It should be noted that the double-logarithmic growth
channels over finite alphabets, it can be shown that at low SNR
of channel capacity in the SNR is not specific to the
one need only consider distributions that are concentrated
Rayleigh-fading channel. It holds, whenever the i.i.d. fading
at two input symbols, one of which is the zero-cost symbol.
process is of finite variance and of finite differential entropy
Even though our channel is not of finite alphabet, we restrict
[21]. It even holds if the fading process is a general stationary
our attention here to rates that are achievable using binary sig-
and ergodic process of finite second moment, provided that it
naling and define
is of finite differential entropy rate [23].
With perfect side information, the behavior of the capacity
(50) changes dramatically. If the receiver has access to the pathwise
realization of the fading process, then by (3) the capacity (in nats
where the supremum is over all distributions that are con- per complex symbol) is given at high SNR by [2]
centrated at two points, one of which is the zero energy symbol
, and that satisfy (54)

In the next subsection, we study the behavior of capacity


(51) in the presence of imperfect side information in an attempt to
bridge (53) and (54).
Theorem 5.5.1: Consider the fading channel (1) with side B. Capacity With Imperfect Side Information
information , where the additive noise is i.i.d.
and independent of the i.i.d. sequence . In the previous subsection, we saw that at high SNR perfect
Assume that the joint law of the noise and the se- side information increases capacity from a double-logarithmic
quence does not depend on the input to the dependence on the SNR to a logarithmic one; compare (53) to
channel . Further assume that and that (54). In the present subsection, we shall study channel capacity
, for some . Consider a decoder that with imperfect side information and try to assess its dependence
minimizes over all messages . Let on the SNR. Loosely speaking, we shall show that even in the
be as in (50). Then presence of side information, if the side information is imper-
fect, capacity continues to grow only double-logarithmically in
the SNR.
Some care, however, has to be exercised in interpreting this
(52) result. There may be situations where the side information,
while not perfect, improves with the SNR and “becomes
perfect” as the SNR tends to infinity. This case is not addressed
here. It should, however, be noted that if the side information is
Proof: See Appendix E. based on noisy measurements of past fading levels, then even

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1127

at infinite SNR, it will not become perfect if the fading process only double-logarithmically increasing), the mutual information
has memory and is not predictable from its past. corresponding to Gaussian signaling is bounded in the SNR.
We begin with a simple inequality. Consider the i.i.d. Rayleigh-fading channel, and assume no
side information. Let . Here we study
Lemma 6.2.1: Let the random variables , , and the pair
be independent. Let , then
where denotes the additive noise term,
(55) denotes the multiplicative noise, and
are independent.
Proof:
By the chain rule we have
(58)
We study the first term on the RHS of (58) by noting that since
and are independent and is of unit variance

and consequently

(59)
As to the other term on the RHS of (58) we compute
(60)
From (58)–(60) we conclude that

Since

As a consequence of the above lemma we now have the fol-


lowing. where is Euler’s constant, we conclude that

Proposition 6.2.1: Consider a Rayleigh-fading channel (61)


with the side information sequence , where the sequence
i.e., that the mutual information corresponding to a Gaussian
is i.i.d. and independent of the input and of the
input distribution is bounded in the SNR.
additive noise. Then, whenever
It is interesting to note that this result does not rely heavily
on the Rayleigh fading assumption.
(56)
Proposition 6.3.1: Let be independent random
In particular, if is bounded in the SNR, then the ca- variables, where and . Assume
pacity of the Rayleigh fading channel with the side information that has a density , and that its differential entropy
increases only double-logarithmically in the SNR. In fact, satisfies . Then
it exceeds the capacity corresponding to the absence of side in-
formation by at most an additive constant that does not depend (62)
on the power. Proof: See Appendix F.
Note 5: If are jointly Gaussian, then Note 6: See [24] for an extension of this results to multi-
can be related to the mean squared error in estimating from antenna systems over stationary and ergodic fading channels of
via the relationship finite-entropy rate.

(57) D. Gaussian Codebooks With Genie-Aided ML Decoding


In Proposition 6.3.1, we saw that on i.i.d. high-SNR fading
channels and in the absence of side information, Gaussian code-
books perform poorly. They typically yield a mutual informa-
C. Gaussian Mutual Information tion that is bounded in the power. The question arises as to
We next study the mutual information between the terminals whether this behavior changes dramatically in the presence of
of a Rayleigh-fading channel when the input is Gaussian. We side information. In this short subsection we note that it does
shall see that at high SNR this mutual information is typically not.
much lower than channel capacity, thus demonstrating the poor To see this refer to (55) to note that the improvement in mu-
performance of Gaussian codebooks at high SNR even with ML tual information afforded by the genie is upper-bounded by the
decoding. In fact, we shall show that whereas the capacity of constant so that if is bounded in the power,
the Rayleigh-fading channel is unbounded in the SNR (albeit then so is .

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1128 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

As before, we caution that this conclusion only holds if where the inequality follows from (63) and the final equality
is bounded in the SNR. It can be otherwise reversed. by the symmetry of the circularly-symmetric Gaussian distrib-
ution.
E. The Mismatch Capacity
Note 7: The result extends immediately to the case where
We next consider the behavior of the mismatch capacity at are not necessarily Gaussian but have a different circularly
high SNR. We shall only consider the Rayleigh-fading channel symmetric distribution. The last equality is then justified by
where the fading process is i.i.d. zero-mean circularly noting that if are independent and circularly symmetric
symmetric Gaussian, and the noise is i.i.d. . We then is circularly symmetric for any deterministic co-
assume no side information available to the receiver. efficients , and, consequently, has a sym-
Proposition 6.5.1: Irrespective of the SNR, the mismatch ca- metric distribution.
pacity of a Rayleigh-fading channel with a mismatched decoder Note 8: The mismatched capacity is no longer zero if we
that chooses the codeword that minimizes allow for stochastic encoders. A classical example is to transmit
a uniformly (or close to uniformly) phase distributed signals
(not revealing this randomized strategy to the nearest neighbor
decoder and letting the receiver find the closest codeword
is zero.
within this randomized set). This demonstrates, again, the
The intuition behind this proposition is that a coherent de- merits of randomized coding strategies in the mismatched
tector for a channel that rotates each symbol randomly is use- decoding regime [25].
less.
Note 9: The situation changes dramatically if a genie in-
Proof: We shall demonstrate that even in the absence of forms the receiver of the phase rotation and the receiver com-
additive noise, the mismatch capacity of the channel is zero. pensates for that prior to performing nearest neighbor decoding.
Assume then that In this coherent scenario the mismatch capacity is unbounded.
This can be demonstrated by sufficiently separating the energy
of the codewords, as in the lower bound of Taricco and Elia [20].
We will show that even for a codebook consisting of only two
VII. SUMMARY AND CONCLUSION
codewords, the maximal probability of error of the mismatched
decoder is bounded from below by . In the presence of perfect side information, the capacity of
Let and be two codewords and assume without loss of a flat-fading channel is achieved by Gaussian codebooks and
generality scaled nearest neighbor decoding. In this paper, we studied the
robustness of this communication scheme with respect to im-
(63) precisions in the side information. We have demonstrated that,
where denotes the -norm, i.e., as a rule of thumb, the side information can be regarded as “per-
fect” only if the second moment of the error is negligible com-
pared to the reciprocal of the SNR. Otherwise, the achievable
rates are reduced by the errors in estimating the fading levels.
Compute now the probability of error conditional on being For example, if the SNR tends to infinity while the estimation
transmitted and thus being received. Denoting this error remains fixed, then the achievable rates are bounded in the
probability by , we have power and do not grow to infinity.
To understand whether the lack of robustness is due to the
Gaussian signaling or due to the suboptimal decoding rule, we
studied various other combinations of codebooks and decoders
in the low-SNR regime and in the high-SNR regime.
Consider first the case where side information is not present.
In the low-SNR regime, the rates achievable with Gaussian
codebooks can be significantly lower than channel capacity
even if ML decoding is performed; compare (33) and (35).
Similarly, in the high-SNR regime, the rates achievable
by Gaussian codebooks can also be significantly lower than
channel capacity. In fact, for a Rayleigh-fading channel,
channel capacity tends to infinity with the SNR (albeit double-
logarithmically) (53), whereas the mutual information corre-
sponding to Gaussian signaling is bounded in the power (61).
The Gaussian input distribution is not the only one at fault
here. In the absence of side information, nearest neighbor de-
coding also performs poorly, even with optimal codebooks, par-
ticularly if the fading is i.i.d. and circularly symmetric. For such

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1129

a scenario, no positive rate is achievable with a nearest neighbor Consequently, by exchanging the supremum over and the ex-
decoder, irrespective of the SNR and of the codebook; see Sec- pectation we obtain the bound
tion VI-E.
Of more interest to our analysis is the more complicated case
where some (imprecise) side information is available to the de-
coder. At low SNR, the performance of Gaussian codebooks
with ML decoding improves significantly, and the rates achiev-
able by Gaussian codebooks are within a constant factor (that
depends on the estimation error) of channel capacity; compare
(42) to (33).
At high SNR, however, Gaussian codebooks perform poorly This bound can be written more compactly as
even in the presence of side information. As long as the mutual
information between the true fading level and the side informa- where
tion is bounded in the SNR, Gaussian codebooks yield rates that
are bounded in the power, whereas capacity typically grows to
infinity; see Section VI-D.
There is no escape from concluding that for fading channel
with imprecise side information, Gaussian codebooks perform
poorly at high SNR, and in designing systems for such scenarios and
one needs to find new codes and not simply pick from the codes
that were designed over the years for the Gaussian channel.
The performance of nearest neighbor decoders with side in-
formation depends critically on the accuracy of the phase es-
where the last equality follows from (26).
timates. On a Rayleigh-fading channel with no phase estima-
To compute one can simply set the derivative with
tion, nearest neighbor decoding performs very poorly; see Sec-
respect to to be zero and obtain that the supremum over is
tion VI-E. It is interesting to note, however, that at low SNR
attained at
and on Gaussian codebooks, the performance of the nearest
neighbor decoder with side information can be very similar to
(64)
that of the ML decoder; compare (49) with (42).
At high SNR, a scaled nearest neighbor decoder with per- In our case, however, so that
fect phase estimation can perform quite well. For example, on a
Rayleigh-fading channel it allows one to achieve rates that are
unbounded in the SNR (though not necessarily with Gaussian and
codebooks). It should be noted that perfect phase side infor-
mation can also do wonders to Gaussian signaling, at times al-
lowing for rates that are unbounded in the SNR.
It seems that on fading channels, if the side information is not which yields the desired bound (27).
precise enough (in the sense that we have attempted to quan- As to the condition for equality, we note that if does not
tify in this paper) then Gaussian codebooks and scaled nearest depend on then neither does , and the expectation
neighbor decoding may not be a good design approach. In such and supremum are exchangeable.
cases, codes that are specifically tuned to such channels are
called for, and one should not use the standard codes that were APPENDIX B
designed for Gaussian channels. For such channels, it is also im- PROOF OF THEOREM 4.0.2
perative to design receivers that take into account the statistical The proof of Theorem 4.0.2 relies on Lemma B.0.1 given
nature of the fading and the side information. later in this appendix, which, loosely speaking, demonstrates
that under Gaussian signaling, treating the estimation error as
APPENDIX A independent Gaussian noise is pessimistic and yields a lower
PROOF OF COROLLARY 3.0.1 bound on the mutual information. Before stating this lemma,
we motivate it by first considering the nonfading case.
Proof: To prove (27) first note that by (20), (22), and (25)
the GMI can be rewritten as It is well known [26] that if and if is
independent of with , then
(65)
For example, the Gaussian noise is the worst independent ad-
ditive disturbance to Gaussian signaling. Perhaps a somewhat
less known fact is that for (65) to hold, and need not
be independent; it suffices that they be uncorrelated, i.e., that
. This stronger statement is better suited for our

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1130 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

application, because it allows us to deal with situations where Once we show this, it will follow from the conditional version
the estimation error is not necessarily independent of the fading of (65) that
but merely uncorrelated with it.
For completeness sake we present here a short proof of (65).
(69)
Proof of (65): Let . Then for any

because the Gaussian noise is the worst uncorrelated distur-


bance to Gaussian signaling. By taking expectation with respect
to , the lemma will thus follow from (69) once we additionally
demonstrate that
Choosing to be the coefficient of the linear estimator of
(70)
from of the least mean squared error
To verify (68), we begin by noting that

gives the desired result.


For an alternative proof, note that since by [7] the rate speci- (71)
fied in (65) can be achieved using a nearest neighbor decoder,7
Now conditional on , the random variables and are in-
it must also be achievable with an unrestricted decoding rule.
dependent, and is of zero mean (it is, in fact,
The following lemma extends this result to fading channels.
distributed). Consequently
See [3] for the case of real channels with no side information.
Lemma B.0.1: Let (72)

Also, is independent of , and so that

where is independent of the pair . (73)


Assume that
Equations (71)–(73) combine to prove (68).
To verify (70) we first note that

and that conditional on , the random variables and


are independent. Then

(66)
Consequently, by (67)

Proof: Let

(67)
so that

thus proving (70). Here the second equality follows because


The hypotheses of the lemma guarantee that, conditional on , and are independent.
the random variable has a distribution. We will With the aid of Lemma B.0.1 we are now ready to prove The-
show that the lemma’s hypotheses also guarantee that condi- orem 4.0.2:
tional on , the random variables and are uncorrelated,
i.e., Proof:

(68)
7In [7], the noise is assumed to be independent of the input, but the proof goes
through verbatim if it is merely uncorrelated with it.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1131

It follows from [13, Lemma, p. 1023] that

(76)

where denotes the relative entropy be-


tween the distribution on corresponding to and the
distribution on corresponding to . Here
if
(77)
otherwise.
It follows from (75) and (76) that

(78)

We now express the RHS of (76) as

where the first equality follows from the chain rule for mutual
information, the next because are i.i.d., the next because
by (31) is measurable and, hence, also
, the next inequality because conditioning re-
duces differential entropy, the next equality by the definition of
mutual information, and the last inequality by Lemma B.0.1. (79)

where the integrations are carried out over the entire complex
APPENDIX C plane with respect to the Lebesgue measure . Here,
PROOF OF THEOREM 5.1.1 is the distribution function of (which does not depend on
by the stationarity assumption). Henceforth, we shall write for
Proof: We first observe that . Changing the integration variable to we obtain

(74)

This follows as in [14, p. 438] from the data processing theorem


by considering the fading channel as a cascade of two channels:
the first channel multiplies the input by the fading process, and (80)
the second adds the Gaussian noise. The capacity of the cascade
of the two channels cannot exceed the capacity of the channel
where and are the density and differential en-
that adds the noise. Since its input is average power limited to
tropy of the complex random variable , which is given by
, we can conclude that

where is independent of the fading


process. In particular

(81)

thus establishing (74). and, consequently, since the circularly symmetric Gaussian dis-
We now prove the reverse inequality. For any and tribution maximizes differential entropy subject to a second-mo-
let denote the mutual information be- ment constraint
tween the input and the corresponding output , when
(82)
are i.i.d. according to the law
It now follows from (80)–(82) that

Denoting by the mutual information (for this


input law) between the input sequence so that by (78)
and the output sequence , we have

(75) which combines with (74) to prove the theorem.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1132 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

APPENDIX D APPENDIX E
PROOF OF LEMMA 5.2.1 PROOF OF THEOREM 5.5.1
Proof: Upper-bounding the mutual information by the Proof: We shall prove the result for the case where the noise
channel capacity and recalling that we obtain is of unit variance, i.e., . The general result then follows
by normalizing the fading level and the side information
by .
To simplify notation, we shall drop the time subscripts, e.g.,
we write rather than . Let
where we have also used the inequality , which
is valid for any .
To prove the reverse inequality, we shall use the result of [27] and let
on the mutual information of multidimensional channels with
weak input signals. We shall, however, need a truncation argu-
ment because the results of [27] assume peak limited inputs.
Let be arbitrary (large). Let
if Denoting by and the distributions on
otherwise and when the channel input is and , respectively, we have
if [6]
otherwise
and

(85)
By the data processing inequality and the chain rule for mutual
information
which we now proceed to evaluate.
Denoting the joint distribution of and by we have

(83)
Consider now

where the first equality follows from the definition of ; the


second equality because is independent of (and, hence,
of ) and because whenever ; and the last Thus,
equality because eliminating the bias in the input distribution to
an additive noise channel does not affect mutual information.
We can now apply the results of [27, Theorem 1] to the study
of the mutual information corresponding to the peak limited (86)
weak signal to obtain
Computing the second term in (85) we have
(84)

The result now follows from (83) and (84) upon letting tend
to infinity because

Completing the square we obtain


and

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
LAPIDOTH AND SHAMAI: FADING CHANNELS: HOW PERFECT NEED “PERFECT SIDE INFORMATION” BE? 1133

(87) We shall next investigate the behavior of (93) as the power


tends to infinity. To this end, we note that without loss of
generality we may assume that . Indeed, can be
(88)
written as , where is arbitrary, and
. The limit does not depend on . With a proper
where the inequality follows from the concavity of the loga- choice of we can guarantee that have any sign we wish.
rithmic function and Jensen’s inequality. For we have
It now follows from (85), (86), and (88) that

and consequently

We next note that this bound is, in fact, tight. This follows by
noting that for the bound in (88) is tight. More precisely,

whenever the moment-generating function of ex-


ists for some . See [9, Lemma 2.2.5 (c)].
ACKNOWLEDGMENT
APPENDIX F The authors would like to thank the Shannon Theory Asso-
PROOF OF PROPOSITION 6.3.1 ciate Editor Prakash Narayan, Sergio Verdú, and the anonymous
referees for their valuable constructive comments.
Proof: Let . By the chain rule

(89) REFERENCES
[1] E. Biglieri, J. Proakis, and S. Shamai(Shitz), “Fading channels: Infor-
As in the argument leading to (59) we note that since mation-theoretic and communications aspects,” IEEE Trans. Inform.
Theory, vol. 44, pp. 2619–2692, Oct. 1998.
[2] T. Ericson, “A Gaussian channel with slow fading,” IEEE Trans. Inform.
Theory, vol. IT-16, pp. 353–355, May 1970.
(90) [3] M. Médard, “The effect upon channel capacity in wireless communica-
tions of perfect and imperfect knowledge of the channel,” IEEE Trans.
To lower-bound we write Inform. Theory, vol. 46, pp. 933–946, May 2000.
[4] G. Caire and S. Shamai (Shitz), “On the capacity of some channels with
channel state information,” IEEE Trans. Inform. Theory., vol. 45, pp.
2007–2019, Sept. 1999.
(91) [5] J. Evans and D. Tse, “Large system performance of linear multiuser re-
ceivers in multipath fading channels,” IEEE Trans. Inform. Theory, vol.
and use the entropy power inequality [26, Theorem 16.6.3] 46, pp. 2059–2078, Sept. 2000.
[6] A. Ganti, A. Lapidoth, and I. E. Telatar, “Mismatched decoding revis-
ited: General alphabets, channels with memory, and wide-band limit,”
IEEE Trans. Inform. Theory, vol. 46, pp. 2315–2328, Nov. 2000.
[7] A. Lapidoth, “Nearest neighbor decoding for additive non-Gaussian
noise channels,” IEEE Trans. Inform. Theory, vol. 42, pp. 1520–1529,
This leads to the bound Sept. 1996.
[8] R. Gallager, “Residual noise after interference cancellation on fading
multipath channels,” in Communications, Computation, Control, and
Signal Processing: A Tribute to Thomas Kailath, A. Paulraj, V. Roy-
chowdhury, and C. Schaper, Eds. Boston, MA: Kluwer Academic,
and, consequently, by (91) to 1997, pp. 67–77.
[9] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applica-
tions, 2nd ed. New York: Springer-Verlag, 1998.
(92) [10] G. Kaplan and S. Shamai (Shitz), “Information rates of compound chan-
nels with application to antipodal signaling in a fading environment,”
AËU, vol. 47, no. 4, pp. 228–239, 1993.
From (89), (90), and (92) we conclude [11] I. G. Stiglitz, “Coding for a class of unknown channels,” IEEE Trans.
Inform. Theory, vol. IT-12, pp. 189–195, Apr. 1966.
[12] R. Sundaresan and S. Verdú, “Robust decoding for timing channels,”
IEEE Trans. Inform. Theory, vol. 46, pp. 405–419, Mar. 2000.
(93) [13] S. Verdú, “On channel capacity per unit cost,” IEEE Trans. Inform.
Theory, vol. 36, pp. 1019–1030, Sept. 1990.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.
1134 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 5, MAY 2002

[14] R. G. Gallager, Information Theory and Reliable Communica- [22] , “On the fading number of multi-antenna systems,” in Proc. 2001
tion. New York: Wiley, 1968. IEEE Information Theory Workshop, Cairns, Australia, Sept. 2–7, 2001,
[15] M. Médard and R. Gallager, “Bandwidth scaling for fading multipath pp. 110–111.
channels,” IEEE Trans. Inform. Theory, vol. 48, pp. 840–852, Apr. 2002. [23] , “On the capacity and fading number of multi-antenna systems
[16] A. Lapidoth and P. Narayan, “Reliable communication under channel over flat fading channels with memory,” conf. paper. To be presented at
uncertainty,” IEEE Trans. Inform. Theory, vol. 44, pp. 2148–2177, Oct. the International Symposium on Information Theory ISIT’02, Lausanne,
1998. Switzerland, June 30–July 5, 2002.
[17] I. Csiszár and P. Narayan, “Channel capacity for a given decoding [24] , “Capacity bounds via duality with applications to multi-antenna
metric,” IEEE Trans. Inform. Theory, vol. 41, pp. 35–43, Jan. 1995. systems on flat fading channels,” paper, to be published.
[18] J. Richters, “Communication over fading dispersive channels,” MIT [25] N. Merhav, G. Kaplan, A. Lapidoth, and S. Shamai (Shitz), “On infor-
Res. Lab. Electron., Tech. Rep. 464, Nov. 30, 1967. mation rates for mismatched decoders,” IEEE Trans. Inform. Theory.,
[19] I. Abou-Faycal, M. Trott, and S. Shamai (Shitz), “The capacity of dis- vol. 40, pp. 1953–1967, Nov. 1994.
crete time Rayleigh fading channels,” IEEE Trans. Inform. Theory, vol. [26] T. M. Cover and J. A. Thomas, Elements of Information Theory. New
47, pp. 1290–1301, May 2001. York: Wiley, 1991.
[20] G. Taricco and M. Elia, “Capacity of fading channels with no side infor- [27] V. V. Prelov and C. van der Meulen, “An asymptotic expression for the
mation,” Electron. Lett., vol. 33, pp. 1368–1370, July 31, 1997. information and capacity of a multidimensional channel with weak input
[21] A. Lapidoth and S. Moser, “Convex-programming bounds on the ca- signals,” IEEE Trans. Inform. Theory, vol. 39, pp. 1728–1735, Sept.
pacity of flat-fading channels,” in Proc. Int. Symp. Information Theory 1993.
(ISIT’01), Washington, DC, June 2001.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY ROORKEE. Downloaded on April 11,2023 at 05:34:59 UTC from IEEE Xplore. Restrictions apply.

You might also like