0% found this document useful (0 votes)
536 views56 pages

Benedetto, Biglieri, Castellani, Digital Transmission Theory, Prentice-Hall, 1987 Appendix

Uploaded by

Tom Larry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
536 views56 pages

Benedetto, Biglieri, Castellani, Digital Transmission Theory, Prentice-Hall, 1987 Appendix

Uploaded by

Tom Larry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 56
det A - det B, (B.12) = [det Al', (B.13) det (cA) = c% + det A, for any number c. (B.14) ‘A matrix is nonsingular if and only if its determinant is nonzero. (©) The eigenvalues Given an N x N square matrix A and a column vector u with N entries, consider the set of NV linear equations Au=\a, (B.15) where \ is a constant and the entries of u are the unknown. There are only N values of A (not necessarily distinct) such that (B.15) has a nonzero solution. These numbers are called the eigenvalues of A, and the corresponding vectors u the eigenvectors associated with them. Note that if u is an eigenvector associated with the eigenvalue \ then, for any complex number c, cu is also an eigenvector. ‘The polynomial a(n) 2 det (AL ~ A) in the indeterminate d is called the characteristic polynomial of A. The equation det (AI — A) = 0. @.16) is the characteristic equation of A, and its roots are the eigenvalues of A. The Cayley- Hamilton theorem states that every square N X N matrix A satisfies its characteristic equation. That is, if the characteristic polynomial of A is a(\) = XN + oyWN~! + +++ bay, then afA) 2 AN + ay ANH 4s hay = 0, (B.17) where 0 is the null matrix (i.e., the matrix all of whose elements are zero). The monic polynomial 2(X) of lowest degree such that 41(A) = 0 is called the minimal polynomial of A. If f(x) is a polynomial in the indeterminate x, and u is an eigenvector of A associated with the eigenvalue A, then SA) u = fou. (B.18) That is, f(A) is an eigenvalue of f(A) and wu is the corresponding eigenvector. The eigenvalues dy, . . . , Awof the N x N matrix A have the properties Ny det (A) = [] 4, (B.19) Fa and w tr(A) = > d,. (B.20) a From (B.19), it is immediately seen that A is nonsingular if and only if none of its cigenvalues is zero. 584 Some Facts from Matrix Theory — Appendix B (@) The spectral norm and the spectral radius Given an NX N matrix A, its spectral norm |jA| the nonnegative number (all & sup 0 I (B21) where x is an N-component column vector, and |lul] denotes the Euclidean norm of the vector uz x (ul & VS wl? = Vara. (6.22) We have UA Bil = [A\l-/B 2» UA xh < All-In 32) for any matrix B and vector x. If \;, 7 = 1,. . . , N, denote the eigenvalues of A, the radius p(A) of the smallest disk centered at the origin of the complex plane that includes all these eigenvalues is called the spectral radius of A: (A) = max, | (B.25) In general, for an arbitrary complex N x N matrix A, we have (A) = |[Al] (B.26) and IlAl| = Vp(ArA). (B.27) IfA = At, then (A) = |All. (B.28) (©) Quadratic forms Given an N x N square matrix A and a column vector x with N entries, we call 8 quadratic form the quantity oN WAX = > > xfayx. (B.29) B.3 SOME CLASSES OF MATRICES Let A be an N x N square matrix. (@) A is called symmetric if A’ = A. (b) A is called Hermitian if AY = A. Sec. B.3 Some Classes of Matrices 585 (©) Ais called orthogonal if A~! = A’ (d) A is called unitary if AW! = AY (©) Ais called diagonal if its entries ay are zero unless i = j. A useful notation for a diagonal matrix is A= diag(ay,, 22, . . - 5 ayy): (f) A is called scalar if A = cl for some constant number ¢; that is, A is diagonal with equal entries on the main diagonal. (g) A is called a Toeplitz matrix if its entri dy satisfy the condition ay (B.30) ‘That is, its elements on the same diagonal are equal. (h) A is called circulant if its rows are all the cyclic shifts of the first one: ‘ 45, = 44-~pmodN- (B31) (i) Ais called positive (nonnegative) definite if all its eigenvalues are positive (nonnega- tive). Equivalently, A is positive (nonnegative) definite if and only if for any nonzero column vector x the quadratic form x"Ax is positive (nonnegative). Example B.1 Let A be Hermitian. Then the quadratic form f 2 x"Ax is real. In fact, St=OtAnt = XA (atey'x = xia. (B.32) Since A’ = A, this is equal to x"Ax = f, which shows that fis real. Example B.2 Consider the random column vector x = [x),%, - . . 5 tyl", and its correlation matrix RE Elxx']. B.33) It is easily seen that R is Hermitian. Also, R is nonnegative definite; in fact, for any nonzero deterministic column vector a, a'Ra=alE[xx'Ja Efa'x x'al (B34) = Ella'xP} =0 with equality only if a'x = 0 almost surely; that is, the components of x are linearly dependent Ifx,,. . . , xyare samples taken from a wide-sense stationary discrete-time random process, and we define ij 2 Eines 835) 586 Some Facts from Matrix Theory Appendix B it is seem that the entry of Rin the ith row and the jth column is precisely ry, ‘This shows in particular that R is a Toeplitz matrix If Gf) denotes the discrete Fourier transform of the autocorrelation sequence (r,), that is Gf) is the power spectrum of the random process (x,) [see (2.86)], the following can be shown. (a) The eigenvalues Ay, . . . , Ayof R are samples (not necessarily equidistant) of the function 4). () For any function (+), we have the Toeplitz distribution theorem (Grevander and Szeg6, 1958): es aya ee [wena o 8.36) Example B3 Let € be a circulant > N matrix ofthe foo cule % 4 ova Cae & Let also w 2 e?*™, so that w = 1. Then the eigenvector associated with the eigenvalue nis Do 2h lA (0.38) fori = 0,1... . .N~ 1. The eigenvalues of C are x= DS ew, i=0,1, -4, (B.39) and Af can be interpreted as the value of the Fourier transform of the sequence ¢p, 1. + Gots taken at frequency iN. C Example B.4 Mf U is a unitary N x NV matrix, and A is an N x N arbitrary complex matrix, pre- ot ostmultiplication of A by U does not alter its spectral norm; that is, llAU|| = |[All = [lAl|. 3 B49) 8.4 CONVERGENCE OF MATRIX SEQUENCES Consider the sequence (Ara of powers of the square matrix A. As n—> %, for A" to tend to the null matrix 0 it is necessary and sufficient that the spectral radius of A be less than 1. Also, as the spectral radius of A does not exceed its spectral norm, for A"— 0 it is sufficient that [[Al| <1 Consider now the matrix series THA ARH. HAMELS (B41) For this series to converge, it is necessary and sufficient that A"—> 0 as n—> %, If this holds, the sum of the series equals (I~ A)! Sec. B.4 Convergence of Matrix Sequences 587 B.5 THE GRADIENT VECTOR Let f(x) = flay, . . . » xy) be a differentiable real function of N real arguments. Its gradient vector, denoted by Vf, is the column vector whose N entries are the derivatives aflay, i= 1,... NAME x1,. . . . xy are complex, that is, =H +t if, (B.42) the gradient of f(x) is the vector whose components are FF oe a ax, ce Example B.S If a denotes a complex column vector, and f(x) 2 R{a'x], we have VW) =a. 0 (B.43) Example B.6 “Tf A is a Hermitian NX N matrix, and f(x) & x'A x, we have Vfl) = 2Ax. 0 B44) B.6 THE DIAGONAL DECOMPOSITION Let A be a Hermitian N x N matrix with eigenvalues Aj, . . . , Ay. Then A can be given the following representation: A=UAU"',” (B.45) where A 4 diag (Aj... , Ay), and U is a unitary matrix, so that U~! = UT. From (B.45) it follows that AU=UA, (B.46) which shows that the ith colin of U is the eigenvector of A corresponding to the eigenvalue ),. For any column vector x, the following can be derived from (B.45): " wax= > Abe, Ban a where y,,. . . , J are the components of the vector y & U'x. BIBLIOGRAPHICAL NOTES ; There are many excellent books on matrix theory, and some of them are certainly well known to the reader. The books by Bellman (1968) and Gantmacher (1959) are ency- clopedic treatments in which details can be found about any topic one may wish to study in more depth. A modem treatment of matrix theory, with emphasis on numerical ‘computations, is provided by Golub and Van Loan (1983). Faddeev and Faddeeva (1963) oes ‘Some Facts from Matrix Theory Appendix B and Varga (1962) include treatments of matrix norms and matrix convergence. The most complete reference about Toeplitz matrices is the book by Grenander and Szego (1958). For a tutorial introductory treatment of Toeplitz matrices and a simple proof of the distribution theorem (B.36), the reader is referred to (Gray, 1971 and 1972). In Athans (1968) one can find a number of formulas about gradient vectors. Appendix 8 Bibliographical Notes 589 APPENDIX C Variational Techniques and Constrained Optimization In this appendix, we briefly list some of the optimization theory results used in the book. Our treatment is far from rigorous, because our aim is to describe a technique for constrained optimization rather than to provide a comprehensive development of the underlying theory. The reader interested in more details is referred to (Luenberger, 1969, pp. 171-190) from which our treatment is derived; alternatively, to (Gelfand and Fomin, 1963). Let R be a function space (technically, it must be a normed linear space). Assume that a rule is provided assigning to each function f € R a complex number ¢{ f}. Then @ is called a functional on R. Example C.1 Let f(z) be a continuous function defined on the interval (a,b). We write fC(a, b). Then GUE fl), a xd, aif2 fmensenan — wecia,b) and oun froa. are functionals on the space C(a, 6). (3 If g is a functional on R, and f, hk € R, the functional d BeL fh 4 7 OLf + ah] ano (cy is called the Fréchet differential of . The concept of Fréchet differential provides a technique to find the maxima and minima of a functional. We have the following result A necessary condition for g{f] to achieve a maximum or minimum value for f = fo is that Bo(fo; A) = 0 for all hE R. In many optimization problems the optimal function is required to satisfy certain constraints. We consider in particular the situation in which a functional @ on R must be optimized under n constraints given in the implicit form WiLf]=C1, Wal f|=Co, + Val f] = Cyr where Yr, . . , Yy are functionals on R, and Cy, .. . , C, are constants. We have the following result: If fo E R gives a maximum or a minimum of @ subject to the constraints yf f1=C), 1 0 for —2 < x < =) subject to the constraint ° 1 rapa = 1 J « Lfeok ‘The relevant Fréchet differential is a . A da fifo (FO) ah TSF aC) _ = [__{ocerso0 ~zeypea}aeo a and this is zero for any A(x) provided that yor. leat = ay 592 Variational Techniques and Constrained Optimization Appendix C APPENDIX D Transfer Functions of State Diagrams A state diagram of a convolutional code such as that of Fig. 9.28 is a directed graph. To define a directed graph, we give a set V = {v), ¥2,. . . } of vertices and a subset E of ordered pairs of vertices from V, called edges. A graph can be represented drawing, a set of points corresponding to the vertices and a set of arrows corresponding to each edge of E and connecting two vertices. A path in a graph is a sequence of edges and can be denoted by giving the string of subsequent vertices included into the path. In the study of convolutional codes, we are interested in the enumeration of the paths with similar properties. A simple directed graph is shown in Fig. D.1. There are three vertices, v4, 0 and v3, and four edges (0, V3), (V4, V3), (U2, 03), and (v, V2). In this graph there are infinitely many paths between v, and v3 because of the loop at the vertex v2. One path of length 4, is, for instance, vyvvv0. Each edge of a graph can be assigned a label. ‘An important quantity, called transmission between two vertices, is defined as the sum of the labels of all paths of any length connecting the two vertices. The label of a path is defined as the product of the labels of the edges of the path. For example, the label Figure D1 Example of © c te directed graph, 593 of the path 7107070705 is LyaL32L25. The transmis then given by mn between v; and v3 in Fig. D.1 is TO, 0s) = Lys + Laalas + Lyabaalas + Liabdalas + +++ = Lys + Lizlas( + Loa + La +++) @. Likes | Lyt pe ela Notice that, in deriving (D.1), we have assumed that the labels are real numbers less than 1. Therefore, Fig. D.1 can be replaced by the scheme of Fig. D.2, with the label Lia given by (D.1). hs Figure D.2 Reduced graph 8 -. to compute the transmission ” Te). Given any graph, it is thus possible to compute the transmission between a pair of vertices by removing one by one the intermediate vertices on the graph and by redefining the new labels. As an example, the graph of Fig. D.3 replicates that of Fig. 9.32, with labels A, B, C, ... . , G. Using the result (D.1), the graph of Fig. D.3 can be replaced with that of Fig. D.4, in which the vertex d has been removed. By removing also the vertex b we get the graph of Fig. D.5, and finally the transmission between a and e: ACG (1 ~ E) + ABFG T oe D2) a.) = TE CD + CDE — BDF a ‘This result, with suitable substitution of labels, coincides with (9.104). A e fe 3 : : Figure D.4 First step to reduce the graph of Fig. D3. ae acs ME < Figure D.3_ Directed graph corresponding t0 . © . the convolutional code state diagram of Fig. 9.32. Figure D.S_ Second step to reduce the graph of Fig. D.3, There is another technique for evaluating T(a, e) that can be very useful in computa- tions because it is based on a matrix description of the problem. Let us define by x; the value of the accumulated path labels from the initial state a to the state i, as influenced by all other states. Therefore, the state equations for the graph of Fig. D.3 are 594 Transfer Functions of State Diagrams Appendix D p= A+ Dio, He = Cy + Pray xy = Bry + Ey a= Gx, In this approach we have Tia, e) £ x,, and therefore we can solve the system (D.3) and verify that x, is given again by (D.2) ‘The set of equations (D.3) can be given a more general and formal expression Define the two vectors X'S (py Kes Xap He) x) £ (A, 0, 6, 0) and the state transition matrix T abc 4 0 D Oja rea [eo 0 |b BE 0 0 Je GG ola Using (D.4) and (D.5), the system (D.3) can be rewritten in matrix form as Tx+%. ‘The formal solution to this equation can be written as x= U-T)! x, or as the matrix power series X= (14TH TP + TD +> + xo. (D.4) (D.5) (0.6) 7) (D.8) Notice that this power series solution is very satisfying when considering the state diagram as being described by a walk into a trellis. Each successive multiplication by T corresponds to a change in the state vector caused by going one level deeper in the trellis. Notice that the multiplication by T is very simple since most of its entries are zero. When the number of states is small, the matrix inversion (D7) is also usetul to get directly the result in the form (D.2).. Transfer Functions of State Diagrams Appendix D APPENDIX E Approximate Computation of Averages The aim of this appendix is to describe some techniques for the evaluation of bounds or numerical approximations to the average E{g(é)], where g(-) is an explicitly known deterministic function, and £ is some random variable whose probability density function is not known explicitly, or is highly complex and hence very difficult to compute exactly. It is assumed that a certain amount of knowledge is available about £, expressed by a finite and usually small set of its moments. Also, we shal] assume that the range of € lies in the interval (a, b], where both a and b are finite unless otherwise stated. The techniques described hereafter are not intended to exhaust the set of possible methods for solving this problem, However, they are general enough to handle a large class of situations and computationally efficient in terms of speed and accuracy. Also, instead of providing a single technique, we describe several, as we advocate that the specific problem handled should determine the technique best suited to it from the viewpoint of computational effort required, accuracy, and applicability. E.1 SERIES EXPANSION TECHNIQUE In this section, we shall assume that the function g(x) is analytic at point x Hence, we can represent it in a neighborhood of xy using the Taylor's series expansion: 86) = sa) + (~ se) + Ge ~ 4g EEO 0G — “Gx eo e) 596 If the radius of convergence of (B.1) is large enough to include the range of the random variable &, and we define ey & ELE = xol", (E.2) then, averaging termwise the Taylor’s series expansion of g(é), we get from (E.2) ra (0%) Flat) = abe) + e100) + 4c EOD (3) Itcan be seen from (E.3) that E[g(€)] can be evaluated on the basis of the knowledge of the sequence of moments (c,);=1, provided that the series converges. In particular, ‘an approximate value of E{g(@)] based on a finite number of moments can be obtained by truncating (E.3): £00), Ble(@1 = Ealet@ & 5 «tf 4) ‘The error of this approximation is = pte Ele(@)] - Enlet®) of sew (5) “x Sp HE = 20)" 7 Mhay + OE = oD, where 0 < @ < 1. Depending on the specific application characteristics, either one of the second and third terms in (E.5) can be used to obtain a bound on the truncation error. In any case, bounds on the values of the derivatives of g(-) and of the moments (E.2) must be available. As an example of application of this technique, let us consider the function B(x) = erfe () (E.6) where h and o are known parameters. This function can be expanded in a Taylor's series in the neighborhood of the origin by observing that (Abramowitz and Stegun, 1972, p. 298) a Serie) = So (ze 7 where H,_\(-) is the Hermite polynomial of degree (k ~ 1) (Abramowitz and Stegun, 1972, pp. 773-787). Thus, h_) BR (0) = (-t : (0) = ¢ Wea (Ze exo ( - (E.8) and we finally get h 2 Fla] = erfe (a) * vee 20° (E.9) S (Dh 2 aye! (ya) e Sec. £1 Series Expansion Technique 597 where Spe, k= 1,2, are the central moments of the random variable €. ‘The proof that the series (E.9) is convergent, as well as an upper bound on the truncation error, can be obtained by using the following bound on the value of Hermite polynomials (Abramowitz and Stegun, 1972, p. 787): 1H,(@)| = 82"? Vat eo(5) B by (E.10) 086435, m— 1.2... , (1) and the following bound on central moments: Ince] SENEMX’, k= 0, 5 =O, (E.12) where x denotes the maximum value taken by |g| The bound (E.12) can be easily derived under the assumption that & is bounded. Using (E.11) and (E.12), we get the inequality (Prabhu, 1971) [Rotgll 4 |E[g@)1 ~ Ents) Bele xen Wtgor x ] ‘ (@.13) = ‘2 oT weve |! oVNF1 which holds provided that 2 (*) a,EIP,@1, (E22) c= which can in turn be approximated by a finite sum. The computation of (E.32) requires the knowledge of the ‘generalized moments” E[P,(E)}, which can be obtained, for example, as finite linear combinations of the central moments (E.10) E.2 QUADRATURE APPROXIMATIONS In this section we shall describe an approximation technique for E[e(&)] based on the observation that this average can be formally expressed as an integral: Sec. £.2 Quadrature Approximations 599 > B[g()] = i BOMEX) dx, (E.23) where f(-) denotes the probability density function of the random variable &. Having ascertained that the problem of evaluating E[g(&)] is indeed equivalent to the computation of an integral, we can resort to numerical techniques developed to compute approximate values of integrals of the form (E.23). The most widely investigated techniques for approximating a definite integral lead to the formula + w | ofl) de = D> wie), (B24) ° a a linear combination of values of the function g(-). The x, i= 1,2,...,N, are called the abscissas (or points or nodes) of the formula, and the w), i = 1, 2, 5 N, ate called its weights (or coefficients). The set of abscissas and weights is usually referred to as a quadrature rule. A systematic introduction to the theory of quadrature rules gf the form (E.24) is given in Krylov (1962). ‘The quadrature rule is chosen to render (B.24) as accurate as possible. A first difficulty with this theory arises when one wants to define how to measure the accuracy of a quadrature rule. Since we want the abscissas and weights to be independent of g(-), and hence be the same for all possible such functions, the definition of what is ‘meant by “‘accuracy’” must be made independent of the particular choice of g(-). The classical approach here isto select a number of probe functions and constrain the quadrature rule to be exact for these functions. By choosing g(-) to be a polynomial, it is said that the quadrature rule (E.24) has degree of precision v if it is exact whenever g(-) i a polynomial of degree < v for, equivalently, whenever g(x) = 1,x,.. . , 2”) and it is not exact for g(x) = x**, Once a criterion of goodness for quadrature rules has been defined, the next step is to investigate which are the best quadrature rules and how they can be computed. ‘The answer is provided by the following result from numerical analysis, slightly reformu- lated to fit our framework (see Krylov, 1962, for more details and a proof): Given a random variable & with range [a, b] and all of whose moments exist, it is possible to define a sequence of polynomials Po(x), P(x), . . . , deg P, (x) = i, that are orthonormal with respect to &; that is, ELP,(6)Pn(Q)1 = Binns myn = 0,1, 2... (E.25) Denote by x) the RHS of (E.24) converges to the value of the LHS “‘for almost any conceivable function’” f,(-) “‘which one meets in practice’’ (Stroud and Secrest, 1966, p. 13). However, this is not true in practice, essentially because the moments of § needed in the computation are not known with infinite accuracy. Computational experience shows that the Cholesky decomposition of the moment matrix M is the crucial step of the algorithm for the computation of Gauss quadrature rules, since M_ gets increasingly ill-conditioned with increasing N. Round- off errors may cause the computed M to be no longer positive definite. Thus its Cholesky decomposition cannot be performed because it implies taking the square root of negative numbers (Luvison and Pirani, 1979). In practice, values of NV greater than 7 or 8 can rarely be achieved; the accuracy thus obtained is, however, satisfactory in most situations. 602 Approximate Computation of Averages Appendix E £.3 MOMENT BOUNDS We have seen in the preceding section that the quadrature rule approach allows Elg(€)] to be approximated in the form of a linear combination of values of g(-). This is equivalent ting, for the actual probability density function f_(x), a discrete density in the i AO) =D wBe- x), (E.33) A where {Ww;, x). are chosen so as to match the first 2V — I moments of & according to E31). A more refined approach can be taken by looking for upper and lower bounds to E[g(@)]}, still based on the moments of é. In particular, we can set the goal of finding bounds to E{g(@)] that are in some sense optimum (j.e., they cannot be further tightened with the available informations on £). The problem can be formulated as follows: given a random variable € with range in the finite interval [a, b], whose first M moments 4), bo, + Hy are known, we want to find the sharpest upper and lower bounds to the integral » Ele) -{ BEL) de, (E34) where g(-) is a known function and f+) is the (unknown) probability density function of &. To solve this problem, we look at the set of all possible f((-) whose range is fa, b) and whose first M moments are 11, + Hag. Then we compute the maximum and minimum value of (E.34) as fe(-) runs through that set. The bounds obtained are optimum, because it is certain that 2 pair of random variables exists, say &’ and &", with range in {a, | and meeting the lower and the upper bound, respectively, with the equality sign. ‘This extremal problem can be solved by using a set of results due essentially to the Russian mathematician M. G. Krein (see Krein and Nudel’ man, 1977). These results ‘can be summarized as follows. (a) If the function g(-) has continuous (M + 3)th derivative, and g%*9() is everywhere in [a, 5] nonnegative, then the optimum bounds to E{(&)] are in the form w x 2 wished Elet@ < 2, wee. (E35) This is equivalent to saying that the two “‘extremal’’ probability density functions are discrete, which allows the upper and lower bounds to be written in the form of quadrature tules. If g%*%(-) is nonpositive instead of being nonnegative, it suffices to consider =g(+) instead of g) (b) If M is odd, then N’ = (M + 1)/2 and N” = (M + 3)/2. Also, fw), x//iy isa Gauss quadrature rule, and {w;, x;}%”, is the quadrature rule having the maximum degree of precision (i.e., 2N” + 1) under the constraints x; = a, xr = b. If M is even, then Sec. £3 Moment Bounds 603 (M + 2)/2. Also, {w;, xj}¥, (respectively, {w7, xj}J%,) is the quadra- ture rule having the maximum achievable degree of precision (i.c., 2N’) under the constraint x, = a (respectively, xj = b). A technical condition involved with the derivation of these results requires that the Gram matrix of the moments 4, . . . , jy be positive definite. For our purpose, 1a simple sufficient condition is that the cumulative distribution function of & have more than M + 1 points of increase. If £ is a continuous random variable, this condition is immediately satisfied. If € is a discrete random variable, it means that £ must take on more than M + 1 values. As M + 1 is generally a small number, the latter requirement is always satisfied in practice; otherwise, the value of E[g(é)] can be evaluated expli with no need to bound it. ‘Computation of moment bounds Once the moments 441, . . . , Hyyhave been computed, in order to use Krein’s resuljs explicitly the quadrature rules {w;, xj)’, and {w;, xj}¥), must be evaluated. From the preceding discussion it will not be surprising that the algorithms for computing the moment bounds (E.35) bear a close resemblance to those developed for computing Gauss quadrature rules. Indeed, the task is still to find abscissas and weights of a quadrature rule achieving the maximum degree of precision, possibly under constraints about the location of one or two abscissas. Several algorithms are available for this computation (Yao and Biglieri, 1980; Omura and Simon, 1980). Here we shall briefly describe one of them, based on the assumption that Golub and Welsch’s algorithm TABLE E.1 Computation of abscisss and weights for moment bounds Input moments (up abscissa and weights Mo4a Met Lower bound a Samy TaN Moaa ta (@+ Det + ays - = arrearee Me XWin a1, ‘Upper bound Ba — (a + yay + ab a MBven are ees Mar samy f= Is Meven M Upper bound Mt fatty Pb 604 Approximate Computation of Averages Appendix E described in Section E.3 has been implemented. In particular, we shall show how the known moments of & must be modified to use them as inputs to that algorithm and how the weights and abscissas obtained at its output must be modified. These computations are summarized in Fig. E.1 and Table E.1. By using the modified moments v; of the first column of Table E. | as an input to Golub and Welsch’s (or an equivalent) algorithm, the abscissas and weights of the second column are obtained. Abscissas and weights for moment bounds, to be used in (E.35), are then obtained by performing the operations shown in the third and fourth columns, respectively. Say] to) [SOCUE | fw, xh [MODIFY wa oes Jano weiscr's| WEIGHTS ANO MOMENTS ALGORITHM, ABSCISSAS, Figure E.1. Summary of the computations tobe performed for the evaluation of moment bounds. ‘These computations will yield tighter bounds as the number M of available moments, increases. However, as was discussed in the context of Gauss quadrature rules, M ‘cannot be increased without bound because of computational instabilities. In practice, it ig rarely possible to increase M beyond 15 or so. But this gives a sufficient accuracy in most applications. Example E.2 (Yao and Biglieri, 1980) ‘As an example of application of theories presented in this section, consider the function ‘40) given in (E.6). Using the expression (E.7) for the kth derivative of the error function, ‘Abscissas for moment bounds ‘Weights for moment bounds Matt Met wie Sec. E.3 Moment Bounds 605 we get shen = [D8 geeraap em “| }-u(53) 36) For a fixed integer k, the sign of the bracketed term in (E.36) does not depend on the value of x. To the contrary, H,_\[(h + x)'V2o], a polynomial having only simple zeros, will change sign whenever its argument crosses a zero value. Hence, g!*) is continuous and of the same sign in {a, 5] if and only if fa, 6] does not contain a root of the equation as an interior point. Furthermore, a simple sufficient condition for g*3%-) t0 be of the same sign in (a, D] is that the largest root of the preceding equation is smaller than a; that is . hed gen, o “where 244%) is the largest zero of Hiys2(2). Table E.2 shows the values of the largest zer0s of the Hermite polynomials of degrees 3 to 20. TABLE E2 The largest zero of Hermite polynomial ine) “Tk in 3 Tae 4 165068 5 dois 6 2.33000 7 28196 8 293064 5 219089 0 3ase16 M sees? 2 aa8972 3 sore i 420485 is 435399 16 sessra " agree 18 S04836 ty $2202 20 Siseras ‘Asa simple computation, eta = = 1,6 = 1, M = 3, and hs We obtain in this case Approximate Computation of Averages Appendix E and. For the upper bound we get and. so the moment bounds for g(-) as in (B.6) are Lb) +h (Gs) eo (49) s ol) 4B (et! 1, must be small enough to make this technique applicable For k= 1, (6.38) may not be valid. £.4 APPROXIMATING THE AVERAGE E (glé, »)] Before ending this appendix, we shall briefly discuss the problem of evaluating approxima- tions of the average Elg(é, 1)], where g(-, -) is a known deterministic function, and &, ‘7 are two correlated random variables with range in a region R of the plane. Exact computation of this average requires knowledge of the joint probability density function Je;n(Xs 9) of the pair of random variables &, 1. This may not be available, or the evaluation of the double integral Elg(é, )] = If, BOC, Wife n(%, ») dx dy (E.39) may be unfeasible. In practice, it is often exceedingly easier to compute a small number of joint moments Pom 2 ELEN", m= 0,15... My (E.40) and use this information to obtain Efg(E, ‘)) ‘The first technique that can be used to this purpose is based on the expansion of g(é, 1) in a Taylor’s series. The terms of this series will involve products £'y/ (l, m 0, 1, . . .), so that truncating the series and averaging it termwise will provide the desired approximation. Another possible technique uses cubature rules, a two-dimensional generalization of quadrature rules discussed in Section E.3. With this approach, the approximation of Elg(&, 0)] takes the form Sec. E.4 Approximating the Average Elglé, 1] 607 x D wages yd. 4) Ela, 0)! As a generiilization of the one-dimensional case, we say that the cubature rule {;,xi, yy has degree of precision v if (E.41) holds with the equality sign whenever a(x, y) is a polynomial in x and y of degree < v, but not for all polynomials of degree v + 1. Unfortunately, construction of cubature rules with maximum degree of precision is, generally, an unsolved problem, and solutions are only available in some special cases, For example, in Mysovskih (1968) a cubature rule with degree of precision 4 and NV = 6 is derived. This is valid when the region R and the function fe,(+,°) are symmetric relative to both coordinate axes; thus, we must have Fen 9) = Sel 9) = fen Yr OER. * (6.42) With these assumptions, ji, = 0 if at least one of the numbers i and kis odd. The moments needed for the computation of the cubature rule are then [1p.0, Ho,2, H4,0> boa and pz. Under. the same symmetry assumptions, a cubature rule with N = 19 and dégree of precision 9 can be obtained by using the moments 19,0, bo,2s Ha,o» Ho,ts Hay Baas M42» Ho.6> B6,0r H2,6+ 6,2» Ba,or Hos» and j14,4 (Piessens and Haegemans, 1975). If a higher degree of precision is sought or the symmetry requirements are not satisfied, one can resort to ‘‘good”” cubature rules that can be computed through the joint moments (E.40). Formulas of the type Ne Ny Elgg, = > 2 meetin £43) with degre of precision v = mini 1. Ny — 1 can be found by using the moments Bor Boss A= 1, + Ney 2Ny, and Hie k= 1, -oNe7 1, k=l, ...,N,— 1 Eg valent algorithms for the computation of weights and abscissas in (B.43) were derived in Luvison and Navino (1976) and Omura and Simon (1980). We conclude by commenting briefly on the important special case in which the two random variables & and 7 are independent. In this situation, by using moments of £ one can construct the Gauss quadrature rule {w;, x}",, and by using moments of ‘n one can similarly obtain the Gauss quadrature rule {u), y) matter to show that the following cubature rule can be obtaine Ne Ny Elgé. wl = > > wants. y), (6.44) ist jet and this has degree of precision v = min(2N, — 1, 2N, ~ 1). ‘Then it is a simple 608 Approximate Comiputation of Averages Appendix E APPENDIX FE The Viterbi Algorithm Consider K real scalar functions do(t9), A(T) + Mx-i(tx—1), whose arguments. “or «+ » 7x1 €an take on a finite number of values, and their sum KI Mito Ti ee TD SD Mt. (FD) In this appendix we consider the problem of computing the minimum of X(-), that is, the quantity ke aro) 2 dA). (F.2) Of course, this problem is trivial when the 7's are in a sense “‘independent’” (i.c., the set of values that each of them can assume does not depend on the value taken by the other variables). In this case, the solution of the problem is obvious: ka w= > min adr, 3) Row and the minimization of a function of K variables is reduced to the minimization of K functions of one variable. In mathematical parlance, this situation corresponds to the ‘case in which the range of the variables to, 1), . . . , T-1 is the direct product of the ranges of 7,0 <= K ~1 Instead, let us consider a more general situation in which the range of 1p, 5 ‘zx~1 is something different from this direct product. In this case, the value taken on by any of the 1; affects the range of the remaining variables, and (F.3) cannot be applied. Example F.1 7 ‘A tourist wants to travel by car from Los Angeles to Denver in five days. He does not ‘want to drive for more than 350 miles in a day and wants to spend every night in a motel of his favorite chain (say, Motel 60). With these constraints, the routes among which he can choose are summarized in Fig. F.1, The best choice will be the one minimizing the total mileage. Clearly, the total mileage of a given route is the sum of five terms, where each represents the distance driven in a day. Hence, this shortest-route problem has the form of (F.2), where 7, is the route followed on the ith day, and (7) is its length. Also, (F.3) cannot hold, as, for example, the choice to travel from Los Angeles to Blythe on the first day will minimize dy(r,), but will result in a A,(+,) value of 228, which is not its minimum value. 2) When the “trivial” solution (F.3) is not valid, in principle one can solve (F.2) by computing all the possible values for XMzo, . . . , tx) (which dre in a finite number) and choosing the smallest. In certain cases (for instance in Example F.1) this can be done; but here we assume that this task is computationally impractical because of the large number of values to be enumerated. Here we shall look for an algorithm that allows us to avoid the brute-force approach. To this end, we shall first describe a sequential algorithm to compute p. in (F.2) and then investigate under which assumption such an algorithm can be simplified and to what extent. A sequential minimization algorithm When the variables 7, . . . , T—1 are not “‘independent’” (in the sense previously stated), if we attempt to minimize X(-) with respect to 79 alone, our result will depend ‘on the values of 7), . . . , tx-1. This is equivalent to stating that the minimum found ely os ciry not 296. JUNCTION e & 295 PAGE_28 URANGO_-3% ee te a BLYTHE GatLuR AT 229 WILLIAMS 24° DAY DAY Day Day Day 1 2 3 4 5 Figure F.1 Which is the shortest route from Los Angeles to Denver? 610 The Viterbi Algorithm — Appendix F will be a function of 7), 2 Teas Say BACT + Tr-1)- Observe also that the minimization of X(-) with respect to 9 will involve only the function Ao(to) and not the remaining terms of the summation in (F.1). The resulting function of 7), . fan be further minimized with respect to 1;. This operation will only involve pity, « . . y Te) + Ay(ay). The result is now a function of 7, + Tate Say bala, | Tx-1)- Repeating this procedure a sufficient number of times, we finish with a function of 11 alone, which can be finally minimized to provide us with the desired value j2. ‘This recursive procedure can be formalized in the following manner. Denoting by (ulti... Tt E = 0, , K ~ 2, the set of values that 7 is allowed to take on once the values of i414...» + Tx=1 have been chosen, the procedure involves the following steps: Bare oo TD = fogein,_y Roo WAT 6s THD Gy yi gg lHIray «+ TDF Malt Ds b= min pet =) The Viterbi algorithm Simplifications of the basic algorithm of (F.4) can be derived from the specific structure of the sets {r[t141,. . - » =a}: The simplest possible situation arises when, for all =0,.. . , K — 2, we have filtieny Teak = (oo (F5) that is, the values that +; can take on are not influenced by those of ts... = + Tk=1- This is the case where we deal with ‘‘independent’’ variables and we see that (F.4) reduces to (F.3). ‘A more interesting situation arises when we consider the second simplest case, in which, for all] = 0,... ,K ~ 2, friltieis 6 teh = trite (F.6) that is, the values that each 7, is allowed to take on depend only on 1)»; In this case, (F.4) simplifies to watt) = jain, oto), BAT) = ema, (ereiGi-d + iG) K-41, (F.7) w= min px which is the celebrated Viterbi algorithm The Viterbi Algorithm — Appendix F en ‘The Viterbi algorithm can be given an interesting formulation once the minimization problem to be solved has been formulated as the task of finding the shortest route through a graph. To do this, let us describe the general model that in its various forms leads to application of the Viterbi algorithm in this book. Consider an information source generating a finite sequence &, £1, . . . , &-1 of independent symbols that can take on a finite number M of values. These symbols are fed sequentially to a system whose [th output, say x), depends in a deterministic, time-invariant way on the present input and the L previous inputs: w= BE Et, > G0). (F.8) ‘The sequence (x,) can be thought of as generated from a shift register, as shown in Fig. F.2. With this model, it is usual to define the state of the shift register at the emission of symbol &, as the vector of the symbols contained in its cells, that is, oF Er s+ ++ Beds (9) so we can say that the output x; depends on the present input & and on the state 0, of the shift register: a1 a, 0). 10) When the source emits the symbol &,, the shift register is brought to the state 074) = (&, Ea... « » &-141). Now we can define the transition between these two states as Tet S (On O14): Can) Determination of the range of index / in (F.8) to (F.11) requires some attention, because for 1 < L and for > XK the function g(-) in (F.8) has a number of arguments & bes b-. D D ----+[> Pa |~ x, Figure F.2. Generation of a shift-register state sequence. & ca bes D D 0) fo = a. G1 bad Figure F3a) Example of a shift-register @ state sequence. 612 The Viterbi Algorithm Appendix F ap enn an) 4S Figure F.3(b) ‘The associated trellis dia- Cc) ‘ram. that are not defined. A possible choice is to assume £, = 0 for /< 0 and! > K ~ 1, and to define the function g(-) accordingly. In this case, J can be assumed to range from 0 to K + L — 1. Otherwise, we may want g(-) to be defined only when its L + 1 arguments belong to the range of the symbols &. In this situation, we must assume / in (F.8) to range from L to K — 1, which implies in particular that each state can take on ME values and each transition ME"! values. Consider now a graphical representation of this process, in the form of a trelis. This is a treelike graph with remerging branches, where the nodes on the same vertical line represent distinct states for a given ! (this index is commonly called the time), and the branches represent transitions from one state to the next. An example should clarify this point. Example F.2 Consider a shift-register state sequence with L = 2, M = 2, & © {~1, I}, and K = 5. For 2 <1-< 5, the states o, can take on four values: (#1, 21). For I <2 and | > 5, we should also consider states including one or two zeros, and assume that g(-) is also defined when one or more of its arguments are zero. In this situation, Fig. F.3b depicts the trellis, diagram for 2 = 1 = 5. Notice that the actual form of g(+) is irrelevant with respect to the structure of the trellis diagram. C] Let us now return to the original minimization problem, to be formulated by way of the use of the trellis diagram defined. It is sufficient to assume that the function to be minimized, the one defined in (F.1), has as its arguments the transitions (F.11) between states, so that the values Ai(r,) can be associated with the branches of the trellis (these are usually called the lengths or the metrics of the branches). Stated this way, it is easy to see that the set of variables minimizing \(-) corresponds to the minimum-length path through the trellis. Also, it is relatively simple to show that (F.6) holds in this situation. Hence, the Viterbi algorithm can be applied. It suffices to observe that the sequence 1),.1, » Tx-1 Corresponds to a path in the trellis from oto oxy, and that the set of transitions 7, compatible with such a path is only determined by ois) The Viterbi Algorithm — Appendix F 613 Example F.3 Letus illustrate the application ofthe Viterbi algorithm to a minimization problem formulated with a trllis diagram. Figure F.4 shows a trellis whose branches are labeled according 10 their respective lengths, Figure F.5 shows the five steps to be performed to determine the shortest path through this trellis, as briefly described in the following. With reference to (F.7), the fist step in the algorithm is to choose, for each 7, (or, equivalently, for each state g,), the branch leading, to «and having the minimum length (this is trivia, because there is only one such branch in our example.) For each value of 6;, store this shortest path and its length, which corresponds to the value of wy in (F.7) Extend now the paths just selected by one branch. For each state of select the branch leading to it such that the sum of its length Ay plus the value of ty just stored is at a ‘minimum, Store these unique paths, together with their total lengths 13 Extend again these paths by one branch. For each state 0, select the brarich leading to it such that hz + His at a minimum and store these minimum paths, together with their lengths. Similar steps should be performed for / = 4 and ! = 5; when J = 5 the algorithm is terminated. Then we are left with a single path, the shortest one, together with its length y.. These steps are illustrated in Fig. F.5. (] ‘An interesting fact can be observed from this example. At step 4 (see Fig. F.5d) ‘we see that all the paths selected so far have a common part. In fact, there is a merge in that all these paths pass through a single node (the uppermost one for ! = 3). Clearly, whatever happens from now on will not change anything before this merge. Hence we can deduce that the optimum path will certainly include the first three branches of the path depicted in Fig. F.5d. Complexity of the Viterbi algorithm Finally, let us briefly discuss the computational complexity of the Viterbi algorithm. Let N, denote the maximum number of states in the trellis diagram for any 1, and N, the maximum number of branches leading from the nodes corresponding to time index To the nodes corresponding to time index ! + 1, for any 1 (e.g., N,= 4 and N, = 8 the trellis diagram of Fig. F.4). As far as memory is concemed, the algorithm requires no more than N, storage locations, each capable of storing a sequence of paths and their Iengths. The computations to perform in each unit of time are no more than N, additions and N, comparisons. As for a shift-register sequence N, = M“ and N, = M“*!, the complexity of the 30 1 iss Figure F.4 Trellis labeled with branch lengths. 614 The Viterbi Algorithm — Appendix F os (e) Viterbi algorithm increases exponentially with the length L of the shift register. Notice that the amount of computations required for the minimization of the function X(-) de- fined in (F.1) grows only linearly with K, whereas the exhaustive enumeration of all the values of A(-) would require a number of computations growing exponentially with K. BIBLIOGRAPHICAL NOTES The Viterbi algorithm was proposed by Viterbi (1967) as a method for decoding convolu- tional codes (see Chapter 9). Since then, it has been applied to a number of minimization problems arising in demodulation of digital signals generated by a modulator with memory (see Chapter 4) or in sequence estimation for channels with intersymbol interference (Chapters 7 and 10). A survey of applications of the Viterbi algorithm, as well as a number of details regarding its implementation, can be found in Forney (1973). The connections between Viterbi algorithm and dynamic programming techniques were first recognized by Omura (1969) The Viterbi Algorithm — Appendix F 615 References 10, 12, 616 Aaron, M. R., and D. W. Tufts (1966), “Intersymbol interference and error probabil IEEE Transactions on Information Theory, vol. IT-12, pp. 2436. . Abend, K., and B, D. Fritchman (1970), “Statistical detection for communication channels with intersymbol interference,"* IEEE Proceedings, vol. 58, pp. 779-785. Abramowitz, M., and I. A. Stegun (eds.) (1972), Handbook of Mathematical Functions. New York: Dover Publicati ‘Ajmone Marsan, M., and E. Biglieri (1977), “Power spectra of complex PSK for satellite communications,”” Alta Frequenza, vol. 46, pp. 263-270. ‘Ajmone Marsan, M., S. Benedetto, E. Biglieri, and R. Daffara (1977), Performance Analysis of a Nonlinear Satellite Channel Using Volterra Series, Final Report to ESTEC Contract 2328/74 HP/Rider 1, Dipartimento di Elettronica, Politecnico di Torino (Italy). ‘Amoroso, F. (1976), ‘Pulse and spectrum manipulation in the MSK format,”” IEEE Transac- tions on Communications, vol. COM-24, pp. 381-384 ‘Amoroso, F. (1980), "The bandwidth of digital data signals,"* EEE Communications Maga- zine, vol. 18, pp. 13-24, November. ‘Anderson, J. B., and J. R. Lesh (1981), “*Guest editors’ prologue,"* EEE Transactions ‘on Communications, vol. COM-29, pp. 185-186. Anderson, J. B., and D. P. Taylor (1978), "“A bandwidth efficient class of signal-space ‘codes,"" IEEE Transactions on Information Theory, vol. 11-24, pp. 103-112. Anderson, R. R., and G. J. Foschini (1975), ““The minimum distance for MLSE digital data systems of limited complexity,”” IEEE Transactions on Information Theory, vol. IT- 21, pp. 544-551 ‘Andrisano, O. (1982), “On the behaviour of 4-PSK radio relay systems during multipath propagation,” Alia Frequenza, vol. 51, pp. 112-126, Arens, R. (1957), “Complex processes for envelopes of normal noise,” IEBE Transactions on Information Theory, vol. TT-3, pp. 203-207 References 4 Is. 16. 19. 20. 21 2 23. 24. 25, 20. 28. 29. 30. 31 22. 33, 34. . Aulin, T., and CE, W, Sundberg (1981), Arsac, J. (1966), Fourier Transforms and the Theory of Distributions, translated by Allen Nussbaum and Gretcher C. Heim. Englewood Cliffs, N.J.: Prentice-Hall Arthurs, E., and H. Dym (1962), ““On the optimum detection of digital signals in the presence of white Gaussian noise. A geometric interpretation and a study of three basic data transmission systems," JRE Transactions on Communications Systems, vol. CS-10, pp. 336-372. Ash, R. B. (1967), Information Theory. New York: Wiley-Interscience. Athans, M. (1968), "The matrix minimum principle," Information and Control, vol. 11, pp. 592-606. ‘ontinnous phase modulation—Part I: Full response signaling,"” IEEE Transactions on Communications, vol. COM-29, pp. 126-208, Aulin, T., B. Persson, N. Rydbeck and C.-E. W. Sundberg (1982), Spectrally Efficient Constant-Amplitude Digital Modulation Schemes for Communication Satellite Applications, ESA Contract Report, May 1982. Aulin, T., G. Lindell, and C.-E. W. Sundberg (1981), “Selecting smoothing pulses for partial response digital FM,"” IEE Proceedings, vol. 128, pt. F, pp. 237-244. Auli, T., N. Rydbeck, and CE. W. Sundberg (1981), “Continuous phase modulati Part I: Partial response signaling,” JEBE Transactions on Communications, vol. COM- 29, pp. 210-225. Austin, M. (1967), Decision-Feedback Equalization for Digital Communication Over Disper- sive Channels, MIT Res. Lab. Electron. Tech. Rep. 461, August 1967, Babler, G. M, (1973), “Selectively faded nondiversity and space diversity narrowband microwave radio channels,"* Bell System Technical Journal, vol. 52, pp. 239-261 Barker, H. A., and S. Ambati (1972), “*Nonlinear sampled-data system analysis by multidi mensional 2-transforms,"" IEE Proceedings, vol. 119, pp. 1407-1413. Bamett, W. T. (1972), "Multipath propagation at 4, 6 and 11 GHz,” Bell System Technical Journal, vol. 51, pp. 321-361. Beare, C, T. (1978), **The choice of the desired impulse response in combined linear- Viterbi algorithm equalizers,"” IEEE Transactions on Communications, vol. COM-26, pp. 1301-1307. Bedrosian, E. (1962), ““The analytic signal representation of modulated waveforms,"” IRE Proceedings, vol. 50, pp. 2071-2076. Bedrosian, E., and S. O. Rice (1971), "The output properties of Volterra systems driven by harmonic and Gaussian inputs,” /EEE Proceedings, vol. 59, pp. 1688-1707 Belfiore, C. A., and J. H. Park, Jr. (1979), “Decision feedback equalization,” IEEE Proceed- ings, vol. 67, pp. 1143-1156, Bellini, S., and G. Tartara (1985), “Efficient discritninator detection of partial response continuous phase modulation,”” IEEE Transactions on Communications, vol. COM-23, pp. 883-886. Bellman, R. E, (1968), Mairix Analysis, 2nd ed. New York: McGraw-Hill, Benedetto, S., and E. Biglieri (1974), “On linear receivers for digital transmission systems,” IEEE Transactions on Communications, vol. COM-22, pp. 1205-1215 Benedetto, S., and E. Biglieri (1983), ‘Nonlinear equalization of digital satellite channels,” IEEE Journal on Selected Areas in Communications, vol. SAC-1, pp. 57-62. Benedetto, S., E. Biglieri, and J. K. Omura (1981), “Optimum receivers for nonlinear satellite channels,"” 5th International Conference on Digital Satellite Communications, Ge- nova, Italy, March 1981. Benedetto, S., E. Biglieri, and R. Daffara (1976), “Performance of multilevel baseband digital systems in a nonlinear environment,”” IEEE Transactions on Communications, vol. COM-24, pp. 1166-1175 References 617 35 36. 30. 38. aL 2 43, 43 46. 41. 48. 49. s1. 52. 33. 55. 56. 37. 18 Benedetto, S., E. Biglieri, and R. Daffara (1979), "Modeling and performance evaluation of nonlinear satellite links—A Volterra series approach,”” JEEE Transactions on Aerospace and Electronic Systems, vol. AES-15, pp. 494-507. Benedetto, S., E. Biglieri, and V. Castellani (1973), “*Combined effects of intersymbol, interchannel, and co-channel interferences in M-ary CPSK systems,"” /EEE Transactions on Communications, vol. COM-21, pp. 997-1008. Benedetto, S., G. De Vincentiis, and A. Luvison (1973), “Error probability in the presence of intersymbol interference and additive noise for multilevel digital signals,"" EEE Transac- tions on Communications, vol. COM-21, pp. 181-188. Benedetto, S., M. Ajmone Marsan, G. Albertengo, and E. Giachin (1987), ““Combined coding and modulation: Theory and applications,”* /EEE Transactions on Information Theory, to be published. Bennett, W. R., and S. O. Rice (1963), “Spectral density and autocorrelation functions associated with binary FSK,"* Bell System Technical Journal, vol. 42, pp. 2355-2385. Benveniste, A., and M. Goursat (1984), “Blind equalizers,"* IEEE Transactions on Communi- cations, vol. COM-32, pp. 871-883. Berger, T. (1971), Rate Distortion Theory. Englewood Cliffs, N.J.: Prentice-Hall. Beyger, T., and D. W. Tufts (1967), “Optimum pulse amplitude modulation. Part I: Transmit- ter-receiver design and bounds from information theory,”” IEEE Transactions on Information Theory, vol. IT-13, pp. 196-208. Berlekamp, E. R. (1968), Algebraic Coding Theory. New York: McGraw-Hill, Berlekamp, E. R., ed. (1974), Key Papers in the Development of Coding Theory. New York: IEEE Press. Biglier, E. (1984), “High-level modulation and coding for nonlinear satellite channels,"* IEEE Transactions on Commitnications, vol. COM-32, pp. 616-626. Biglieri, E., A. Gersho, R. D. Gitlin, and T. L. Lim (1984), “Adaptive cancellation of nonlinear intersymbol interference for voiceband data transmission,” /EEE Journal on Selected ‘Areas in Communications, vol. SAC-2, pp. 165-777. Biglieri, E., M. Elia, and L. Lo Presti (1984), “*Optimal linear receiving filter for digital transmission over nonlinear channels," GLOBECOM 1984, Atlanta, Ga., November 1984. Blachman, N. M. (1971), “Detectors, bandpass nonlinearities and their optimization: inversion of the Chebyshev transform,”” IEEE Transactions on Information Theory, vol. YT-17, pp. 398-404. Blachman, N. M. (1982), Noise and Its Effects on Communication, 2nd ed. Malabar, Fla: R. E. Krieger Publishing Co. Blahut, R. E. (1983), Theory and Practice of Error Control Codes. Reading, Mass.: Addison- Wesley, Blanchard, A. (1976), Phase-locked Loops. New York: Wiley. Blanc-Lapierre, A., and R. Fortet (1968), Theory of Random Functions, vol. 2. New York: Gordon and Breach. Boutin, N., S. Morissette, and C. Portier (1982), “Extension of Mueller’s theory on optimum pulse shaping for data transmission,"" /EE Proceedings, vol. 129, Pt. F, pp. 255-260. Bracewell, R. N. (1978) The Fourier Transform and Its Applications, 2nd ed. New York McGraw-Hill. Brennan, L. E., and I. S, Reed (1965), ““A recursive method of computing the Q function,"” IEEE Transactions on Information Theory, vol. 1T-L1, pp. 312-313. Butman, S. A., and J. R. Lesh (1977), “The effects of bandpass limiters on n-phase tracking systems,”” IEEE Transactions on Communications, vol. COM-25, pp. 569-516. Cambanis, S., and B. Lin (1970), “On harmonizable stochastic processes,” Information and Control, vol. 17, pp. 183-202. References 58, Cambanis, S., and E. Masry (1971), “On the representation of weakly continuous stochastic processes,”” Information Sciences, vol. 3, pp. 277-290. 59. Campbell, L. L. (1969), "Series expansions for random processes,"" in Proceedings of the International Symposium on Probability and Information Theory, Lecture Notes in Mathert- ies, no. 89, pp. 77-95. New York: Springer-Verlag 60, Campopiano, C. N.. and B. G, Glazer (1962), “A coherent digital amplitude and phase modulation scheme," JRE Transactions on Communication Systems, vol. C8-10, pp. 90- 9s. 61, Cariolaro, G. L., and G. P. Tronca (1974), “Spectra of block coded digital signals,"* IEEE Transactions on Communications, vol. COM-22, pp. 1555-1564. 62. Cariolaro, G. L., and S. Pupotin (1975), “Moments of correlated digital signals for error probability evaluation,” JEEE Transactions on Information Theory, vol. TT-21, pp. 558 568 63. Cariolaro, G. L., G. L. Pietobon, and G. P, Tronca (1983), “Analysis of codes and spectra calculations,” International Journal of Electronics, vol. 35, pp. 35-19. 64, Castellani, V., L. Lo Presti, and M. Pent (1974a), “Performance of multilevel DCPSK systems in the presence of both interchannel and intersymbol interference," Electronics Letters, vol. 10, no. 7, pp. 11-112, 65, Castellani, V., L. Lo Presti, and M. Pent (1974b), "Multilevel DCPSK over real channels,"” IEEE International Conference on Communications (ICC '74), Minneapolis, Minn., June 1974 66. Chalk, J. H. H. (1950), “The optimum pulse shape for pulse communication," IEE Proceed- ings, vol. 97, pt. Il, pp. 88-92. 67. Chang, R. W. (1971), “A new equalizer structure for fast start-up digital communicatio Bell System Technical Journal, vol. 50, pp. 1969-2001 68. Chang, R. W., and J. C. Hancock (1966), ““On receiver structures for channels having memory," IEEE Transactions on Information Theory, vol. 1T-12, pp. 463-468, 69, Clark, G. C., and J. B. Cain (1981), Error-Correction Coding for Digital Communications. New York: Plenum Press. 70. Corazza, G., and G. Immovilli (1979), “On the effect of a type of modulation pulse shaping in FSK transmission systems with limiter-discriminator detection,” Alta Frequenza, vol. 48, pp. 449-457. 71, Corazza, G., G. Crippa, and G. Immovilli (1981), “Performance analysis of quaternary CPFSK systems with modulation pulse shaping and timiter-discriminator detection,"” Alta Frequenza, vol. 50, pp. 77-88. 72, Croisier, A. (1970), “Introduction to pseudotemary transmission codes, Research and Development, vol. 14, pp. 354-367 73, D'Andrea, N. A. and F. Russo (1983), “First-order DPLL's: A survey ofa peculiar methodol- ‘ogy and some new applications,"* Alta Frequenza, vol. 52, pp. 495-505. 74, Daut, D. G., J. W. Modestino, and L.. D. Wismer (1982), “New short constraint length ‘convolutional code construction for selected rational rates,”” BEE Transactions on Information Theory, vol. 1T-28, pp. 794-800. 75, Davenport, W. B., Jr. (1970), Probability and Random Processes. An Introduction for Applied Scientists and Engineers. New York: McGraw-Hill 76. De Buda, R. (1972), “Coherent demodulation of frequency-shift keying with low deviation ratio,”” IEEE Transactions on Communications, vol. COM-20, pp. 429-435. 77, Decina M., and A, Roveri (1987), “Integrated Services Digital Network: Architectures and Protocols,"” Chapter 2 of K. Feher, Advanced Digital Communications. Englewood (Cliffs, N.J.: Prentice-Hall. 78. De Jager, F., and C. B. Dekker (1978), ““Tamed frequency modulation—a novel method IBM Journal of References 619 80. 81. 82. 83. 84, 85. 86. 87. 88, 9 93. 95. 98. 99, 100, 620 to achieve spectrum economy in digital transmission,”” IEEE Transactions on Communica- tions, vol. COM-26, pp. 534-542 Devieux, C., and R. Pickholtz (1969), “Adaptive equalization with a second-order algo- rithm," Proceedings of the Symposium on Computer Processing in Communications, Polytech- nic Institute of Brooklyn, pp. 665-681. April 8-10, 1969. Di Toro, M. J. (1968), “Communication in time-frequency spread media using adaptive equalization,” IEEE Proceedings, vol. 56, pp. 1653-1679. Divsalar, D. (1978), Performance of Mismatched Receivers on Bandlimited Channels, Ph.D. ssertation, University of California, Los Angeles. Dugundji, Dj. (1958), “Envelopes and pre-envelopes of real wave-forms,”” IRE Transactions ‘on Information Theory, vol. IT-4, pp. 53-57. Duttweiler, D, L, (1982), ““Adaptive filter performance with nonlinearities in the correlation multiplier,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vols ASSP- 30, pp. 578-586. Dym, H., and H. P, McKean (1972), Fourier Series and Integrals. New York: Academic Press ia, M. (1983), “Symbol error rate of binary block codes,"* Transactions of the 9th Prague Conference on Information Theory, Statistical Decision Functions, Random Processes, PP 223-227, Prague, June 1982. Etnouti, S., and S. C. Gupta (1981), “Error rate performance of noncoherent detection of

You might also like