0% found this document useful (0 votes)
647 views632 pages

State Variables For Engineers - DeRusso, Paul M - (Paul Madden) Roy, Rob J - , Author Close, - 1965 - New York, Wiley - 9780471203803 - Anna's Archive

Uploaded by

Sin Nombre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
647 views632 pages

State Variables For Engineers - DeRusso, Paul M - (Paul Madden) Roy, Rob J - , Author Close, - 1965 - New York, Wiley - 9780471203803 - Anna's Archive

Uploaded by

Sin Nombre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 632

Digitized by the Internet Archive

in 2018 with funding from


Kahle/Austin Foundation

https://archive.org/details/statevariablesfoOOOOderu
-
State Variables for Engineers
State Variables for Engineers
PAUL M. DERUSSO ROB J. ROY CHARLES M. CLOSE

Department of Electrical Engineering


Rensselaer Polytechnic Institute

JOHN WILEY & SONS


New York • Chichester • Brisbane • Toronto
Copyright © 1965 by John Wiley & Sons, Inc. All Rights Reserved.
Reproduction or translation of any part of this work beyond
that permitted by Sections 107 or 108 of the 1976 United States
Copyright Act without the permission of the copyright owner
is unlawful. Requests for permission or further information
should be addressed to the Permissions Department, John
Wiley & Sons, Inc.
Library of Congress Catalog Card Number: 65-21443.
Printed in the United States of America.

20 19 18 17 16 15

ISBN 0 471 20380 7


Preface

The feedback control field has been given a strong impetus in the
theoretical direction. Stability theory, as formulated by Lyapunov, and
modern optimization theory, as developed by Bellman and Pontryagin,
and the works of Kalman, LaSalle, Merriam, and others are responsible for
the changes. These works rely heavily upon “state variable” formulations,
and it is apparent that advanced presentations of control theory must be
given from this viewpoint. Furthermore, practicing control engineers
must learn this viewpoint, because most of the current technical papers
in this field utilize a state variable formulation. Otherwise, the current
divergence of theory and practice will increase.
Acquiring this body of knowledge is difficult, since most of the topics
that are published in book form are contained in advanced books on
mathematics. Many engineering students and practicing engineers do not
have the level of mathematical sophistication necessary to comprehend these
advanced treatments. The purpose of this book is to provide these people
with the necessary self-contained transitional presentation. It is intended
to follow a conventional first course in feedback control systems, and to
provide the necessary background for advanced presentations of nonlinear
theory, adaptive system theory, sampled-data theory, and optimization
theory. Furthermore, it attempts to unite the state variable approach and
the more usual transfer function concepts, so that the reader can relate
the new material to what he already knows. For these reasons, extensive,
complicated proofs are omitted in favor of numerous examples. We have
not tried to include all possible topics, but rather have attempted to cover
basic principles so that the reader can subsequently investigate the lit¬
erature for himself. Thus some treatments are not as advanced as can be
found in the literature. After completing this book, the reader requiring
more depth in state variables, Lyapunov theory, or optimization theory
is encouraged to read Reference 1 of Chapter 5, References 26 and 28
of Chapter 7, and Reference 25 of Chapter 8, respectively, and the recent
technical papers on these topics.
vi Preface

This book is an outgrowth of notes prepared to meet the needs of a


first-year graduate course in modern automatic control theory initiated in
the Electrical Engineering Department of Rensselaer Polytechnic In¬
stitute in 1962. Since its original presentation, the course has also been
given to advanced undergraduates. The book contains more material
than can be presented in one semester. At Rensselaer, we omit Sections
1.1 to 1.6, 2.1 to 2.6, and 3.1 to 3.8 because they have been covered in
previous courses, and we briefly cover Chapters 7 and 8 in our one-
semester course. The material of Chapter 7 isdncluded in a subsequent
one-semester graduate course entitled “Nonlinear and Adaptive Systems,”
and extensive coverage of optimization theory is given in a subsequent
two-semester graduate course.
We are indebted to many people. First of all, Dr. Bernard A. Fleishman
of Rensselaer’s Department of Mathematics reviewed the entire manuscript
and made many suggestions. Drs. Chi S. Chang and Imsong Lee reviewed
Chapter 7, and Drs. Charles W. Merriam III and Frederick J. Ellert
reviewed and contributed to Chapter 8. In addition, Drs. Dean N. Arden,
Robert W. Miller, Dean K. Frederick, Howard Kaufman, and William
G. Tuel, Jr., and Mr. Hugh J. Dougherty made numerous suggestions
leading to improvement of the manuscript. Furthermore, Drs. Ellert and
Tuel supplied computer results used in several examples. The entire final
manuscript was patiently typed by Miss Rosana Laviolette, and portions
of the preliminary effort were typed by Miss Sandra Elliott and Mrs.
Joan Hayner.
Paul M. DeRusso
Rob J. Roy
Charles M. Close
Troy, New York
July, 1965
Contents

1 Time-Domain Techniques

1.1 Introduction 1
1.2 Classification of Systems 3
1.3 Resolution of Signals into Sets of Elementary Functions 8
1.4 The Singularity Functions 11
1.5 Resolution of a Continuous Signal into Singularity Functions 20
1.6 The Convolution Integral for Time-Invariant Systems 25
1.7 Superposition Integrals for Time-Varying Systems 32
1.8 Resolution of Discrete Signals into Sets of Elementary
Functions 35
1.9 Superposition Summations for Discrete Systems 37

2 Classical Techniques

2.1 Introduction 47
2.2 Representing Continuous Systems by Differential Equations 47
2.3 Reduction of Simultaneous Differential Equations 49
2.4 General Properties of Linear Differential Equations 52
2.5 Solution of First Order Differential Equations 53
2.6 Solution of Differential Equations with Constant Coefficients 54
2.7 Solution of Differential Equations with Time-Varying
Coefficients 68
2.8 Obtaining the Impulse Response from the Differential
Equation 71
2.9 Difference and Antidifference Operators 75
2.10 Representing Discrete Systems by Difference Equations 79
2.11 General Properties of Linear Difference Equations 82
2.12 Solution of Difference Equations with Constant Coefficients 83
2.13 Solution of Difference Equations with Time-Varying
Coefficients 94
vii
Contents

Transform Techniques

3.1 Introduction 100


3.2 The Fourier Series and Integral 101
3.3 The Laplace Transform 109
3.4 Properties of the Laplace Transform 112
3.5 Application of the Laplace Transform to Time-Invariant
Systems 119
3.6 Review of Complex Variable Theory v 125
3.7 The Inversion Integral 134
3.8 The Significance of Poles and Zeros 143
3.9 Application of the Laplace Transform to Time-Varying
Systems 146
3.10 Resolution of Signals into Sets of Elementary Functions 153
3.11 The Z Transform 158
3.12 Properties of the Z Transform 162
3.13 Application of the Z Transform to Discrete Systems 175
3.14 The Modified Z Transform 182

Matrices and Linear Spaces

4.1 Introduction 192


4.2 Basic Concepts 193
4.3 Determinants 202
4.4 The Adjoint and Inverse Matrices 211
4.5 Vectors and Linear Vector Spaces 214
4.6 Solutions of Linear Equations 223
4.7 Characteristic Values and Characteristic Vectors 232
4.8 Transformations on Matrices 245
4.9 Bilinear and Quadratic Forms 262
4.10 Matrix Polynomials, Infinite Series, and Functions of a
Matrix 270
4.11 Additional Matrix Calculus 286
4.12 Function Space 290

State Variables and Linear Continuous Systems

5.1 Introduction 313


5.2 Simulation Diagrams 314
5.3 Transfer Function Matrices 322
5.4 The Concept of State 325
5.5 Matrix Representation of Linear State Equations 329
Contents ix

5.6 Mode Interpretation—A Geometrical Concept 344


5.7 Controllability and Observability 349
5.8 Linear Fixed Systems—The State Transition Matrix 356
5.9 Linear Time-Varying Systems—The State Transition Matrix 362
5.10 Linear Time-Varying Systems—The Complete Solution 375
5.11 Impulse Response Matrices 381
5.12 Modified Adjoint Systems 388

State Variables and Linear Discrete Systems

6.1 Introduction 407


6.2 Simulation Diagrams 407
6.3 Transfer Function Matrices 410
6.4 The Concepts of State 413
6.5 Matrix Representation of Linear State Equations 415
6.6 Mode Interpretation 428
6.7 Controllability and Observability 431
6.8 The State Transition Matrix 432
6.9 The Complete Solution 441
6.10 The Unit Function Response Matrix 448
6.11 The Method of Least Squares 457

Introduction to Stability Theory and Lyapunov's


Second Method
7.1 Introduction 469
7.2 Phase Plane Concepts 470
7.3 Singular Points of Linear, Second Order Systems 472
7.4 Variational Equations 479
7.5 Limit Cycles 488
7.6 Rate of Change of Energy Concept—Introduction to the
Second Method of Lyapunov 498
7.7 Stability and Asymptotic Stability 501
7.8 Lyapunov’s Second Method for Autonomous Continuous
Systems—Local Considerations 504
7.9 Asymptotic Stability in the Large 511
7.10 First Canonic Form of Lur’e—Absolute Stability 513
7.11 Stability with Disturbances—Practical Stability 518
7.12 Estimation of Transient Response 519
7.13 Relay Controllers for Linear Dynamics 520
7.14 Determination of Lyapunov Functions—Variable Gradient
Method 524
7.15 Nonautonomous Systems 527
Contents

7.16 Eventual Stability 529


7.17 Discrete Systems 531
7.18 Application to Pulse Width Modulated Systems 532

Introduction to Optimization Theory

8.1 Introduction 546


8.2 Design Requirements and Performance Indices 548
8.3 Necessary Conditions for an Extrequim—Variational
Calculus Approach 551
8.4 Linear Optimization Problems 556
8.5 Selection of Constant Weighting Factors 569
8.6 Penalty Functions 573
8.7 Boundary Condition Iteration 574
8.8 Dynamic Programming 575
8.9 Control Signal Iteration 578
8.10 Control Law Iteration 580
8.11 Design Example 582
8.12 Singular Control Systems 590

Index 603
State Variables for Engineers
1
Time-Domain Techniques

1.1 INTRODUCTION

Physical systems are customarily represented by models consisting of


idealized components which can be precisely defined mathematically.
The choice of a suitable model, embodying all the features of a physical
system that are critical to its performance, may be difficult. If an overly
simplified model is used, the results obtained from it will not closely
approximate the behavior of the physical system. If an unnecessarily
complicated model is used, it may be difficult or even impossible to analyze.
Once a model is chosen, its characteristics are determined purely mathe¬
matically. In some cases, a system may be directly characterized in some
way, without reference to any particular model. A number of different
ways of utilizing these mathematical characteristics or models are dis¬
cussed in this book. Like most textbooks, this one says very little about
the choice of a model, or about how a specific system should be char¬
acterized. Instead, its purpose is to demonstrate the application of
mathematical tools after the system has been properly represented by a
model, or in some equivalent, purely mathematical manner.
The principal concern in the analysis of a system is the relationship
between certain inputs (or sources) and outputs (or responses). These
may be electrical, hydraulic, mechanical, thermal, or other types of
quantities, and they are normally real functions of time. In Fig. 1.1-1,
the inputs are denoted by vx(t), v2(t), . . . , vm(t), and the outputs by
yi(t), 1/2(0, • • ■ , VnO). The desired relationship is one of cause and effect,
so that, when the inputs are known, the resulting outputs can be deter-
1
2 Time-Domain Techniques

r*~yi(t)
*~y2(t)
^y(t)
Vm(t) yn(t)
(a)
Fig. 1.1-1

mined. The system is often, however, characterized in some implicit way,


instead of by a direct cause and effect relationship.
Systems with several inputs and outputs are conveniently handled by
matrices, which are discussed in Chapter 4. The first three chapters, there¬
fore, are in general restricted to systems with a single input and a single out¬
put. All the techniques of the first three chapters can, however, be readily
extended to multiple input-output systems, as is done in Chapters 5 and 6.
The first three chapters are also in general restricted to linear deter¬
ministic systems having no internal energy prior to the application of the
input. As shown in Section 1.4, any initial energy stored within the system
can be represented by added inputs. Nonlinear systems are considered
in the last two chapters.
Each of the first three chapters is concerned with one of the three
common methods of characterizing linear deterministic systems. Chapter
1 is based upon the system’s response to singularity or delta functions;
Chapter 2, upon the system’s differential or difference equation; and
Chapter 3, upon the system function related to the Fourier, Laplace,
or Z transforms. Both continuous and discrete and both fixed and varying
systems are treated. A major attempt is made to correlate the three
methods of characterizing systems.
Matrices are introduced in Chapter 4. Thus the basic mathematics
to be presented for the state variable approach are completed at this point
and are utilized throughout the remaining chapters. Chapter 5 introduces
state variable techniques from an engineering viewpoint, as applicable
to linear continuous systems. Linear discrete systems are similarly
considered in Chapter 6. Lyapunov stability theory, as applied to non¬
linear systems, is introduced in Chapter 7. Chapter 8 is an introductory
consideration of system design and optimization.
In essence, this book may be divided into three parts. The first consists
of Chapter^ 1 through 4, which present the basic mathematics utilized in
later chapters. Chapters 5 and 6 comprise the second part of the book
and present the basic concepts of state variables and some applications.
The third part of the book consists of Chapters 7 and 8, which are further
applications of the state variable viewpoint.
Sec. 1.2 Classification of Systems 3

1.2 CLASSIFICATION OF SYSTEMS

Systems may be classified in a number of different ways. Consider the


system of Fig. 1.1-16, with the single input v{t) and the single output y(t).
Assume that the system does not contain any independent sources and is
at rest, with no internal energy, before the signal is applied. The input-
output relationship is often indicated symbolically by
y(t) = Lv(t) (1.2-1)
where L is an operator that characterizes the system. It may be a function
of v, y, and t, may include operations such as differentiation and integra¬
tion, and may be given in probabilistic language. Equation 1.2-1 is really
nothing more than a shorthand way of saying that there is some cause
and effect relationship between v(t) and y(t).
A system is deterministic if, for each input v{t), there is a unique output
y(t).t In a nondeterministic or probabilistic system, there may be several
possible outputs, each with a certain probability of occurrence, for a given
input. Inputs to a system may also be given either as known functions
or as random functions. Random functions, such as noise, can be de-
described only in a statistical or probabilistic sense. If such a signal is an
input to a deterministic system, the output is not deterministic.
A system is nonanticipative if the present output does not depend on
future values of the input. In such a case, y(t0) is completely determined
by the characteristics of the system and by values of v(t) for t < t0. In
particular, if v{t) = 0 for all t < t0, then y(t) = 0 for t < t0. An antici¬
pate system would, on the other hand, violate the normal cause and
effect relationship.
A system is realizable if it is nonanticipative, and ff y(i}ta real function
of time for all real v{t). This definition does not imply that there is
necessarily a known procedure for combining components to yield a given
realizable system.
Assume that the responses to two different inputs vfif) and v2(t) are
yfit) and y%(t), respectively. Let cx and c2 denote two constants. A system
is linear if the response to v(t) = cpjfit) + c2v2(t) is y(t) = cyyfit) + c2y2(t)
for all values of vlt v2, cl9 and c2. Expressing this definition symbolically,
L[civi(t) + c2v2(t)] = c^vfit)] + c2L[v2(t)] (1.2-2)
Equation 1.2-2 is also known as the superposition principle. If it holds
only for inputs within a certain range, the system is linear only within
that range. Unless otherwise stated, a linear system is understood to be
linear for all inputs.
t See Section 5.4 for further discussion.
4 Time-Domain Techniques

A system is time-invariant or fixed if the relationship between the input


and output is independent of time. If the response to v(t) is y(t), then the
response to v(t — X) is y(t — X). In such a system, the size and shape of
the output are independent of the time at which the input is applied.
Symbolically,
L[v(t - X)] = y{t - X) (1.2-3)

Several examples of these definitions are given below.


Example 1.2-1. A differentiator is characterized by the relationship

2/(0 = ~t
at
v(0
The system is linear since

d d d
- [CjUiCO + c2v2(t )] = cx - vx{t) + c2 — v2{t)
at at at

The system is also realizable and time-invariant.

Example 1.2-2. A squarer is characterized by the relationship

2/(0 = v\t)
The system is nonlinear because

[cxui(0 + c2v2(t)]2 c1v1\t) + c2v2\t)

It is realizable and time-invariant.

Example 1.2-3. A system is characterized by the relationship

2/(0 = t - v{t)
cit

The system is linear since

d d d
t — [cxVi(t) + C2V2(t) 3= Cit — Vi(t) 1
- - c2t — v2{t)
at at at

It is realizable but time-varying because

d „ , dv(t — X)

Example 1.2-4. A system is characterized by the relationship

2/(0 = v(0 ~T v(d)


at
It is nonlinear since

[cXVx{t) + C2V2{t )] - [CiViit) + c2v2{t)\


at

7*— CiViit) ^ vx{t) + C2V2{t) ^ v2{t)


dt dt

It is realizable and time-invariant.


Sec. 1.2 Classification of Systems 5

Example 1.2-5. Prove or disprove the following statement. In a linear system, if the
response to v(t) is y(t), then the response to Re [y(03 is Re [y(t)\. The symbol Re is read
“the real part of.”
Vi + Vx*
Re[y] =
2
where the asterisk superscript denotes the complex conjugate.
If 2/(0 = L[v(t)], the response to Re [y(/)j is
vx + vx*
L
2 — \L[V i] +

This equals Re [2/(01 if and only if


L[vf) = {JLti/J}*
Although the last equation is not necessarily true, it is true if L is a real operator, i.e.,
if the responses to all real inputs are real. The proposed statement is valid, therefore,
for all realizable linear systems. By a similar proof, the phases “the real part of” can
all be replaced by “the imaginary part of.”

Example 1.2-6. Prove or disprove the following statement. In a linear system, if the
response to v(t) is y(t), the response to (d/dt)v(t) is (d/dt)y(t).
If 2/(0 = L[v(t)\, the response to dvjdt is

dv
L — L[v\, L not a function of t
dt dt

For a time-invariant system, L is not a function of time (although it may include


differentiation or integration with respect to time), and the last step is justified. The
proposed statement is therefore valid for time-invariant linear systems. By a similar
proof, differentiation may be replaced by integration.
For time-varying systems, the last step is not justified. A specific counter example is
the system of Example 1.2-3. The question is whether two linear operators commute.
The answer is, in general, no.

Example 1.2-7. Prove or disprove the following statement. The cascade connection of
two linear systems, as shown in Fig. 1.2-1, is linear.

<7(0
v(t) ■> >- y(0

Fig. 1.2-1

q(t) is the response of Nt to v(t), and y{t) is the response of N2 to q(t).

q(t) = Lx[v(t)\

2/(0 = L2[q(t)\ = L2{L1[v(t)]}

Since the two component systems are linear, the responses to v(t) = c1y1(/) + c2vft)
are given by
q(t) = Lx[cxvx(t) + c2v2(t)] = c1L1[u1(t)] + c2Lx[v2(t)]
y(t) = L2{cxLx[vx{t)\ + c2Lx[v2(t)]}
= cxL2{Lx[ux(t)]} + c2L2{Lx[v2{t)}}
6 Time-Domain Techniques

By the definition of linearity, the last equation proves that the cascaded combination
of N1 and N2 forms a linear system. It is true, in fact, that any system composed only
of linear components is itself linear.

Systems may be further classified according to the types of signals that


are present. A continuous signal is one which is a function of a continuous
independent variable t. The signal must be uniquely defined at all values
of t within a given range, except possibly at a denumerable set of points.f
The definition of a continuous signal is broader than the mathematical
definition of a continuous function. The function shown in Fig. 1.2-2#,
for example, is not continuous for a < t < b because of the discontinuity
at t = t0, but it does represent a continuous signal for a < t < b by the
definition above.

fi(t) h(t)

h(t)

Fig. 1.2-2

A discrete signal is one which is defined only at a sequence of discrete


values of the independent variable t. In many cases, the signal may be
zero except at these values of t, but this is not essential to the definition.
In many cases of practical interest, the instants at which the signal is
defined are equally spaced and can be denoted by t = t0 + kT, where
T is the time between instants, and where k takes on only integral values.
The signal is a function of the discrete independent variable k.
t A set of points is denumerable if the points can be listed on a one-to-one basis with
the positive integers.
Sec. 1.2 Classification of Systems 7

A quantized signal is one which can assume only a denumerable number


of different values. Quantizing is simply rounding off to the nearest
acceptable value, similar to rounding off a number to the nearest integer.
Quantized signals may be either continuous or discrete.
Example 1.2-8. The function fft) in Fig. 1.2-2a represents a continuous signal. For
use in the other parts of the figure, seven discrete instants of time and four discrete
signal levels are shown. Rounding off the signal to the nearest level produces the
quantized continuous signal fft). Sampling/d?) at the seven discrete instants of time
produces the discrete signal fft), while sampling f2(t) yields the quantized discrete
signal fft).

Example 1.2-9. Continuous signals that change only at discrete instants may be
completely described by equivalent discrete signals. Figure 1.2-3a shows a staircase

fi(t) hit)

->- t 1
o T 2T

(a) (b)

h(t) MO

1
0 T 2T

(c) (d)
Fig. 1.2-3

function whose value changes only at / = kT(k = 0, 1, 2, . . .), and Fig. 1.2-3& shows
a series of pulses of known width. These continuous signals are equivalent to the
discrete signals in parts (c) and (d) of the figure, respectively, in the sense that the original
signals can ideally be regenerated.

Consider the discrete signal produced by sampling any finite bandwidth,


continuous signal at regular intervals of T seconds. Shannon’s famous
sampling theorem says that the discrete signal is equivalent to the con¬
tinuous signal, provided that all frequency components of the latter are
less than 1/27" cycles per second.1 In this event, the original continuous
8 Time-Domain Techniques

signal can theoretically be recovered from the discrete signal by linear


filtering. In practice, this cannot be precisely accomplished.
Example 1.2-10. The output of a digital computer is a quantized discrete signal.

Example 1.2-11. In pulse code modulation, each time increment of signal is quantized,
often to the nearest of eight levels, including zero. Each quantized increment is then

represented by three two-level pulses using the binary number system. Thus the signal
of Fig. 1.2-4a is quantized and coded as shown in parts (b) and (c), respectively.

1.3 RESOLUTION OF SIGNALS INTO


SETS OF ELEMENTARY FUNCTIONS

The next chapter discusses the classical solution for the response y(t)
to a given input v(t). Depending upon the nature of both the system and
Sec. 1.3 Resolution of Signals into Sets of Elementary Functions 9

the input, classical techniques may involve considerable effort. One would
certainly expect, and it is true, that the responses to certain classes of
input functions could be determined more easily than the response to an
arbitrary input. Accordingly, a sensible procedure for linear systems is
to try to express a given arbitrary input as the sum of elementary functions.
If the response to each of the elementary functions is known, or if it can
be easily found, then the response to the arbitrary function can be found
by the superposition principle of Eq. 1.2-2.
Although this scheme is satisfactory for linear systems, whether they
are time-varying or not, it cannot be extended to nonlinear systenls. Its
validity is based upon Eq. 1.2-2, which does not apply to nonlinear
systems. This section describes the scheme in general terms, and the next
few sections and parts of Chapters 2 and 3 deal with the important special
cases. The reader may wish to return to this section later.
Computational ease is not the only reason behind the proposed scheme.
The characteristics of a system may be expressed in several different ways,
e.g., by differential equations, or by the response to certain elementary
functions. Although it is usually possible to convert from one method of
characterization to another, it is not always easy to do so. Furthermore,
different methods of characterization give different insights into system
analysis, and certain problems are much more conveniently treated by a
particular approach, such as the one discussed here.
Suppose that the input functions under consideration can be decom¬
posed into a denumerable number of elementary functions denoted as
k/f = 0, 1,2,...). Then
00

v(t) = 2 ajilt)
i=1

Each of the set of elementary functions is a function of time t. To


distinguish one of the elementary functions from its neighbors, the param¬
eter X is introduced, /c, is then regarded as a continuous function of t and
a discrete function of X and is written k(t, X). Each of the set of coeffi¬
cients is a constant with respect to t. They represent the relative
strength of the elementary functions composing v(t) and are, of course,
different for different elementary functions. Since at is a discrete function
of X, it is written a(X). The input is then written

v(t) = 2,a(X)k(t,A) (13-1)

k(t, X) is called the component or elementary function, and a(X) is £nown


as the spectral function of v(t) relative to k(t, X). The response of a system
to the elementary function will be another function of t and X, denoted by
10 Time-Domain Techniques

K(t, X). By superposition, the response of a linear system to v(t) is

y(t) = ^ a(X)K(t, X)
A
(1.3-2)
As discussed in Chapter 3, the Fourier series for a periodic function has
the form of Eq. 1.3-1. For nonperiodic functions, however, the resolution
of v(t) into a denumerable set of elementary functions is not possible. A
continuous set of elementary functions is then required. k(t, X) and a(X)
become continuous functions of the parameter X, and Eq. 1.3-1 is replaced
by

a(X)k(t, X) dX (1.3-3)
J

k(t, X) and a(X) are still called the elementary and spectral functions,
respectively. If the response of a linear system to k(t, 2) is again denoted
by K(t, X), the response to v(t) is

The parameter X may or may not be a real variable. If it is a complex


variable, Eqs. 1.3-3 and 1.3-4 require integration in a complex plane, a
matter discussed in Chapter 3. The limits of summation and integration in
Eqs. 1.3-1 through 1.3-4 depend upon the set of elementary functions used,
but in general the integration is over the entire range of the parameter X.
In the case of a real parameter, the limits would in general be from -- oo
to +oo. Finally, by use of the impulse function discussed in the next
section, Eqs. 1.3-1 and 1.3-2 can be regarded as a special case of Eqs. 1.3-3
and 1.3-4.
The approach summarized by Eqs. 1.3-3 and 1.3-4 is practical only if
the following two requirements are met.

(i) There must be an easy way of finding the coefficients a(X) for any
arbitrary function of time.
(ii) There must be an easy way of finding K(t, X), the response to the
elementary function, for an arbitrary linear system.

An appropriate class of elementary functions k(t, X) must be chosen with


these two requirements in mind.
The unit step and unit impulse functions, discussed in the next section,
are two different sets of k(t, X) that meet both requirements. In each case,
all the functions in the set have exactly the same size and shape when they
are plotted versus /, regardless of the value of X. The only difference
between different functions of the same set is that they occur at different
times. Because of this, the K{t, X) functions are different functions of t
Sec. 1.4 The Singularity Functions 11

for different values of the parameter X. Since time t is explicitly involved


in all steps of the solution, the use of the step and impulse functions leads
to “time-domain analysis.”
In Chapter 3, the set of functions k(t, X) has the exponential form ex\
where A is a complex parameter. Each eu is nonzero over the entire time
range of interest, which is either — oo < t < oo, or 0 < t < oo. Two
functions corresponding to two different values of X do, however, have
different sizes and shapes. The K(t, X) functions are, for time-invariant
systems, independent of t, so that t is not explicitly involved in all steps
of the solution. The use of the exponential functions leads to “complex-
frequency-domain” analysis, which includes the Fourier and Laplace
transforms. The inverse Fourier and Laplace transforms are special
cases of Eq. 1.3-4.
Further general discussion of the resolution of functions into sets of
elementary functions is best deferred until after a careful consideration
of time-domain and complex-frequency-domain analysis. Additional
remarks are found in Section 3.10. The remainder of this chapter is
concerned with the resolution of signals into sets of singularity functions.

1.4 THE SINGULARITY FUNCTIONS

The singularity functions are a class of functions that forms the basis
for the time-domain analysis of systems. Every singularity function and
all its derivatives are continuous functions of time for all real values of t
except one. Furthermore, the singularity functions can be obtained from
one another by successive differentiation or integration. The singularity
function that is perhaps the most familiar is the unit step function, which
is denoted by U_L and is shown in Fig. 1.4-1. As can be seen from the
figure,
0 for t < 0
(1.4-1)
1 for t > 0

Although the function is sometimes defined to have the values \ or 1 at


the point of discontinuity, it is more commonly left undefined at this
point.
For any function of time,f(t — a) represents the function f{t) displaced
a units to the right. Hence it is logical to denote a unit step function with
its discontinuity at t = a by
(0 for t < a
(1.4-2)
1 for t > a
12 Time-Domain Techniques

U_x(t) U_1(t - a)
k /

- 1
r-H

H
1
1
1 >
0 0 a

(a) (b)

Fig. 1.4-1

Note specifically that the discontinuity occurs when the quantity in


parentheses is zero. The unit step function is zero or 1, when its argument
(the quantity in parentheses) is negative or positive, respectively.
If the unit step function is successively integrated, other singularity
functions are obtained. The unit ramp function U_2 is

(0 for t < 0
U_2(t) = U_.it) dt = (1.4-3)
I —CO it for t > 0
and is shown in Fig. 1.4-2a. Another singularity function is
rt fo for t < 0
U_3(t) = U_2(t) dt = (1.4-4)
-oo for t > 0
and is shown in Fig. 1.4-26. The subscript notation is suggested by the
following symbolism, sometimes used in mathematics.

f~\t) = j7(0 dt

r\t)=jr\t) dt
U_ 3 (t)

Fig. 1.4-2
Sec. 1.4 The Singularity Functions 13

fo(t)

(a) (b)
Fig. 1.4-3

As is evident in Chapter 3, the subscript also indicates the power of 5 in


the Laplace transform of the singularity function.

The Impulse Function

Successive integration, as in Eqs. 1.4-3 and 1.4-4, yields an infinite


number of singularity functions. The next logical step is to attempt to
form other singularity functions by successive differentiation of the unit
step function. The derivative of however, is zero for t ^ 0 and
does not exist for t = 0. To gain some insight into such a result, consider
a function that only approximates a unit step function, such as/_i(0 of Fig.
1.4-3a. The derivative of /_x(0 is the function f0(t) in part (b) of the
figure.

m = y/_,(»)
dt
(1.4-5)
/-i(o = f m dt
J — 00

As the dimension A becomes smaller, more closely approximates


the unit step. f0(t) becomes a narrower and higher pulse, with the total
area underneath the curve remaining equal to unity. As least as long as
A remains nonzero, f0(t) is the derivative ofCertainly

lim/_1(0 = CLjO)
A -* 0

except possibly at t = 0. The limit of/0(7) is denoted by

rT / \ r. fO for t t6 0
U0(t)= hm/0(0 = _ _
a-*o unhmty lor t = 0
U0(t) is called the unit impulse function, and its properties are shown by
the representation of Fig. 1.4-4. The number 1 alongside the arrow is
14 Time-Domain Techniques

Uo(t - a)
Uo(t): oo
A
A A

l l

1 t
0 0 a

(a) (b)
Fig. 1.4-4

intended to infer that the total area underneath the “curve” is unity. In
the limit, Eq. 1.4-5 becomes

U0(t)= lim 7/-1O)


a-+o at

t7_x(0 = lim
A -> 0

It would seem that these could in turn be rewritten as

U0(t) = y [/_!»
dt
(1.4-6)
u_i(») (70(t) dt

This last step does not, however, necessarily follow by rigorous mathe¬
matics. Differentiation and integration are both limiting processes
themselves, and Eq. 1.4-6 is valid only if the differentiation and integration
are commutative with taking the limit as A approaches zero. That
interchanging the order of two limiting processes is not always valid can
be seen by noting that
_ y^
lim lim —-— ^ lim lim—-—
x~*0 y~*0 y y-*0 x~*0 X“ -f- y~

The mathematical difficulties with the impulse function can be traced


to the fact that it is not a function at all, in the normal mathematical use
of the word. If / has a unique value corresponding to each value of t
lying in some domain Z), / is said to be a function of t for that domain
of t. The function f is defined by listing in some way, as by equations,
tables, or graphs, the values of / corresponding to all values of t within
the domain D. A functional relationship is a point by point relationship
and can be visualized as a black box with an input and output slot, as in
Fig. 1.4-5. If any 4 within the domain D is thrown into the input slot, a
unique fk falls out of the output slot. The domain D is often, but not
Sec. 1.4 The Singularity Functions 15

always, the set of all real numbers. The definition above makes the terms
function and single-valued function identical. A “multivalued function”
may be represented by two or more single-valued functions.
The unit step function is a perfectly good mathematical function. If one
starts to define the unit impulse function by the relationship

U0(t) = 0 for t t6 0 and does not exist at t = 0

this is satisfactory also. But the definition of the unit impulse by taking
the limit off0(t), shown in Fig. 1.4-36, must include the fact that the area
under the curve is unity. This can be indicated by

f U0(t) dt = U_.it)
J — 00

or, equivalently,

j U0(t) dt = 1 for all e > 0

But neither of the last two equations is a permissible way Fi§-1-4-5


of defining a function as a point by point relationship.
The use of the impulse in system analysis produces results that can
invariably be justified by more conventional and advanced mathematics.
This involves, however, detailed consideration of changing the order of
limiting processes, and the use of Stieltjes and Lebesgue integrals in place
of the usual Riemann integral. It is now customary to regard the unit
impulse as a “distribution” or “generalized function.”! The theory
includes the ordinary mathematical functions as special cases of dis¬
tributions. A distribution having the properties desired of the unit
impulse is possible. These properties are summarized as

U0(t — r) = 0 for t 5^ t
%t
U0(t — r) dt = U_.it - r) (1.4-7)
J — CO
00

f(t)U0(t r) dt = /(r) iff(t) is a continuous function


' — 00

The first two properties have already been discussed. The third can be
proved by noting that, since the integrand is zero except at t = r,

00 'r+e
f(t)U0(t - t) dt = f(t)U0(t - t) dt
-00 ’T—e

f The standard work on this subject is Reference 2. A simpler and more readable
treatment is given in Reference 3.
16 Time-Domain Techniques

Using the mean-value theorem, this becomes

This relationship is sometimes called the sampling property. As one


application, consider Eqs. 1.3-3 and 1.3-4. If k{t, r) is a train of impulses,
the integrand is zero except at discrete times, and the integration can be
replaced by a summation. This explains the statement that Eqs. 1.3-1
and 1.3-2 can be considered as a special case of Eqs. 1.3-3 and
1.3-4.

Representing Initial Stored Energy by Impulses

When a unit impulse is the input to a system with energy storing ele¬
ments, it serves to change the system’s stored energy instantaneously.
If a unit impulse of force is applied to a fixed mass M that has previously
been at rest, the velocity of the mass is

Thus 1/2M units of kinetic energy are instantaneously imparted to the


mass at t = 0. If a unit impulse of current flows into a fixed, initially
uncharged capacitance, the voltage is

i
e(t) = —
r 1
Uo(t)dt — — U_i(t) volts
C J—oo C

Thus the current impulse instantaneously places a charge of 1 coulomb


on the capacitance, and 1/2C joules of energy in its electrostatic field.
Similarly, a source of U0(t) volts placed across an inductance causes a
current
If* ,1
i(t) = — U0(t) dt — — C/_1(t) amperes
L J—oc L

corresponding to 1/2L joules of energy instantaneously placed in the


magnetic field.
These examples suggest that the effects of any initial energy stored
within a system may be represented by impulsive inputs. Recall that the
basic definitions of Section 1.2, such as those for linear and time-invariant
systems, assumed no internal energy before the external inputs are applied.
Sec. 1.4 The Singularity Functions 17

o-+

A
CeoUo(t - to)
amperes eoU-, (t - to)
volts

io^-1 (t - to)
amperes

If these definitions are to be extended to systems with initial internal


energy, such energy must be represented by added inputs. A capacitance
or inductance with initial energy at time t0 may be replaced by the equiv¬
alent circuits of Fig. 1.4-6, as far as the response external to its terminals
for t > t0 is concerned. Figure 1.4-6 also gives equivalent circuits using
step functions. Their equivalence follows directly from Thevenin’s and
Norton’s theorems.

Another Use of the Impulse Function

The presentation of the unit impulse as the limit of a rectangular pulse


of unit area, as the width approaches zero, suggests that a system’s
response to a pulse can be approximated by the response to an impulse
with the same area. The idea is worthwhile, because the impulse response
is more easily calculated than is the pulse response. The approximation is
a good one if the width of the input pulse is small compared to the time
constants of the system. If, for example, a rectangular pulse of force is
applied to a mass, and if the mass does not move appreciably while the
pulse is applied, then the response of the system which includes the mass
will be the same as to an impulse of force of equal area. Any input pulse,
whether rectangular or not, serves to introduce energy into a system in
some way. If all the energy is introduced before the output has a chance
to change appreciably, the output can be closely approximated by the
impulse response. Clearly, for a given system, the approximation becomes
progressively better as the input pulse width becomes narrower. This is
the basis for the subject matter in the next section.
18 Time-Domain Techniques

fo(0 fo(t)

fi(t)
A
Ui(t): oo
/
A2
1
A 2A ->■1

o
0
1

A2 (
— oo
(c)
(d)
Fig. 1.4-7

Other Singularity Functions

The limit of the rectangular pulse of Fig. 1.4-36 is not the only way of
arriving at the unit impulse. Any function which possesses the desired
properties in the limit may be used in place of a rectangular pulse. Two
such functions are shown in Figs. \A-la and 1.4-76. In each case, as
A approaches zero, the unit impulse results. The derivative of the function
in part (b) of the figure is shown in part (c). In the limit as A approaches
zero, f0(t) becomes the unit impulse, and fx(t) becomes the unit doublet
of part (d). The unit doublet has the characteristics

(0 for t 0
l/i(0 =
— oo and + oo for t = 0
(1.4-8)
U0(t) = U^t) dt
' — 00

The same complications regarding the rigorous treatment of the unit


impulse carry over to the unit doublet.
Successive differentiation of the unit doublet leads to an infinite number
of other singularity functions. In summary, all the singularity functions
Sec. 1.4 The Singularity Functions 19

are completely characterized by describing any one of them and using


the relationships

Un+1(t -a) = ~ Un{t - a)


dt
(1.4-9)
Un-!(t — a) = Un(t - a) dt
J — oo

The singularity functions that are the most useful as elementary functions
are the unit step and the unit impulse.

Determination of the Impulse and Step Response

Later sections of this chapter show how the response of a linear system
to any arbitrary input can be determined, once the response to a unit
impulse or unit step function is known. The terms impulse and step response
are always understood to mean the response to unit singularity functions
when the system contains no stored energy prior to the application of the
input.

Example 1.4-1. If the input to Fig. 1.4-8 is i\(t) = U0(t), it acts like an open circuit
except at t = 0, when it instantaneously inserts some energy into the circuit. Because
of the presence of the inductance, the impulse of current will all flow through the right-
hand resistance, causing an impulse in the response e0(t). Since the current through the
RL branch will remain finite, the impulse of voltage will appear directly across the
inductance, creating a current of \ ampere. For t > 0, the circuit may be redrawn as
in Fig. 1.4-9, where /(0+) = £. The circuit has a time constant of 1 second, so

e0(0 = ~K0 = -[h~l] for / > 0

The complete impulse response is

e0(0 = U0(t) - [*«-*] U.iit)

Since the circuit is time-invariant, the step response is the integral of the impulse
response, namely,

1 - j W‘l<* = i(l +«-*) for t > 0


Jo

eo (0

Fig. 1.4-8
20 Time-Domain Techniques

Fig. 1.4-9

For most systems, the impulse and step responses are determined
analytically from the differential equations describing the system. The
relationship of the impulse response to the system’s differential equation
is discussed in detail in Chapter 2. Its relationship to the system function
of Laplace transform theory is covered in Chapter 3. In some cases, the
impulse or step response may be given or may be approximately deter¬
mined experimentally.

1.5 RESOLUTION OF A CONTINUOUS SIGNAL


INTO SINGULARITY FUNCTIONS

Consider the function v(t) shown by the solid curve in Fig. 1.5-\a.
It can be crudely approximated by the staircase function, shown by the
broken line. The staircase function is the superposition of the five step
functions in part (b) of the figure.

v(t) = U_.it) + 2 U_.it - 1) + U_.it - 2) - 3 U_.it - 3) - U_.it - 4)

where = stands for “approximately equal to.” The approximation


becomes progressively better as smaller time intervals and more steps are
chosen in the staircase function.
The area underneath iit) has been divided into four sections in Fig.
1.5-lc. In part id) of the figure, each section is approximated by an
impulse function having an area equal to the area of the corresponding
section in part (c). Then

vit) = 2U0it) + 3.5U0it - 1) + 3U0it - 2) + 0.5U0(t - 3)

The last approximation is certainly not a good representation of vit),


since parts (c) and id) of the figure bear little resemblance to each other.
An approximation having values of only zero and infinity does not closely,
represent the finite function v(t). If, however, the aim is to approximate
the response of a linear system to the input vit), the procedure may be
satisfactory. As discussed in the last section, the intervals of time chosen
Sec. 1.5 Resolution of a Continuous Signal into Singularity Functions 21

in the approximation must be small compared to the time constants of the


system.
The approximations illustrated in Fig. 1.5-1 should become exact as the
time intervals become infinitesimal in size. Consider the arbitrary function
v(K) shown in Fig. 1.5-2#. The value of the function at some arbitrary
point t can be approximated by a series of step or impulse functions. The
dummy variable X has been introduced so that it is possible to distinguish
between the particular point t and the variable representing the general
distance from the origin along the abscissa. In terms of step functions,
OO

v(t) = ^ At;(/c A2)f/_1(t — k AA)


k=—oo

where Av(k AA) denotes the jump at the point k AA in the staircase
approximation of Fig. 1.5-26. Since the factor U_x{t — k AA) = 0 for
t < k AA, it ensures that the jumps in the staircase function to the right
22 Time-Domain Techniques

v(\)

of point t have no bearing on the value of v(X) at the point t. Note also
that
dv(X)
Ar>(/c AA) === AA
L dX J k AA
Thus
00
dv(X)
v(t)= 2 C/_x(r - k AA) AA (1.5-1)
fc=—oo l^Tj fc AA
All three of the approximations above become exact as AA approaches
zero. In the limit, AA becomes dX, k AA becomes the continuous variable
A, and the summation is replaced by an integration.

oo
dv(X)
v(t) = U_t(t - X) dX
dX

Since U_r(t — A) = 0 for A > t,


dv(X)
v(t) = dX (1.5-2)
I — oo dX

a result that could have been written down directly.


In terms of impulse functions,
00
v(t) = 2 [v(kAX)]U0(t - kSX)\X (1.5-3)
k~— oo

where the quantity [v(fc AA)] AA is the area of the rectangular pulse
beginning at the point k AA in Fig. 1.5-26. The approximation becomes
exact as AA approaches zero, yielding

v(t) = I v(X) U0(t — A) dX (1.5-4)


J — CO
Sec. 1.5 Resolution of a Continuous Signal into Singularity Functions 23

This result is simply a restatement of the sampling property of the unit


impulse, previously given in Eq. 1.4-7. As discussed in the last section,
Eq. 1.5-3 does not really represent a good approximation to v{t) for any
nonzero value of AA. As AA approaches zero, however, both the area of
the impulses and the spacing between them become zero, yielding exactly
the function v(t).
While Eqs. 1.5-2 and 1.5-4 are rather trivial results, the concepts lead to
very important expressions for the response of a linear system. Let the
input of a linear system be approximated by Eq. 1.5-1. Denote the re¬
sponse of the system to a unit step occurring at t = f by the symbol
r(t, f). Thus, when v(t) = U_x(t — f), y(t) = r(t, ft). Then, by the
superposition principle, the approximate response to an arbitrary input is
00
dv(X)
2/(0— 2 r(t, k AA) AA
k=— oo ~dX. k AA

The exact output is given by the limit as AA approaches zero.

r(t, A) dX (1.5-5)
dX

For nonanticipatory systems, r(t, A) = 0 for t < A, so

2/(0 = f r(U A) dX (1.5-6)

Next let the input be approximated by Eq. 1.5-3. Denote the response
of the system to a unit impulse occurring at t = f by the symbol h(t, ft).
By superposition, the approximate response to an arbitrary input is
oo

y(t) == ^ v(k AA)/?(t, k AA) AA


fe=— oo

The exact output is given by the limit as AA approaches zero.

y(t) ==| v(X)h(t, A) dX (1.5-7)


J — oo

Since, for nonanticipatory systems, h(t, A) = 0 for t < A,

y(t) = f v(X)h(t, A) dX (1.5-8)


J —oo

In many problems, the input is zero for negative values of time, in which
case the lower limit for the integrals in Eqs. 1.5-5 through 1.5-8 becomes
zero.
24 Time-Domain Techniques

v(t)

(b)

Fig. 1.5-3

There are two situations which sometimes cause difficulty in the use of
these equations. If the input v(t) has discontinuities, its derivative, which
appears in the integrand of Eq. 1.5-6, will contain impulses. In the
evaluation of the integral, the sampling property of Eqs. 1.4-7 and 1.5-4
is useful. The derivative of the input shown in Fig. 1.5-3# has impulses
of value A and B, respectively, at t = 0 and /0. Thus Eq. 1.5-6 reduces to

y(t)= Ar(t, 0) + Br(t, t0) + f dVl[^ r(t, X) dl (1.5-9)


Jo aA

where dvJdX stands for dvjdX with the impulses removed. Their effect
is accounted for by separate terms. The last equation follows directly
from the sampling property of impulses, or alternatively from breaking
the function of Fig. 1.5-3# up into the three components shown in part (b).
The impulse response of a system may occasionally contain an impulse
itself. Then the integrand of Eq. 1.5-8 contains an impulse. In this case,
Sec. 1.6 The Convolution Integral for Time-Invariant Systems 25

the impulse response can be written as the sum of two terms,

h(t, A) = kU0(t — A) + hi(t, A)

where hx does not contain an impulse. Then the response to an arbitrary


input v(t) is given by

y(t) = kv(t) + | v[X)hx{t9 A) dX (1.5-10)


J — 00

1.6 THE CONVOLUTION INTEGRAL


FOR TIME-INVARIANT SYSTEMS

The preceding derivations are valid for all linear systems. For time-
invariant systems, a further simplification is possible. The symbols h(t)
and r(t) are customarily used to stand for the responses to a unit impulse
and unit step, respectively, applied at t = 0.

h(t) = h(t, 0) and r(t) = r(t, 0)

By the definition of a time-invariant system given in Eq. 1.2-3,

h(t, A) = h(t — A, 0) = h{t - A)


r(t, A) = r(t — A, 0) = r(t — A)

Equations 1.5-5 through 1.5-8 then lead to Eqs. 1.6-2 and 1.6-3 in Table
1.6-1. Equations 1.6-4 and 1.6-5 in the table are easily obtained from the
first two by the substitution of variable r = t — A. For nonanticipatory

Table 1.6-1

Simplification Simplification when


when System is Input Is Zero for Equation
General Equation Nonanticipatory t < 0 Number

% OO
y(t) = v(X)hC - X) dX ' G-6-2)
J — OO

r
Upper limit is t Lower limit is 0

y(t) = (1.6-3)
l — oo

r oo
y(0 = v(t — r)h(r) dr G-6-4)
' — 00
Lower limit is 0 Upper limit is t

y(t) = f" (1.6-5)


Leo
26 Time-Domain Techniques

systems with v = 0 for t < 0, the limits on all the integrals become 0 and t.
Even in this case, however, it is not incorrect to use the wider limits, and
this is occasionally done to facilitate the proofs of some theorems.
Equations E6-4 and 1.6-5 suggest another interpretation for time-
domain analysis. Consider Eq. 1.6-4 for a nonanticipatory system with an
input that is zero for t < 0.

y(t) = | v(t — r)h(r) dr


Jo
\

y(t) and v(t) represent the output and input, respectively, at the instant of
time t. Since v(t — r) represents the input r seconds before this instant,
r is sometimes called the age variable. As r increases, v(t — r) represents
the input further and further back into the past. For r = 0 and t, re¬
spectively, v(t — r) is the input at the instants t and 0, respectively. The
equation above indicates that the entire past history of the input con¬
tributes to the output at the instant t. The past history is weighted,
however, by the factor h(r). In fact, h{r) is sometimes called the system¬
weighting function instead of the impulse response. This interpretation
in terms of a weighting function will, perhaps, become clearer later when
the graphical interpretation of the equation is considered. For stable
systems, h(r) approaches zero as r approaches infinity. For such systems,
the more recent past is weighted more heavily than the far distant past.
Equations 1.6-2 and 1.6-4 may be rewritten in still another way if
desired. Simply use the fact that, for time-invariant linear systems,

h(r) = ^(zl and h(t_l) = djiL^1)


dr d(t - X)
For the case of time-invariant systems with zero inputs for t < 0, all
the formulas can be summarized as follows:

y{i) = — * r(t)= = v(t) * h(t) (1.6-6)


dt dt
The asterisk symbolizes the convolution of the functions on either side of
it. Convolution is mathematically defined as

m*m = [ ho- a)/2(A) dx


Jo

=
Jo
fViWUt - X) dX (1.6-7)

When the more general limits of integration of — oo and + oo are used,


the term convolution and the symbol f^t) * f2(t) are still used by some
workers.
Sec. 1.6 The Convolution Integral for Time-Invariant Systems 27

Example 1.6-1. For the circuit and voltage source shown in Fig. 1.6-1, find the
current i(t). Assume that the circuit has no stored energy for t < 0.
In the preceding terminology, v(t) = e(r) and y(t) = /(/). The current response to
a step of voltage is

Kr) = I [1 e {RJL)t] for / > 0

The equation
de{X)
i{t) = r(t - X) dX
dX

will be used. Since the character of the input is different for 0 < t < 1 and for 1 < t,
the solution must be carried out in two parts. For 0 < t < 1,

i(t) = J (l)i[l - e-(RlL)lt~X)]dX

t--1(1 - e~{R'L)t)
R
For 1 < t,

KO (1) — [I €-UtlL)U-X)]dX +
R

1
\ --W - 1)«
R R

Note that the two solutions are identical for t = 1, as expected. Also note that the second
solution is not the first solution with t replaced by 1.

Example 1.6-2. Figure 1.6-2 shows a cascade connection of two identical circuits.
Assuming that the impulse response of each one individually is h{t) — te~t for / > 0,
and that the second circuit does not load down the first, find the impulse response of
the entire combination.

Fig. 1.6-2
28 Time-Domain Techniques

When
(0 = U0(t)
then
e*(0 = te~l for / > 0
and
t
e*(t) = e2(X)h(t - ;.) dX
o
rt 3, —<

{X*~*)[(t - X)e~(t~A)] dX = for t > 0

The last expression represents the impulse response of the entire combination.

Example 1.6-3. Assume that the individual circuits of Fig. 1.6-2 are amplifiers, each
having the step response r(t) = — A€~t,RC for t > 0. Find the step response for the
two stages in cascade, assuming that the second does not load down the first. Extend
the result to n stages.
When
= tf-i (0
then
e2(t) = —Ae~t/Rc for t > 0
and
de2(X)
r{t — X) dX
dX
de2(X)
= e2(0 )r(t) + r{t — X) dX
'o+ dX
The last expression has a separate term as a result of the discontinuity in e2(t) at t — 0.
This is similar to Eq. 1.5-9.
P ’ A r-X!RC
«,(/) = +
io RC
_ ^2e~t!RC
1 - for t > 0
RC

The last expression is the step response for two stages in cascade. For three stages in
cascade, the step response is

—t!RC\
f4 A2 —d €~A/rO
/j A \
[-A€-{i-WRC]dX
L l RC) J
O

— —A3 ^-t,RC 1 - 2 + i for t > 0


RC, RC,
For n stages in cascade, the step response is

(n — 1)! (—tlRC)k
(~A)ne-tlRC 1 + > --—---- for t > 0
K Zv (n-k — 1)!(A!)2
A:=l

The last expression can be more easily obtained by means of the Laplace transform,
and it is derived in Chapter 3. The expression has been plotted for various values
of nd
Sec. 1.6 The Convolution Integral for Time-Invariant Systems 29

Graphical Interpretation of the Convolution Integrals

The graphical evaluation of the quantity

m *m = I7
Jo
i(A)/2« - ^ dx (1.6-8)

is based on the fact that the integral of a function between two limits
represents the area under a curve between those limits. For concreteness,
let
for 0 < t < 1
/i(0 =
elsewhere

m = 711 - e~(R,L)t]
XV
for t > 0

The convolution of these two functions was found analytically in Example


1.6-1. As shown in Fig. 1.6-3, f2(t — X) plotted versus (A — t) is the

fiW
tV
1

Fig. 1.6-3
30 Time-Domain Techniques

v(t) h(t)

mirror image of the original function about the vertical axis. In plotting
this quantity versus X, it is shifted t units to the right. The multiplication
of/i(T) and f2(t — X) produces the last curve in Fig. 1.6-3. The shaded
area represents the integral in Eq. 1.6-8 for a typical value of t. In
summary, to find fx{t) * fold one of the functions about the vertical
axis, slide it a distance t into the other function, and take the area under¬
neath the product curve.
Graphically determining this area for different values of t will lead to a
plot of fY(t) * f2(t) versus t. In the example above, both the analytical
and the graphical approaches indicate that the response monotonically
increases from 0 at t = 0 to l/R as t —► oo, but that the rate of increase
suddenly decreases at t — 1. Note that the result would not be changed
if the limits of integration were changed to — oo and + oo. The integrand
is zero except for 0 < X < t, since fi(X) = 0 for X < 0, and f2(t — X) = 0
for X > t. For anticipatory systems with inputs for t < 0, the limits
would have to be — oo and + oo.
The graphical evaluation of the convolution integrals is particularly
useful when the analytic evaluation proves to be difficult, or when the
input or impulse response is given graphically instead of analytically. It
also reinforces the previous interpretation of the impulse response as a
weighting function. Consider the formula

r*t f*t
y{t) = v(X)h(t — X) dX = v(t — r)/l(r) dr
Jo Jo

Typical v and h functions are shown in Fig. 1.6-4. v(t — r) represents the
input t seconds before the “present” time t. Thus t = 0 corresponds to
the present; r > 0, to past time; and t < 0, to future time, as indicated
in the figure. Since v(t — r) is multiplied by h(r), the past values of the
input are weighted by the impulse response.
Sec. 1.6 The Convolution Integral for Time-Invariant Systems 31

A Useful Property of the Convolution Integrals

The convolution of two integrable functions fx(i) and f2(t) was given by

oo
/i(0*/2(0= A(t - X)f2(X) dX
-oo

oo
= AWUt - A) dX (1.6-9)
' — oo

where the more general limits of — oo and -f oo have been used. Differ¬
entiating both formulas with respect to t, it is found that fff) * ff{t) =
fi(t) */2(0, where the prime denotes differentiation. In general,

m * f?\t) = f\kXt) * m
(1.6-10)
m ftkxo f[~kXt) * m
* =

where the superscripts (k) and (—k) denote the kth derivative and integral,
respectively.5 The first of these two expressions is written out explicitly
below.
00 oo
•(k)
/i(A)/?’(r - A) dX = I f[kXX)Ut - X) dX (1.6-11)
-oc -oo

where

/lW(A) = /l(A)
dXk
d>
AkX‘ - A) =
d(t - Xf
Mt - A)

The use of Eq. 1.6-11 for k = 1, together with the fact that h{t) = (dr/dt),
enables Eqs. 1.6-3 and 1.6-5 to be derived from Eqs. 1.6-2 and 1.6-4,
respectively. Equation 1.6-11 can also be helpful in the evaluation of the
convolution integrals, whether using analytical, graphical, or computer
techniques.5 Another application is the generalization of the impulse
sampling property of Eq. 1.5-4 to higher-order singularity functions.
Letting f2(t) = U0(t), Eqs. 1.6-11 and 1.5-4 yield

oo
dkA(t)
MX)uk(t - X) dX = for k = 0, 1, 2, . . (1.6-12)
-00 dtk
A modification of Eq. 1.6-11 is used in Section 2.8 to relate a system’s
impulse response to its differential equation. Note that

A tf(t — X) 1 = 4/2O — A) d(l - X) _ df2{t - A)


dX M ‘ d(t — X) dX ' d(t — X)
32 Time-Domain Techniques

In general,
d%(t - X)
[/»(< - A)] = (-1)'
dXk d(t - Xf
Equation 1.6-11 can then be written

r.
J — oo
AW
dk
_dXk
Mt - X) dX = (-If
00

—oo _d A
.AW [A(t A)] dl

(1.6-13)
For the special case where f2(t) = U0(t), this becomes
oo
r d*
AW C/0(t - A) dA = (-1)*^© (1.6-14)
' — 00 dA/c dtk

1.7 SUPERPOSITION INTEGRALS


FOR TIME-VARYING SYSTEMS

The formulas of Section 1.5 are valid for all linear systems, whether
they be fixed or varying. The formulas of Section 1.6 are based upon
Eq. 1.6-1, which is valid only for fixed systems. The analogous results
are now established for varying systems.
Equation 1.6-1 indicated that for fixed systems the impulse and step
responses are a function only of r, the time that has elapsed since the
application of the impulse or step function. For varying systems, these
responses are functions of two variables. These may be taken as the present
time t and the time A at which the singularity function was applied or,
alternatively, the present time t and the elapsed time r. It must be clearly
understood which pair of variables is being used. Thus h(t, p) could be
interpreted as the response at time t resulting from an impulse applied
at either time p or time t — p. Since both interpretations are used in the
literature, this causes some confusion. In this book, the symbol /?(/, /3)
is always given the first of the two interpretations. When the second
interpretation is intended, the symbol h*(t, p) is used.f Specifically,

h(t, A) = response at time / to a unit impulse at time A


/?*(/, r) = h(t, t — r) == response at time t to a unit impulse
r seconds earlier (time t — r)
r(t, A) = response at time / to a unit step at time A ^
r*(b t) = r(t, t — t) = response at time / to a unit step t
seconds earlier (time t — r)
t The asterisk subscript is not a common notation. In most references, the correct
interpretation must be deduced from the context in which the symbol appears.
Sec. 1.7 Superposition Integrals for Time-Varying Systems 33

Equations 1.5-5 through 1.5-8 lead to the results in the first half of Table
1.7-1. The last half of the table is obtained by the substitution of variable
t = t — X. In Eqs. 1.7-2 and 1.7-3 the h(t, 2) and r(t, X) functions are more
commonly used, while in Eqs. 1.7-4 and 1.7-5 the h*(t, r) and r*(7, t) func¬
tions lead to more compact results. Since the term convolution is restricted
to formulas like those in Table 1.6-1, the equations in Table 1.7-1 are

Table 1.7-1

Simplification
When System Is Simplification when Equation
General Equation Nonanticipatory v = 0 for t < 0 Number

r oo
2/(0 = v(X)h{t,X)dX
J — 00
0-7-2)
f* 00
2/(0 = v(X)h*(t, t — 7) dX
J— oo
Upper limit is t Lower limit is 0
OO
dv(X)
2/(0 = — r(t, X) dX
— co
dX
(1.7-3)
OO
dv{X)
2/(0 = r*(t, t — X) dX
• 00 dX

00
y{t) = v(t — r)h(t, t —t) dr
' — 00
(1.7-4)
00
y(t) = v(t — r)h*(t,T) dr
I— oo
Lower limit is 0 Upper limit is t
00
dv(t — r)
y(t) = r(/, t — t) dr
d(t — r)
J — 00
(1.7-5)
00
du{t — r)
2/(0 r*(t, r)
1— 00 £/(t — t)

usually called superposition integrals. Table 1.6-1 for fixed systems can, of
course, be regarded as a special case of Table 1.7-1. For fixed systems,
h(t, X) reduces to h(t — X), and h*(t, r) to h{r).
h*(t, t) can be interpreted as a weighting function for varying systems,
exactly corresponding to the weighting function h(j) for fixed systems.
The graphical interpretation of the convolution integrals can also be
extended to the general superposition integrals. The discussion associated
with Fig. 1.6-4 applies to varying systems if hfr) is replaced by h*(t, r).
To find
2/(0 = I v(t — t)/i*0, t) dr
Jo
34 Time-Domain Techniques

fold v(t) about the vertical axis, and slide it forward a distance t to form
v(t — t) versus r. Multiply the resulting curve by h*(t, r), and take the
area underneath the product curve. The only difference from the fixed
system procedure is that, since h*(t, r) is a function of time, a different
h*(t, t) curve must be used for each value of t considered. The adjoint
technique of Chapter 5 proves helpful when using h*{t, r) for a fixed t
and varying r.
Equations 1.6-9 and 1.6-11 are mathematical identities that may be used
whenever the integrands have the proper form. One of the functions must
be an explicit function of 7, and the other an explicit function of t — 7,
where t is treated as a parameter. Starting with Eq. 1.7-5

dv(t — r)
r*(t, t) dr
d(t — r)
Eq. 1.6-11 gives

2/(0 v(t - t) -J- 0*0, r)] dr


J —00 dr

A comparison of this result with Eq. 1.7-4 infers, as is true, that

h*(t, r) =-f [r,(t,r)]


dr
(1.7-6)
r*0, t) = /7*(t, a) da
J —oo

where the lower limit becomes zero for nonanticipatory systems. Also

h(t. A) = - 3- [r(t. A)]


aA
00
(1.7-7)
r(t, X) = h(t, a) da

where the upper limit becomes t for nonanticipatory systems. Equation


1.7-7 serves to relate Eqs. 1.7-2 and 1.7-3. The reader should be reminded,
however, that replacing the input v(t) of a varying system by dv(f)\dt
does not necessarily produce a new output of dy{t)!dt. (See Examples
1.2-6 and 1.2-3.)
t - 2
Example 1.7-1. If the impulse response of a linear system is h(t, X) = —-— for
7 < t, find the response to v{t) = (re-')U-dJ). t

By Eq. 1.7-2,

y(t) =
Sec. 1.8 Resolution of Discrete Signals into Sets of Elementary Functions 35

Finding the impulse response of a varying system can be difficult. It is shown in


Section 2.8 that the impulse response in this example describes a system characterized
by the differential equation
d2y dy
t*-£ + 4t+2y = v(t)
at2 dt

The same example is solved in Example 3.9-3 using the Laplace transform.

1.8 RESOLUTION OF DISCRETE SIGNALS INTO


SETS OF ELEMENTARY FUNCTIONS

The only discrete signals considered in this book are those which are
defined at equally spaced instants, denoted by t = tQ + kT. In this
chapter, k is a time index which takes on only integral values, and T is
the time between instants. While t0 may have any value, the time origin
may be chosen to make t0 = 0, and this is consistently done. It is also
possible to choose the time scale so that T = 1, as in Sections 2.9 through
2.12, but this is not done in this chapter.
A discrete signal consisting of a series of finite numbers may arise
naturally, as in the case of a digital computer, or it may result from
sampling a continuous function, as in Figs. 1.2-2c and 1.8-lc. If the ideal
periodic switch in Fig. 1.8-16 is assumed to close every T seconds for an
infinitesimal time,/2(r) is zero except at the sampling instants. In terms
of the unit delta function
(1 for / = kT
S(t — kT) = (1.8-1)
|p for t t6 kT

the signal f2(t), shown in part (c) of the figure, can be expressed as
00 oo

/,(«) =/i(0 2 6(t -kT)= 2 MkTWt - kT) (1.8-2)


k=— co k=—cc

Consider the more practical situation where the ideal switch in Fig.
1.8-16 remains closed for AT seconds. The signal f2(t), shown in part (d)
of the figure, is a continuous signal that may be approximated by a dis¬
crete signal. As discussed in Sections 1.4 and 1.5, a narrow pulse may be
approximated by an impulse of equal area, provided that the time con¬
stants of the system which follows are large compared to the pulse width.
The signal of Fig. 1.8-1 d can be approximated by
00

f.(t)= 2 [A(kT) ST]U0(t - kT) (1.8-3)


k=— oo

Note that, if AT = T, Eq. 1.8-3 describes the staircase function shown in


36 Time-Domain Techniques

flit)

flit) h(t)
—-—>

(b)

h(t) h(t) \
>

—N
\

X
\ \
\

\ \
\

\ \
\ \
\

\
\

\
\

\ \
\

,<rrl
1

0 T 2T 3T AT 5 T 6T 0 T 2T c T AT bT 6T
(c) (d)

h(t) 2 U0(t - kT)

OO oo oo °o oo oo
v / v /k /^ I k A
1 1 1 1 1 1
t
0 T 2T 3T AT 5T 6T
(f)

>
_is

o
o
/
/
0 ^

CO
\
8^-
\
\

\
\

0 T 2T 3T AT bT bT
(h)
(g)

Fig. 1.8-1

part (e). Such a result can be produced in practice by following the


sampling switch by a “zero order hold” or “boxcar generator” circuit.
This might be done to approximately recover the original signal if only the
sampled signal were available. Equation 1.8-3 is then essentially identical
with Eq. 1.5-3.
In Eq. 1.8-3, the time scale may be adjusted so that AT = 1, or the
constant multiplying factor of A T may be incorporated into the gain of the
system which follows. In either case, the factor AT disappears from
Eq. 1.8-3, and the signal of Fig. 1.8-ld can be approximated by the signal
Sec. 1.9 Superposition Summations for Discrete Systems 37

of part (g), which in turn is given by


oo oo

MO = f*(0 = MO I U„(t — kT) = 2 fi(kT) U0(t - kT) (1.8-4)


k=— oo ic=—cc
oo

where 2 U0(t — kT) represents the impulse train shown in part (/). The
k=— oo
idealized sampling switch of part (h), together with the symbol fx*(/),
is defined by Eq. 1.8-4. Such a switch modulates the input by a train
of unit impulses to produce a discrete output signal composed of impulses
of varying area. This is the sampling device assumed in books on sampled-
data control systems, and it is useful in later parts of this book. The
degree of approximation involved in going from part id) to part (g) of
Fig. 1.8-1 is best discussed in terms of transform techniques.6 It should
be noted that all of the ideal switches discussed in this section are linear,
time-varying components.

1.9 SUPERPOSITION SUMMATIONS


FOR DISCRETE SYSTEMS

If the input to a linear system initially at rest is the discrete signal given
in either Eq. 1.8-2 or 1.8-4, the output can be found by superposition.
In this section, any sampling switch needed to produce the discrete input
is not considered to be a part of the system under consideration. The only
information needed about the system is its response to a unit delta or
impulse function, whichever is appropriate.

Time-Invariant Systems

The response of a fixed system depends only upon the time that has
elapsed since the application of the input. Let h{t) and d(t) denote the
response to a unit impulse and delta function, respectively, applied at
t = 0. Then the response to U0(t — kT) or to d(t — kT) is h(t — kT)
or d(t — kT), respectively.
Except in trivial cases, the delta response d(t) is identically zero for
oo

continuous systems. Any input for which |y(l)| dt = 0 does not insert
J — oo

any energy into the system and hence cannot cause an output to occur.
In discrete systems, such as a digital computer, a finite discrete input is
converted to a finite discrete output. In such a case, d(t) is defined only
at the instants / = t0 + nT, where n is an integer. The term t0 represents
a possible constant time delay between the input and output sequences.
Since the chief concern is with the form of the output, t0 will be ignored
38 Time-Domain Techniques

C
O Vh
5 s /.—N
G JO
<?i G <N m Tf VOI
t
G % On On o\ 6) Os
cr ^ i
Os
* h^H —i
w Z

G on o
O

~ V +-> -*—> -t->
3
oj
o Oh
c Vi 1 O E *|| .1 o £ O
-3 Uh •S —< l_ rn Ui
£ 1 a>
Uh «
Vi
a<i>
£X

G
C
o
J-h
<u
£
^
.23
£ ^
Dh 00
<[>
^
N
oo
®N
V•
—i
Vh
<D
£
N
C/5
a>
Dh .£
^
<l> o Dh o Oh
00 ^ 3“ o
N G P hJ
w H-J P
V5
3
c •- &■ <D
.2 g 2 ■*—»
"
™ -2 S « E +->

O oo a, o i II
3
00 E R
O
G >>. o Lh ||
33 00 '-2 «i -V
<o
N I-H
hi
u
i-H
<u
Dh - C <U <u a> £ ^ a> N
.§1 g Oh
Dh’~<
00
o
£ C/5 Dh
Dh
Dh 00
£ oo

«I ° P hJ
Oh’~

h4
O
Table 1.9-1

hi
hi hi hi
hi

hi
p hi
P
I cn I
i
3
Oh
P hi p ov p hi
-*->
3 a ''S’"
a"
o W
8 Hi sZIJ 8H oo 8Hj 8H
3 Ai
II II II II
a>
,,—^ /—S /—^
g
HH
hi
c
hi
si
3 hi hi
00

hi K
■J*
hi

‘-o p° P°
3 p p P
Dh
C
vs/

«W 8W! 8W’
r4i -se

i-H
S»/ *
Sec. 1.9 Superposition Summations for Discrete Systems 39

in this chapter, and the delta response will be written d(nT). For non-
anticipatory systems, d{nT) = 0 for n < 0. By superposition, the response
of a discrete system to the input of Eq. 1.8-2 is given by Eq. 1.9-1 in
Table 1.9-1. Equation 1.9-2 is obtained by replacing the dummy variable
k by n — k. Note that these equations define the output only at discrete
instants of time. For nonanticipatory systems with v(t) = 0 for t < 0,
the limits of summation become 0 and n for both equations, although it
is never incorrect to use the wider limits of — oo and + oo.
Consider a continuous system whose input is given by Eq. 1.8-4. The
output is the continuous function of time given by Eq. 1.9-3 or 1.9-4.
In the simplifications listed for nonanticipatory systems and for inputs
that are zero for t < 0, the statement “Upper limit is kT = is literally
correct only when t is an integral multiple of T. The summation is only
over discrete values of k. At time t = (n + y)T, where n is an integer
and 0 < y < 1, the statement above should be interpreted as k = n.
Equations 1.9-3 and 1.9-4 give the exact output of a continuous system
to a train of weighted impulses. When such an input is used to approxi¬
mate the output of a physical sampler, however, these equations likewise
involve an approximation. The degree of approximation depends upon
the width of the pulses from the sampler compared to the time constants
of the system.
If the output of the continuous system is desired only when t is an
integral multiple of T, Eqs. 1.9-3 and 1.9-4 reduce to Eqs. 1.9-5 and 1.9-6,
respectively. The latter equations would be used if, for example, the
output of the system were to be followed by another sampling switch.
For nonanticipatory systems with v(t) = 0 for t < 0, the limits of sum¬
mation again become 0 and n. The reader has no doubt noticed the
similarities between Tables 1.6-1 and 1.9-1. In fact, Eqs. 1.9-3 through
1.9-6 can be treated as special cases of Eqs. 1.6-2 and 1.6-4.
Equations 1.9-1 and 1.9-2 are identical with Eqs. 1.9-5 and 1.9-6,
respectively, except for the use of different symbols. It is therefore
instructive to compare the discrete and continuous systems shown in
parts (a) and (b) of Fig. 1.9-1. The relationship of y(t) to ift) in part (b)

v(t) = 2 u(kT)8(t - kT)


d(kT)
y(nT) = 2
^ k = - °°
v(kT)d(nT - kT)
k = -

(a)

v(t) v*(t) y(t)


h(t)

(b)
Fig. 1.9-1
40 Time-Domain Techniques

at instants which are integral multiples of T is exactly the same as the


relationship of y(nT) to v(kT) in part (a), provided that h(kT) = d(kT).
It is shown in Chapter 2 that a discrete system can be described by a
difference equation, and thus so can the configuration of Fig. 1.9-16,
provided that the output is desired only at t = nT. Difference equations
may be solved classically, as in Chapter 2, or by the Z transform, as in
Chapter 3.
Example 1.9-1. In Fig. 1.9-1 a, T = 1,
[2k for k > 1
d(k) =
(0 for k < 0
and the input is given by
(k for k > 0
v(k) = !
[0 for k < 0
Find an expression for the discrete output signal y{ri).
Using Eq. 1.9-2,

2/(0 = 2 O — k)2k for n > 1


fc=l

Thus
y(n) = 0 for n < 1
2/(2) = 2
2/(3) = 8
2/(4) = 22

The calculation of y(n) for large values of n can be greatly simplified if the expression
can be put in closed form. It can be shown, although not without some thought, that

J(« - k)2k = 2[2n - n - 1]


k=1

It is shown in Example 2.12-6 that the delta response d{k) in the last
example describes a system characterized by the difference equation
y{k + 2) — 3 y(k + 1) + 2 y(k) = 2 v(k + 1) — 2 v(k)
One disadvantage of the superposition summations in Table 1.9-1 is that
it is difficult and often impossible to express the answer in closed form.
When the preceding problem is solved in Example 3.13-2 by the use of
the Z transform, the solution is easily obtained in closed form.
Example 1.9-2. If, in Fig. 1.9-16, T = 1,
(2( for t > 1
hit) =
(O for / < 0
and
v(t) = tU^{t)
then the output at the instants t = n is
y(n) = 2[2n — n — 1] for n = 0, 1, 2, ...
Sec. 1.9 Superposition Summations for Discrete Systems 41

v(t) v*(t) q (t) y(t)


h(t) h(t)

(a)

v(t) v*(t)
-> q(t) q*(t)
h(t)
y(t)
h(t)

(b)

Fig. 1.9-2

Example 1.9-3. Figure 1.9-2# shows a sampling switch followed by two identical sys¬
tems in cascade, each having the impulse response

h(t) = te~l for t > 0

Assuming that the second system does not load down the first, and that the sampling
interval is T = 1, find the output y(t) when

v{t) = <r*U-x{t)

From Example 1.6-2, the impulse response of the two systems in cascade is t3e~ll6.
Using Eq. 1.9-3,
-t t
y{t) = > e k--- -I (/ - *)’ for t > 0
k=0 6 fc=o
Thus
y(t) = Ue_V6 for 0 < t < 1
y{t) = U3 + (t — l)3]e~76 for 1 < t < 2

Example 1.9-4. Repeat the previous example, with the added sampling switch shown
in Fig. 1.9-2b.

q(t) = V €~\t - k)€-“-*> = e-f y {t - k)


k=0 fc=0

m
y(t) = 2 (m - k) (t — m)e~ (t — m)

m=0 L fc=0

Using this approach, y(t) is not easily evaluated in closed form. For present considera¬
tions, it is sufficient to note that

y(t) = 0 for 0 < t < 1

y(t) = e~‘(t — 1) for 1 < t < 2

It should be observed that the results of the last two examples are not
identical, not even at the sampling instants of t = n. Expressions for
y(t) in closed form are easily found by means of the modified Z transform
in Example 3.14-1.
42 Time-Domain Techniques

Co <[>
+3 X
ctJ C
ex' _■?
W Z

g .SS °
•2 - V
n g**- £ «
1.§
—1 uo
.3
X J5 s-,
o X || o
s—
i_ N<0 U,<u -v
cl s £
Ui <D J-H . 1-4 a>
<D N <L> <U N D
S<D O oo ^ oo cl 2.
X £ g-1 00 £ 00

ONi
• r-H ON

£
J-H
<D o o
• t-H
O cl.2
D
00
N X X X
X
e •" S' w
.2 g 2 -*-* •*«* a3 CO

X 0) <3
3 -<-» r>.
O >-.
oo .X1 •I o a> £ * I1 l_o
X L> 1—1 u* £
X 00 X J_<L> NO s_ £"< ctf »-ha> Lt* u u
Oh - c <3 ■$■ oo S-1 00 00 ^ <D N

e ! g
Cl, oo
O Cl oo ^O —1oo
CL'~ CL •-' CL’~
bo *5 o J X D D X
* Z
Table 1.9-2

a
a,
3
o

8
8
8(Aj I
I 8H,|
II

*
References 43

Graphical Interpretation of the Convolution Summations

The equations in Table 1.9-1 are known as convolution or superposition


summations. Their graphical interpretation is similar to the graphical
interpretation of the convolution integrals in Section 1.6. The graphical
evaluation of Eq. 1.9-3 is carried out by folding h(t) about the vertical
axis, sliding it a distance t into the discrete signal v(kT), and then summing
up all values of the discrete product curve (instead of taking the area
underneath the product curve). In the case of the convolution summations,
however, the graphical evaluation is really identical with the analytic
evaluation and is therefore seldom used. The interpretation of h(t — kT)
as a weighting function may, however, be useful.

Time-Varying Systems

The equations in Table 1.9-1 are restricted to fixed linear systems,


except for the presence of ideal sampling switches. The extension of these
results to time-varying systems is parallel to the discussion of Section 1.7.
Define

d(nT, kT) — the response at time nT to b(t — kT)


d*(nT, kT) = the response at time nT to a unit delta function
occurring kT seconds earlier [time (n — k)T]
h(t, kT) — the response at time t to U0(t — kT)
h*(t, kT) = the response at time t to a unit impulse occurring
kT seconds earlier [time (t — kT)]

With these definitions, the equations of Table 1.9-1 are replaced by those
of Table 1.9-2 for time-varying systems. The interpretation of h*(t,kT)
as a weighting function is still valid.

REFERENCES

1. C. E. Shannon, ‘‘Communication in the Presence of Noise,” Proc. JRE, Vol. 37,


January 1949, p. 11.
2. L. Schwartz, Theorie des Distributions, Hermann et Cie., Paris, 1950-51.
3. P. W. Ketchum and R. Aboudi, “Schwartz Distributions,” Second Midwest
Symposium on Circuit Theory, 1956.
4. L. B. Arguimbau, Vacuum-Tube Circuits, John Wiley and Sons, New York, 1948,
p. 161.
5. R. E. Scott, Linear Circuits, Addison-Wesley Publishing Company, Reading,
Mass., 1960, Section 13-5.
6. E. E Jury, Sampled-Data Control Systems, John Wiley and Sons, New York, 1958,
Chapter 9.
44 Time-Domain Techniques

Problems

1.1 Under what conditions does the differential equation

dny dy dmv dv
a”dr+'" + J, + dF< +••■ +b1Jt +b0v

describe a linear system? a time-invariant system? a nonanticipatory


system ?
1.2 Assume that two different linear systems are connected in cascade, as in
Fig. 1.2-1. If the systems are time-invariant, and if the second system
does not load down the preceding one, will interchanging their order
affect the relationship between y{t) and v(t) ? Repeat for time-varying
systems. Prove your answers.
1.3 Show that lim /0(7) = U0(t), where f0(t) is the function in Fig. \A-la.
A->0
1.4 Find the step response of the circuit in Fig. PI.4.

2a

1.5 Find the response of Fig. 1.4-8 when the current source has the waveform
shown in Fig. PI.5, and also when ix{t) = 3e~t for t > 0.

h(t)

1.6 Check the results of Example 1.6-1 by considering the voltage source to
be the sum of two singularity functions, and by finding the response to
each one.
Problems 45

1.7 A fixed linear system containing no initial stored energy has the input
and impulse response shown in Fig. PI.7. Find the response at t = 4

u(t) h(t)

seconds both analytically and graphically. Roughly sketch y{t) versus


t for t > 0.

1.8 The cross-correlation functions


P 00

^12(0 = + T) dr
J — 00
C* 00
^21(0 = f2(r)fl(t + T) dr
J — 00

are important in the analysis of systems subjected to aperiodic inputs.


Prove that </>21(7) = <^>i2( — t). If

/i(0 = (t - - 1) - it - l)C/_! (7—2)

and f2{t) = 2e~tU_1(t), find and sketch


1.9 Prove Eq. 1.6-10.
1.10 Derive Eq. 1.7-7 by letting v{t) = !/_!(/ — r) and 2/(r) = r(t, r) in Eq.
1.7-2. Then derive Eq. 1.7-6 by using the substitution A = t— r.
1.11 A system is described by the differential equation

(7 + 1) + y = (t + 1M0

Noting that the left-hand side of this equation is the derivative of (t + 1 )y,
find hit. A), h*(t, t), r(t, A), and r+(/, r). Find the response to v{t) =

1.12 Repeat the previous problem for the differential equation

dy y dv
— -(-- == — -)- v
dt t + 1 dt

1.13 In the cascade connection of Fig. 1.2-1, assume that the impulse responses
of Nx and N2 are denoted by hx{t. A) and h2(t, A), respectively. Derive
an expression for the impulse response of the cascaded combination.
46 Time-Domain Techniques

1.14 Assume that three linear components are placed in cascade. If the first
is an ideal integrator, the third an ideal differentiator, and the middle
one characterized by the impulse response /z*(/, r) = te~iT+t) for r > 0,
find the impulse and step responses of the combination.
1.15 In Fig. PI.15, MO = (t - Ve-'U^it) and MO =* e^fcos (tt/2)/ -
sin (tt/2)/]I/MO cos [(tt/2)/ + (tt/4)]£/_i(0- The sampling switch
has a period T = 1 second. Find 2/0*70 when i>(0 = C/MO-

v(t) ^ 9*(0
h-i(t)

Fig. PI.15

1.16 A sampling switch is added just before MO Tig. PI. 15. The switches
operate synchronously with a period T = 1 second, and MO = (1/2)*
and MO = 2 cos (tt/3)/1 for t > 0. Find y(nT) when i?(0 = C/_2(0» and
also when v(0 = sin (tt /3)1.
1.17 In Problem 1.15, find 2/(0 for 0 < t < 3.
1. 8 A certain feedback control system samples the input r>(0 at t = nT, where
= J, and n — 0, 1, 2, .... The response to v(t) = U_x(t) is
2/(0 = (e-yT/6)[l — e~nT], where t = (n + y)T, and 0 < y < 1. Find
and sketch 2/(0 for 0 < t <2 T when i?(0 — e~2t
2
Classical Techniques

2.1 INTRODUCTION

The classical method of describing a linear system is by a differential


or difference equation relating the output to the input. A differential
equation is used for continuous systems, and a difference equation for
discrete systems. Since a continuous system may be initially described
by a set of simultaneous equations, Section 2.3 considers solving such a
set for the single desired differential equation.
An explicit method exists for solving both time-invariant and time-
varying differential equations of the first order, and it is presented in
Section 2.5. For higher order equations, a general and explicit method of
solution exists only for the time-invariant case. This method is given in
Section 2.6, and higher order time-varying equations are deferred to the
following section. Section 2.8 relates the material of Chapters 1 and 2.
It presents methods of finding the impulse response from the differential
equation for both fixed and varying systems.
The treatment of difference equations is roughly parallel to that for
differential equations, except that a preliminary section on difference
and antidifference operations is included. The entire chapter is restricted
to linear systems. Only analytical methods of solution are considered,
although in many applications computer techniques are the most practical
approach.1

2.2 REPRESENTING CONTINUOUS SYSTEMS


BY DIFFERENTIAL EQUATIONS

A continuous system is often described by a differential equation


relating the output y(t) to the input v(t). The general form of such an
47
48 Classical Techniques

equation! is
in—1
dny y . . dy . , dm'v , dv
an + an-1 t + • ' • + ax — + a0y = b m-h ’ ‘ + b1— + b0v
dt7 dt n- 1 dt dtm dt
(2.2-1)
Since the input v(t) is presumably known, the right side of this equation
can be represented by F(t), which is often called the forcing function.

dy dn 1y dy
a + an-i ~ \ + • ’ ' + ai~ + aoV — F(t) (2.2-2)
dtn dt 1 dt
For linear systems, the a/s and b/s cannot be functions of v or y but may
be functions of t. For fixed linear systems, these coefficients must be
constants. The proof of these two statements is similar to Examples 1.2-3
and 1.2-4.
A system’s differential equation may be given, or it may have to be
found from a model of the system. In the latter case, a set of simultaneous
differential equations are written directly from the model. The next
section gives an example and discusses the solution of the simultaneous
equations to yield a single equation relating y{t) and v(t).
The operator p is often used to indicate differentiation and is defined
byj
PMO] = y MO] (2-2-3)
dt
If c1 and c2 are constants,
dm+nv
pm(pn v) = pm+nv =
dtm+n
(2.2-4)
pm(c ffil + C2v2) = c1pmv1 + c2pmv2
(p + c/){p + c2)u = [p2 + (c1 + c2)p + cxc2]v
where m and n are non-negative integers. In fact, in most respects the
operator p may be treated as an algebraic quantity. The most notable
exception is that it does not, in general, commute with functions.
p(tv) 5* t(pv)
p(l\V2) ^ VjffD2)

Using the operator p, Eqs. 2.2-1 and 2.2-2 become


(anprl + an__xpn 1 + • • • + axp + a0)y(t)
= (bmpm H-+ bxp + b0)v(t) = F(t) (2.2-5)
f For a differential equation to uniquely describe a system, it must be understood that
the system is nonanticipative.
%D is another commonly used symbol for the differentiating operator. It is not used
in this book, however.
Sec. 2.3 Reduction of Simultaneous Differential Equations 49

The quantities in parentheses preceding y and v are themselves operators.


For fixed linear systems, where the coefficients are constants, the last
equation is written symbolically as

Mp)y(t) = B(p)v(t) = F(t) (2.2-6)

For varying linear systems, where the coefficients are functions of time,
A and B are time-varying operators. This is shown by writing

A(p, t)y(t) = B(p, t)v{t) = F(t) (2.2-7)

The formal definition of the operators A and B follows from a comparison


of the last three equations.

2.3 REDUCTION OF SIMULTANEOUS


DIFFERENTIAL EQUATIONS

A continuous model can be described mathematically by a set of


simultaneous differential equations. Basically, one set of equations is
needed to characterize each of the components, and another set to describe
their interconnection. These two types of equations are usually, however,
combined by inspection when mathematically describing the model.
The resulting set of differential equations can then be reduced, although
not always without difficulty, to yield a single equation relating the
output and input. The solution of simultaneous time-invariant equations
is far easier than that of time-varying equations and is considered first.
Example 2.3-1. An equation relating the output voltage e2 to the source voltage ex
is desired for the circuit of Fig. 2.3-1.
Summing up the currents leaving nodes 3 and 2, respectively,

„ de 3 (1
c1T +
clt u +
1 \
/ 3_
C2
_ de2
Ci—
dt
+
1

Rx
£2

Fig. 2.3-1
50 Classical Techniques

and
* ~
3 1
de v de 2 (\ 1\ 1
I—+—U+-
c'Hi + *S\ + (Cx + C2) ——|- e2 dt
at \Ri R2J L

-c,p
dt

Differentiating the second equation term by term to remove the integral sign, and using
the operator p = djdt, these equations become, after assuming for simplicity that all
resistances, capacitances, and the inductance have values of 1 ohm, 1 farad, and 1
henry, respectively,
(p + 2)e3 — (p + l)e2 ex
~(p2 + p)e3 + {Ip2 + 2p + l)e2 = (p2)e1

Premultiply each term in the first equation by the operator p2 + p, and each term in the
second equation by p + 2, and then add the two equations. Since

{p2 + p)(p + 2)e3 = (p + 2 )(p2 + p)e 3


the terms involving e3 are eliminated. Then

(p + 2)(2p2 + 2p + \)e2 — (p2 + p)(p + \)e2 = [(p2 + p) + p2(p + 2)K


or
(p3 + 4p2 + 4p + 2)e2 = (p3 + 3p2 + p)ex

which is the desired result.

The procedure used in the preceding example is valid for any two time-
invariant equations. If L represents an operator which is a function only
of /?, the equations can be written symbolically as

Ln(p)yi(t) + L12(p)y2(t) = Fx(() 3


La(p)yi(t) + L22{p)y2{t) = F2{t)

Premultiply the first equation by L21, and the second by Ln, and subtract.
Since L2lL11y1 =

(LnL22 L21L12)y2 — F21F1 L1XF2 (2.3-2)


Similarly,
(LuL22 F21L12)y1 = L22F1 L12F2 (2.3-3)

Each of the last two equations has only one independent variable and can
be solved by the methods of Section 2.6.
Consider next the general case of n simultaneous equations. Symbol¬
ically,
LnVi + L12y2 + * * • + Llnyn — F^t)
+ F22y2 T- • • • T* L2nyn — F2{t)

LniVi + Ln2y2 + • • * + Lnnyn — Fn(t)


Sec. 2.3 Reduction of Simultaneous Differential Equations 51

As long as the Lij operators depend only upon p, the solution obtained
by Cramer’s rule can be shown to be valid.f

^(p)yi=I.Ski(p)Fk(t) (2.3-5)
k= 1

where A(p) is the differential operator found by evaluating the determinant

E-12 ... r
^-11 ^ln

C2i ... T
A(p) = T22 ^2 n (2.3-6)

... T
Lni Ln2 nn

Aki(p) ^ the operator found by evaluating the kith cofactor of this deter¬
minant, i.e., the determinant A(p) with the A:th row and zth column
removed, multiplied by (—l)fc+h
This section has thus far been restricted to time-invariant differential
equations. Obtaining a single differential equation relating the output
and input of a varying system is more difficult. Suppose that a system
is represented by the simultaneous equations

Kpy i) + p(t2y2) = Fi(t)


p(tyi) + t(py2) = F2(l)

To eliminate y1 from these equations, one might try premultiplying the


first equation by pt and the second by tp, and then subtracting the two.
If this is done,
pt\pyi) + pKpt2y2) = ptfi(0
tpipty i) + tp(tpy2) = tpF2(t)

However, pFipyf) = (t2p2 + 2tp)yY, while tp(pty^) = t{tp2 + 2p)yp, hence


subtracting the last two equations will not eliminate yx.
Equations 2.3-2 and 2.3-3 are not valid for varying systems, because

L21(p, t)Llx(p, t)yx{t) ^ Ln(p, t)L21(p, t)yx(t)

Similarly, Eq. 2.3-5 does not hold. While a single differential equation
relating the output and input can often be obtained, no simple formula
of the type of Eq. 2.3-5 exists which is generally applicable. Also, the
difficulty in determining a single differential equation often makes other
methods advisable. One of these methods is discussed in Chapter 5.

f Those in need of a review of determinants and Cramer’s rule should read Sections 4.3
and 4.6, respectively.
52 Classical Techniques

2.4 GENERAL PROPERTIES OF


LINEAR DIFFERENTIAL EQUATIONS

Any «th order linear differential equation can be written in the form of
Eq. 2.2-5, repeated below.

(anpn + a^p”-1 H-+ ayp + a0)y(t) = F(t) (2.4-1)

Because all the remarks of this section are valid for both fixed and varying
systems, tjae a/s can, in general, be functions of t. If the right side of the
last equation is identically zero,

(anpn + an~\Pn~1 + + a0)y(t) = 0 (2.4-2)

which is called a homogeneous differential equation. Equation 2.4-1 is


called a nonhomogeneous or inhomogeneous equation.
Equation 2.4-2 can have, at most, n linearly independent solutions.
The concept of linear independence is precisely defined in Chapter 4.
In brief, n objects are said to be linearly dependent (or just dependent)
if at least one of them can be expressed as a linear combination of the
remaining ones. If the objects are not dependent, they are said to be
independent. A necessary and sufficient condition for n solutions of
Eq. 2.4-2 to be independent is that their Wronskian does not vanish.
If 2/i, y<i, • . . , yn represent n solutions, the Wronskian is given by the
determinant
yi y2 ’ " yn
W(t) = PVl py2 ' ’ ‘ pVn (2.4-3)

pn~xyi pn~xy2 • • • pn~lyn


The most general solution of Eq. 2.4-2 is

yH — + K2y2 + * * * + Knyn (2.4-4)

where the K/s are arbitrary constants. The subscript H refers to the
solution of the homogeneous equation.
The last equation says that, once n independent solutions are known,
any other solution can be expressed as a linear combination of these n
solutions. For fixed systems, there is a general method of finding n
independent solutions to the homogeneous differential equation. For
varying systems there is, unfortunately, no such general method.
The most general (or “complete”) solution of the nonhomogeneous
Eq. 2.4-1 is
y = yH + yP (2.4-5)
Sec. 2.5 Solution of First Order Differential Equations 53

where yH is given in Eq. 2.4-4 and is the solution of the related homo¬
geneous equation. yP is any one solution, no matter how arrived at,
which satisfies Eq. 2.4-1, and is known as the particular or particular
integral solution. yH is called the complementary solution. Any method
of finding yP, including guesswork, is allowable. In fixed circuits with a
sinusoidal source, for example, a-c steady-state circuit theory might be
employed. The next section shows an explicit general method of finding
yP once yH is known.
As yP contains no arbitrary constants, y and yH both contain n such
constants. Their evaluation requires a knowledge of initial or boundary
conditions, and it is discussed in Section 2.6.

2.5 SOLUTION OF FIRST ORDER


DIFFERENTIAL EQUATIONS

Any linear first order differential equation can be written in the form
of Eq. 2.5-1.

f- + a(t)y = F(t) (2.5-1)


dt

For convenience, it is assumed that the coefficient of dyjdt has been made
equal to unity. The coefficient a may in general be a function of t. Such
an equation may always be solved by the introduction of the integrating
factor eJa(t)dt. Both sides of Eq. 2.5-1 are multiplied by this factor.

dy. e5a(t)dt+ a(f)y€Ia(t)dt= F^€!a(t)dt (2.5-2)


dt

The left side of the last equation is the time derivative of y€lait)df So

y€iad) dt_ jF(t)eSa(f)dt dt + c, c = constant (2.5-3)

Although the integration on the right side of Eq. 2.5-3 may sometimes be
difficult, the method constitutes an explicit procedure for solving the
differential equation, for both the time-invariant and the time-varying
cases. It will give both the complementary and the particular solutions
in the answer.
In evaluating |'a(t) dt, it is not necessary to include an arbitrary con¬
stant of integration. The reader should convince himself that including
such a constant will not increase the generality of the final solution.
54 Classical Techniques

Example 2.5-1. Solve dyfdt — ty = t for y{t).


The integrating factor is e~ltdt — e~t2/2. Multiplying the original equation by the
integrating factor,

4
dt
= tc*'*

ye-*2'2 = - je-t2,2(-tdt) + c = -e-‘2/2 + c

y — — 1 + ce*2/2

Although the well-known separation of variables method is sometimes simpler than


the use of the integrating factor, it is not applicable to all first order equations. It could
not be used in this example.

2.6 SOLUTION OF DIFFERENTIAL EQUATIONS


WITH CONSTANT COEFFICIENTS

Linear time-invariant systems are represented by linear differential


equations with constant coefficients. The nih order nonhomogeneous
and homogeneous equations are given by Eqs. 2.4-1 and 2.4-2, respec¬
tively, where the a?s are constants. First order equations were discussed
in Section 2.5, while the methods of this section are applicable to all orders.

Homogeneous Differential Equations

Consider the nth order equation

(anpn + an_xpn~x + • • • + axp + a0)y = 0 (2.6-1)


/
Assume that the solutions are of the form y = er\ where r is a constant
to be determined. Substituting the assumed solution into Eq. 2.6-1, there
results
(anrn + a^r71-1 + • • • + axr + a0)eTt = 0

For the last equation to be satisfied for all values of t,

anrn + an-irn~x + • • • + axr + a0 = 0 (2.6-2)

Equation 2.6-2 is called the characteristic or auxiliary equation and can be


written down directly from Eq. 2.6-1. The left side is an nth order alge¬
braic polynomial, so Eq. 2.6-2 has n roots. Denoting these roots by rl9
r2, . . . , rn, the corresponding solutions to Eq. 2.6-1 are

Vi = £ri\ 2/2 = *T2t, • • • , Vn = crnt


Sec. 2.6 Solution of Differential Equations with Constant Coefficients 55

If these n solutions are linearly independent, the most general solution


to the homogeneous differential equation is

yH = + K2er * + • • • + Kner** (2.6-3)

If the rf s are all different, it turns out that W(t) in Eq. 2.4-3 does not
vanish, so that the n individual solutions are independent. If r1 = r2,
then yx = erit and y2 = terii are independent solutions. If a root, say rl9
is repeated k times, so that rx — r2 = • • • = rk, then the most general
solution is

yH = Kxer i* + K2t€ri* + • • • +

+ Kk+1erjc+lt + ‘ • ‘ + Knernt (2.6-4)

Thus finding yH involves only solving for the roots of an nth order equa¬
tion. In the event that some of the roots are complex, however, the
solution should be written in another form.
Since the coefficients in Eq. 2.6-2 are real, any complex roots must
occur in complex conjugate pairs. If one root is r1 = a + jfi, where
a and ft are real, another root must be r2 = a — jf. Then

K1eri' + K2er2t = ^(K^* + K^~m)


= ea<[(X1 + Kf) cos pt + j(K± — K2) sin ft]
= eat[A cos ft + B sin fit]
For a real system, the af s are real numbers, and yH is a real function of
time. This means that, when the arbitrary constants are evaluated,
A and B must be real numbers, which in turn means that Kx and K2 must
be complex conjugates. Since two trigonometric terms of the same
frequency may be written as a single term with a phase angle, it is also
possible to write
K^ + K2erit = Ke** COS (ft + </>)

Nonhomogeneous Differential Equations—Method of Undetermined


Coefficients

Consider the equation


(anpn + an_1pn~1 + * • • + axp + a0)y = F(t) (2.6-5)

whose solution is
V = Vn + Vp (2.6-6)
The complementary solution yH is found by replacing F(t) by zero and
solving the resulting homogeneous equation as before. There are two
standard methods of solving for the particular integral solution yP, the
56 Classical Techniques

method of undetermined coefficients and the variation of parameters


method.
The method of undetermined coefficients can be used only when the
forcing function F{t) possesses a finite number of linearly independent
derivatives. F(t) may be a polynomial in positive integral powers of t,
or a combination of simple exponential, sinusoidal, or hyperbolic functions.
If F(t) is, for example, In t or yjt, the method is not applicable (unless a
solution in the form of an infinite series is assumed). The basic procedure
is to assume that y is a linear combination of the terms in F(t) and its
derivatives, each term being multiplied by an undetermined constant. The
assumed solution is then substituted into Eq. 2.6-5, and the undetermined
constants are so chosen that the equation is satisfied for all values of t.
A modification of this procedure is necessary if a term in F(t) has
exactly the same form as a term in the complementary solution. Physi¬
cally, this is the familiar resonance phenomenon, where the system is being
excited at one of its natural modes. For example, the equation

g + 3^ + 2 y=l+c-‘
dt2 dt
for which yH = + A2e_2<, is not identically satisfied by yP =
A + Be~\ regardless of the choice of A and B. It is reasonable to expect,
however, that the term resulting from exciting the system at a natural
mode would decay more slowly with time than would otherwise be the
case. It is thus logical to try as a solution

yP = A + Bter1

When this solution is substituted into the differential equation, it is found


that the equation is satisfied identically for A = \ and B = — 1.
The general procedure when a term in F(t) is identical in form with a
term in yH is to multiply by t the corresponding terms in the assumed
yP. The same procedure is followed when a term in F(t) is tn times a
term in yH. If, however, the term in F(t) corresponds to a repeated root
of the characteristic equation (an rath order root, for example), then the
corresponding terms in yP are multiplied by trn.
Example 2.6-1. Find the general solution of

d2y dy
— + 2 — + y = tc-(
dt2 dt

The characteristic equation is r2 + 2r + 1 = 0, so = r2 = — l,and yH = 4-


Koe~l. Although normally yP = Ate~( 4- Be~\ in this example the characteristic equa¬
tion has a double root at —1; hence

yP = At3e~l + Bt2e~f
Sec. 2.6 Solution of Differential Equations with Constant Coefficients 57

Note that, although repeated differentiation of At3e_i also leads to the terms C/e_< and
De~\ these terms are not included in the assumed yP. The reason is that these terms
are solutions of the related homogeneous differential equation and would therefore
vanish when they are substituted in the left side of the original differential equation.
Substituting yP as determined above,

— + 2 — + yP = (0)r3e-‘ + (0)r2€-4 + 6 Ate-* + 2 Be^ = t€~*


dt2 dt

Thus A = B = 0, and the complete solution is

y = Kde-1 + K*r* + |r3e-4

Nonhomogeneous Differential Equations—Variation of Parameters

Unlike the method of undetermined coefficients, the variation of


parameters method of finding yP will work whether or not the forcing
function F(t) has a finite number of independent derivatives. Also unlike
the first method, it will work even if the a- s in Eq. 2.6-5 are functions of
time, although this is of no concern in this section.
The variation of parameters method attempts to find the particular
integral solution in terms of the complementary solution. Since this
method may be less familiar to the reader, consider first the first order
differential equation
(aYp + a0)y = F(t) (2.6-7)

The complementary solution satisfying the homogeneous differential


equation
{axp + a0)y = 0 (2.6-8)

has only the single term yH = Kyx. The particular integral solution is
assumed to have the form
yP = uy1 (2.6-9)

where all three symbols in Eq. 2.6-9 are functions of t. To find u, Eq.
2.6-9 is substituted into Eq. 2.6-7, giving

+ uyf) + a{iuy1 = F(t)

where the dots denote derivatives with respect to t. Rearranging yields

a1uy1 + u(a1y1 + a0y x) = F(t)

In the last equation, the factor in parentheses is zero, because y1 satisfies


Eq. 2.6-8. Therefore
du
F(t) (2.6-10)
dt a1y1
58 Classical Techniques

Example 2.6-2. Find the complete solution to

dy e 31
— + 3y = —
dt t

The homogeneous solution is yH = Ke-3t, i.e., yx = e_3t. Assuming yP — ue~3t,

du g( e~3t 1
dt t t

u = In t and yP = Ke~3t In t
The complete solution is
y = Ke~3t + e~3t In t

Next consider the second order differential equation

(a2p2 + axp + aQ)y = F(t) (2.6-11)

The complementary solution satisfying the homogeneous differential


equation
(a2p2 + axp + a0)y = 0 (2.6-12)
consists of two parts.
Vh = KiVi +

The particular integral solution is assumed to be

Vp = UiVi + w2?/2 (2.6-13)

where uY and u2 are undetermined functions of t. To evaluate ux and u2,


two conditions are needed. One of these is that Eq. 2.6-13 must satisfy
Eq. 2.6-11, but the other condition may be chosen in any way that appears
convenient.
yP = Wi yx + u1y1 + u2y2 + u2y2

The expressions for yP and yP are less cumbersome if

dddx + d2y2 = 0 (2.6-14)

Equation 2.6-14 is therefore taken as the second of the two needed


conditions.
yp = Uiih + U2y 2

yp = u1ij1 + u1y1 + u2y2 + u2y2

Substituting into Eq. 2.6-11,

+ uiyx + u2ij2 + u2y2) + a1(u1y1 + u2y2)

+ tfoOE/i + u2y2) = F(t)


Sec. 2.6 Solution of Differential Equations with Constant Coefficients 59

Rearranging,
a2(Miih + w2</2) + ul(a2ij1 + a1yl + aay^
+ u2{a2y2 + a1y2 + a0y2) = F(t)
Since y2 and y2 both satisfy Eq. 2.6-12,
m
«i*/i + u2y2 = (2.6-15)
Q2
To obtain explicit formulas for ux and u2, Eqs. 2.6-14 and 2.6-15 are solved
simultaneously. There results

Wi = -*/2r(0
u« =
y,m (2.6-16)
~ ViVi) a2(2/i*/2 - ViVs)
It is worth noting that, since yx and y2 are two independent solutions of
Eq. 2.6-12, Eq. 2.4-3 implies that yxy2 — y1y2 ^ 0. Since the denominator
of Eq. 2.6-16 is not zero, u1 and u2 can always be explicitly found.
Example 2.6-3. Find the complete solution to

d2y 1
-T7 + y = -

The homogeneous solution can be written as

Vn ~ Ki cos t + K2 sin t
Thus
yx = cos t and y2 = sin t
In Eq. 2.6-16,
2/i2/2 - ViVi = cos21 + sin2 t = 1
sin t cos t
-, «2 =-
t t
The complete solution is
sin t cos t
y = Kx cos t + K2 sin t — (cos t) dt + (sin t) dt

The integrals appearing in the answer cannot be expressed in terms of a finite number of
elementary functions. The integrals can be expressed by infinite series, however, and
happen to be well-tabulated functions—the “sine integral” and “cosine integral”
functions that have many applications.

Consider finally the n\h order differential equation shown in Eq. 2.6-5.
The complementary solution has the form

Vh — K-iVi + K*y2 + * ' ' + Knyn

The assumed particular integral solution is

Vp = "Mi + "2y2 + ' ' ’ + unyn (2.6-17)


60 Classical Techniques

where the u - s are functions of /. The derivatives of the u/s are found by
solving simultaneously the following n equations.

“iVi + u2y2 + • • • + unyn = 0


tiiVi + w22/2 + • * * + dnyn = 0

n—1 Fit)
UiVi + umI 1 + • • • + unynn 1 =
an
where dots and superscripts denote derivatives with respect to t. The first
w — 1 of these conditions are arbitrarily selected to put the result in a
tractable form. The last of the equations is found by substituting the
assumed yP into Eq. 2.6-5 while making use of the first n — 1 equations.
The set of equations above can be solved in terms of determinants by
Cramer’s rule.
WJt)F(t)
ui for i — 1, 2, . . . , n (2.6-18)
anW(t)
where W is the determinant
Vi y2 • • • Vn
Vi 2/2 • • • Vn
mt) = (2.6-19)
. ,71—1 . .n—1 „ .n—1
Vi y2 Vn
and where Wni(t) is the mth cofactor. By Eq. 2.4-3, W(t) is the Wronskian,
which does not vanish if yx through yn are independent solutions of the
homogeneous equation.
Example 2.6-4. Find the complete solution to
d3y d2y
+ 3 + y =
dt3 dt2
The homogeneous solution is
yH = K.e-t + KtU~* + K^

1 t t2
W(t) = e~3< -1 (—/ + 1) (->2 + 20 2e~3t
1 (t - 2) (/2 - At + 2)
ux = (|e3<)(/26-29(e-9 = bj2
u 2 = ae39(-27e-2t)(e-9 = -t

*3 = (*€80(€-a*)(e-*) = *
«i = h3
U2 = —¥2
«3 = y
yP = (i?3)C-4) + (—*/*)(/€-*) + t3e -t
Sec. 2.6 Solution of Differential Equations with Constant Coefficients 61

Thus the complete solution is

y = Kl€~* + K2te-t + K3t*e-* + ^3e-^

A comparison of the two standard methods of finding y P reveals that


the method of undetermined coefficients is often the simpler. It has the
disadvantages, however, of being valid for only a limited class of forcing
functions and of not applying for time-varying systems.
The variation of parameters method always gives an explicit expression
for yP once yH is known. This expression takes the form of a single time
integration of a known function, which is one reason why yP is commonly
called the particular integral solution. In some cases, the integral may be
difficult to evaluate, and numerical methods mav have to be used. The
important point, however, is that the method is valid for all forcing
functions and can be extended to the solution of time-varying systems.

The Particular Solution for the Input est

The general time-invariant system is described by Eq. 2.2-1 or 2.2-6.


Using the A and B operators,
Mp)y(t) = B(p)v(t) (2.6-20)

If v(t) = est, the particular solution has the form yP{t) = H(s)est, where
H is not a function of t. Substituting these expressions into the differential
equation gives
A(/?)[//(T)esi] = 5(/?)[esi]

Since pkest = skest, B(p)est = estB(s), where B(s) is a function formed by


replacing the differentiation operator p by s.

H(sy‘A(s) = estB(s) or H(s) = — (2.6-21)


A(s)
Thus H(s) can be written down by inspection of the differential equation.
Also
yp(t) (2.6-22)
H(s) =
_ v(t) _U(0=e
H(s) is usually called the system function or transfer function. The
fact that the response to est can be found so easily suggests decomposing
an arbitrary input into a set of est functions. In the terminology of Section
1.3, the elementary function would be k(t, A) = eH, and the response to
the elementary function would be K(t, X) = H{X)eXt. Carrying out this
suggestion leads to the Laplace transform technique of Chapter 3.
For a time-invariant circuit, the system function H(s) can be determined
directly from the model, instead of first writing the differential equation.
62 Classical Techniques

In the particular solution for v{t) = est, all currents and voltages have the
form e(t) = E(s)est and i(t) = I(s)est. The three passive circuit elements
are shown in Table 2.6-1. When the preceding expressions for e and i are

Table 2.6-1

Element Defining Equation Z(s)

R e = Ri ' R
—wv—
di
e =L — sL
dt

1
c 1 = c dt— sC
—ii—
inserted into the defining equation for the circuit element, the ratio of the
voltage to the current can be found. This ratio, called the impedance
and denoted by Z(s), is independent of t. Because of this fact, the relation¬
ship between yP(t) and v(t) can be found by the same basic rules as apply
to d-c circuits.
Example 2.6-5. For the circuit of Fig. 2.6-1, find the particular solution eP(t) when
the input current is i(t) — est.

(R + sLXl/sC) sL + R
E(s) = -I(s) =- I(s)
R + sL + 1 /sC s2LC + sRC + 1
Thus
sL T R
H(s) =
s2LC + sRC + 1

and the particular solution when i(t) = e*1 is

sL T R
eP(t) = —-est
P s2LC + sRC + 1

Fig. 2.6-1
Sec. 2.6 Solution of Differential Equations with Constant Coefficients 63

Equation 2.6-21 can be used to construct the differential equation


relating the input and output from a knowledge of the system function.
For the example above, the differential equation is

(.LCp2 + RCp + \)e(t) = (Lp + R)i(t)

The variable s is usually called the complex frequency. When 5 is


replaced by jco, the expressions for Z(joo) in Table 2.6-1 are those used
in a-c steady-state analysis. The particular solution when v = ejcot can be
found from
yp(t) B(jco)
H(jco) = (2.6-23)
_ v(t) J v(t)=e
J(Ot
A(jco)

The particular solution to a sinusoidal input can be found directly from


H(j co) by using Example 1.2-5. Since cos cot = Re [e3"*], and since the
response to v(t) = ejcit is yP(t) = H(joj)ejoit, the response to v(t) = cos cot
is
yP = Re [H(jco)ei(at]

If the complex quantity H(jco) is expressed in polar form by \H(jco)\ eje, the
response can be written
yP = \H(j<o)\ cos (cot + 6)

H(jco) is often called the frequency spectrum of the system and is basic
to a-c steady-state circuit theory.

The Evaluation of the Arbitrary Constants

The arbitrary constants in the complementary solution are evaluated


from a knowledge of boundary conditions or initial conditions. In most
system problems, the values at t — 0+ of y(t) and the first n — 1 derivatives
are used to evaluate the arbitrary constants in the solution to an «th
order differential equation. The 0+ indicates that the value of the de¬
pendent variable and its derivatives are known an infinitesimal interval
after the reference time t = 0. The quantities are normally found by
examining the stored energy in the system at t — 0. It is important to
emphasize that the arbitrary constants depend upon the forcing function
used and cannot be evaluated until after yP has been found.
Example 2.6-6. A simplified model of a vacuum-tube amplifier with high-frequency
compensation is shown in Fig. 2.6-2. The voltage applied to the grid is

eft) = U_ft) volts

For negative values of t there is no stored energy associated with L and C. Find the
output voltage eft).
64 Classical Techniques

Fig. 2.6-2 (gm R =


1800 ohms.) v

The circuit can be described mathematically by writing two node equations. Applying
KirchhofTs current law at nodes 2 and 3,

de2 1 \ 1
C17 + Rez I ~Re°=

1 / 1 1
e-> dt\ =0
~Re2+\Rea + L

The second equation can be differentiated term by term to remove the integral sign.
Solving the first equation algebraically for e3 and substituting the result into the second
equation gives
d2e2 Rde2 1 _ ldex R
~dF + L~dt + LC62 ~ ~C \~dt +

As expected, this is the same differential equation as the one found for Fig. 2.6-1 by a
different method. Inserting the given numerical values and expressing / in microseconds
rather than seconds, there results

d2e o deo dex


—: + 36 — + 800co = -400 — - 14,400^x
dt2 dt “ dt
For t > 0,
d‘2eo deo
-- + 36 — + 800eo = -14,400
dr2 dt

The characteristic equation is


r2 + 36r + 800 = 0
which has roots at
r = —18 ±/'21.8
Hence
(e2)jj = e~18,[Ah cos 21 .St + K2 sin 21.8/]

The particular integral solution can be found by examining the differential equation,
or by using d-c steady-state theory.
14,400
02)p = -18
800
The complete solution is

e2 = —18 + cos 21.8/ + K2 sin 21.8/] for / > 0


Sec. 2.6 Solution of Differential Equations with Constant Coefficients 65

Also

= e-18t[(21.8i5r2 - 187Q cos 21.St - (21.8Kx + 18AT2) sin 21.8?] for t > 0
at

Kx and K2 can be evaluated from the last two equations if e2 and de2/dt are known at
t = 0 + . From the problem statement, the voltage e2 and the current through L were
zero for negative values of t. These values cannot change instantaneously unless there
is an impulse of current into C or an impulse of voltage across L, so they must remain
zero at t = 0 + . To find the value of de2/dt at t = 0 + , note that, since the inductor
current remains zero, the current through C must be ic = i = —gm = —0.01. Since
4 = C (defdt),

de 2 - 0.01 volts volts


-400 x 106 = - 400
^(0+) 25 x lO"12 second microsecond
Thus
0 = -18 + Kl

-400 = 21.8Ko - 18^


giving
Kx = 18 and K2 = -3.5
The solution is
= -18 + e-18'[18 cos 21.St - 3.5 sin 21.8/]

Example 2.6-7. Repeat the previous example with eft) = (cos 4 x \Wt)U_ft).
The form of the complementary solution is not changed, and the particular solution
can be found from the system function
jcoL -f- R
Hi j(o) = -4-
7 1 - co2RC + JcoRC

For the given values of L, R, and C, and for co = 4 x 107,

H(jco) = (—13.1)e-m°

The complete solution, when t is in microseconds, is

eft) = -13.1 cos (40/ - 71°) + ^18t[Kx cos 21.8/ + K2 sin 21.8/]

The arbitrary constants can again be evaluated from e2(0+) and (de2/dt)(0+).

0 = -13.1 cos 71° + K1


-400 = -13.1 (40) sin 71° + 21.8K2 - 18^

giving jKx = 4.3 and K2 = 7.9.

Homogeneous Initial Conditions

In some important problems, the initial conditions are homogeneous.


For the nth order differential Eq. 2.6-5, homogeneous initial conditions
are

2/(0) = f
dt
(0)
d'-'y

dt"-1
(0) = 0 (2.6-24)
66 Classical Techniques

The variation of parameters method, previously used to find a particular


solution, can be easily extended to give the complete solution which
satisfies these homogeneous initial conditions. For n = 1, Eqs. 2.6-9
and 2.6-10 are combined as

2/(0 = y1(t)[u(t) - 1,(0)] = Vl(t) dz (2.6-25)


Jo a1(z)y1(z)

where yx(t) is the complementary solution, and z is a dummy variable


of integration. The upper limit yields the particular solution previously
discussed, while the lower limit yields a specific arbitrary constant for the
complementary solution. Note that y{0) = 0.
For n = 2, Eqs. 2.6-13 and 2.6-16 are combined as

2/(0 = 2/1WKO) - «i(0)] + 2/2(0t»2(0 - ^(O)]


y2(z)F(z) yi(g)^(g) dz
= — 2/i(0 dz + y2(t) (2.6-26)
o a2(z)W(z) t0 a2(z)W(z)

where y±(t) and y2(t) are two solutions of the homogeneous equation, and
where W(t) is the Wronskian

W(t) = ViiOy^t) - y!(t)y2{t)

The previous comments about the upper and lower limits of integration
still apply. Note that y(0 ) = 0, and

y{t) = &(0[«i(0 - «i(0)] + y2(t)[u2(t) - w2(0)]

+ [j/i(0“i(0 + 2/2(0“2(0]

The last bracket is always zero by Eq. 2.6-14, so 2/(0) = 0 as required.


In the general case of an nth order differential equation with homo¬
geneous initial conditions, Eqs. 2.6-17 and 2.6-18 yield

2/(0 = 2/i(0K(0 - Ui(0)] + • • • + yn(t)[un(t) - u JO)]

KM F(z) dz (2.6-27)
= 1
0 an(z)W(z)

where W(t) and Wni{t) are defined in Eq. 2.6-19. Since they are based on
the variation of parameters method, Eqs. 2.6-24 through 2.6-27 are valid
for varying as well as for fixed systems. There is an important application
in Section 2.8, where these equations are used to find the impulse response
of a time-varying system.
Sec. 2.6 Solution of Differential Equations with Constant Coefficients 67

The Physical Significance of the Complementary


and Particular Solutions

The form of the complementary solution depends only upon the system
and not upon the input. The characteristic equation depends only upon
the parameters of the system, and the roots of the characteristic equation
determine the kind of terms appearing in the complementary solution.
In the event that there is no external source (e.g., the system may be excited
only by some initial stored energy within it), the complementary solution
becomes the complete solution. Thus the complementary solution
represents the natural behavior of the system, when it is left unexcited.
For this reason, the complementary solution is also called the free or
unforced response.
If the free response of a system increases without limit as t approaches
infinity, the system is said to be unstable. This is the case if the character¬
istic equation has a root with a positive real part, since the complementary
solution then contains a term which increases exponentially with t. Roots
with negative real parts, on the other hand, lead to terms that become
zero as t approaches infinity. Purely imaginary roots, if they are simple,
lead to sinusoidal terms of constant amplitude in the complementary
solution. This case of constant amplitude oscillation in the complemen¬
tary solution, which is characteristic, for example, of LC circuits, is usually
considered a stable and not an unstable response. Repeated imaginary
roots lead to terms of the form tn cos (ojt + <j>), which is an unstable
response. If the roots of the characteristic equation are plotted in a com¬
plex plane, the following statement can be made. For a stable system,
none of the roots can lie in the right half-plane, and any roots on the
imaginary axis must be simple. If all the roots of the characteristic
equation lie in the left half-plane, the complementary solution approaches
zero as t approaches infinity and is identical with the transient response
of the system.
The magnitudes of the terms in the complementary solution, i.e., the
arbitrary constants, depend upon two things, one of which is the input.
The other is the past history of the system before the input was applied.
This history can be completely summarized by a knowledge of the energy
stored within the system at the time the input is applied.
The form of the particular solution is dictated by the forcing function,
as can be clearly seen from the method of undetermined coefficients.
The only time the system has any influence upon this form is when a term
in the forcing function duplicates a term in yH. In this event, the system
is being excited in one of its natural modes, a phenomenon called reson¬
ance.
68 Classical Techniques

Since the form of the particular solution depends upon the input, it is
also called the forced response. If all the roots of the characteristic
equation lie in the left half-plane, the forced solution is identical with the
steady-state solution. The magnitudes of the terms in the forced solution
depend upon both the input and the system parameters.
It is customary to think of the forced component of the solution as
being immediately established by the application of the input. The free
component, which is the complementary solution, adjusts itself through
the proper evaluation of the arbitrary constants to provide for the proper
transition from the unexcited system to a system under the dominance
of the input. Some like to think of the system as initially resisting
the wishes of the input, by means of the complementary solution. The
magnitudes of the arbitrary constants depend upon how greatly the
character of the input differs from the natural behavior of the system.

2.7 SOLUTION OF DIFFERENTIAL EQUATIONS


WITH TIME-VARYING COEFFICIENTS

The differential equation describing a time-varying system is given by


Eqs. 2.2-1, 2.2-2, or 2.2-7, where the at and b{ coefficients are now func¬
tions of time.

/ x dny , , ,. dy , , N
an(0 ~rp + ’ ' ‘ + a1(0 - b a0(t)y
dt dt

= bm(t) Cl + • • • + bio C + bo(0v = F(0 (2.7-1)


dt dt

Mp, t)y(t) = B(p, t)v(t) = F(t)

For first order equations, the method of Section 2.5 always yields a
solution. For higher order equations, however, an explicit solution
cannot be found, in general.

The Complementary Solution

Consider Eq. 2.7-1 with F(t) replaced by zero, and examine the method
of Section 2.6 for finding yH. Solutions of the form y — ert, where r is a
constant, were assumed, resulting in the equation

anrn + • * • + ay + a0 = 0

But, since the a's are now functions of time, the roots of this equation
are functions of time, violating the assumption just made. In fact, there
is no general method of finding yn in terms of elementary functions.
See. 2.7 Solution of Differential Equations with Time-Varying Coefficients 69

It is sometimes helpful to know that, if all but one of the n independent


solutions to an /7th order homogeneous equation are known, the remaining
one can be found by a modification of the variation of parameters
method.2
One of the special cases which can be solved is Euler’s equation

[an{b + ct)npn + an_ffb + cr)"-1/?”-1 + • ■ •


+ ax{b + ct)p + a0]y(t) = 0 (2.7-2)

where the af s, b, and c are constants. By using the substitution of variable

(b + ct) = ez, i.e., z = In (6 + ct)

the equation can be reduced to one with constant coefficients.


Example 2.7-1. Find the general solution to

cl2y dy
t2 77 + t—+y = °
dt2 dt
Let e3 = t and e2 dz = dt. Then
dy dy
dt dz

dt2 dz dz2-
The new equation becomes
d2y
- +y = o
dz
the solution of which is

y = K1 cos ^ + K2 sin 2 = Kx cos (In t) + K2 sin (In t)

Occasionally, an ni\\ order homogeneous differential equation can be


reduced to differential equations of lower order. An equation in which
y and its first (k — 1) derivatives are absent can be reduced to an equation
of order n — k by the substitution of variable z = dky\dtk. Another case
is that in which the operator can be expressed in factored form, as in the
following example.

Example 2.7-2. The differential equation

[tp2 + (t2 + 1 )p + 2 t]y = 0


may be written as
(tp + 1 )(p + t)y = 0
This is equivalent to the two first order equations

(tp + l)u = 0, (p + t)y = v

The first of these two equations may be solved for v as a function of t, and the second
then solved for y, using the method of Section 2.5.

Considerable work has been done on differential equations with co¬


efficients that are periodic in time. A second order equation of this nature
70 Classical Techniques

can be transformed into Mathieu’s or Hill’s equation.3 Higher order


equations may be handled by the Floquet Theory.4
When the coefficients are polynomials in t, an infinite series solution
may be obtained. Bessel’s and Legendre’s equations are the best-known
examples. When the coefficients are rational fractions (i.e., the quotient
of two polynomials), an infinite series solution can still be found. The
infinite series has, in such a case, only a limited region of convergence;
hence, if a solution is required for all time, a number of different series
must be used.
Finally, a number of methods for approximating yH are available.5!
These methods usually assume that the ai coefficients either vary slowly
with t or have a small variation compared to their mean value.

The Particular Solution

Although the method of undetermined coefficients is not suitable for


time-varying differential equations, the variation of parameters method is.
All steps in the derivations of Eqs. 2.6-10, 2.6-16, and 2.6-18 are valid
whether or not the a/s are functions of time. Thus yP can always be
found once yH is known. Unfortunately, as has just been pointed out,
yH cannot, in general, be found.
Example 2.7-3. Find the complete solution to
„ d2y , dy 1
T t — — y — —
dt2 dt t
The homogeneous equation
d2y dy
t2 — + t - - y = 0
dt2 dt
has the form of Euler’s equation. Using €= = t,
Ko
yH — Klez + Koe z — Ky H——

Using Eq. 2.6-16, with yx = t and y2 = \/t,


2
ViVi - ifxv2 = —
t
(-1/00/0 1
— and
t\-nt) 213

(Od/O 1
— and u2 = —tin/
t\-2/t)
1
Vp = “iVi + uyjz =
4~t

y = Ky + — (K, - 4i 1 In /)
t
t See Section 5.9.
Sec. 2.8 Obtaining the Impulse Response from the Differential Equation 71

The identification of the complementary and particular solutions with the free and
forced response is still valid for time-varying differential equations.

2.8 OBTAINING THE IMPULSE RESPONSE


FROM THE DIFFERENTIAL EQUATION

Chapter 1 presents methods of finding the response of a system to an


arbitrary input, once the impulse response is known. Although the impulse
response is sometimes given directly, it is important to be able to
find it from the system’s differential equation. Consider, therefore, the
equation
dny dn~Yy dv
an + an-1 dtn-l + ’ * ' + al “ + aoV = U0(t) (2.8-1)

Since the system is at rest for t < 0, y = 0 for t < 0. Because the forcing
function U0(t) is zero except at t = 0, the solution for all t > 0 consists
only of the complementary solution
V = K1y1 + K2y2 + • • • + Knyn (2.8-2)
To evaluate the n arbitrary constants, n initial conditions are needed.
For a fixed system, where the a/s are constants, these can be found directly
from the coefficients. The nth derivative of the solution, but none of the
lower order derivatives, contains an impulse at t = 0. This is the only
way Eq. 2.8-1 can be satisfied at t = 0. If one of the lower order derivatives
were to contain an impulse, then dnyjdtn would contain a singularity of
higher order. Since an{dnyjdtn) does contain a unit impulse at t = 0,
dn~Yy\dtn~Y must jump from 0 to 1 \an at t = 0, and all lower order
derivatives must be continuous at the origin. The n initial conditions must
be
1
2/(0 + ) = 2/(0 + ) = • • * = 2/ (0 + ) = 0, 2/n-1(0+) = - (2.8-3)
a‘n

A fixed linear system is characterized by Eq. 2.2-1 or 2.2-6 as


dny dy . dmv
a „-h + + — + a()y — b m + + *>+ + V
dt n dt dr dt
or
Mp)y(t) = B(p)v(t) = F(t) (2.8-4)

Unless bm = bm_1 = • • • = bx = 0, the.procedure above must be modi¬


fied. When v(t) = U0(t), the right side of Eq. 2.8-4 contains singularity
functions of several orders. One convenient approach is to assume that,
for small non-negative values of time, y(t) can be represented by a Taylor
series, as in the following example.
72 Classical Techniques

Example 2.8-1. Find the impulse response for


d2y dy 1 dv
— + 3 — + 2 y — — — -p v
dt2 dt 2 dt
Let v = U0(t).
y = [c0 + cxt + c2t2 + • • • ]C/_X(0
where the c/s are constants to be determined.
dy
— co<70(0 + [ci + 2c2/ + • • * ]t/_i(0
dt

d2y \
—— = cot/^) + CiU0(t) + [2c2 + * • • ] C/Li(r)
dt2

Substituting into the differential equation and collecting terms,

tfi(0[c0] + f3o(0[ci + 3c0] + t/_i(0[2c2 + 3 c1 + 2c0] + • • • = + U0{t)

Equating corresponding coefficients,


co = i
C\ + 3c0 = 1, so cx = —£

Immediately after the impulse,


1 dy 1
2/(0 + ) = c0 = - , — (0+) = c1 — — -
2 dt 2

From the original differential equation, the complementary solution for t > 0 is

y = Kye-' + K2e~2t

Evaluating Kx and K2 by the two initial conditions above, Kx = 4 and K2 = 0.


The impulse response is
Kt) =

The reader familiar with the Laplace transform may wish to compare this example with
Example 3.5-2.

The Impulse Response for Time-Varying Systems

The previous approach cannot be readily extended to time-varying


systems, where the operators A and B are functions of time.
7 Yl 7 772

an(0 ~~ +**’+ a0(t)y = bm{t ■— ) + * * *+ b0(t)v (2.8-5)


dtn dt
MP, 02/(0 = B(p, t)v(t) = F(t)
One method which is valid for varying as well as for fixed systems is
closely related to Eqs. 2.6-24 through 2.6-27. For nonanticipatory
systems having zero input for t < 0, the response to any arbitrary input
is given by Eq. 1.7-2.

y(t) = \v(X)h(t9 X) dX (2.8-6)


Jo
Sec. 2.8 Obtaining the Impulse Response from Differential Equations 73

Note the similarity of this equation and Eqs. 2.6-25 through 2.6-27. Since
yft) is not a function of 2, and F(z) and W(z) do not depend upon i, Eq.
2.6-27 may be rewritten as
n
1
-a„(z)W(z)
2 yi(‘Wni(z) dz (2.8-7)

where the yf s are n independent solutions to the homogeneous differential


equation. The quantity in brackets is known as the one-sided Green’s
function and is denoted by6

gif,z) = - , . 2 ylt)wni{z) (2.8-8)


aJz)W(z)
Then

2/(0 = F(z)g(t, z) dz (2.8-9)


f0
First, compare Eqs. 2.8-6 and 2.8-9 for the special case F(t) = v(t),
which occurs when the operator B(p, t) = 1. Then

h(t, X) = g(t, X) for 0 < X < t


Of course,
h(t, X) = 0 for t < 2

lVni(z)/an(z)W(z) then represents the proper values of the arbitrary


constants in the impulse response. Green’s function is frequently used in
mathematical literature, and it must satisfy a number of useful properties.6
The factors in Eq. 2.8-8 may be written explicitly as determinants by
Eq. 2.6-19.
2/10) 2/2(2) ... 2/„(2)
yiiz) 2/2(2) ... 2/71(2)
W(z) = (2.8-10)

yr\z) yt\z) ••• 2/r'(2)

2/1(2) 2/2(2) ... 2/„(2)

n 2/1O) 2/2(2) ... 2/71(2)


2 vM wmiz)
i—l
ypXz) yl~\z) ... 2/r2(2)
2/i(0 2/2(0 • • • 2/n(0

2/i(0 2/2(0 • • • 2/„(0


2/i(z) 2/2(2) .. . 2/„(2)
71—1
= (-D 2/1(2) 2/2(2) .. . 2/77(2) (2.8-11)

2/r2(2) 2/r2(2) ... 2/r2(2)


74 Classical Techniques

In general, F(t) 9^ v(t), and Green’s function and the impulse response
are not identical. Let the only restriction on Eq. 2.8-5 now be m < n.
Fit) = B(p, t)v(t)
= [bm(Opm + • • • + W{t)p + b(1(t)]v(t) (2.8-12)
where p = djdt. Equation 2.8-9 can be used to find the impulse response,
once Green’s function is known. If v(t) = U0(t — 2), then y(t) = h(t, 2).

h(t, X) = f [B(p, z)U0(z - 2)]g(t,


Jo \
z) dz (2.8-13)

where now p = djdz. Equation 2.8-13 is used only to find the impulse
response for 0 < 2 < t, and the integrand is zero except at z = 2. The
limits on the integral may therefore be changed to —00 and +00.
This equation is not nearly so formidable as it looks. A typical term in
B(p, z)U0(z - 2) is

£>*(» Ji U„(z - X) = bk(z) J- U0(A - z)


dzk dz*
The integral involving this typical term can be evaluated by Eq. 1.6-14 as
follows.

Bit, z)bk(z) [t/„(A - *)] dz = (-1) [g((; 2)^(2)] (2.8-14)


dz* dr
Equation 2.8-13 can therefore be rewritten as
m jk
hit, 2) = 2( — If [g(t, X)bk(X)] (2.8-15)
k=0 dA
The same result can be obtained in a more elegant way by the concept of
the adjoint operator.6
In summary, Eq. 2.8-8 gives h(t, 2) = g (t, 2) for 0 < 2 < t when
F(t) = v(t). In general, h(t, 2) is found from Eq. 2.8-13 or 2.8-15. These
equations constitute a general method of finding the impulse response from
a linear differential equation. The principal limitation is that there is no
perfectly general method of finding the solutions yl9 . . . , yn to a varying,
homogeneous differential equation.
Example 2.8-2. Find the impulse response of a system described by the following
differential equation.7
dy
+ 41 ——b 2y = v
dt ’
Since this is a form of Euler’s equation, the complementary solution can be found by
the substitution of variable / = as discussed in Section 2.7.
Sec. 2.9 Difference and Antidifference Operators 75

Thus yx{t) = 1 It and y2(t) = 1 It2. To determine Green’s function, note that

1 1
z z2 ]
W(z) =
-1 -2
z2 z3

1 1
7 T2 z — t
2 »<(ok, = (-i)
i= 1 i i t2z2
z z2

,l\ (z — t \ t — z
6cr <'•*>“ I?)(-*‘>(-Fv t2

Since the operator B(p, t) = 1 in this example,

t - ?.
h(t, 7) = —-— for t > X

This result is used in Examples 1.7-1 and 3.9-4.

2.9 DIFFERENCE AND ANTIDIFFERENCE OPERATORS

Difference equations arise when functions of a discrete variable are


considered. Let y(k) denote a function that is defined only for integral
values of As discussed in Chapter 1, any function that is defined only
at equal intervals can be expressed in this way by a suitable change of scale.
The shifting operator E is defined by

E[y{k)\ = y(k+ 1) (2.9-1)

The repeated application of this operator gives

E*[y(k)} = E[Ey(k)} = y(k +2)


or, in general,
En[y{k)] = y(k + n) (2.9-2)

for any positive integer n. The difference operator^; A is defined by

Ay(k) = y(k + 1) - y(k) (2.9-3)

Since the last equation may be rewritten as

Ay(k) = (E- 1 )y(k)


t In many books, the function is written yk.
{ This is the forward difference operator, as distinguished from the backward difference
operator V, which is occasionally used, and which is defined by Vy{k) = y{k) —
y(k ~ l).
76 Classical Techniques

the operators A and E are related by

A = E - 1 (2.9-4)

Ay(k) is called the first difference of the function y(k). Higher order
differences are defined as

A*y(k) = A[Ay(fc)] = y(k +2) - 2y(k +1) + y(k)


A3y(k) = A [A2y(k)] = y{k + 3) - 3y(k +2) + 3y{k + 1) — y{k)

or, in general, \

Sny(k) = 2(-1)r(”Wfe + n - r) (2.9-5)


n
where j represents the binomial coefficients. The last equation is
r
consistent with Eqs. 2.9-2 and 2.9-4, for

Any(k) = (E- \)ny(k) = | (~iy h En~ry(k)


r=0 \r/

The operators E and A obey the usual algebraic laws, such as

A[cy(k)] = c Ay(k)
A m[y(k) + z(k)] = A my(k) + A mz(k) (2.9-6)
Am Any(k) = An Am?/(^) = Aw+n?/(A:)

where c is a constant, and where m and n are positive integers. Equations


2.9-6 remain valid if every A is replaced by E. The operators commute
with each other, but not, in general, with functions.

A [y(k)z(k)] y*— y(k) Az(k)

The operator A for discrete functions is somewhat analogous to the


differentiation operator p = djdt for continuous functions. To make this
clear, consider the continuous function /(/), whose derivative is

m = hmfA±iCim (2.9-7)
dt t~>o T

If the function is considered only at the discrete instants t = nT(n = 0,


1,2,...), the shifting and differencing operators are defined by

m) =/<< + t)
(2.9-8)
A/(0 =f(t + T) -fit)
Then

(2.9-9)
dt t-+ o T
Sec. 2.9 Difference and Antidifference Operators 77

and, in general,
dmf (0 _ lim AW/(0 (2.9-10)
dtm T-+ 0 Tm
It is not surprising, therefore, that there are differencing formulas similar
to but not identical with the common differentiation formulas. For
example,
\[y{k)z(k)] = y(k + 1) Az(/c) + z(k) Ay(k)

'y(k) = </c) Ay(/c) - y(/c) Az(/c) (2.9-11)


Uk) J z(k)z(k + 1)

The differentiation of polynomials is analogous to the differencing of


“factorial polynomials.”! The factorial polynomial of order m is defined
as
(fc)<m> = k(k - 1 )(k - 2) • • • (k - m +1) (2.9-12)

where m is any positive integer. Note that (k){m) contains exactly m


factors. Using the definition of the difference operator, it is found that

A(k)W = m(kym~1) = mk(k - 1 )(k - 2) • • • (k - m + 2) (2.9-13)

Because of the analogy between A and /?, it is logical to ask if there is an


operator for discrete functions that is analogous to the integration
operator />_1, where

P_1[/(01 = J/( 0 dt + c

= I /(A) dl + K (2.9-14)
Jt0
where c and K are constants of integration. The lower limit t0 is arbitrary
and forms part of the constant of integration. Specifically,

c = K - f(t) dt
_ %) _n=fo
The quantity
2/(0 = p-\m\ (2.9-15)

is the general solution to the equation

MO =/(0 (2.9-16)
Equivalently,
PP~V(0 =/( 0 (2.9-17)

f Any ordinary polynomial can be expressed as the sum of factorial polynomials, if


desired. See, for example, Reference 8.
78 Classical Techniques

By analogy, an antidifiference operator A-1 should exist such that

y(k) = A y\k) (2.9-18)


is the general solution to
A* m =m (2.9-19)
or, equivalently, such that
A A~Y(k) =f(k) (2.9-20)
Since
■k-1
A 2 fin) + K = [fik)+f(k -D + ' ■■+m + K]
_n=0
- u(k- q + • • • +m + K]=m
the antidifference operator satisfying Eqs. 2.9-19 and 2.9-20 must be
fc-i

a-y(fe) = i /(b) + k (2.9-21)


71=0

The antidifference operator may be rewritten as


Yl——lc—1 71——Jc

A-1/® = 2 fin) + c = 2f(n - 1) + c (2.9-22)

where the summation is still with respect to the dummy variable n. The
lower limit in Eq. 2.9-22 is unspecified because any fixed number of the
terms/(0),/(l),/(2), ... in Eq. 2.9-21 can be combined with the constant
of summation, K, to form the new constant c. The lower limit may be
selected in whatever manner appears to be the most convenient. The
arbitrary lower limit in Eq. 2.9-22 is the analog of the arbitrary value of
t0 in Eq. 2.9-14.
The reader may recall that the exact evaluation of integrals may be
tricky or difficult, and in some cases downright impossible. Since there is
no conceptual difficulty in summing a finite number of terms, he may be
tempted to conclude that a similar problem does not exist for the operator
A-1. A brute force calculation of A~lf(k) from Eq. 2.9-22 over a wide
range of values of k is often, however, unnecessarily tedious. As k
increases without limit, so does the number of terms in the summation.
Whenever possible, A~xf(k) should be expressed in closed form.
The summation of a finite series is one of the topics usually included in
books on finite differences. Among the available techniques are short
tables of summation formulas, summation by parts (analogous to but
not identical with integration by parts), the use of Bernoulli polynomials,
and the use of partial fractions. For example, the telescopic series

n=2 n(n — 1)
Sec. 2.10 Representing Discrete Systems by Difference Equations 79

can be easily summed by noting that

1 = _L_1
n(n — 1) n — 1 n
The result is
k i i

2 -= 1 - - (2.9-23)
n=2 n{n — 1) k
n—k
The summation of some of the very simple function (e.g., 2 l/«) cannot be
expressed in closed form in terms of elementary functions. In some of
these cases, an approximate summation can be obtained.
It is useful to note that, when factorial polynomials are used, Eq. 2.9-13
gives

A = —-— (/c)(m+1> + K
m + 1
or
A-\k{k - 1) • • • (k - m + 1)]
n=k—1

= ^ n(n — 1) ' " * (n — m + 1)

= —-— k(k - 1) • • • (k - m) + K (2.9-24)


m + 1
In the following sections, the reader will notice a striking resemblance
between the solution of differential and difference equations.! This has
already been suggested by the analogy of the operators p and A, and p~x
and A-1. Although there are some dissimilarities, the major concepts in
the solution of differential equations have their parallel in the solution of
difference equations. For this reason, the treatment of difference equations
is somewhat shorter and more concise than the treatment of differential
equations.

2.10 REPRESENTING DISCRETE SYSTEMS


BY DIFFERENCE EQUATIONS

The general form of a difference equation relating the output y(k) to the
input v(k) of a discrete system is

any(k + n) + an_xy{k + n — 1) + • • • + axy{k + 1) + ayy(k)


= bmv(k + m) + • • • + bxv(k + 1) + b0v(k) (2.10-1)

t In fact, difference equations may be used to obtain approximate solutions of ordinary


and partial differential equations. From another point of view, they form a bridge
between ordinary and partial differential equations.
80 Classical Techniques

or, equivalently,

[cn A" + cn_x A"-1 + ••• + £-! A + c0]y(k)


= [dm A™ + • • • + d, A + da}v(k) (2.10-2)

Either of the two equations may be easily obtained from the remaining
one by Eqs. 2.9-1 through 2.9-5. The second is a closer analog to the
differential equation 2.2-1, but the first is easier to use and is the one more
commonly given. For a linear system, the a/ s, b/s, c/s, and d/s cannot
be functions of y or v, but they may be functions of k. For fixed linear
systems, these coefficients must be constants.
Using the shifting operator E, Eq. 2.10-1 may be rewritten as

{anEn + an_xEn~x + • • • + axE + aQ)y{k)


= 0bmEm + • • • + bfiE + b0)v(k) = E(k)
(2.10-3)

Since the input v(k) is presumably known, the right side of Eq. 2.10-1
is represented by the known forcing function E(k). For fixed linear
systems, where the coefficients are constants, the last equation is written
symbolically as
A(E)y(k) = B(E)u(k) = F(k) (2.10-4)

Difference equations arise when functions of a discrete variable are con¬


sidered. Systems which can be described by difference equations include
computers, sequential circuits, and systems with time-delay components.
Any system whose input v(t) and output y(t) are defined only at the
equally spaced intervals t — kT is described by the difference equation

any(kT + nT) + an_xy(kT + nT — T) + • • • + axy(kT + T) + aQy(kT)


= bmv(kT + mT) + • • • + byv{kT + T) + b0v(kT)

where k, m, and n are integers. By this equation, the value of the output at
the instant t — (k -f n)T is expressed in terms of the past outputs from
t = kT to (k + n — 1 )T, and in terms of the inputs from t = kT to
(k + m)T. For nonanticipatory systems, m < n. If the time scale is
adjusted so that T = 1, this equation is identical with Eq. 2.10-1.
Difference equations also arise in ways other than those mentioned
above. The right side of Eqs. 2.10-3 and 2.10-4 need not always be a
direct function of the input, and the discrete variable k may be an index
of position instead of an index of time. The following two examples!
show how difference equations result whenever there is a repetition at
equal intervals of position or at equal intervals of time.

f These examples are somewhat similar to those of Reference 9.


Sec. 2.10 Representing Discrete Systems by Difference Equations 81

Example 2.10-1. An example of the repetition of structure at equal intervals of


position is the cascade connection of m identical T-sections shown in Fig. 2.10-1. The
currents are a function of the position index k, and i(k) is desired for k = 0, 1, 2, ... , m.

Fig. 2.10-1

The equation for the /:th mesh is

-R2i(k - 1) + 2(R1 + R2)i(k) - R2i(k + 1) = 0

This is a second order, homogeneous difference equation which is solved for i as a


function of k in Example 2.12-1. If the resistance R2 is replaced by a capacitance C,
the equation for the rth mesh becomes a differential-difference equation. The solution
for i as a function of the continuous variable t and the discrete variable k is carried out
in Example 3.7-3.

Example 2.10-2. Systems subjected to periodic inputs or periodic switching are


examples of a repetition at equal intervals of time. Figure 2.10-2 shows a circuit
subjected to a square wave of voltage. Assuming that there is no initial stored energy,
find an expression for the current.

Fig. 2.10-2

While the solution could be found by calculating the response for each cycle in
succession, this would be a very tedious process. The steady-state solution could be
obtained by use of the Fourier series, but the answer would be an infinite series instead
of being in closed form. Examine, therefore, the £th cycle, and for convenience move
the time origin to the start of the Ath cycle. The differential equations describing the
circuit are
di1
-b i1 = 1 for 0 < t < 1
dt

dio
-b /2 = 0 for 1 < t < 2
dt

where the subscripts 1 and 2 denote the first and second halves of the cycle, respectively.
Since the current is a function of the cycle under consideration, it is again a function of
82 Classical Techniques

the discrete variable k as well as the continuous variable /. The solution is carried out
in Example 2.12-3. A first order difference equation must be solved, along with the
above differential equations.

2.11 GENERAL PROPERTIES OF


LINEAR DIFFERENCE EQUATIONS

An nth order linear difference equation can be written in the form of


Eq. 2.10-3, repeated below.
(anEn + an_xEn~x + • • • + axE + a0)y(k) = F(k) (2.11-1)
where an ^ 0, a0 ^ 0, and where all the a/s are defined for all integral
values of k of interest. The remarks of this section are valid for both fixed
and varying equations; hence the a - s can in general be functions of k.
If a0 = 0, q ^ 0, and an ^ 0, the equation is of order n — 1. In contrast
to differential equations, the order of a difference equation is defined as
the difference between the lowest and highest power of E. If the operator
A is used, as in
(cn An + cn_x A"-1 + • • • + Cj A + c0)y(k) = E(k) (2.11-2)
the order of the equation cannot be determined by inspection. For
example, the equation (A2 + 3 A + 2)y(k) = 0 is equivalent to
(E2 + E)y(k) = 0
which is a first and not a second order equation.
Equation 2.11-1 is called a nonhomogeneous difference equation, while
(anEn + an_xEn~x + • • • + axE + a0)y(k) = 0 (2.11-3)
is an nth order homogeneous equation. Difference equations are also
called recurrence formulas. Equation 2.11-1 can be rewritten as

y(k -h n) = - — [an_iy(k + n - 1) + • • • + axy(k + 1)


^n
+ a0y(k) - F(k)] (2.11-4)
If y(0) through y(n — 1) are known, y(k) may be explicitly found for all
k > n by the repeated application of Eq. 2.11-4. Thus, in contrast to a
differential equation, y(k) may be calculated from the difference equation
itself for any value of k (given enough patience or a computer) in terms
of the first n values of y{k).f The values of y(0) through y(n — 1), or
equivalent information, must be given in order for the solution of Eq.
2.11-1 to be unique.

t This is the basis for one method of obtaining numerical solutions of differential
equations.
Sec. 2.12 Solution of Difference Equations with Constant Coefficients 83

Use of the iterative process suggested by Eq. 2.11-4 is not usually


considered to constitute a “solution” of Eq. 2.11-1. Most of the remainder
of this chapter is concerned with finding an expression for y(k) in closed
form that satisfies Eq. 2.11-1 or 2.11-3.
A homogeneous difference equation of order n has n linearly independent
solutions. If a„ 7^ 0, and a0 ^ 0, the independent solutions of Eq. 2.11-3
can be denoted yx(k), yffik), . . . , yn(k). A necessary and sufficient con¬
dition for the solutions to be independent is that

y± y2 • • * yn
Ey1 Ey2 • • • Eyn
W(k) = * ot (2.11-5)

En~1y1 En ly2 ‘ * * En~xy


The most general solution of Eq. 2.11-3 is

Vh — (k) + C2y2{k) + • • • + Cnyn(k) (2.11-6)

where the Cf s are arbitrary constants that are independent of k.


The most general solution of Eq. 2.11-1 is

y = yH + yp
where yH is given in Eq. 2.11-6, and where yP is any one solution satisfying
Eq. 2.11-1. The components yH and yP are called the complementary
and particular solutions, respectively. Since yP contains no arbitrary
constants, y contains n such constants, which must be evaluated by initial
or boundary conditions, as in Examples 2.12-1 and 2.12-4.

2.12 SOLUTION OF DIFFERENCE EQUATIONS


WITH CONSTANT COEFFICIENTS

Explicit procedures exist for finding the closed-form solution of differ¬


ence equations with constant coefficients. The ni\v order nonhomogeneous
and homogeneous difference equations are given by Eqs. 2.11-1 and
2.11-3, respectively, where the afs are constants.

Homogeneous Difference Equations

Consider the «th order equation

(anEn + an_xEn~x + • • • + axE + aQ)y(k) = 0 (2.12-1)


t The function W(k) is analogous to the Wronskian, but it is often called Casorati’s
determinant and given the symbol C{k).
84 Classical Techniques

By analogy with Section 2.6, it is reasonable to assume solutions of the


form
y(k) = €* (2.12-2)

where r is a constant to be determined.


Example 2.12-1. Solve Example 2.10-1 for /' as a function of k.

The equation for the rth mesh in Fig. 2.10-1 is

-R2i(k - 1) + 2(R, + R2)i(k) - R2i(k + 1) = 0 (2.12-3)

There must be two independent solutions to this second order difference equation. For
the assumed solution i{k) = erk,

[-R2e-r + 2(R1 + Ro) - R2er]erk = 0

The expression in the brackets, which is a polynomial in er, is zero if

R1 + R2
cosh r
R*
Since cosh ( — 0) = cosh (0), the two allowable values of r are given by r = ±0 where

+ R2
(2.12-4)
cosh 0 =
R*
The general solution is
i(k) = Cxckd + Co€~k6

The two boundary conditions needed to evaluate the arbitrary constants Cx and C2
are found by examining the first and last meshes in Fig. 2.10-1.

(Ri + RM0) - R-A1) =


(2.12-5)
iym) = 0

Using the second boundary condition, Cxem0 + C2crmQ = 0; hence

Co = -Cxe27n0
and
i(k) = CA^9 - e2mee~ke] = —2Cxcm9 sinh (m - k)0

Using the first boundary condition,

(R1 + /?.,)[-2C^mQ sinh md] - R2[-2C1em0 sinh (m - 1)0] = e


Since
/?i + R2 = R2 cosh 6
and
sinh (jn — 1)0 = sinh mO cosh 0 — sinh 0 cosh mO

the last equation becomes

2CiR2em®[ — sinh mO cosh 0 + sinh mO cosh 0 — sinh 0 cosh mO] — e


Thus
—e
2C1e”10 =
Ro sinh 0 cosh mO
Sec. 2.12 Solution of Difference Equations with Constant Coefficients 85

The final answer is then


e sinh (m — k)Q
m = (2.12-6)
R% sinh 0 cosh mQ

for k = 0, 1, 2, . . . , m, where 0 is given by Eq. 2.12-4. Example 3.7-3 gives the solution
for i as a function of t and k when the resistance R2 in Fig. 2.10-1 is replaced by a
capacitance C.

While the form of the assumed solution given in Eq. 2.12-2 happens to
be the most convenient one for Example 2.12-1, it is not the form that is
usually assumed. Recall from the example that the allowable values of r
were the roots of a polynomial in er. If p = €r, the assumed solution
becomes
y(k) = pk (2.12-7)

where (3 is a constant to be determined. Substituting Eq. 2.12-7 into


Eq. 2.12-1 and using the fact that Empk = fmfk, the following character¬
istic or auxiliary equation results.

anPn + dn-ijdn_1 + * * * + + a0 = 0 (2.12-8)


In practice, this equation is written down directly from an inspection of
the difference equation, Eq. 2.12-1.
If the n roots of the characteristic equation are distinct and are denoted
by i^, p2,. . . , pn, the most general solution of the homogeneous difference
equation is
yH = CAk + CAk + • • • + CJk (2.12-9)

If the ftfs are distinct, it can be shown that the individual solutions
yx = Pf, y2 = p2k, . . . , yn = pk do satisfy Eq. 2.11-5 and are therefore
independent.10 If a root, for example, is repeated m times, then the
most general solution is

vh = cxp\ + cm\ + * • • + cmkm-Ti + cm+1pkm+1 + • • • + cjkn


(2.12-10)
Any complex roots of a characteristic equation with real coefficients must
occur in complex conjugate pairs. If there are first order roots at = peod
and p2 = pe~jd, where p and 6 are real numbers, the solution contains the
terms
C\Pi + C2/?2fc = Pk[A cos Ok + B sin Ok]
= Cpk cos (6k + </>) (2.12-11)

where A, B, C, and f are arbitrary real constants.


Roots of the characteristic equation at p = 0 deserve special mention.
If £0 = 0, ax 0, and an ^ 0 in Eq. 2.12-1, the characteristic equation
has a first order root at the origin. Since the order of the difference
86 Classical Techniques

equation is only n — 1, and the order of the characteristic polynomial is n,


the root at the origin is extraneous and should be disregarded. Higher
order roots at = 0 should likewise be disregarded.

Nonhomogeneous Difference Equations—Method


of Undetermined Coefficients

Consider the nth order difference equation

(anEn + a^E"-1 + • • • + a0)y(k) = F(k) (2.12-12)

whose solution has the form


y(k) = yH -hyp (2.12-13)

The complementary solution yH is found by solving the related homo¬


geneous equation, Eq. 2.12-1. The particular solution yP can be found
by the same two methods used for differential equations, i.e., undetermined
coefficients, and variation of parameters.
The method of undetermined coefficients can be used only when the
repeated application of the operator E to the forcing function F(k)
produces a finite number of linearly independent terms. F(k) may be a
polynomial, exponential, sinusoidal, or hyperbolic function (e.g., km + • • •
+ bxk + b0, clc, sin Ok or cos Ok, sinh Ok or cosh Ok), or it may be a
product or linear combination of such functions. The solution is assumed
to be a linear combination of the terms in F(k), F(k +1), F(k +2), . . . ,
each term being multiplied by an undetermined constant. The undeter¬
mined constants are chosen so that the assumed solution satisfies Eq.
2.12-12 for all values of k.
If a term in F(k), F{k + 1), F(k + 2), . . . has exactly the same form
as a term in yH, a modification must be made in the form of the particular
solution that would otherwise be assumed. All the terms in yP corre¬
sponding to the duplicated term in yH must be multiplied by the lowest
power of k that will remove the duplication.
Example 2.12-2. Find the general solution to
(.E2 + 2E+ 1 )y(k) = k(-\)k

The characteristic equation is


£2 + 20 + 1 = 0

so = /?2 = — 1, and yH — Cx(— l)fc + C2k{— l)fc. Although normally yP =


[Ak + l)fc, the double root of the characteristic equation at —1 requires

Vp = [Ak3 + M2](-l)*
Then
EyP = —[A{k + l)3 + B(k + l)2](-l)fc
E2yP = [A{k + 2)3 + B(k + 2)2](— l)fc
Sec. 2.12 Solution of Difference Equations with Constant Coefficients 87

Substituting these expressions into the left side of the difference equation yields

[(6k + 6) A + 2B](-l)k = k(-l)k

so that A = i and B = — \. The complete solution is

y(k) = [Ct + C2k - W + ik*](-l)k

Example 2.12-3. Solve Example 2.10-2 for i as a function of t and k.


The response of the circuit in Fig. 2.10-2 during the &th cycle is given by

-j- + 4 = 1 for 0 < t < 1


' (2.12-14)
di2
— + 4 = 0 for 1 < t < 2
dt

where the subscripts 1 and 2 denote the first and second halves of the cycle, respectively.
The solutions of these differential equations are

iff, k) — 1 + Cx(4)e_t for 0 < t < 1


(2.12-15)
iff, k) = C2(k)e~t for 1 < t <2

The factors Cx and C2 are constants with respect to t but may be functions of k, the
cycle under consideration. Since the current cannot change instantaneously,

4(1, k) = 4(1, k) and 4(0, k + 1) = 4(2, k)

Since the current is zero at the start of the first cycle,

4(0, 1) = 0

Inserting each of these three conditions into Eq. 2.12-15 gives, respectively,

C2(k) = CM + e (2.12-16)
1 + Cffc + 1) = C2(k)e-2 (2.12-17)
Cx(l) = -1 (2.12-18)

Equations 2.12-16 and 2.12-17 yield the first order difference equation

Cffc + 1) - e~2Cffc) = -1 + e-1

The complementary solution is


Cffk) = ce 2fc
Assuming a particular solution of
Clp(k) = a

and substituting it into the difference equation yields

0(1 - e-2) = -1 + e-1


Hence
-1
a =
1 + e-1
The complete solution is

Cffc) = Ce —
1 + e-1
88 Classical Techniques

By Eq. 2.12-18,

1 + e
Finally,
e + e2(1-fc)
Cx(k) = -
1 + €
and
€ + e2(1-*> ,
i1(7, k) = 1--—- e~* for 0 < t < 1
1 + e
e2 _ e2(l-fc)

4(4 k) = for 1 < t < 2


1 + e
If k approaches infinity, the steady-state response is seen to be

e
4(0 = 1 - for 0 < t < 1
1 + €
(2.12-19)

4(0 = for 1 < t < 2


1 + 6

The number of cycles needed to reach the steady state is given by


-2(1—fc)
«e

Nonhomogeneous Difference Equations—Variation of Parameters

If the complementary solution has been found, variation of parameters


is an explicit method of finding yP regardless of the nature of F(k).
Whereas the application of the method to differential equations resulted
in the integration of a known function of t, its application to difference
equations results in the summation of a known function of k.
Consider the first order equation

= F(k) (2.12-20)
The complementary solution has only the single term yH = Cy^k). The
particular solution is assumed to have the form

2Ip = fi(kMk) (2.12-21)


Substituting Eq. 2.12-21 into Eq. 2.12-20 gives
axy(k + \)yx(k + 1) + a^ifyy^k) = F(k)
Rearranging,
aY[y{k + 1 )y1(k + 1) - yifyy^k + 1)]
+ Kk)[aiyi(k + 1) + ¥i(£)] = F(k) ,
The expression in the first bracket is yx(k + 1) Ay(k), while the second
bracket vanishes, since y^k) is a solution of the related homogeneous
Sec. 2.12 Solution of Difference Equations with Constant Coefficients 89

equation. Therefore
aiVi(k + 1) Afx{k) = F{k)
whose solution, by Eq. 2.9-22, is

fi{k) = A
-i F(k) f F(n - 1) (2.12-22)
Lfli2/i(fe + l)-l «i2/i(n)
Example 2.12-4. Find the complete solution to
(-3)1
(£ + 3 M*) =
/:(A: +1)
The complementary solution is

yu = C(—3)fc, i.e., 2/i(/c) = (-3)fc

The assumed form of the particular solution is

, __
yp = jU(/c)(-3)k
By Eq. 2.12-22,
(_3 1
(-3)" ,4 -3(« - 1 )n
By Eq. 2.9-23,
1
ffk) = -i

and the complete solution is

1 1 1
y(k) = C-T — (— 3)fc = Ci + — (-3)fc
3 3 k_ 33K_
k_

Next consider the second order equation

(a2E2 + axE + a0)y(k) = F(k) (2.12-23)

whose complementary solution has the form

Vh = Ctfxik) + C2y2(k)
The particular solution is assumed to be

Vp = PiWy-SJc) + ju2(k)y2(k) (2.12-24)

One of the two conditions needed to evaluate and fx2 is that Eq. 2.12-24
must satisfy Eq. 2.12-23. By analogy to Eq. 2.6-14, the second condition
is arbitrarily chosen to be
Vi(k + 1) Anfk) 4- y<i(k + 1) Aju2(k) = 0 (2.12-25)
Since
ix iff + 1) = fxfk) + A fxfk)

it follows from Eq. 2.12-25 that


EyP = yx{k + l)|>i(fc) + + y2(k + 1 )[fi2{k) + A y2(k)]
— Vi(k + Ejfxffk) -f y2{k -f 1 )fx2(k)
90 Classical Techniques

and
E2Vp = Vi(k + 2 )[yx(k) + A^*)] + y2(k + 2)[y2(k) + ky2(k)}
1
Substituting these expressions into Eq. 2.12-23 and rearranging gives

a2hh(k + 2) Ay^k) + y2(k + 2) Ay2(k)\


+ yMfay^k + 2) + axyx{k + 1) + a^y-^k)]
+ ju2(k)[a2y2{k + 2) + axy2{k + 1) + a0y2(k)\ = F(k)

Since yx(k) and y2(k) are solutions of the related homogeneous equation,

Vi(k + 2) Sujk) + y.Jk + 2) Su2(k) =


m (2.12-26)
a,
Solving Eqs. 2.12-25 and 2.12-26 simultaneously,
-y2{k + 1 )F(k)
A/^i(/c) =
a2[yi(k + 1 )y2(k + 2) - yx(k + 2)y2(k + 1)]
(2.12-27)
_yi(k + l)E(/c)_
A y2(k) =
ao[yi(k + 1 )y2(k + 2) - yx(k + 2)y2{k + 1)]
so
n=k
y2(n)F(n - 1)
^i(fe) = -1
+ 1) - Vi(n + l)2/2(»)]
(2.12-28)
”"'r Vi(n)F(n — 1)
=2
a2[*/i(«)2/2(« + 1) - Vi(n + k)y2(n)]
If y1 and y2 are two independent solutions to the homogeneous equation,
as was assumed, then Eq. 2.11-5 proves that the denominators in Eqs.
2.12-27 and 2.12-28 do not vanish. Thus (k) and y2(k) can always be
explicitly found.
Example 2.12-5. Solve Example 2.12-2 by variation of parameters.

(E2 + 2E+ \)y(k) = k(-\)k


Let
Vp = fi^y^k) + yz{k)y2(k)
where
Vi(k) = (-1T, y2(k) = k(-l)k
Note that
Vi(k)y2(k + 1) - yx{k + 1 )y2{k) = -1
By Eq. 2.12-28,
n=k
tH(k) = 2 »(-!)"(«-D(-i)”-1
n—k n=k—1
= — ^ n(n — 1) = —k{k — 1) — ^ n(n — 1)

Using Eq. 2.9-24, with m — 2,

fix{k) = -k(k - 1) - }k(k - 1 )(k - 2) = -Uk* - k)


Sec. 2.12 Solution of Difference Equations with Constant Coefficients 91

Similarly,
jl = ]c 71 — Jc
Mk) = -2 (-!)“(» - 1X-1)”-1 =2(n - 1) = ik(k - 1)
Then
yP = -Kk3 - k)(-l)k + K£2 - k)k(-iy

= [ik* - W2 + ^](-i)fc
The complete solution is

y(k) = [C1 + C2k - \k2 + i/r3]( —l)fc


agreeing with Example 2.12-2.

Consider finally the nth order difference equation, Eq. 2.12-12, whose
complementary solution has the form

Vh = Ciyfk) + C2y2{k) + * * * + Cnyn(k)


The assumed particular solution is
Vp = Ei(k)yi(k) + ^(k)y2{k) + * • • + nn(k)yn(k) (2.12-29)
The following n — 1 conditions are arbitrarily selected to simplify the
solution.
Vi(k + 1) A^fk) + y2{k + 1) Ay2(k) -f • • * -f- yn(k + 1) A^ffk) = 0
Vi(k + 2) Ayi(k) + y2{k + 2) Ay2(k) + * • * + yn(k + 2) Ann{k) = 0
. (2.12-30)
yffk + n — 1) Ajufk) + y2(k + n — 1) Aju2(k) + • • •
+ Vn(k + n - 1) Afiffk) = 0
Substituting Eq. 2.12-29 into Eq. 2.12-12, and making use of Eqs. 2.12-30,
there results

yfk + n) A/ufk) + y2{k + n) Ay2(k) + • • • + yn(k + n) Apn(k) =

(2.12-31)
Solving Eqs. 2.12-30 and 2.12-31 in terms of determinants by Cramer’s rule,
Wni(k + 1 )F(k)
(2.12-32)
a„W(k + 1)
or
r=k
Wni(r)F(r - 1)
ft =1 anW(r)
(2.12-33)

where W(k) is the determinant


Vi(k) yt(k) yJH)
Eyfk) Ey2{k) Eyn(k)
W{k) = (2.12-34)

En~'yi{k) E^yfk) • • • En-'yn{k)


92 Classical Techniques

and where Wni(k) is the mth cofactor. By Eq. 2.11-5, W(k) does not
vanish if yx{k) through yn(k) are independent solutions of the homogeneous
equation.
Although the summation involved may be difficult to express in closed
form, Eq. 2.12-33 gives an explicit solution for yP once yH is known.
Unlike the method of undetermined coefficients, the variation of param¬
eters method is applicable for all forcing functions and can be extended
to the solution of difference equations whose coefficients are functions
of k.
\

The Physical Significance of the Complementary


and Particular Solutions

Assume that the input-output relationship of a system is given by the


difference equation
A(E)y(k) = B(E)v(k) = F(k) (2.12-35)

The form of the complementary solution depends only upon the param¬
eters of the system, specifically upon the roots of the characteristic
equation
A(0) = 0
If the roots f$2, . . . , f$n are distinct, the complementary solution

yH — + ^2p2k + ' ' *

increases without limit as k approaches infinity only if one of the roots


has a magnitude greater than unity. If the roots of the characteristic
equation are plotted in a complex plane, roots inside and outside of a
unit circle drawn about the origin correspond to terms that approach zero
and that increase without limit, respectively, as k approaches infinity.
Simple roots on the unit circle yield terms of constant magnitude, but
repeated roots on the unit circle again yield terms that increase without
limit.
Since the complementary solution is the complete solution when there
is no external source, it is also called the free or unforced response. The
system is unstable if the free response increases without limit, i.e., if the
characteristic equation has roots outside the unit circle or repeated roots
on the unit circle.
The arbitrary constants in the complementary solution depend upon
the initial or boundary conditions, which summarize the past history of
the system, and upon the input. They cannot be evaluated until both the
complementary and the particular solutions have been found.
Sec. 2.12 Solution of Difference Equations with Constant Coefficients 93

Since the form of the particular solution depends upon the input, it is
also called the forced response. If all the roots of the characteristic
equation are inside the unit circle, then the complementary and particular
solutions are identical with the transient and steady-state solutions,
respectively.

The Particular Solution for the Input zk

Some insight into the relationship between the classical and transform
solutions of difference equations is gained by examining the particular
solution of Eq. 2.12-35, which is written out below, when v(k) = zk.

{anEn + an_xEn~x H-+ axE + a0)y(k)


= (bmEm + • • • + bfE + b0)v(k) (2.12-36)

Assuming that yP(k) = H(z)zk, where H(z) is not a function of k and noting
that EmyP(k) = H(z)zmzk, there results
H{z)A{z)zk — B(z)zk

where A(z) is the function formed from A(E) by replacing the operator
E by the parameter 2. Thus
bmzm + ’ • • + bxz + b0
(2.12-37)
anzn + • • • + axz + a0
H(z) is the system function for a discrete system and can be written down
by inspection of the difference equation. It plays an important role in
the application of the Z transform. Note also that
'yp(k)
H(z) (2.12-38)
- v(k). v(k)=z

Obtaining the Delta Response from the Difference Equation

The delta response, defined in Section 1.9 and denoted by d(k), is the
response of a discrete system initially at rest to

T for k — 0
(2.12-39)
0 for k Z 0

In comparing this equation with those of Chapter 1, remember that the


difference equations of this chapter assume that the time scale has been
adjusted so that T — 1. Equation 2.12-36 must be solved subject to the
input of Eq. 2.12-39 and to the condition that
y(k) = 0 for k < 0 (2.12-40)
94 Classical Techniques

The general technique is illustrated by the following example, the answer


of which is used in Examples 1.9-1 and 3.13-3. The technique is also
applicable for finding the impulse response of a continuous system sub¬
jected to an input v*(t), provided that the response is desired only at
equally spaced, discrete instants.
Example 2.12-6. Find the delta response to

y(k + 2) - 3y{k + 1) + 2y(k) = 2v(k + 1) - 2v(k)

The right side of this equation is zero for k > 1, sq the delta response is the comple¬
mentary solution
y(k) = Ci + C2(2)k for k > 1

To evaluate the two arbitrary constants, the values of y( 1) and y(2) are needed.
Substituting k = —2 in the difference equation gives

y{ 0) — 0 + 0 = 0 — 0

Hence y{0) = 0. For k = —1,

y( 1) — 0 + 0 = 2 — 0
Then y( 1) = 2. For k = 0,
2/(2) - 3(2) + 0 = 0-2

Thus 2/(2) = 4. Using the last two results to evaluate Cx and C2,

2 = Ci + 2C2
4 = Cj + 4C2

so Ci = 0 and C2 = 1. The delta response is

(2k for k > 1


d(k) =
0 for k < 0

2.13 SOLUTION OF LINEAR DIFFERENCE EQUATIONS


WITH TIME-VARYING COEFFICIENTS

Consider the nth. order difference equation

[an{k)En + • • • + ax{k)E + a0(k)]y(k) = F(k) (2.13-1)

As with differential equations, a solution can always be found for first


order equations but cannot in general be found for higher order equations.

The Complementary Solution

Consider the first order, homogeneous equation

ai(k)y(k + 1) + a0(k)y(k) = 0 (2.13-2)


Sec. 2.13 Difference Equations with Time-Varying Coefficients 95

where afik) 0, and afik) ^ 0. This can be rewritten as

y(k + 1) = a(k)y(k) (2.13-3)


Letting k = 0, 1,2,...,
2/(1) = tf(0)2/(0)
y{2) = tf(l)a((%(0)
y( 3) = a(2)a(\)a(0)y(0)
or in general
y(k) = [a(0)a(l) ■ ■ ■ a(k- l)]«/(0)

= «/(0) IT a(n)
n=0

so that the general solution to Eq. 2.13-3 is


qq-jfc_^ qq-Jc

yH{k) = C XJ a(n) = C JJ a(n - 1) (2.13-4)

The lower limit of the indicated product is arbitrary, because any fixed
number of the factors a(0), <?(1), a(2), . . . may be combined with the
arbitrary constant C.
The solution to a homogeneous equation of order greater than one
cannot in general be found in terms of elementary functions, since the
procedure based on Eqs. 2.12-7 and 2.12-8 is invalid when the coefficients
are functions of k. It is sometimes helpful to know that, if all but one
of the independent solutions are known, the remaining one can then be
found.11
As with differential equations, there are a number of special cases
which do have an explicit solution. If the equation can be put in the
form

anf{k + n)y{k + «) + •••+ afi{k + l)y(k + 1) + a0f(k)y(k) = 0

where the a - s are constants, the substitution z(k) = f(k)y(k) reduces it


to a difference equation with constant coefficients. The procedure is
somewhat analogous to that used for Euler’s differential equation, but
the change of variable is with respect to the dependent rather than the
independent variable. This is usually the case when a varying equation
is solved by a change of variable.!
Sometimes, the operator can be expressed in factored form, in which
case the difference equation can be reduced to equations of lower order.
For example,
[,kE2 + (k2 + k + \)E + k]y(k) = 0
f Certain nonlinear equations, such as y2(k + 2) 4- y2(k + 1) 4- y2(k) = 0, can like¬
wise be reduced to a linear difference equation by a change of the dependent variable.
96 Classical Techniques

may be written as
(kE + 1)(£ + k)y(k) = 0

which is equivalent to the two first order equations

(kE + 1 )z(k) = 0
(E + k)y(k) = z(k)

each of which may be explicitly solved.


Many of the special cases that have been systematically investigated
can be classified according to the nature of the coefficients, such as the
case of periodic coefficients.12 Considerable work has been done in
obtaining solutions in the form of an infinite series.13

The Particular Solution

The variation of parameters method, including Eqs. 2.12-22, 2.12-28,


and 2.12-33, is valid whether or not the af s are functions of k. In the
right side of these three equations, the symbols ax, a2, and an are under¬
stood to be ax(n — 1), a2(n — 1), and an(r — 1), respectively. Although
yP can always be found once yH is known, yH cannot in general be
determined.
Example 2.13-1. Find the complete solution to

[(* + 1 )E - k]y(k) = k + 1

By Eq. 2.13-4, the complementary solution is


k-1 n (k - 1)! C
Vh = CJ = C
nt1 n + 1 k\ k

The particular solution is assumed to have the form yP — /i(k)/k. By Eq. 2.12-22,

n=k F(n - 1) k(k + 1)


/«(*) = 2 a^n — \)yx{n) = 2
n=l
n
2

where the last expression follows from the formula for an arithmetical progression or
from Eq. 2.9-24. The complete solution is

C k + 1
y(k) = - +
k 2
REFERENCES

1. A. Ralston and H. S. Wilf, Numerical Methods for Digital Computers, John Wiley
and Sons, New York, 1960, Part III.
2. E. L. Ince, Ordinary Differential Equations, Dover Publications, New York, 1956,
Section 5.22.
3. Ibid., Section 7.4.
4. Ibid., Section 15.7.
Problems 97

5. B. K. Kinariwala, “Analysis of Time-Varying Networks,” IRE Intern. Conv.


Record, Vol. 9, Pt. 4, March 1961, pp. 268-276.
6. K. S. Miller, “Properties of Impulsive Responses and Green’s Functions,” IRE
Trans. Circuit Theory, Vol. CT-2, March 1955, pp. 26-31.
7. D. Graham, E. J. Brunelle, Jr., W. Johnson, and H. Passmore, III, “Engineering
Analysis Methods for Linear Time Varying Systems,” Report ASD-TDR-62-362,
Flight Control Laboratory, Wright-Patterson Air Force Base, Ohio, January 1963,
pp. 126-127.
8. C. R. Wylie, Jr., Advanced Engineering Mathematics, Second Edition, McGraw-
Hill Book Company, New York, 1960, p. 137.
9. M. P. Gardner and J. L. Barnes, Transients in Linear Systems, John Wiley and
Sons, New York, 1942, Chapter IX.
10. Wylie, op. cit.. Section 5.5.
11. H. Levy and F. Lessman, Finite Difference Equations, Sir Isaac Pitman and Sons,
London, 1959, Chapter 6.
12. T. Fort, Finite Differences and Difference Equations in the Real Domain, Oxford
University Press, London, 1948, Chapters XIII and XIV.
13. L. M. Milne-Thomson, The Calculus of Finite Differences, Macmillan and Company,
London, 1933, Chapters 14-16.

Problems

2.1 If y1 and y2 denote two solutions of a linear, second order differential


equation, prove that these solutions are independent if and only if the
Wronskian of Eq. 2.4-3 does not vanish.
2.2 Solve the following differential equations. In part (e), the substitution
y = (\lx)(dx/dt) is helpful.
(a) (p2 + 3/7 + 2)y = 0, y(0) = 1, y(0) = 4
(b) (p2 +3p + 2)y = sin (t 4- n/4), y(0) = 1, ;y(0) = 4
(c) (p3 + p2 + p + l)y = ur1, 2/(0) = 2/(0) = 2/(0) = 0
(d) (tp2 + 2p)y = y{0) = 0, y{0) =i
0) py - y2 = l, y{0) = o
2.3 Find the differential equation describing Fig. P2.3 by first determining
the system function H{s).
2 h 2 il

Fig. P2.3
98 Classical Techniques

2.4 Find the most general solution of the following homogenous differential
equation by noting that one solution is y^t) = t.

(\t3p2 + tp - \)y = 0

2.5 Solve Example 1.6-1, using only the classical solution of differential
equations.
2.6 The circuit in Fig. P2.6 is originally operating in the steady state with the
switch K open. If the switch closes at t = 0, find expressions for the
currents for t > 0. \

112 2h
VA-rWfx
+
4 volts -=■
2a 112
< •— W\r AAA--

+
2 volts —

Fig. P2.6

2.7 Find the impulse response of the systems described by the following
differential equations.

(a) (/ + 2p + 2)y = (p2 +3\p + 3)v

(b) (p3 + 2p2 + 3/> + 2)y = (p + 4)i;

(c) (p + t)y = v
(d) (tp2 + 2p)y = v

2.8 Solve Problem 2.2d by using the superposition integrals of Chapter 1 and
the result of Problem 2.1d.
2.9 Find an expression for h(t, X) if a system is described by (ap + \)y =
(ap + \)v, where a is a constant, but where a may be a function of t.
2.10 Prove Eqs. 2.9-6, 2.9-11, and 2.9-13.
2.11 Sum the series

i —-—
"(n + 1)(« + 2)

2.12 Starting with Eq. 2.9-11 a, derive the formula for summation by parts:

k k
2 u{n) A ( ) = [u(n)v(n)]^+1 —
77 77 2 v(n + 1) &u(n)
Problems 99

2.13 Find the most general solution of the following difference equations.
Which of these equations represent stable systems? Rewrite these equa¬
tions in terms of the difference operator A.

O) y{k + 2) + 5y(k + 1) + 6y{k) = 0


(b) y{k + 2) — 2(sinh 6)y(k + 1) — y(k) = 0
(c) y{k + 2) — Ay{k + 1) + y{k) = 0, where —1 < A <1

2.14 Solve the following difference equations, and comment on the stability
of the systems they represent.

(a) (E2 + E + i)y(k) = sin (A:tt/2), 2/(0) = 0, 2/(1) =1


(b) (A2 + A + IMA:) = 0, y{0) = 1, A^O) = 0
(c) (E2 + 3E + 2)y{k) = (-l)fc, 2/(0) = y{ 1) = 1
(<d) {E2 + 2E + l)y(k) = (E + 2)v(k), y(k) = v(k) = 0 for k < 0,
v(k) = 1 for k > 0
(e) {E2 + E + 1 )y(k) — v(k), y{k) = v{k) = 0 for k < 0,
v(k) = ( —1)& for k > 0.

2.15 Determine the delta response of the system described by

(E3 + 2 E2 + 3 E + 2 )y{k) = (E + 4)v(k)

2.16 Solve Problem 2.14 by the superposition summations of Chapter 1.


2.17 Find the complete solution of the following difference equations.

(a) (E - k)2y(k) = k!
0b) [E2 + (2k + \)E + k2]y(k) = k\
Transform Techniques

3.1 INTRODUCTION

In Chapter 1, continuous signals are resolved into sets of singularity


functions. Once the response of a linear system to a singularity function
is known, the response to an arbitrary input can be found by superposition.
In Section 3.2, continuous signals are decomposed into sets of sinusoidal
functions by means of the Fourier series and integral. A system’s response
to a sinusoidal function can be easily found by a-c steady-state theory,
after which the response to an arbitrary input again follows by super¬
position. Because of certain limitations in the Fourier methods, the
Fourier integral is generalized in Section 3.3 to obtain the Laplace trans¬
form. The properties and applications of the Laplace transform are
considered in some detail, and the transform methods are related to the
convolution integrals of Chapter 1 and the differential equations of
Chapter 2.
Section 1.3 suggested that the time domain and transform techniques
are only two special cases of the general problem of decomposing signals
into sets of elementary functions. This matter is considered further in
Section 3.10.
The Z transform of Section 3.11 can be conveniently used for systems
with discrete inputs, provided that the output is desired only at certain
discrete instants. The Z transform is related to the superposition surm
mations of Chapter 1 and the difference equations of Chapter 2. The
modified Z transform of Section 3.14 permits the calculation of the output
at any instant of time.
100
Sec. 3.2 The Fourier Series and Integral 101

f2(t)

fM

3.2 THE FOURIER SERIES


AND INTEGRAL

Periodic and nonperiodic functions can be decomposed into sets of


sinusoidal functions by the Fourier series and integral, respectively. A
function is periodic, with a period T', if it is continually repeated every
T seconds over all values of t. Of the functions shown in Fig. 3.2-1, only
f^t) is periodic.
102 Transform Techniques

An infinite series, known as the Fourier series, may be written for any
single-valued periodic function which has in any one period only a finite
number of maxima, minima, and discontinuities. The area underneath
any one cycle of the curve is required to be finite. The Fourier series for a
function /(f) with a period T is
00

/(() = a0 + 2(an cos na>0t + bn sin noj0t) (3.2-1)


n=1

where co0 = lirjT is called the angular frequency of the first harmonic,
and where 1 ot/2
a0 — /(0 dt
T J-T/2
2 0TI2
an /(f) cos nco0t dt (n 0) (3.2-2)
T J-T/2
2 fT/2
bn = - /(f) sin nco0t dt
T J-T/2
The series converges to /(f) for values of t at which the function is
continuous. At a finite discontinuity, the series converges to the average
of the values of/(f) on either side of the discontinuity. The series con¬
verges uniformly for all f, if/(f) satisfies the restrictions of the previous
paragraph and in addition remains finite. If only a finite number of terms
are taken, the corresponding finite Fourier series approximates /(f) with
the least mean square error.
Since two trigonometric terms of the same frequency may be combined
into a single term with a phase angle, Eq. 3.2-1 may be rewritten as

(f) = a0 + 2 An cos (nco0t + 6n) (3.2-3)


n=1
where
An ~ V^n
(3.2-4)
0„= -tan

An and 0n represent the magnitude and phase angle (with respect to a pure
cosine wave) of the nth harmonic.

The Complex Form of the Fourier Series

The complex form is easily obtained from the trigonometric form by


use of the identities
cos + «-*)
(3.2-5)
sin <j> = — («** - £-*+)
2j
Sec. 3.2 The Fourier Series and Integral 103

Equation 3.2-3 becomes


oo

m= 2 n=— oo
c,**-** (3.2-6)

where
1 fTI 2
C« = f dt (3.2-7)
1 J-T/2

Equations 3.2-6 and 3.2-3 are related by c0 = a0 and

C, = ~ (n * 0)

so that the magnitude of cn is one-half the magnitude of the wth harmonic,


and the angle of cn is the phase angle of the harmonic relative to a cosine
wave. Since cn completely defines f(t) in frequency domain terms, it is
called the frequency spectrum off(t).
Example 3.2-1. In the circuit of Fig. 3.2-2a find the steady-state component of the
output voltage e0(t).

eo (0
104 Transform Techniques

For the calculation of the steady-state response, the square wave of input voltage
may be assumed to exist for both positive and negative values of time. Then the
frequency spectrum of the input, found from Eq. 3.2-7 with co0 = n, is
1 _ £—jmr

Ijnrr

cn = 0, n even
1
c„ = -— , n odd V

JH7T

Thus
1 ±00 1

e(0 = z + V -ein7Tt
z n=±T±*,'.j'n7T

By a-c steady-state circuit theory, the response to ejU)t is

€i(ot €jUot-tan—la,)

1 +j(JD V1 + (JO2

By superposition, and using Eq. 3.2-5,


co 2
e0(t) — y —. — cos (mrt — tan-1 mr)
M2
The broken line in Fig. 3.2-2b is the sum of the first two nonzero terms in the last
expression. The exact steady-state output, representing the entire infinite series, is

-e~l for 0 < t < 1


1 +€
e»(,) =
-e~l for 1 < t < 2
U +e
and is shown by the solid line.f

The preceding example contains three essential steps: finding the


frequency spectrum of the input, multiplying by the a-c steady-state
transfer function to obtain the output frequency spectrum, and construct¬
ing the output as a function of time. The frequency spectrum of /(f)
is often denoted by F(co), rather than by cn. Then, for a fixed linear system
with input v(t) and output y(t),

Y(co) = V(co)H(jco) (3.2-8)

where H(joo) is the a-c steady-state system function defined in Eq. 2.6-23.
Two limitations of the Fourier series method are that y(t) is expressed
as an infinite series rather than in closed form, and that only the steady-
state component is obtained. Because of this, the method is generally
t This closed-form solution is not obtained directly from the infinite series, but from
the classical solution of Example 2.12-3. e0 = di/dt, where i is given in Eq. 2.12-19.
Sec. 3.2 The Fourier Series and Integral 105

used only when the frequency spectra are desired, as in wave filters, and
not to calculate the output waveform as a function of time.

The Fourier Integral

The Fourier series of Eqs. 3.2-6 and 3.2-7 can be rewritten with co = nco0.
'T! 2 i 'T/2
-n 1
, dt = — \ f(t)e-j0it dt
Aco T A00 J-T/2 2tt J-T/2
00

m= i ej(ot A to
C0=— 00 Aco
The symbol Aco denotes the spacing between lines in the frequency
spectrum, and it is equal to co0 = 2it/T.
A nonperiodic function of time can be obtained from a periodic one by
letting T approach infinity. Since Aco then approaches zero, the formerly
discrete frequency spectrum becomes a continuous one, containing all
possible frequencies. The quantity g(co) = cn/Aco is introduced because
cn itself vanishes as T approaches infinity. Taking the limit as T approaches
infinity and as Aco becomes dco,
I f00
g(*>) = r- ft*-’ dt
2rr J—oo
/* 00

m = g(o>)*iwt dco
J — CO

To obtain the form most commonly used, let G(co) = 2iTg(co).

G{oS) = [m f{ty-^dt = ^[fm


J — cc
(3.2-9)
1 f00
f(t) = — G(co)e,“* dm = &~\G(co)]
277 J— 00
The last two equations are the direct and inverse Fourier transform,
respectively. G(co) is called the frequency spectrum of f(t), and the
expression for /(/) is often called the Fourier integral. The reader may
notice that the Fourier integral has the form of Eq. 1.3-3 with 2 = co,
and k(t, X) = e1(0t.
Equations 3.2-9 can be used for nonperiodic signals in the same manner
that the complex Fourier series is used for periodic signals. The conditions
necessary for the existence of a Fourier series carry over to the Fourier
integral. The function of time must be single-valued, with a finite number
(* oo

of maxima, minima, and discontinuities, and with |/(0I dt remaining


00
finite.
106 Transform Techniques

fit)
)\

1
L

L L
2 2
(a)

|G(to)|

|G(co)|
A

>■ co
0

(c)

Fig. 3.2-3

Example 3.2-2. Find the frequency spectrum of the rectangular pulse shown in
Fig. 3.2-3a.
'Ll 2
1 sin (coL/2)
G(co) - | - e-jcot dt =
-LI2 ^ coL/2

G(co), which is real because /(V) is an even function, is illustrated in Fig. 3.2-3b. In the
limit as L approaches 0, fit) becomes the unit impulse, and G(co) is shown in part (c)
of the figure. Thus the unit impulse contains all frequencies in equal strength.
This example demonstrates the principle of reciprocal spreading. The narrower the
function of time is made, the more its frequency spectrum spreads out, and the greater
is the bandwidth required to reproduce it faithfully. If the sharp corners in the function
Sec. 3.2 The Fourier Series and Integral 107

of time are smoothed, the high-frequency components in its frequency spectrum are
reduced. For example if, f(t) = (1V27r)e~t2/2, which is the Gaussian-type pulse of
unit area shown in Fig. \A-la, G(co) = e_c°2/2.

Example 3.2-3. Find the Fourier transform of the unit step function.
00

It should be expected that this example cannot be solved since | dt is not


finite. If Eq. 3.2-9 is used, ■ 00

f* 00
t 00
G(co) €-ja)t dt =
Jo L-jnJ
cannot be evaluated, since the upper limit is not uniquely defined, i.e., the defining
integral does not converge. If the reader is persistent, he might note that
100
€-at€-jmt gt _
1
J5'[e_a< for t > 0] (a ^ 0)
'0 a +JW
which, as a approaches 0, reduces to 1 /jco. But this limit is not, mathematically, the
Fourier transform of the unit step function. To find the function of time corresponding
to G(co) = 1/yco, Eq. 3.2-9 gives

cos cot sin cot


m=j , —:-b - dco
2rr J— qo JO) L yco co

sin cot j —i for t < 0


dco =
co \ i for t > 0

The purpose of this example is to point out that some of the very common functions of
time do not have Fourier transforms. The attempted use of the convergence factor
€~at suggests the heuristic derivation of the Laplace transform in the next section.

When analyzing a fixed linear system, it is convenient to denote the


Fourier transform of the input v(t) and output y{t) by V(co) and Y(co),
respectively. Using the system function defined in Eq. 2.6-23, the response
to v(t) = ej(ot is y(t) = H(jco)eja}t. Then, by the superposition principle,

Y(co) = V(co)H( jco) (3.2-10)


and
y(t) = .^[VUoWCjoj)} (3.2-11)

In contrast to the Fourier series, the Fourier integral of Eq. 3.2-11 gives
the complete response in closed form, provided that the system contains
no initial stored energy.
Example 3.2-4. For the circuit of Fig. 3.2-4o, plot the frequency spectrum of the
input and output, and find e2(t). Assume that there is no stored energy in the capacitance
at t = 0.
1
Ex(co) jajt dt
1 + jco

E2(co) = Ejco) 1 - 1
1 + jco (1 + jco)2
108 Transform Techniques

$E2(u)

(b)

Fig. 3.2-4

The frequency spectra are plotted in part (b) of the figure. The evaluation of

00
1
eft) = — (ho
2tt oo (1 +jcoy

involves a difficult integration. Tables of Fourier transforms do exist,1 and the use of
complex variable theory is often helpful. It can be shown by the methods of Section
3.6 that
' j CO
-ja>t
1
eft) = d(joj)
Irrj oo (1 + /to) ■
Sec. 3.3 The Laplace Transform 109

where the integration has been carried out with respect to the complex variable 5. The
details of the integration are not pursued further here, but the reader should realize
that the direct evaluation of the inverse Fourier transform is seldom easy.

3.3 THE LAPLACE TRANSFORM

The Laplace transform largely overcomes the chief difficulties encoun¬


tered in the use of the Fourier transform. The difficulty with the con¬
vergence of the direct transform can be remedied by replacing e~jcot by
€-(<r+]co)t jn £q. 3 2-9. This inserts a convergence factor e~at into the
integrand of G(oo) and permits the integral to be evaluated in some cases
where it would not otherwise exist. This modification is suggested by the
efforts of the previous section to obtain the Fourier transform of the unit
step function.

The Two-Sided Laplace Transform

In Eq. 3.2-9, let G(co) = F(jco), and write


-oo
F(jco) (r)€-to‘ dt
J—00
1 riv>
m — F{ja>yat d{ja>)
Z.7TJ J — j 00

In the second expression, the variable of integration has been changed to


jco. Replacing jco by s = a + jco,
r 00
F(S) = I f(t)t~stdt =&n [/CO] (3.3-1)
J — 00
and , r„+i«
fit) = T. F(sy*ds = ^u'iFis)] (3.3-2)
2lTj Ja—joo

where and cSfn-1 stand for the direct and inverse two-sided Laplace
transform.
This presentation is a heuristic one, illustrating the relationship between
the Laplace and Fourier transforms, rather than emphasizing mathe¬
matical rigor. In Eq. 3.3-1, the value of a = Re s is chosen to make the
integral converge, if possible. In Eq. 3.3-2, it is understood that the
integration is carried out with a within the same range that ensures the
convergence of Eq. 3.3-1. Since s is a complex variable, Eq. 3.3-2 involves
integration in the complex plane. A detailed discussion of this is postponed
until after a review of complex variable theory. Two examples of the use
of Eq. 3.3-1 are presented below.
110 Transform Techniques

Example 3.3-1. Determine the Laplace transform of

for t < 0
/(') =
for t > 0, for real a
' 00 00
-is—a)t
zate-st rff
F(s) = _

_-(s - a)_|0 s — a

with a region of convergence given by a > a. The function in this example does not have
a Fourier transform if a is positive but does have a Laplace transform.

Example 3.3-2. Determine the Laplace transform of

for t < 0, for real a


m =
for t > 0
-—(s—a)t

F(s) — — eate-st fa —

■ oo s — a — 00 s — a

with a region of convergence given by o < a.


It appears from these two examples that two different functions of time have the
same Laplace transform. The two regions of convergence, however, are mutually
exclusive. The region of convergence must be specified in order to determine which
function of t corresponds to the function (s — a)-1.

It should be clear that the Fourier and Laplace transforms of a non¬


periodic function are essentially identical if the range of convergence for
a includes a = 0. In such a case, the Fourier transform is found by
replacing s by jco in the Laplace transform. The phenomenon of two
different functions of time having the same Fourier transform never
occurs. If a is positive, ^^[iKjco — a)] must be the function of Example
3.3-2 and not that of Example 3.3-1. The unbounded function has no
Fourier transform and could not therefore be the correct answer. It is the
introduction of the convergence factor e~at that results in the uniqueness
problem.

The One-Sided Laplace Transform

It is usually possible to restrict oneself to functions that are zero for


negative values of time. The reason is that the response of a linear system
can be determined for all t > 0 from a knowledge of the input for t > 0
and the energies stored in the system at t — 0. The history of the system
prior to some reference time is adequately summarized by the stored
energies at that time, as far as the system’s future behavior is concerned.
The two-sided Laplace transform of Eq. 3.3-1 is therefore usually re¬
placed by the normal or one-sided Laplace transform of Eq. 3.3-3. The
latter equation simply says that f(t) is zero for negative values of t, or else
Sec. 3.3 The Laplace Transform 111

that f(t) for t < 0 is of no concern and can hence be assumed to be zero.
I* 00
F(S)= /(t)*-1 dt = X[f(t)] (3.3-3)
Jo
A subscript I is not used, since the Laplace transform is assumed to be the
one-sided transform, unless otherwise indicated. The application of
Eq. 3.3 -3 to several common functions of time leads to the results in
Table 3.3-1:f
Table 3.3-1

Region of
fit) for t > 0 F(s) Convergence

u0(t) 1 G > — 00
1
u_ i(0 — a > 0
s
1
t G > 0
72

e~at
i
g > —a
s + a
P
sin fit G > 0
S2 + p2
s
cos fit G > 0
s2 + fS2

e-at sjn P
g > —a
(s + a)2 + p2
s + a
e~at COS pt g > —a
(s + a)2 + p2
e~a t f n—1
1
{n - 1)! g > —a
(s + a)n
n = 1,2,...

The expression for the inverse transform is the same whether the one¬
sided or two-sided transform is used.
■j (V-fj'oo
m = A F(sy‘ ds = y-’[F(S)] (3.3-4)
2ttJ Ja—jco

The integration must still be carried out with a within the range that
ensures the convergence of Eq. 3.3-3.
j A very extensive table appears in Reference 2. Tables of moderate size can be found
in several of the other references.
112 Transform Techniques

fi(t)

hit) hit)

Fig. 3.4-1

3.4 PROPERTIES OF THE LAPLACE TRANSFORM

Several important theorems concerning the existence and uniqueness


of the one-sided Laplace transform may be found in standard references
and can be partially summarized as follows:3 A function f(t) must be
defined for all t > 0, except possibly at a denumerable set of points, in
order for its transform F(s) to exist. Every such f(t), which is piecewise
continuous and which is of exponential order, has a Laplace transform.!
For such functions, the integral in Eq. 3.3-3 converges absolutely for
a > c, where the constant c is known as the abscissa of absolute con¬
vergence.
Two functions of time have the same Laplace transform if and only
if they are identical for all t > 0, except possibly at a denumerable set of
points. In Fig. 3.4-1, all three functions have the same Laplace transform,
(s + l)-1. Except for these trivial cases, however, there is a one-to-one

t A function fit) is of exponential order if some real number k exists suchthat

lim f(t)ekt — 0
t-+ oo

Examples of functions not of exponential order are e'2 and tl. A function of
nonexponential order may or may not have a Laplace transform.
Sec. 3.4 Properties of the Laplace Transform 113

correspondence between/(/) and F(s). This means that Table 3.3-1 may
be used to find the inverse as well as the direct transform, which in many,
but not all, cases obviates the need for Eq. 3.3-4. Note that the uniqueness
problem of Examples 3.3-1 and 3.3-2 no longer exists, because the two
functions given there do not have the same one-sided transform.
The use of residue theory in connection with Eq. 3.3-4 provides a
powerful and relatively simple method of finding /(/). As shown in
Section 3.7, the use of Eq. 3.3-4 in the one-sided transform yields a
function of time that is zero for t < 0, and that is continuous except for
possible singularity functions for t > 0. Thus FF~Y [l/Os + D] is identified
as the function f\{t) in Fig. 3.4-1, rather than f2(t) or /3(f).
The functions encountered in system theory often have discontinuities
at t — 0. It is usually convenient to define the value off(t) at t = 0 to be
the limit of f(t) as t approaches zero through positive values. This limit
is symbolized as/(0+), and for the function fx(t) in Fig. 3.4-1 has a value of
unity. Although it is not necessary to adopt this convention, most
engineers do so, for it leads to the fewest difficulties. Consistent with this
convention, and with the fact that only the response for t > 0 is usually
desired, the definition of the direct Laplace transform can be written as

R > 0, A > 0 (3.4-1)


R-* oo JA
A->0

Other Properties of the Laplace Transform

Table 3.4-1 lists the most useful properties of the Laplace transform.
Proofs in most cases follow directly from Eq. 3.4-1 and can be found in
most references.4 Equations 3.4-2 through 3.4-7 are the basis for the
solution of fixed integral-differential equations by means of the Laplace
transform, as discussed in the next section. Equations 3.4-8 through
3.4-12 are useful in finding the direct and inverse transforms of given
functions.
Example 3.4-1. Find if-qi/U + 2)2].

if-1 — t for t >0


s

by Table 3.3-1. Using Eq. 3.4-8 with a = 2

if-1 = te~2t, t > 0


(s + 2)
114 Transform Techniques

Table 3.4-1

Equation
Property Number

+mo] = n/m + n/2( oj = w + f2(*) (3.4-2)

^[fl/(OJ = «^[/(0] = oF(j) (3.4-3)

r</ i
if = JF(J) -/(0+) (3.4-4)

if = 5wF(1y) — y(0+)

df
(3.4-5)
's i(0+) ^(0+)
H?)
if /(t) flfr (3.4-6)
5
F(s) /~1(0+)
if /(r) dr
I —CO 5 5

where f 1(0 +) = lim (3.4-7)


i—0+
•^[^‘/(Ol = F(s + a) (3.4-8)

mm] = - js f(s) (3.4-9)

• 00

y«r
se F( r ) dr (3.4-10)

&[f(t - a)U_x{t - a)] = e~asF(s) (3.4-11)

if = a F(as) (3.4-12)
/(i
a,

if Ad - W) dx = Fx{s)Fls) (3.4-13)
L Jo
1 Cx+j oo
nfiiOMO] = r-. F1(w)F2(s — w) dw
^‘Trj J x— j co
where w = x + jy, and where x must be greater
than the abscissa of absolute convergence for fx(t)
over the path of integration (3.4-14)
lim/(7) = lim sF(s) provided that the limit exists (3.4-15)
►0 s—*- co

lim/(r) = lim sF(s) provided that sF(s) is analytic


<—► co s—*-0
on the jm axis and in the right half of the s-plane. (3.4-16)
Sec. 3.4 Properties of the Laplace Transform 115

Example 3.4-2. Find SF\t sin r] and JS?[(sin t)/t].


1
JF[sin t) =
S2 + 1

by Table 3.3-1. Using Eq. 3.4-9,


2s
T£[t sin t] --
as s2 + 1 (s2 + l)2
Using Eq. 3.4-10,
' oo
sin t oo
se dr = [tan-1 r] s
r2 + 1
7T
=-tan-1 s = tan-1
2

Example 3.4-3. Figure 3.4-2 shows a periodic function of time fit), with period
T. Let fit) denote the first cycle only,/2(0 the second cycle only, etc. If
PT

Fi(s) = ^[/x(01 = f(th~st dt


Jo

fit)

find an expression for


1 00
F(s) = 5f[f{t)] = I f(t)e-* dt
Jo

Since
fit) = fiit) + fit) +/3(0 + '
ana since
fit) = fit - T)U_iit - T)
UO = fit - 2T)U_iit - IT)

Eq. 3.4-11 yields


Fis) = Flis) + e~TsFiis) + e-^’Ffs) +
= Fiis)[ 1 + e~Ts + e~2Ts + - • •]
Since
1
= 1 + X + X2 + for Id < 1
1 — X

the expression for Fis) may be written

Eiis)
Fis) = .-Ts
for Re s >0
1
116 Transform Techniques

This result is very useful when finding the response of a system to a periodic function.
Unlike the Fourier series method, both the steady-state and the transient response
may be found in closed form.5

Equation 3.4-12 is helpful in an understanding of time and frequency


scaling. Equations 3.4-13 and 3.4-14 are convolution formulas, and they
emphasize the fact that
^[/i(0/2(0] *

The last two properties in Table 3.4-1 are known as the initial and final
value theorems, respectively. The meaning of the word analytic is dis¬
cussed in Section 3.6.

Impulses in f(t)

The presence of impulses in f{t) introduces some difficulty in the


interpretation of Laplace transform theorems. As discussed in Section
1.4, impulses are not functions in the usual mathematical sense. The
uniqueness property previously discussed is not valid if there are impulses
in one of the two functions at the “denumerable set of points.” This does
not cause confusion, however, because it is easy to recognize whether or
not the inverse transform contains impulses. Perhaps the greatest con¬
fusion occurs when dealing with impulses at the time origin. The entry
in Table 3.3-1 stating that JF[U0(t)] = 1 assumes that the impulse occurs
immediately after t — 0. The transform of an impulse occurring im¬
mediately before t = 0 is zero. These statements are consistent with
Eq. 3.4-4. Since the unit impulse is the derivative of the unit step, the
transform of an impulse occurring immediately after t = 0 is given by
FF[U0(t)] = s{\js) — 0 = 1; for an impulse immediately before t — 0,
&[U0(t)\ = s(l/s) -1=0.
In the analysis of systems, either viewpoint yields correct results if
consistently used. An impulse serves to instantaneously insert stored
energy into the system. If JF[U0(t)] is taken to be unity, which is the usual
practice, the initial stored energy in the system should be calculated
excluding the effect of the impulse. A similar viewpoint is adopted for
higher order singularity functions occurring at the origin. FF[Un{t)] = sn,
and the initial stored energy is calculated excluding the effect of the
singularity function.

Partial Fraction Expansions

In many cases, F(s) is the quotient of two polynomials with real co¬
efficients. In such cases, the inverse transform can always be found from
Sec. 3.4 Properties of the Laplace Transform 117

Table 3.3-1, using a partial fraction expansion.4 Suppose that

P(s) P(s)
m = (3.4-17)
Q(s) (s - - sr+1) — sn)

where the denominator polynomial Q{s) has distinct roots ^r+1, . . . , sni
and a root sx which is repeated r times. If (2(X) is of higher order than
-P(V), then
Kr_ i K„
Hs) = + H-- +
(S ~ S1) (s - sf)' (s - sj
K r+l Kn
+ + +
S — S r+l S -
where
Kt = [(j - O' = r + 1, . . . , n)
Kr = [(5 - s.fFis)]^
d
Krl = (s - sf)rF(s)
_ds -1 S=Sl
and, in general,

1
Kr-k = (s - s1)rF(s) (k = 0, 1, 2, . . . , r — 1)
kl [_dsl S=S1

Once the partial fraction expansion has been determined, the inverse
transform of each term follows from Table 3.3-1.
Example 3.4-4. Find the inverse transform of

s3 + 1
F(s) =
S3 + 5
Since the denominator polynomial is not of greater order than the numerator poly¬
nomial, a preliminary step of long division is necessary.

F(s) = 1 + s(s2 ++ 11)


The second of the two terms can be expanded in partial fractions as above, yielding

1 (■\/2i2)e~j7T/4 (V2/2)ejV/4
F(s) = 1 + -s
s-J S+j
Using Table 3.3-1 and Eq. 3.2-5,

fit) = U0(t) + — v 2 cos

A partial fraction expansion of F(s) should be a mathematical identity


for all values of 5. The correctness of a particular expansion can be easily
checked by recombining the individual terms over a common denominator.
118 Transform Techniques

The expansion in the last example would certainly not be correct without
the 1 obtained by long division, since F(s) approaches unity for large
values of s. Whenever the order of the numerator equals or exceeds that of
the denominator, long division terms are needed to describe properly the
function for large values of s.

The Relationship between the Fourier and Laplace Transforms

A further comparison of the Fourier and one-sided Laplace transforms


is interesting, partly because tables for one of the transforms can then be
applied to the other. Consider a transformable function f{t), with
JF[f(t)] = F(s), and f{t)] = G(co). How can the direct Fourier trans¬
form be found from a table of Laplace transforms? To answer this
question, consider the following three cases. If f(t) = 0 for t < 0,
G(co) = F(jco). If fit) = 0 for t > 0,

Letting y = — t,
r oo
G(a>) = f{—y)ea>y dy
00
dt = FR(-jco)

where fR{t) = /(—t) and is the original function reflected about the
vertical axis. If f(t) is nonzero for both positive and negative values of
time, it can be decomposed into two parts, each of which falls into one of
the two previous cases.
Example 3.4-5. Find the Fourier transform of the function shown in Fig. 3.4-3#,
given by

Let
m =m + m
where the last two functions are shown in parts (b) and (c) of the figure.

Finally,
2
G(co) = Giiw) + G2(co) =
w2+ 1
Sec. 3.5 Application of the Laplace Transform to Time-Invariant Systems 119

fit)

hit) f2R(t)

Fig. 3.4-3

3.5 APPLICATION OF THE LAPLACE TRANSFORM


TO TIME-INVARIANT SYSTEMS

There are two basic methods of analyzing linear, time-invariant systems


using the Laplace transform. Such systems can be described by a set of
integral-differential equations with constant coefficients. These equations
can be transformed into algebraic equations by Eqs. 3.4-2 through 3.4-7.
This approach is illustrated by the following example. A second method,
associated with the system function concept, is described later in this
section.

Example 3.5-1. Find an expression for the response eft) for t > 0 in the circuit of
Fig. 3.5-1. The source eft) is a known function of time, and the current iL and voltage
ec are assumed to be known at the instant t = 0.
The circuit is described by the two loop equations

+ 3/2 + 2
/ h^ = 0
120 Transform Techniques

112 ec(t)
o
A
if
2h
112 S2 (t)

212

o
Fig. 3.5-1

Let Ifs), I2(s), and Efs) stand for the transforms of 4(0, 4(0, and eft), respectively.
Transforming the differential equations term by term yields the following algebraic
equations.
(25 + 3)4(0 ~ (25 + 2)4(5) = Efs) + 214(0+) - 4(0+)]

— (25 + 2)4(0 + I 25 + 3 H—14(0 —-4 40+) — 2(4(0+) — (2(0+)]

Note that
4(0+) — 4(0+) — 4(0+)
and
‘t
24-40+) = 2 lim 4(0 dr = ec(0+)
<--o+ ■ 00

Since 4(0 and 4(0 are the only unknowns, the two algebraic equations may be solved
simultaneously for 4(0- There results

(252 + 2s)Efs) - 2/z(0+)5 - (25 + 3)er(0+)


Efs) = 44) =
452 + 95 + 6

If the circuit contained no stored energy at t = 0, 4(0+) = +(0+) = 0, and

2s2 + 25
eft) JS?-1 Efs)
452 + 95 + 6

The terms involving initial stored energy summarize the history of the
circuit for t < 0. Such terms frequently occur when a switching operation
is performed at the reference time t = 0, adding or disconnecting elements.
It is best to transform the differential equations immediately, rather than
first differentiating one of them, or solving them simultaneously. If the
equations are transformed immediately, the 0+ terms are always directly
related to the initial stored energy, and they vanish if the system contains
none.

The Impulse Response

The Laplace transform offers a convenient method of finding the impulse


response from the system’s differential equation.
Sec. 3.5 Application of the Laplace Transform to Time-Invariant Systems 121

Example 3.5-2. A system is described by the differential equation of Example 2.8-1.

d%V . i dy , o 1 dv ,
dt2 dt 2 dt

Determine its impulse response.


Assume that the system is initially at rest, and that v is a unit impulse occurring
immediately after t = 0. Then
dy
(0+) =2/(0+) =0
dt
and the transformed equation is

s2Y(s) + 3 sY(s) + 2 Y(s) = - + 1

so
Ks + 2)
Y(s) =
s2 + 3s + 2 s + 1
and
2/(0 = *e-*

This answer agrees with the impulse response found in Example 2.8-1.
Alternatively, if v is a unit impulse occurring immediately before t = 0, the transform
of the right side of the differential equation is zero, but (dy/dt)(0+) = — £ and 2/(0+) =
£, as in Example 2.8-1. The complete transformed equation is

s2Y(s) - | + 1 + 3sY(s) - 1 + 2T(0 = 0

which gives the same result for y(t) as before. This alternative approach is not recom¬
mended because of the difficulty of finding the necessary initial conditions.

Systems Containing No Initial Stored Energy

The assumption of no initial stored energy within the system is one that
is made throughout much of this book. It is not necessary in such cases
to write differential equations describing the system first and then trans¬
form them. Consider the typical electrical elements shown in Table 2.6-1.
When the element has no stored energy, the ratio E{s)jl{s) — Z(s) is the
simple algebraic quantity given in the last column. Each element can
therefore be characterized by an impedance Z(s), which is not a function
of voltage, current, or time. Thus, when elements are interconnected,
the rules for d-c and a-c steady-state analysis carry over to Laplace
transform analysis. The relationship between the transformed output
and input can be found by the solution of algebraic equations.
Example 3.5-3. Find an expression for e2(t) for / > 0 in the circuit of Fig. 3.5-1,
assuming that there is no initial stored energy.
The circuit is redrawn in Fig. 3.5-2, each element being labeled with its impedance,
and transformed voltages and currents being used. Any method from d-c circuit theory
122 Transform Techniques

2/s

Ei(s)

may be applied. Perhaps the easiest is to write a single node equation at node 3.

1 1
Efs) 1 + —-: + = E x(s)
2s T 2 1 + 2/s
or
(2s + 2)0 + 2)
EAs) =-EAs)
J 452 + 95 + 6 '
Then
1
E‘2 0) =
1+2/5
Efs)

or
2s2 + 2s
e2(t) = Efs)
4 s2 + 95 + 6

agreeing with the result of Example 3.5-1.


This algebraic method of solving for the transform of the output may be extended to
systems with initial stored energy. Any initial energy may be represented by added
sources, as discussed in Section 1.4. Figure 1.4-6 shows how this may be done for
energy stored in a capacitance or inductance. When this has been done, each passive
element may again be characterized by its impedance.

The System Function H(s)

There is a direct relationship between the input v(t) and output y(t)
for a linear system with no initial stored energy. The system function
(or “transfer function”) of a fixed system is defined as the ratio of the
transformed output to the transformed input.

H(s) =
r(s) (3.5-1)
V(s)
or
2/(0 = Se-i[V(s)H(s)] (3.5-2)

assuming that there is no initial stored energy. A review of Section 2.6


indicates that the Z(s) and H(s) of that section are the same impedance
Sec. 3.5 Application of the Laplace Transform to Time-Invariant Systems 123

and system functions defined in this section. Recall that H(s)est is the
forced or particular integral response to the input est. From Eqs. 2.6-21
and 2.6-22,
B(s) yp(t)
H(s) = (3.5-3)
-40) /
v(t)=e
st

where A(s) and B(s) are formed from the operators A(p) and B(p) in the
differential equation
Mp)y(t) = B(p)v{t) (3.5-4)

Thus H(s) can be written down by inspection of the system’s differential


equation. For the system of Example 3.5-2,
1
H(s) =
is + 1 2

s2 + 3s + 2 S + 1

which the reader will recognize as the transform of the impulse response.
The use of Eq. 3.5-2 constitutes the same basic three-step procedure
used for the complex Fourier series and the Fourier integral. The input
function is transformed (this time into the complex frequency domain)
and is then multiplied by the system function. The resulting transformed
output is then converted back to the time domain. In fact, when s is
replaced by jo) in the system function, H(jco) is the same system function
that was used with the Fourier series and integral. From Eq. 2.6-23,

H{jco) = \H(ja>)\ (3.5-5)


- v(t) -
The forced response to v(t) = cos cot is yP(t) = \H(jco)\ cos (cot + 6).
Another useful interpretation of the system function results when
Eq. 3.5-2 is used for the special case of v(t) = U0(t), y(t) = h(t). Since
s?[u0(t)] = l,
h(t) = .Sf-'IffO)] (3.5-6)

Thus the inverse transform of the system function is the impulse response.
This is one reason for using the symbol h(t) for the impulse response.
Similarly, using Eq. 3.5-2 with v(t) = U_±(t) and y(t) = r(t),

(3.5-7)

Example 3.5-4. Solve Example 1.6-3 by use of the Laplace transform. The example
considers n identical amplifiers in cascade, as in Fig. 3.5-3, each having the step response
r(t) = —Ae~t,RC for t > 0.
Since
—A
s + 1 IRC
124 Transform Techniques

En(s)

Fig. 3.5-3

the system function for each amplifier is


As
s + 1 IRC

Assuming that the individual amplifiers are not loaded down by the presence of suc¬
ceeding stages, the system function for the combination is

En{s) E2(s) E3(s) E„(s) ■As


Efs) Efs) Efs) En^(s) s + 1 IRC

The step response of the combination is

1 / —As
s \s T 1 /RC/
The inverse transform is6
j n^Hn~ 1 )\(-t/RC)k
(-A)ne~tlRC
£i(n - k -1)!(*!)2J
agreeing with the result given in Example 1.6-3.

The Convolution Integrals

By Eqs. 3.5-2, 3.4-13, and 3.5-6,


rt
2/(0 = se-'ivisWis)] = - 2) dA
= I v(t - 2)/?(2) dX (3.5-8)

This result is identical with the convolution integrals of Eqs. 1.6-2 and
1.6-4 for nonanticipatory systems with v(t) = 0 for / < 0. Also,

H(s)
y(t) = J?-1 sK(s)
s J
The inverse transform of the last factor in the brackets is r(t) by Eq. 3.5-7.
By Eq. 3.4-4,

y-'lsFfs)] = —
dt
ifr(0+) = 0. Using Eq. 3.4-13,

2/(0 = P^rP r(t - X) dX = f ~ .A) r(X) dX


dX 'o d{t — 2)
Sec. 3.6 Review of Complex Variable Theory 125

If t>(0+) 0, the response to the step function y(0+)C/_1(t) must be added


to the result above, giving

2/(0 = v(0+)r(t) + r(t - X) dX


Jo dA

= v(0+)r(t) + f dff - ,A) r{X) dX


Jo d(t — A)
(3.5-9)

Since v(t) is regarded as zero for t < 0 in the one-sided transform, i>(0+)
is the size of a discontinuity at the origin. Then dv(A)/dA will have an
impulse of value y(0+) at A = 0. To be consistent with the definition of
the one-sided transform in Eq. 3.4-1, however, Eq. 3.5-9 should more
properly be written

y(t) = v(0+)r(t) + fJo+ dA


r(t - A) dA

so that such an impulse does not contribute to the integral. Equation


3.5-9 is equivalent to the convolution integrals of Eqs. 1.6-3 and 1.6-5 for
nonanticipatory systems with v(t) = 0 for t < 0. It must be remembered
that an impulse in dv{t)!dt at the origin is to be included in the integrand
of Eqs. 1.6-3 and 1.6-5, but not in the integrand of Eq. 3.5-9. If the reader
is troubled by impulses in dv{t)jdt, he may wish to review the discussion
associated with Eq. 1.5-9.

3.6 REVIEW OF COMPLEX VARIABLE THEORY

There are many readable textbooks devoted to the subject of complex


variables.7 This section presents, without detailed proofs, only the barest
skeleton of the part of the theory needed for an understanding of transform
techniques. Readers who have not previously been exposed to complex
variable theory will probably need to consult one of the references.

Functions of a Complex Variable

Since the theory is applied to the Laplace transform, the independent


variable is taken to be
5 = a + jco (3.6-1)

where a and a> are the real and imaginary parts of 5, respectively. As for
any complex quantity, values of 5 can be represented graphically in a
complex plane, shown in Fig. 3.6-1. It is sometimes convenient to think of
an “extended s-plane” as the surface of a sphere of infinite radius, with the
126 Transform Techniques

« (imaginary axis)
A

s-plane

cr (real axis)

Fig. 3.6-1

point at infinity diametrically opposite the origin, but such an interpreta¬


tion is not used in this book.
The dependent variable is denoted by

F(s) = R+jX (3.6-2)

R and X are the real and imaginary parts, respectively, of F(s), and they
are real functions of the two real variables a and co.

Analytic Functions

The derivative of F(s) is defined as

dF(s) = limAF(s) = Hm F(s + As) - F(s)


(3.6-3)
ds As->0 As As->0 As
As represents a small distance in the s-plane and may be taken in any
arbitrary direction. The usefulness of the derivative concept is greatly
increased if the derivative turns out to be independent of the direction of
As. Equation 3.6-3 is independent of the direction of As at a point s0,
if and only if
dR _ dX
(3.6-4)
da dco
and
dR _ _ dX
(3.6-5)
do da
and if F(s) and these partial derivatives are continuous in some neighbor¬
hood of s0. The last two equations are known as the Cauchy-Riemann
conditions for a unique derivative.
A function F(s) is said to be analytic at a point in the s-plane if and only
if it is single-valued and has a finite and unique derivative at and in the
Sec. 3.6 Review of Complex Variable Theory 127

oo

neighborhood of that point.| The vast majority of the commonly en¬


countered functions are analytic except at a finite number of points. The
quotient of two polynomials in 5, for example, is analytic for all finite
values of 5 except where the denominator becomes zero. All the usual
rules of differentiation hold for analytic functions.

Integration in the Complex Plane

Integration along a contour C in the 5-plane (called a contour integral)


can be interpreted in terms of real integrals by writing

F(s) ds = (R T j X) {d(7 -f- j dco)

(R da — X dco) + j (X da + R dco)
Jc c
The values of a and co are related by the contour along which the integra¬
tion is carried out.
A contour integral of F(s) between two fixed points in the 5-plane is not,

in general, independent of the path taken. In Fig. 3.6-2, F(s) ds and


r Jc,
F(s) ds are not necessarily equal. If the path of integration is a closed
Jc2
contour, this fact is indicated by a circle superimposed upon the integral
sign. An arrow is sometimes shown on the circle to indicate that the
integration is to be carried out in a specific direction, either clockwise or
counterclockwise. The sign of the integral is positive if the direction of the
integration is such that the area enclosed lies to the left of the contour.

f The phrase “analytic at and in the neighborhood of a point in the 5-plane” is more
correct mathematically, but it is also more cumbersome.
128 Transform Techniques

In Fig. 3.6-2,
(j> F(s) ds = F(s) ds — F(s) ds

The sign of the C2 integral is negative, since reversing the direction of


integration along a path reverses the sign of the contour integral. Since
the integrals along contours C1 and C2 are not necessarily equal, the
integral around a closed path is not necessarily zero.
Two theorems attributed to Cauchy lay the foundation for the practical
evaluation of contour integrals. The first theorem states that, if F(5) is
analytic on and inside a closed contour C in'the 5-plane,

F(s) ds = 0 (3.6-6)
'c
A corollary to this theorem is that

F(s) ds = F(s) ds (3.6-7)


Ci JCz

in Fig. 3.6-2, if F(s) is analytic on and between the two paths. Another
corollary deals with multiple connected regions. In Fig. 3.6-3a, F(s) is
assumed to be analytic on the closed contours C, C1? and C2, and in the
shaded region R. It is not, however, necessarily analytic inside Cx and C2.
In Fig. 3.6-3b, cuts labeled C3 through C6 are constructed so as to give
a simply connected region. The distance between corresponding ends of
C3 and C4, and also C5 and C6, is assumed to be infinitesimal. By Cauchy’s
theorem.
f
1C3
ra
e
Cs) ds —

J Ci
r%
F(s) ds

F(s) ds + F(s) ds
/c. Cs

r
/%

F(s) ds F(s) ds = 0
C2 kJ Ce
CO co
A C

(a)
Fig. .6-3
Sec. 3.6 Review of Complex Variable Theory 129

since F(s) is analytic on and inside the closed contour formed by the sum
of these individual contours. By Eq. 3.6-7, the integration along C3 and
C4, and also along C5 and C6, cancel, so

F(s) ds = (p F(s) ds + C> F(s) ds (3.6-8)


Ci J c2

A second theorem by Cauchy states that, if F(s) is analytic on and inside


a closed contour C, and if 50 is any point inside C,

F(so) = -- <f ds (3.6-9)


2ttj J c s — s0
and
dnF n F(s)
n
= — (>
n+1
ds (3.6-10)
ds s=s0 2rrj c(s - s0)
Since the integrands are not analytic at the point s0, these equations can be
used to evaluate integrals that cannot be handled by Eq. 3.6-6. Their
primary importance, however, is in the derivation of the Taylor and
Laurent series and the residue theorem. Note also that Eq. 3.6-9 leads
to the surprising conclusion that, if a function F(s) is defined on a closed
contour and is analytic on and inside that contour, then the value of the
function is automatically determined at all interior points.

Taylor’s Series

If F(s) is analytic at a point s0, then it can be represented by the following


infinite power series in the vicinity of

F(s) = b„ + b^s - s0) + b2(s - 50)2 + • • • + bn(s - s0)n H-


(3.6-11)
where
b0 = F(s0)
1 \ dnF(s) (3.6-12)
bn
nil dsn
The form of Eq. 3.6-11 is seen to be the same as the Taylor series for a
function of a real variable. If s0 happens to be zero, the series is called
Maclaurin’s series.
Taylor’s series converges to F(s) for all points in the 5-plane inside a
circle of convergence. This is the largest circle that can be drawn about
50 without enclosing any points at which F(s) is not analytic. For all
points outside the circle of convergence, the series diverges.
The coefficients in Taylor’s series can always be evaluated by Eqs.
3.6-12, since all the derivatives of an analytic function exist. If F(s)
130 Transform Techniques

and the first k — 1 derivatives are zero at 5 = 50, the first k terms in Eq.
3.6-11 are missing, and F(s) is said to have a zero of order k at s0.
A uniqueness theorem states that, if two power series represent F{s)
in the neighborhood of 50, then they must be identical. The coefficients
may therefore be found in any convenient way, and not necessarily from
Eq. 3.6-12. In the examples which follow, long division is used.
The uniqueness theorem of the previous paragraph leads to other useful
results. If a series representation of F(s) is valid in any arbitrarily small
region, it is the unique representation wherever it converges. Also, if an
analytic function is specified throughout any arbitrarily small region, it is
then uniquely determined throughout the entire 5-plane. The last state¬
ment is known as the principle of analytic continuation.

Example 3.6-1. Expand F(s) = 1/(5 + 1) in a power series about 5 = 1, i.e., in powers
Of (5 — 1).

By long division,

1 1 1
- (5 - 1) + - (5 - l)2 - — (5 - l)3 + * • *
5 —f- 1 2 —(5 — 1) 2 4 8 16

CO
A
Circle of

Fig. 3.6-4

The same series results if Eqs. 3.6-11 and 3.6-12 are used with s0 = 1. The function
F(s) is analytic except at the point 5 = — 1, shown by the cross in Fig. 3.6-4. The circle
of convergence indicates that the series converges for \s — 1| <2.

Example 3.6-2. Expand F(s) = s/'[(s — 1)(5 — 3)] about 5=1.


Although F(s) is not analytic at 5 = 1, the function 5/(5 — 3) is analytic and may
be expanded in a Taylor series. By long division,

5 1 + (5 — 1) 3 3
5-3 -2 + (5 - 1)
Sec. 3.6 Review of Complex Variable Theory 131

which converges for Is — 1| <2. Dividing each term of the series by (s — 1),

0 - l)(s - 3) s — 1 4 8

which converges for 0 < |s — 1| <2.

Laurent’s Series

Even if F(s) is not analytic at the point 50, it can be represented by an


infinite series in powers of (s — ^0). In this case, however, the series
contains both positive and negative powers of (s — s0).

F(s) = b0 + - s0) + b2(s - s0)2 + • • •

+ 6_i(s - Sq)"1 + b_2(s - s0)-2 + * • * (3.6-13)


where

bk = — (j> f ds - for all k (3.6-14)


2ttJ Jc (s - s0)*+1

The hrst part of the series is known as the ascending part; the second part,
the principal or descending part. The series converges between two circles
of convergence, both centered at s0. F(s) must be analytic between these
circles, which are labelled Cx and C2 in Fig. 3.6-5. If there is an “isolated
singularity” of F(s) at s0, the circle C2 may shrink to infinitesimal size, and
the Laurent series then represents F(s) in the vicinity of s0. As discussed
later in this section, the coefficient b_x is particularly important and is
called the residue of F(s) at s0.
In the use of Eq. 3.6-14, the contour C may be any closed contour
between C1 and C2, shown in Fig. 3.6-5. Since this equation is difficult
to evaluate, however, the coefficients in Laurent’s series are normally

Fig. 3.6-5
132 Transform Techniques

found by some other means. Any convenient method may be used, be¬
cause the representation of F(s) by a series in powers of (5 — s0) in any
given region is unique. Example 3.6-2 is an example of a Laurent series,
where C2 is arbitrarily small, and Cx is a circle of radius just under 2.

Classification of Singularities

Singularities of F(s), also called singular points, are points in the 5-plane
at which F(s) is not analytic. If nonoverlapping circles, no matter how
small, can be drawn around each singular point, the points are called
isolated singularities. The function F(s) = (5 + l)/[s3(s2 + 1)] has
isolated singularities at 5 = 0, -f-j, and —j. The function F(s) = 1/
[sin (77/5)] has isolated singularities at 5 = 1, J, . . . , but a nonisolated
singularity at the origin. Fortunately, the commonly encountered
functions have only isolated singularities. The reader should remember
that F(s) can be represented by a Laurent series in the vicinity of every
isolated singularity.
An isolated singularity is classified further by examining the Laurent
series written about it. If the principal or descending part of the series has
an infinite number of terms, the singularity is called an essential singularity.
Otherwise, the singularity is called a pole. The order of the pole is equal
to the number of terms in the principal part of the series. If F(s) has a
pole of order n at 50, then there are no terms in the principal part of the
power series for (5 — s0)nF(s), so that (5 — s0)nF(s) is analytic at 50.

The Residue Theorem

Letting k — —1 in Eq. 3.6-14 gives

F(s) ds = 277jh_1 (3.6-15)


c
where is the residue of F(s) at s0. The contour of integration is shown
in Fig. 3.6-5 and is understood to enclose no singularities other than an
isolated singularity at s0. If 50 is not a singular point, then b_x is zero, and
this equation reduces to Eq. 3.6-6. In Example 3.6-2, if C is a circle of
radius 2 about the origin,

C> F(s) ds = 2tt/’(-J) = -7rj


3 C

Suppose the contour of integration encloses several isolated singular¬


ities. Since nonoverlapping circles can be drawn about each of them, the
integral along contour C can be replaced by the sum of the integrals
Sec. 3.6 Review of Complex Variable Theory 133

around the individual circles, as in Eq. 3.6-8 and Fig. 3.6-3. Applying
Eq. 3.6-15 to each individual circle,

F(s) ds = 2irj ^ residues (3.6-16)

where the summation includes the residues at all the singularities inside C.

The Evaluation of Residues

Because Equation 3.6-16 is very valuable in finding the inverse Laplace


transform, it is important to be able to calculate residues quickly and
easily. Residues can always be found by writing a Laurent series about
each singularity, and selecting the coefficient of the (s — s0)_1 term.
Several special formulas are useful, however, and are derived next.
If F(s) has a pole of order n at 5 = s0, the function g(s) = (s — s0)nF(s)
is analytic at s0 and may be expanded in a Taylor series about s0.

g(s) = g(s0) + g'(s0)(s - s0) + •

g<n> Oo) (s — s0)n +


+ (s - s0)n 1 +
-(» - 1)!J L n! J

where the primes and superscripts in parentheses denote differentiation


with respect to s. Then

g(s) -n+l
F{s) = n
= g(s0)(s - s0) n + g'(s0)(s - s0y
(s - So)
~g(n ^(Sp)' £(n)(s0)
+ (s - s0) 1 + + •
.(n - 1)!J L n

The last expression is the Laurent series for F(s), so the residue at s0 is

1 r dn~x
b-i = — (s - so)"F(s) (3.6-17)
{n - 1)! ds -1 S=S 0

For a first order pole, where n = 1, this reduces to

= [(s - s0)F(s)]s=So (3.6-18)

The next approach is valid only when F(s) can be expressed as

P(s)
F(s) = (3.6-19)
Q(s)
where both P(s) and Q(s) are analytic at s0, and where P(s0) ^ 0. If F(s)
has a first order pole at s0, then Q(s0) = 0, but Q'(s0) ^ 0. Writing a
134 Transform Techniques

Taylor series for both P(s) and Q(s) and carrying out the indicated long
division,
P(s0) + P'(s„)(s -$„) + ■■■
f(s) =
Q'(so)(s - So) + ^ <2"(s0)(s - S0)2 +

P(Sq)
0 - s0) 1 +
QXso)
The residue at 50 is
P(sp) '
b -1 =
(3.6-20)
QXs0)
This approach becomes cumbersome when applied to higher order poles.

3.7 THE INVERSION INTEGRAL

When the Laplace transform F(s) is known, the function of time can be
found by Eqs. 3.3-2 and 3.3-4, rewritten below.

f{t) = — lim
277j R-* oo Jc—jR
P F(s)€s< ds (3.7-1)

As this equation applies equally well to both the two-sided and the one¬
sided transform, it is convenient to treat them together.
Recall from Section 3.3 that the defining equation for the direct trans¬
form F(s) converges only for certain values of <r, i.e., only within a certain
region of the 5-plane. Using the principle of analytic continuation, how¬
ever, this is sufficient to uniquely define F(s) throughout the entire 5-plane,
except at the singular points. Since the factor est is analytic throughout
the entire finite 5-plane, the function F(s)est can be integrated without
difficulty along any path that does not include any singularities of F(s).
If the concept of an extended 5-plane mapped on the surface of an infinite
sphere is used, the point at infinity must be avoided, since est has an
essential singularity there. In the application of Eq. 3.7-1, a semicircular
detour could be made around the point at infinity. An easier interpretation
of Eq. 3.7-1, however, is to consider only the finite 5-plane, and to apply
the limiting process of R —► oo after the integration has been carried out.
This is the procedure that is followed here.
In Section 3.3, it was indicated that the path of integration in Eq. 3.7-1
is restricted to values of a for which the direct transform formula con¬
verges. In the case of the two-sided Laplace transform, the region of
convergence must be specified in order to uniquely determine the inverse
Sec. 3.7 The Inversion Integral 135

co
A
E D

V 0
c
< >
B
H

G A

Fig. 3.7-1

transform. In the two-sided transform, the regions of convergence for


functions of time that are zero for t < 0, or zero for t > 0, or that fall
into neither category, have the form a > ac, a < ac, or aCi < a < ac2,
respectively. In the one-sided transform, the region of convergence is
always given by o > crc, where ac is called the abscissa of absolute con¬
vergence.
From the point of view of rigorous mathematics, Eqs. 3.3-1 and 3.3-3
should be taken as the formal definition of the Laplace transform. The
correctness of Eq. 3.7-1, and the above restriction on the path of integra¬
tion should then be proved. The proof, in essence, shows that, if Eqs.
3.3-1 and 3.7-1 are applied consecutively to a function f(t), the end result
is the same function f{t) at all points of continuity.8 For a discontinuity
at the origin, the result is J[/(0+) + /(0—)]. In the one-sided transform,
the consecutive application of Eqs. 3.3-3 and 3.7-1 yields f{t) for t > 0,
zero for t < 0, and 0+) for t = 0.
The path of integration in Eq. 3.7-1 is usually taken as a straight vertical
line, denoted by ABD in Fig. 3.7-1. EFG and DHA are semicircles, whose
radius will later become infinite. Since the direct integration of F(s)est
along ABD is difficult, another procedure is normally used. For most
commonly encountered functions,

00 JDEFGA
for t > 0, and

lim F(s)est ds = 0
R^oo Jdiia

for t < 0. In this event, path ABD may be changed to ABDEFGA for
136 Transform Techniques

t > 0, and to ABDHA for t < 0. Using the residue theorem of Eq.
3.6-16,
1 1
f(t) = — lim O' F(s)est ds
2irj R-+co Jabdefga

[residues of F(s)e at the singularities


= 2 to the left of ABD] for t > 0
(3.7-2)

i
m lim O F(s)est ds
2rrj r -* 00 jABDHA

[residues of F(s)est at the singularities


= -2 to the right of ABD] for t < 0
(3.7-3)

It is necessary to know when Eqs. 3.7-2 and 3.7-3 may be substituted


for Eq. 3.7-1. The following theorem helps to answer this question.9

Theorem 3.7-1. If 5 is replaced by the real quantity R, and if lim F(R)


R-* co
approaches zero at least as fast as N/R, where N is a finite number,
then Eqs. 3.7-2 and 3.7-3 are valid.! Note that this theorem includes the
special case of the quotient of two polynomials, provided that the order
of the denominator exceeds the order of the numerator.

In using Eqs. 3.7-2 and 3.7-3, the residue formulas developed in Section
3.6 prove helpful. For a pole of order n at s0, the residue of F(s)est is

1 d n—1 st
n-
~(s~ so)nF(s)< (3.7-4)
(n- 1)! .ds s=so

For a first order pole at s0, the residue of F(s)est is given by

[(5- - s0)F(sy%=So (3.7-5)


or
P(s> .St
(3.7-6)
_(d/ds)Q(s) J S=So
where F(s) = P(s)IQ(s), and P(s) and Q(s) are analytic at s0.
Example 3.7-1. F(s) = — l/b2(s — 1)]. The region of convergence is given by
0 < o < 1 and is shaded in Fig. 3.7-2. Find f(t).

f The theorem as stated is really unnecessarily restrictive. For example, Eqs. 3.7-2
and 3.7-3 are valid if F(s) is any meromorphic function (the ratio of two functions that
are analytic throughout the finite s-plane) that approaches zero uniformly as \s\ co.
If there is ever any doubt, Eqs. 3.7-2 and 3.7-3 may be used formally to find a function
of time. If the transform of this answer is found to be the original F(s), then the validity
of the answer is established.
Sec. 3.7 The Inversion Integral 137

co

Since Theorem 3.7-1 is satisfied, Eqs. 3.7-2 and 3.7-3 may be used. For t > 0,
fit) = [residue of F(s)est at 5 = 0]

= t + 1
s=0
For t < 0,
fit) = —[residue of Fis)est at 5 = 1]

The complete function is shown in Fig. 3.7-3. The region of convergence in this example
indicates that the two-sided Laplace transform must have been used in obtaining F(s).
Poles to the left and right of this region yield, respectively, the components of /(/)for
positive and negative time.
fit)

Fig. 3.7-3
138 Transform Techniques

Example 3.7-2. F(s) = (s + tf)/[0 + a)2 + /?2] = 0 4- a)/[(s + a — jP)(s -+■ a + j(3)].
If the region of convergence is given by o > —a, find f(t).
Equations 3.7-2 and 3.7-3 may again be used. For t > 0,

fit) =2 [residues of F(s)est at s = —a ± j/3]

(s + a)est (5 + a)est
+
_s + a + jfj_ s=-a+jfi _s + a — jfj_

= a+JP)t -|- ^e( a ofi)t — €-at cos fit

For t < 0, f{t) = 0, since there are no singularities inside path ABDHA in Fig. 3.7-1.

When the one-sided Laplace transform is used, the region of convergence


is given by a > crc, as can be expected from Eq. 3.3-1. The abscissa of
absolute convergence ac must be to the right of all the poles of F(s).
Otherwise, the region of convergence would include a singular point,
at which F(s) certainly does not exist. For this reason, a statement of the
region of convergence is normally omitted in the one-sided transform.
Also for this reason, the inversion integral always yields /(f) = 0 for
t < 0. The rest of this section is restricted to the one-sided transform.
For the case where F(s) is the quotient of two polynomials, with the
numerator of lower order than the denominator, the residue and partial
fraction methods are essentially identical. Although the partial fraction
method is restricted to rational functions of s, the more powerful residue
method is not. Several typical examples involving nonrational functions
are now considered.
Example 3.7-3. Figure 3.7-4 shows an RC filter consisting of m identical T-sections
in cascade, excited by a unit step of voltage. Find the current ir(t) at any point on the
filter (r = 0, 1, 2, ..., m). Assume that there is no initial stored energy.

—WVl/wv --WVW-WV—AA^-t-vW-vVA-f-vW-^1
R/2 w* R/2 R/2 R/2 R/2

eX = U.x(t) ir-i Irn— 1


ri
cv
>
Fig. 3.7-4

Characterizing each element by its impedance, the transformed equation for the
rth mesh is
1 1
— Ir-i(s) +(/?-!—— ITC5)-t;T+i(A — 0
sC sC sC

This is a second order, algebraic difference equation. As discussed in Chapter 2, two


boundary conditions are needed. These can be obtained by examining the first and
last meshes.
R 1 1 1
Sec. 3.7 The Inversion Integral 139

By the method of Example 2.12-1, the solution of the difference equation, subject to the
boundary conditions above, is
C sinh [(m — r)0]
Ir(s) =
(sinh 0)(cosh md)

where cosh 0 = 1+ (RC/2)s. The hyperbolic functions are analytic for all finite values
of the complex variable 6. Whenever sinh 0 = 0, [sinh (m — r)0]/sinh 0 is finite, so the
only poles of Ir(s) occur when cosh md = 0. Letting 0 = a + y/3,

cosh md = cosh men cos m(3 + y sinh men sin mfi

This expression is zero only when m(3 = ±(2k — l)(rr/2) and men — 0, where k =
1, 2, 3, ... . The poles are at
.(2k - 1)tt-
»- ±) 2m

(2k — 1)77
cosh 0 = cos
2m
2 (2k - 1)77
j = — 1 + cos
RC 2m
The ± sign is not needed because cos (—<f>) = cos </>. Also, after k = m, the values of
5 in the last equation are repeated. Thus there are m poles, as might be expected from
the fact that Fig. 3.7-4 has m energy-storing elements. The poles lie on the negative
real axis of the j-plane between the origin and —4/RC.
The integral of F(s)est along path DEFGA in Fig. 3.7-1 can be shown to vanish for
t > 0 as R —00, so f(t) is the sum of the residues of F(s)est at the m poles. Define

P(s) = C sinh (m — r)d

Q(s) = (sinh 0)(cosh md)


Note that

dQ(s) dQ dd RC
-= —--= [(sinh d)(m sinh md) + (cosh 0)(cosh md)]
ds dd ds 2 sinh 0

At the poles, where cosh md = 0, and sinh md = j sin [(2k — \)(tt/2)] = y( —l)fc+1,

dQ , RCm\
ds s=s.
= ./(-Dfc+1(— J

The fact that the last expression is not zero indicates that the poles are of the first order;
hence Eq. 3.7-6 may be used.

(2k - 1)77 (2k - IV77


P(s0) = jC sin (m — r) = jC( — l)fe+1 cos
2m 2m
Then
7 m (2k — 1)77 ^[-i+cos(2-fc—1)77]f
COS
M " IS, fc=i
^ 2m
Example 3.7-4. Find

SF-
J2(52 + J + 1)
140 Transform Techniques

The procedure most commonly used is to find, in the normal manner, that

1 V3 77
se- — t — 1 H-- e-</2 cos — t H—
_s2(s2 + s + 1) V3 2 6,

for t > 0. Then, by Eq. 3.4-11,

V3 V3
JV- t — 2 H-- e <f-1)/2 COS — t--1- U^{t - 1)
S2(S2 +5+1) V3 \ 2 2 6,

which is the previous function of time shifted along tjie time axis one unit in the positive
t direction.
If, however, the inversion integral were to be used directly,

i -s(t-l)
m= 2-nj
ds
ABD s\s + 1 ~j)(s + 1 + j)

where path ABD in Fig. 3.7-1 passes to the right of all the poles.

:sU-l)
ds
s2{s + 1 — j){s + 1 +/)

would be expected to vanish along path DEFGA only for t — 1 > 0, and along path
DHA only for t — 1 <0, since the real part of s{t — 1) should be negative. The reader
should verify that this is indeed the case.

Therefore, for t < l,/(r) = 0, while for t > 1,

f(t) [residues of F(s)est at 5 = 0, —1 ± j]

-s(t-l) -s(t-l) -sU-1)


+ +
ds S2 +5+ 1 s=0 _s2(s + 1 +/)_ S=_1+J- L s'\s + 1 -/)_ s= 1 j

2 /V 3 V3 77 v
= t — 2 4-- e <( 1)/2 cos I — t-1-
v 3 \ 2 2 6/

Example 3.7-5. Find


The function F(s) = \/Vs is a double-valued function, as a result of the square root
operation. If 5 is represented in polar form as rej9, then rei(0+2^) js another acceptable
representation. However, YT+++277) = — V re;0, giving rise to two different values for
Vs. A double-valued function is not analytic, and the bulk of the theorems from com¬
plex variable theory do not apply.
In order to make the function analytic, arbitrarily restrict the angle of 5 to the range
— 77 < 0 < 77, and also exclude the point 5 = 0. This is done formally by the construc¬
tion of a branch cut along the negative real axis, as shown in Fig. 3.7-5. The end of the
branch cut, which in this case is at the origin, is called a branch point. A branch cut
can never be crossed, and so the branch cut ensures that —77 < 0 < 77, thereby making
F(s) single-valued. The basic inversion integral

f{t) = -L lim F(s)est ds


l7TJ it—>00 ABD
Sec. 3.7 The Inversion Integral 141

co
A

Branch cut
e <j

Fig. 3.7-5

still applies, but Fig. 3.7-1 must be modified, as shown in Fig. 3.7-6. By the residue
theorem.

F(s)est ds + F(s)est ds + I F(s)est ds + I F(s)est ds — 0


ABB DEF JFGHI JlJA

since no singularities are enclosed. For t > 0, it can be shown that the second and
fourth integrals vanish as R approaches infinity. Then

f(t) = — lim — F(s)est ds


R—>oo 2tt/ Jfghi
On the infinitesimal circle GH, let 5 = rej9 = r cos 6 + jr sin 6, and
-TT er(cos 0)<€jr(sin 0)t
F(s)est ds = -Jr*je dd
’GH on r'Ae*9/2

which vanishes as r approaches zero. On the straight line FG, let 5 = — u, sA = jiM,
and ds = —du, where u and iM are real, positive quantities. Note carefully that the
branch cut requires V — 1 = j and not —j for this path. Then
' oo ut
0 1 1
F(s)est ds — — du
FG oo juX/2 J Jo u
CO
142 Transform Techniques

Combining these results gives


1
/(') = u~M du
2 Vj

1
V nt

The substitution y = ut was made, and the evaluation of the last integral follows either
from the definition of the gamma function or from a standard table of definite integrals.
Examples of finding the inverse transform of double-valued functions when the path
shown in Fig. 3.7-6 encloses poles of F(s) are discussed in Reference 10.

An important application of the residue theorem is based upon the


complex convolution theorem of Eq. 3.4-14. If F^s) and F2(s) represent
the one-sided transforms offft) and /2(/), respectively, then

»)/.(«)] = f. f Fi(w)F2(s - w) dw (3.7-7)


Ittj jc

where w is a complex variable, and where s is treated as an independent


parameter in carrying out the integration. It is understood that the real
part of 5 must be large enough that all poles of F2(s — w) in the w-plane
are to the right of the poles of The contour of integration is any
path from c — jco to c + joo in the shaded strip of Fig. 3.7-7.
f This restriction on 5 ensures that the defining formula for the direct transforms of
MO, and MOffO converges.
Sec. 3.8 The Significance of Poles and Zeros 143

If the integral along the infinite semicircles DFA and DHA vanishes,
the contour ABD may be replaced by either of the two closed contours in
the figure. Then, by the residue theorem,
&mm\ = 2 [residues of F1(w)F2(s — w) at the poles of /^(w)]
= — 2 [residues of F1(\\ )F2(s — w) at the poles of F2(s — u’)]
(3.7-8)
Example 3.7-6. Find 3F[teat] by the lise of the complex convolution theorem.
If fiil) = t and f2(t) = eat, Eq. 3.7-8 gives

1
ST[teat] = residue of at w = 0
w2(5 — w — a)

1 1
dw\s — w — a, w =0 is ~ a)‘
Alternatively,
1
JT[teat] = residue of at w = s — a
w2(s — w — a)

1
_w j Hj=s_a is - a)‘

A very important application of Eq. 3.7-8 is in connection with the Z transform of


Section 3.11.

3.8 THE SIGNIFICANCE OF POLES AND ZEROS

Poles and zeros are defined in Section 3.6. In brief, poles and zeros of
F(s) are values of s for which 1 jF{s) and F(s), respectively, become zero.
Suppose that F(s) can be written as the quotient of two factored parts.

F(s) = K(S ~ Zl)(s ~ Zz)' • • (s ~ ^ (3.8-1)


(s - PjXs - ■ ■ ■ (s - pn)
Then z1 through zm are zeros of F(s), and through pn are poles. If the
factor (s — zf) or (s — p0) is repeated r times, F(s) has a zero or pole, respec¬
tively, of order r. If m < n, F(s) has a zero of order n — m at infinity.
When plotted in the s-plane, poles are denoted by crosses, and zeros by
circles. A second order pole or zero is denoted by two superimposed
crosses or circles, respectively. The pole-zero plot for
10Cs2 -2^ + 2)
F(s) =
s(s + l)2
is shown in Fig. 3.8-1. The fact that there are three poles and two zeros
in the finite s-plane indicates that F(s) has a first order zero at infinity.
The pole-zero plot completely describes Fis), except for the multiplying
constant K = 10.
144 Transform Techniques

co
A

1 o

-n-x ->- <T


-i 1

Fig. 3.8-1

The pole-zero plots of greatest interest are those representing inputs,


outputs, and system functions. For realizable signals and systems, the
inverse transforms of V(s), Y(s), and H(s) must be real functions of time.
This in turn requires that any complex zeros or poles must occur in
complex conjugate pairs. Furthermore, the residues of complex conjugate
poles must themselves be complex conjugates.
Each pole in F(s) gives rise to a term in the inverse transform f(t) =
1 [iF(j)]. The form of the term is completely determined by the location
and order of the pole, although the size of the term depends upon the
location of the other poles and zeros of F(s). Poles in the right half-plane
give rise to terms that increase without limit as t approaches infinity,
while left half-plane poles yield terms that decay to zero. This is true even
for higher order poles. A pole of order r at 5 = —a, for example, will
produce a tr^xe~at term in f(t), but lim tr~x<rat — 0. The distance of poles
t-+ oo
from the vertical axis indicates how fast the corresponding terms in f(t)
grow or decay as t increases. For first order poles in the left half-plane,
this distance is the reciprocal of the time constant. First order poles on
the imaginary axis yield sinusoidal terms of constant amplitude, but
higher order poles produce terms that increase without limit as t ap¬
proaches infinity. The distance of first order complex conjugate poles
from the horizontal axis equals the angular frequency of oscillation of the
corresponding terms in f(t).

The Free and Forced Response

The output of a fixed linear system initially at rest is given by Eq. 3.5-2,
repeated below.
y(t) = (3.8-2)
Sec. 3.8 The Significance of Poles and Zeros 145

The poles of V(s) depend only upon the input and give rise to the forced
response, which is the particular solution in the classical language of
Chapter 2. The poles of H(s) depend only upon the system and yield the
free or natural response. This is identical with the complementary
solution of Chapter 2, with all the arbitrary constants evaluated. Since
the size of the terms in the free response depends upon the poles and zeros
of both V(s) and H(s), the arbitrary constants depend upon both the input
and the system, as stated in Section 2.6.

Example 3.8-1 Find the output voltage e{t) if the circuit of Fig. 3.8-2 has
no initial stored energy, and if i(t) = (1 + sin t)U_ft).

Fig. 3.8-2

The system function is the input impedance

1 s
Z(s) =
2/s + 5 + 3 (s + DO + 2)

The Laplace transform of the input is

_ 1 1 _ 52 + 5 + 1

5 52 + 1 S(S2 + 1)
Hence
s2 -F 5 + 1
E(s) =
(s + 1)0 + 2)02 + 1)
Then
1
e(t) = &~l[E(s)] = - e-* ~ - e-2t + —= cos
2 VlO

The first two terms in the output constitute the free response and are produced by the
poles of Z(s). The last term is the forced response, produced by the poles of I(s) at
s = ±j. There is no term in the forced response due to the pole at the origin, because
it is cancelled by a zero of Z(s). This can be physically explained from the circuit by
noting that the inductance acts like a short circuit to the d-c component of the source.
In this example, the free and forced responses are the transient and steady-state com¬
ponents, respectively. This is always the case if H(s) has only left half-plane poles, and
if the input does not vanish as t approaches infinity.

Stability

Since SP-^His)] = h(t), the poles of H(s) determine the form of the
impulse response. Because an impulsive input simply inserts some energy
146 Transform Techniques

into the system instantaneously, the nature of h{t) indicates the stability
of the system. The system is unstable if h{t) increases without limit as t
approaches infinity. Thus a system is unstable if H(s) has poles in the right
half-plane or higher order poles on the imaginary axis. If there are first
order poles on the imaginary axis, the free response contains oscillations
of constant amplitude. Such a system is usually, but not always, con¬
sidered to be stable. If H(s) has only left half-plane poles, the system is
unquestionably stable. These conclusions agree with those reached in
Section 2.6. By Eq. 3.5-3, the poles of H(s) are the roots of the character¬
istic equation A(s) = 0.

3.9 APPLICATION OF THE LAPLACE TRANSFORM


TO TIME-VARYING SYSTEMS

Previous sections of this chapter have discussed in some detail the use
of transform techniques and the system function H(s) in the analysis of
fixed systems. This section investigates the application of these techniques
to time-varying systems.
The first of the two basic transform methods for fixed systems was to
transform the system’s differential equations into algebraic equations.
Although any differential equation may be transformed term by term,
an algebraic equation results only if the coefficients of the differential
equation are constants. Otherwise, another differential or integral
equation results. Equations 3.4-9 and 3.4-10 are useful in solving homo¬
geneous differential equations whose coefficients are polynomials in t.
Example 3.9-1. Solve the equation

d2y dy
t —- -1—1— (t — \)y = 0
dt2 dt
when 2/(0+) — 1.
Transforming this equation term by term, using Eq. 3.4-9, yields

d dY(s)
- - [s*Y(s) - s - 2/(0+)] + [sY(s) - 1] + —+ Y(s) = 0
ds ds
Simplifying,
dY{s)
(s + 1) —— + Y(s) = 0
ds

which is a first order differential equation in Y(s). This can be easily solved by the
separation of variables, or by the method of Section 2.5, giving
K
Y(s) =
s + 1
Then
2/(0 = K*~*
Sec. 3.9 Application to Time-Varying Systems 147

Choosing K = 1 in order to satisfy the condition 2/(0+) = 1,

2/(0 = €"*

Example 3.9-2. The differential equation of the previous example is changed slightly
to read
d2y dy
+ -t* + ty = 0
1 dt2 at
with 2/(0+) = 1.
The reader will recognize this equation as Bessel’s equation of zero order, whose
solution is y = J0(t). The transformed equation is

d dY(s)
- - [s2Y(s) - s - 2/(0+)] + |sY(s) - 1] - -fd = 0
ds ds
Simplifying,
dY(s)
(s2 + 1) + jT(0 = 0
ds
whose solution is
K
Y(s) =
Vs2 + 1
The constant K, needed to satisfy the condition 2/(0+) = 1, can be found by the initial
value theorem of Eq. 3.4-15.
sK
lim —— = K
S->00 Vs2 + 1

so K — 1, and
1
2/(0 = &-1
vV + l
Although the inverse transform appears in standard tables, note instead that, since the
solution is known to be J0(t),

-2Vo«] = — ■■ ■
Vs! + 1

The last example gives the easiest method of finding the Laplace
transform of the Bessel function. In general, the use of Eq. 3.4-9 yields
a differential equation in Y(s) of order equal to the degree of the highest
polynomial in t. It should be emphasized that the resulting equation
cannot always be easily solved, and that the preceding examples are not
typical of the difficulties encountered.
Each term in a differential equation with variable coefficients is the
product of one known and one unknown function of time. In trans¬
forming such a term, Eq. 3.4-14 can be used. The result is a complicated
integral equation in y(T). Since the resulting equation is very rarely easier
to solve than the original one, this method is not pursued further.
At this point, one might seek other integral transforms helpful in
solving those differential equations that are not amenable to the Laplace
148 Transform Techniques

(3.9-17)
3 u.
o 0 m NO
ON 1-H *—< ’—1 I
£ I 1 1 I
On On ON ON On On ON On
m rb co rb rb rb m rb

y(t) = lT~x[V{s)H{t, 5)]


C/3
r-
3
<u
4->
C/3

C/2
txO
3
‘>7 03
u
3
>■
03
P

3 t-
o

(3.9-8)
i
c7 rb 3" 77 Co t-~
- £ On

ON ON On ON
Table 3.9-1

On ON
si m rb rb rb rb C*~i

w z
= j^-i[r(5)//(5)]

3
o
Laplace transform

o
03 3
Response by the

C/3 3
3 C/3 -0 CL 3
o 3 C/3
3 C/3 3
03
Oh _o <3 03 O
C/3 s- 4—*
03 +-> c/3
O 3
I— 0
C/3 3 O C/3 <U
03 O s_ 3 C/3
C
4—>
C/3 Cu 5J0 .0 O 3 3 O
3
s— 03
03 -*—»
L—» c CL O Cl
X<D
im

3 3 Cl
CL CL 3 C/3
—s 03 L— 03
E GO U,
Sec. 3.9 Application to Time-Varying Systems 149

transform. This has been done, and a number of other transforms are
available for certain special cases. The best-known of these are the Mellin
and Hankel transforms.11 General integral transforms are discussed
further in Section 3.10.

The System Function

The second basic transform method for fixed systems was based upon
the system function H(s). For systems with no initial stored energy, the
response to an arbitrary input is given by

y{t) = &-'[V( s)H{s))

Zadeh has shown that the system function concept can be extended to
time-varying systems.12 When the characteristics of a system are changing
with time, the system function must be an explicit function of t as well as s.
One would expect that the response of a varying system with no initial
stored energy would be given by

y(t) = 5)]
In order to understand how the time-varying system function results from
a generalization of H(s), first examine Table 3.9-1. Equations 3.9-1
through 3.9-8 summarize the three principal methods of characterizing
fixed systems. They follow directly from Eq. 2.2-6, the definition of the
impulse response, Eq. 1.6-2, Eq. 1.6-4, Eq. 3.5-6, Eq. 3.5-6, the discussion
following Eq. 2.6-20, and Eq. 3.5-2, respectively. For varying systems,
Eqs. 3.9-9 through 3.9-13 follow directly from Eqs. 2.2-7, 1.7-1, 1.7-1,
1.7-2, and 1.7-4, respectively. Equations 3.9-14 through 3.9-17 are derived
in this section.
In the differential equations 3.9-1 and 3.9-9, p = djdt. While the oper¬
ators A and B are functions only of p for fixed systems, they are explicit
functions of both p and / for varying systems. For example,

A{p, t)y(t) = [an{t)pn + • • • + ax{t)p + a0(t)]y(t)


Equations 3.9-2, 3.9-10, and 3.9-11 define the impulse response. A careful
distinction must be made between the symbols h and h* as discussed in
Section 1.7. h(t, 2) is the response at time f to a unit impulse at time 2,
while h*(t, r) is the response at time t to a unit impulse occurring r
seconds earlier. The dummy variables 2 and r in these equations are
related by r = t — 2. The limits on the superposition integrals of Eqs.
3.9-3, 3.9-4, 3.9-12, and 3.9-13 may usually be modified, if desired. For
nonanticipatory systems with v(t) = 0 for t < 0, limits of zero to t may be
used.
150 Transform Techniques

For fixed systems, H(s) = S?[h(t)], where the transformation is with


respect to t. Since h(t) is the response at time t to a unit impulse at time
zero (i.e., t seconds earlier), the age variable r has been used in Eq. 3.9-5.

H(s) = &[h(r)\ (3.9-5)

where the transformation is now with respect to r. By analogy, the system


function for the variable case is defined as

H(us)=&[h*(u r)] (3.9-14)


where the transformation is still with respect to r, and where t is regarded
as an independent parameter. The system function is not the transform
of h(t, X) with respect to X.
f* oo r oo
H{t, S) = /l*(f, r)e~ST dr = h(t, t — r)e~sr dr
Jo Jo

= f hit, X)6~S(t-V dX y* J¥[h(t, X)]


J — 00
The definition of Eq. 3.9-14 forms the bridge between the time-domain
and transform techniques. The response to the input

v(t) = €st for — oo < t < oo

can be found by Eq. 3.9-13.

y(t) = f" esU~r)h*(t, r) dr


J — OO
For nonanticipatory systems, the lower limit may be changed to zero, and

y(t) = est "h*(t, r)e-ST dr = estH(t, s)


Jo
Since the response to es< has been shown to be estH(t, 5),

A(p, t)[H(t, s)est] = B(p, t)est (3.9-16)

Since the exponential input was assumed to exist for all t, H(t, s)est really
represents only the forced or particular integral response. A similar
comment applies to Eq. 3.9-7, as is inferred by Eq. 3.5-3.
From Eq. 3.9-12, the response to an arbitrary input is
r 00
y{t) = v(X)h(t, X) dX
J — 00

K(s)€sA ds h(t, X) dX
c
where C is the contour of integration used for the evaluation of the
inversion integral of Section 3.7. Assuming that the order of the two
Sec. 3.9 Application to Time-Varying Systems 151

integrations may be reversed,

2/(0 = ij L V(s) ,s).


h(t, X) dX ds

By Eq. 3.9-12, the factor in brackets is the response to esi, namely,


H(t, s)esl.

2/(0 = — V(s)H(t, s)€5' ds = ^-1[E(s)/f(t, s)] (3.9-17)


277j Jc
Once H(t, s) is known, the response to an arbitrary input can be found
from the standard tables and techniques of Laplace transform theory. In
taking the inverse transform with respect to v, t is treated as an independent
parameter. Note that
70) 7^ V(s)H(t, s)

since 70) = X£[y(t)] certainly cannot be a function of t.


Example 3.9-3. If H(t,s) = \/(t2s2), find the response to v(t) = te tU_x(t). The next
example shows that this system function corresponds to the differential equation

d2y dy
t2 —• +41-b 2y = v
dt2 dt
Write
1 1
2/(0 = ^-l[LVV(s)yH{Vt,, 0171 = t%
- ^_1 + ])2j

The residue of est/[s2(s + l)2] at s = 0 is

d
7s (s + 1)2J =0
and at ^ = — 1 is

= (/ + 2)€-‘
s=—1
Thus

e -t

which agrees with Example 1.7-1.

Obtaining the System Function from the Differential Equation

Equation 3.9-17 gives a relatively easy method of calculating the re¬


sponse, once the system function is known. Determining the system
function is usually the most difficult part of the entire procedure. The
three common ways of characterizing a system are by the differential
equation, the impulse response, and the system function. There is, of
course, no problem if the system function happens to be given. Similarly,
152 Transform Techniques

if the impulse response is given, H(t, s) can be easily determined by Eq.


3.9-14. If the differential equation is given, one method is to find first the
impulse response by the techniques of Section 2.8.
Example 3.9-4. Find H(t, s) for a system characterized by

dy
+ 4t -f + 2y = v
dt

The impulse response was found in Example 2.8-2.

h{t, 2) = -- forY > A

Then

h*(t, r) = h(t, t - t) = - for r > 0

1 i
m, s) = - &ir\ =
t2S2

An alternative method is based upon Eq. 3.9-16. This is a time-varying


differential equation which can be solved directly for the system function.
The equation is of the same order as the differential equation used to find
the impulse response, but in some cases it is easier to solve. Since the
repeated differentiation of est is so simple, the form of Eq. 3.9-16 may be
simplified somewhat. A typical term in B(p, t)est is

bk(t)pkest = bk(t)skest
so
B(p, t)est = estB(s, t) (3.9-18)

where B(s, t) is a function of s and t, and not an operator. A typical term


in A(p, t) [H(t, s)esi] is
ak(t)pl[H(t, i)e8']

If/ and g are two functions of time,

pVs) = 2 (kJ(pk-rf)(p'g)
'k'
where I ) represents the binomial coefficients, i.e.,

(c + df = r=o
^ (/v I c k-r dr
\ r>
Letting/ = H{t, 5) and g = est
k /k
Pk[H(t, s)es<] = 2 (l) [pk~rH(t, 5)][s^csi]
r=0 \'/

= (k)pk-VH(t,s) = €s'(p + s)“H(t, s)


r=0 V r/
Sec. 3.10 Resolution of Signals into Sets of Elementary Functions 153

Thus
A(p, s)est] = estA(p + s) (3.9-19)

where A(p + 5, t) is an operator, as well as a function of s and t. Sub¬


stituting Eqs. 3.9-18 and 3.9-19 into Eq. 3.9-16 gives

A(p + s, t)H(t, s) = B(s, t) (3.9-20)

The methods of Sections 2.7 and 2.8 are often helpful in solving this
equation.

Example 3.9-5. Find H(t,s) for a system characterized by

d2y dy
t2 — + 4/ - + 2y = v
dt2 dt
In operator notation,
(t2p2 + 4tp 3- 2)y = v
Using Eq. 3.9-20,
[t2(p + s)2 + 4 t(p + s) + 2\H(t,s) = 1
or
[t2p2 + {1st2 -f- 4t)p + (t2s2 + 4ts + 2)\H(t, s) = 1

The reader should verify that the solution


1
H(t, s)
t2s2

does satisfy the last equation and does agree with the result of the previous example.
For this particular system, solving the differential equation in H(t, s) is more difficult
than solving the original differential equation for the impulse response. For other
systems, however, it may be simpler.

The reader should realize that the H(t, s) found in the last example is
only the forced or particular integral solution of Eq. 3.9-20. From the
derivation of Eq. 3.9-16, however, this is exactly what is desired. Al¬
though a different H(t, 5) results if a complementary solution is added, it
can be shown that the response calculated by Eq. 3.9-17 is not affected.12
In many problems, the exact determination of H(t, s) is so difficult that
recourse to approximation methods is necessary. Standard approximating
techniques may be used when the parameters either vary slowly with time
or have a small variation compared to their mean value.13

3.10 RESOLUTION OF SIGNALS INTO


SETS OF ELEMENTARY FUNCTIONS

Both the superposition integrals of Chapter 1 and the inverse transform


methods of this chapter may be regarded as special cases of the resolution
of signals into sets of elementary functions.14 As background for the
154 Transform Techniques

material of this section, the reader may wish to review Section 1.3.
Equations 1.3-3 and 1.3-4 express the input and output of a system
initially at rest as

v(t) = I a(X)k(t, X) dX
c
(3.10-1)
y(t) = I a(X)K(t, X) dX
Jc
If X is a real variable, the integration is in general from X = — oo to + go.
If X is a complex variable, the integration is over a contour C in the com¬
plex plane. k(t, X) represents the family of elementary functions into
which v(t) is decomposed. The spectral function a(X) is a measure of the
relative strength of the elementary functions comprising v(t). K(t, X)
is the system’s response to k(t, X).
In order for Eq. 3.10-1 to be useful, there must be a simple method of
finding a(X) for an arbitrary input. Because of the linearity of Eq. 3.10-1,
one would expect a(X) to be expressible in the following form.
OO

a(X) = v(t)k \X, t) dt (3.10-2)


' — OO

k~\X, t) is known as the inverse of k(t, X) and is not necessarily easy to


find. One helpful relationship between k(t, X) and k~\X, t) is obtained
by letting v{t) = U0(t — z) in Eq. 3.10-2. The corresponding a(X) is
00

k \X, t)U0(t — z) dt = k J(2, z)


' — OO

Inserting this result into the first of Eq. 3.10-1,


f
U0(t — z) = /c_1(A, z)k(t, X) dX (3.10-3)
Jc
In order for a given choice of k(t, X) to be fruitful, it must be possible to
find a fc_1(2, z) satisfying this equation.
Much of Sections 1.5 through 1.7 is based upon the choice of k{t, X) =
U0(t — X), where X is a real variable. Then k~\X, t) = U0{X — t), and
Eqs. 3.10-2 and 3.10-1 become
OO

a(X) — v(t)U0(X — t) dt = v(X)


— oo

00 (3.10-4)
v(t) = a(X)U0(t — X) dX
J — oo
’ oo

= v(X)U„(l - A) cO
J — 00

The latter rather trivial result is identical with Eq. 1.5-4.


Sec. 3.10 Resolution of Signals into Sets of Elementary Functions 155

The inverse two-sided Laplace transform! is based upon the choice of


k(t, 2) = eu!2irj for — oo < t < oo, and k~\X, t) = e~u, where X is
the complex frequency variable previously denoted by 5. Equations 3.10-2
and 3.10-1 become
oo

a(X) = I v{t)e u dt = j£?n[ v(t)]


-oo
(3.10-5)
1
v(t) = a(A)eA< dX = £ iTVWl
2nj Jo
which agree with the usual equations for the direct and inverse transform.
The contour C in the complex 2-plane is described in Section 3.7. When
the above choices of k(t, X) and k~\X, t) are substituted into Eq. 3.10-3,
the right side becomes
1
Xzeudx = '&Ir1[€~XK]
2rrj Jc

This does equal U0(t — z) as required.


K(t, X), the response to k(t, 2) when 2 is treated as a parameter, is usually
called the characteristic function. When k(t, X) = U0(t — 2), K(t,X)
is the impulse response h(t, 2). From Eqs. 3.10-1 and 3.10-4, the expression
for y(t) then becomes
oo
y{i) = v(X)h(t, 2) dX (3.10-6)
'—00

agreeing with Eq. 1.5-7. When/c(t, 2) = eul2nj, K(t, X) = (l/27rj)H(t, X)eu


by Eq. 3.9-16. Thus, for the Laplace transform approach,

1
y{ 0 = a(X)H(t, X)eudX
2Trj Jc
= ^Ir1[a(2)/7(t, 2)] (3.10-7)
where a(X) = u[v(t)\, agreeing with Eq. 3.9-17.
For any choice of elementary functions k(t, 2), the characteristic function
K(t, 2) can be found once the impulse response h(t, 2) is known. Using
Eq. 3.10-6 with v(t) = k(t, z) and y(t) = K(t, z),
oo

K(t, z) — k(X, z)h(t, 2) dX (3.10-8)


-oo

Conversely,

h(t, z) = k 1(2, z)K(t, 2) dX (3.10-9)


c
t For the Fourier transform, k(t, X) = €Ut/2Tr for — oo < t < oo, where A is the real
variable previously denoted by oo. For the one-sided Laplace transform, k(t, 2) =
for 0 < t < oo. For generality, the discussion is carried out in terms of the
two-sided Laplace transform.
156 Transform Techniques

For the special case of k{t, X) = ex72vj and k~\X, t) = these two
equations should relate the system function H(t, s) of Section 3.9 to the
impulse response. For this special case, K(t, z) = {\j2zTj)H(t, z)ezt by
Eq. 3.9-16. Substituting these values into Eq. 3.10-8 gives

— H(t, z)C = — Ch(t, X) dX


277rj 27rj
Thus,

H(t,z) = I” h(t, X)€-d,i-» dX


J — CO

Letting £ = t — k,

H(t, z) = f /?*(f, £)e 2? d£ = J?u[h*(t, |)] (3.10-10)


J — oo

agreeing with Eq. 3.9-14. Equation 3.10-9 becomes

h(t,z) = — X^'dX
IttJ Jc

= — f H(t, X)ew-Z) dX
2ttj Jc
Then

h*(t, z) = h(t, t — z) = H(t, X)eXz dX


2nj JC

= &ll-1[H(t,X)] (3.10-11)

agreeing with Eq. 3.9-15.


Equations 3.10-1, 3.10-2, 3.10-3, 3.10-8, and 3.10-9 apply to any choice
of elementary functions k(t, X). Although the only special cases con¬
sidered in this book are the superposition integrals of Chapter 1 and the
better-known integral transforms, some of the less well-known integral
transforms (e.g., the Mellin, Hankel, Hilbert, Euler, and Whittaker
transforms) occasionally prove useful. In fact, it might seem worthwhile
to make a systematic study of many types of elementary functions k(t, X),
with the aim of inventing new integral transforms. Such a study would,
however, be severely limited by the general difficulty of finding inverse
functions k~l(X, t) that satisfy Eq. 3.10-3. Even then, the characteristic
function K(t, X) must be relatively easy to find, if the new transforms are
to be useful.
Sec. 3.10 Resolution of Signals into Sets of Elementary Functions 157

The spectral function a(K) for the input v(t) is often denoted by V(X).
Using this notation, Eqs. 3.10-2 and 3.10-1 become

oo

V(X) = v(t)k~\x, t) dt
-oo

v(t) = V(X)k(t, X) dX
c

y{t) = V(X)K(t, X) dX (3.10-12)


Jc

= Y(X)k(t, X) dX

where Y{X) is the spectral function or transform of the output y(t) with
respect to the set of elementary functions k(t, X). The relationship
between Y(X) and V(X) becomes particularly simple if the characteristic
function K(t, X) happens to have the form K(t, X) = H(X)k{t, X). When
this occurs, the elementary functions k(t, X) are said to be “eigenfunctions”
of the system under consideration.

response to k(t, X)
H(X) =
k(t, X)

and is called the system function with respect to k(t, X). In such a case,

V(X)H(X)k(t, X) dX

Thus Y(X) = V(X)H{X). Equation 3.10-12 can then be rewritten

V(X) = I r(t)/c_1(2, t) dt
J — co

Y(X) = V(X)H(X) (3.10-13)

y(t) = Y(X)k(t, X) dX
Jc
This specifically describes the three steps associated with the Fourier and
Laplace transforms: the direct transform, multiplication by the system
function, and the inverse transform.
The forced response of a fixed linear system to eu always has the form
H(X)eu, as shown in Eq. 2.6-22.| For such systems, the previous equations

t Zadeh points out that the statement is also true for a certain class of nonlinear
systems.15
158 Transform Techniques

become
I* oo
V(X) = \ v(i)<Tlidt = S?u[v(t)]
J — o0

Y(X) = V(X)HO) (3.10-14)

2/(0 = f f r(A)€"dA = JS?n'W)]


277/ J(7

For varying linear systems, the system function H is a function of t as well


as 2, as shown in the previous section.

3.11 THE Z TRANSFORM

The Laplace transform, which in earlier sections was used to solve


differential equations and to determine a system's response to continuous
signals, is also applicable to the solution of difference equations and the
determination of the response to discrete signals.16 The Z transform is,
however, better suited for the latter purposes. Although some early
workers, including Laplace himself, had used a similar concept, the Z
transform has only recently attracted wide attention from engineers,
principally in its application to sampled-data control systems.
To provide motivation, the Z transform is first related to the more
familiar Laplace transform and to the discrete
u(t) / v*(t) y(t) signals of Section 1.8. When doing this, it
h(t)
is better to consider a discrete signal consist¬
Fig. 3.11-1 ing of a train of weighted impulses, rather
than a series of finite numbers, since the
Laplace transform of the latter is identically zero. Any problem in¬
volving sequences of finite numbers can be solved by inspection, however,
once the corresponding problem with impulses has been solved.
Figure 3.11-1 shows a linear system preceded by an ideal sampling
switch with a period T. From Eq. 1.8-4,
oo oo

v*(t) = v(t) 2 U0(t — nT) = 2 v(nT)U0(t - nT) (3.11-1)


n=—oo n=—oo

where the lower limit becomes zero if ift) = 0 for t < 0. The two-sided
and one-sided Laplace transforms of Eq. 3.11-1 are
oo

F„*(s) = 2 v(nT)e~nTs
n=— oo
oo (3.11-2)
F*(s) = 2 v(nT)e~nTs
n=0
Sec. 3.11 The Z Transform 159

Since the variable in both cases is really e~nTs, it is convenient to define

2 = esT (3.11-3)

Although it is contrary to normal mathematical nomenclature, it has


become customary to define

V(Z) = [V*(s)]es^z (3.11-4)

V(z) is not V(s) with s replaced by 2, but is V*(s) with 5 replaced by


(1/TO ln Equation 3.11-2 then becomes
00

Vu(z) = J v(nT)z~"
n—— 00
„ (3.11-5)
V(z) = ^v(nT)z~n
n=0

The response of the configuration of Fig. 3.11-1, and those of more


complicated configurations, are deferred until the properties of the Z
transform have been investigated. As the reader proceeds, he should note
the similarities between the properties of the Z and Laplace transforms.

The Two-Sided Z Transform

The formal definition of the two-sided transform is


00

F(z) = &nU(t)] = &nU*(t)] = lf{nT)z~n (3.11-6)


n——oo
or, equivalently,
F(z) = {Sru[f*(t)]}e,T=z (3.11-7)

where T is a known constant. Since T(2) depends only upon the value of
f(t) at the instants t = nT, it can be equally well regarded as the Z trans¬
form of either f(t) or
An alternative interpretation is based upon the fact that the exponent
of 2 in Eq. 3.11-6 denotes a particular element in the infinite sequence
. . . ,/( — . . . ,f{nT), .... The variable 2 is often regarded
as an “ordering variable” for the infinite sequence.
Equation 3.11-6 defines F(z) only for those values of 2 for which the
infinite series converges. For other values of the complex variable 2,
F(z) is defined by the principle of analytic continuation. Whenever ^(2)
is written in the infinite series form of Eq. 3.11-6, the values off{nT) can
be seen by inspection. If F{z) is given in closed form,f{nT) can be found
by the techniques of the next section.
160 Transform Techniques

Example 3.11-1. Find the Z transform of

|0 for / < 0
^ ieat for t > 0
By Eq. 3.11-6,
oo oo aT

f(z) =y tmTz-n =y —
n=0 n=0
This infinite series converges for

aT
aT
< 1, i.e., |z| >

and can be written in closed form as


1
F(z) = aT
1 - (eaTf) z -

Example 3.11-2. Find the Z transform of

— eat for t < 0


m =
|0 for t > 0
Again using the basic definition,

F(z) = - J zanTz~n = - ^ (t-aTz)7


?2 = —CO 772 = 1

00 00
= - 2 (e~aTz)n+1 = -e-aTz^ (e-aTz)n
n=0 n=0

The infinite series converges for \z\ < eaT and can be written in closed form as

F(z) = -e~aTz
1 — e z z — e a!1

If F(2) is given in closed form, as in the preceding examples, the region


of convergence must be specified in order to determine which function
of time corresponds to the function of 2. Note that the two regions of
convergence are mutually exclusive, and that only one of the two functions
of time is bounded.

The One-Sided Z Transform

If the history of the system prior to t = 0 is summarized by appropriate


boundary conditions,/(/) can be considered to be zero for t < 0. The
one-sided transform, which is the one used henceforth unless otherwise
stated, is defined as
00

F(z) = 2C[f(t)] = 2 f(nT)z~a (3.11-8)


71=0

or, equivalently,
F(z) = {J?[f*(t)]}€,r=z (3.11-9)
Sec. 3.11 The Z Transform 161

Based upon this equation, a table of Z transforms such as Table 3.11-1


can be determined. A limitation of Eq. 3.11-8 is that there may be some
difficulty in expressing the infinite series in closed form. One method that
overcomes this difficulty is discussed in the next section.

Table 3.11-1

Region of
m F(z) Convergence

z
£/_i(0 z - 1
1*1 > 1
Tz
t 1*1 > 1
(z - l)2
T2z(z + 1)
t2 \z\ > 1
(2 - 1)3
Z
t,at 1*1 > €
z„ — e aT

z sin (IT
sin [It 1*1 > 1
z2 — 2z cos pT + 1
z(z — cos 0D
cos [It 1*1 > 1
z2 — 2z cos [IT + 1

It is sometimes necessary to find the two-sided transform from a table


of one-sided transforms.
0 oo

&nlf(t)] = I f(nT)z-n + J f{nT)z~n - /(0)


n=— oo n=0

The presence of the last term compensates for the fact that n — 0 appears
in both of the summations. Letting m = —n,

co oo

2Tii[/(01 = I f(~mT)zm + 2 f(nT)z~n — /(O)


m=0 ot=0

= + &\f( o] -m (3.H-10)

For the special case of an even function of time,/(—/) = f(t), and

= F(z-') + F(z) -/(0) (3.11-11)

where F(z) =fF[f(t)\.


162 Transform Techniques

3.12 PROPERTIES OF THE Z TRANSFORM

The properties of the one-sided Laplace transform have their parallel


in the one-sided Z transform. Although a few properties are proved,
most of the proofs are left as exercises for the reader.

Theorem 3.12-1. f{t) must be defined for all t = nT (n = 0, 1, 2, . . .)


in order for F(z) to exist.

Theorem 3.12-2. The definition of Eq. 3.11-8 converges absolutely for


121 > c, where c is known as the radius of absolute convergence.

Theorem 3.12-3. Two functions of time have the same Z transform if


and only if they are identical for all t = nT (n = 0, 1,2,.. .). Thus F(z)
contains no information about f(t) except at the sampling instants. The
inverse transform i^-1[T(z)] can be uniquely defined as either/*(/) or
f(nT); the latter definition is used in this book. If Tables 3.11-1 and
3.12-1 are used to find the inverse transform, every t in the left-hand
column should be replaced by nT.

Theorem 3.12-4. The Z transform of f(tjT) is independent of the sam¬


pling interval T. From the definition of Eq. 3.11-8,
00

nntiT)] = 2 /(«>-»
71=0

which does not depend upon T. This theorem is useful in finding the in¬
verse transform by partial fractions, particularly if the available tables
assume T = 1.

Other Properties of the Z Transform

Table 3.12-1 lists the most useful properties of the Z transform. Equa¬
tions 3.12-1 through 3.12-5 are the basis for the solution of fixed difference
equations, as discussed in the next section.
To prove Eq. 3.12-3, for example, let m = n + 1, and write
oo oo

&U(t + T)] = 2 f(nT + T)z~n = z 2 f(mT)z~“


n—0 m=1
00

= 2 2 f(mT)z-™ - zf{0) = zF(z) - zf(0)


m=0

Equation 3.12-6 offers a method of summing a finite or infinite series.


Equations 3.12-7 through 3.12-10 are useful when finding the direct and
Sec. 3.12 Properties of the Z Transform 163

Table 3.12-1

Equation
Property Number

^[/i(0 +/2( 0] = &lfi (0] + nUO] = F^Z) + F2(z) (3.12-1)


%[af(t)] = aF(z) (3.12-2)
&lf(t + T)] = zF(2) - z/(0) (3.12-3)
iT[/(; + 2F)] = z*F(z) - z2/(0) - zf(T) (3.12-4)
2[f(t + mT)] = zmF(z) - zmf(0) - zm~1f(T) - ■■■ - zf(mT - T) (3.12-5)

3* 2 f(kTy = f— F(z) (3.12-6)


_k=0 J Z — i
n*-atf(0] = F^z) (3.12-7)

ntf(t)]=-Tz-F(z) (3.12-8)

Z yw l_ CF(2)
dz (3.12-9)
T
Z[f(t - kT)U_x(t - kT)] = 2-^(2) (3.12-10)

3V/(()] = F (3.12-11)
a-
n
Z 2 fx(t - kT)f2(kT) = Fl(z)f2(2) (3.12-12)
k=0

1 I F1(w)F2(z/w)
^[/i(0/2(0] = J-. <j>
W
where the contour of integration separates the
poles of Fj(w) from those of F2(z/w) (3.12-13)
lim f(t) = lim F(z), provided that the limit exists (3.12-14)
t—>-0 z—> co

lim /(0 = lim (--)f(z),


<->oo Z->1 \ 2 /

provided that [(z — \)/z]F(z) is analytic on and out¬


side the unit circle (3.12-15)

inverse transforms of given functions. Equation 3.12-11 shows the effect


of a change of scale in the z-plane, while Eqs. 3.12-12 and 3.12-13 are
convolution formulas. The former one is proved and discussed in Section
3.13. Equations 3.12-14 and 3.12-15 are known as the initial and final
value theorems, respectively.
Table 3.12-2 lists three properties that further relate the Laplace and Z
transforms. F(s) and F*(s) represent the Laplace transforms of/(/) and
164 Transform Techniques

Table 3.12-2

Equation
Property Number

F(s)
F(z) = 2 residues of at the poles of F(s) (3.12-16)
1 — esJz
00
Ink 2tm
F*(s) = - 2 F[s +j = F* s + j (3.12-17)
lc=— oo T T
If F(s) = F1(s)F2*(s), then F(z) = F1(z)F2(z) (3.12-18)

/*(/), respectively. To prove the first property, rewrite Eq. 3.11-1 as

/*( 0 =/(0<(0
where z(0 denotes the impulse train
00
1(0 = 2 (/„(< - nT)
72=— 00
Then
= SF[f(t)i(t)]

which suggests the use of the complex convolution theorem of Eqs. 3.7-7
and 3.7-8. Using the result of Example 3.4-3,

1
/(s) 1 _ e-r*

This function has first order poles at

= 1

Ts = d= j2rrk

2rrk
S = ±j for k = 0, 1, 2, . . .
T
Using Eqs. 3.7-7 and 3.7-8,

1 F(w)
F*(s) = dw
2-rrj Jc 1 - <rr(s-”’>
F(w)
= 2 residues of | _ ^—Ts^wT
at the poles of F(w) (3.12-19)

Replacing els by z,

F(w)
r(*) = 2 residues of — at the poles of F(w)
i - e Tz
Sec. 3.12 Properties of the Z Transform 165

This is identical with Eq. 3.12-16, if the dummy variable w is changed back
to 5. The expression provides a method of obtaining F(z) directly in closed
form, without having to sum an infinite series. Furthermore, in the
analysis of systems, F(s) rather than f(t) may be given. In this case, Eq.
3.12-16 permits the calculation of F(z) without having to find f(t) as an
intermediate step.
Example 3.12-1. Find 2?[te~at].

1
F(z) = residue of at s = —a
(s + a)\ 1 e-'z-1)
d 1 Te~aTz
ds \ 1 — esTz~1 (Z — c-aTy
S——a

The same result follows from Table 3.11-1 and Eq. 3.12-7 or 3.12-8.

Example 3.12-2. Find F(z) corresponding to F(s) = [5(25 + 3)]/[(5 + 1)2(5 + 2)].
F(s)
The residue of at .y = — 1 is
[1 - esTz~l]

5(25 + 3) -T,
-Te-Tz
7sT .y-1
_ds ((5 + 2)(1 — €8rz~1)j_| s_ 1 (z — e~T)2

The residue at s — —2 is

5(25 + 3) 2z
- 2T
.(5 + 1)2(1 - esTz~1) J s=-2 Z — €

Hence
2z Te-Tz
F(z) =
2T
Z — €

(z - e-T)2

The same result can be obtained by finding

5(25 + 3)
Z’-1 = 2e-2t - te-
_(s + 1)2(5 + 2)_

and by looking up the Z transform of the two functions of time in the tables.

The proof of Eq. 3.12-17 follows from a simple modification of Eq.


3.12-19. As shown in Eq. 3.7-8, it is possible to replace the sum of the
residues of T(w’)/[1 — g-z’f*-*')] at the poles of /7(h’) by minus the sum
T(s — w)
of the residues of the same function at the poles of 1/[1 ]•
These poles are located at
. 2jrk
W s ±J for k = 0, 1, 2, .
T
The residue of F(w)/[ 1 — e at one of these first order poles is,
by Eq. 3.6-20,

f F(w) } 1 / .2nk
= ~ ~ E 5 ± J —
\(djdw)[l - €-r,S-"']l„=s±)(2rt/r, T \ T
166 Transform Techniques

Thus
00
. 2nk\
F*0) = - I F\s+j
T k=— oo T7
which is identical with the first half of Eq. 3.12-17. An alternative proof
of this result is suggested in one of the problems at the end of the chapter.
If 5 is replaced by s -f- jilim/T),

p*
. 2?TU 2v-(k + n)^j
s +j
T
The right side may be written

1 00
. 2 tt m \
2 F(s+j = F*(s)
T m=—co T 7
which proves the second half of Eq. 3.12-17.|
Although perhaps the most important application of Eq. 3.12-17 is in
the proof of other theorems, the equation also gives considerable insight
into the sampling of continuous signals. An assumed plot of the frequency
spectrum of a continuous signal f(t) is shown in Fig. 3.12-1#. It is assumed
that the signal is band-limited, so that it contains no frequency com¬
ponents above coc radians per second. The frequency spectrum for F*(s)
is given by

and is shown in parts (b) and (c) of the figure for two different values of
the sampling interval T. The plot consists of the spectrum of the con¬
tinuous signal repeated every 2tt/T radians per second. If 77/T > coc, as
assumed in part (b), the original shape of \F(jco)\ is not destroyed by the
t Equations 3.12-16 and 3.12-17 are given in the form usually used in the literature, but
they are not consistent if fit) has a discontinuity at t = 0. Equation 3.12-16 is correct
if the value of the function at the origin is defined to be f(0-f), while Eq. 3.12-17
assumes that this value is M/(0—) + /"(0 + )]. The inconsistency can be explained by
recalling that Eq. 3.7-8 requires that the integral vanish along DFA and DHA in Fig.
3.7-7, and by carefully examining the integration along these semicircles. Unless
otherwise stated, it is customary to use f(0+) as the function’s value at t — 0. Since
/(0—) = 0 in the one-sided transform, Eq. 3.12-17 should then read

F*(s) = tf{ 0+) + -


1 v
If U/-
/ 2”k\
/ fc=-oo \ 1 /

In the proof of Eq. 3.12-17 that is suggested in Problem 3.31, the Fourier series for
the impulse train assumes that /'(/) is an even function of time. The one-sided Laplace
transform then includes the effect of only half of the impulse at the origin, so the extra
term i/(0 + ) is again needed.
Sec. 3.12 Properties of the Z Transform 167

IWujl

I F*(jw)\

\F*(jo,)\
A

■>- co
-3tr/T -tr/T tt IT 3tr/T

Fig. 3.12-1

sampling process. If, however, n/T < coc as in part (c), the original shape
of \F(jaj)\ no longer appears in the frequency spectrum of the sampled
signal. Thus the continuous signal f{t) can be theoretically reproduced
from the sampled signal f*(t) by linear filtering if and only if tt/T > coc.
This statement agrees with Shannon’s sampling theorem.17
The reconstruction of the continuous signal could be accomplished by
an ideal low-pass filter, but approximating an ideal filter requires a large
number of components and results in a great time delay. In many sampled-
data systems, the circuit following the sampling switch does act as a crude
low-pass filter, reducing the high-frequency components and smoothing
out the signal.18
If the 5-plane is divided into horizontal strips as in Fig. 3.12-2, Eq.
3.12-17 indicates that the pole-zero pattern in each strip is the same.
Values of F*(s) at corresponding points in different strips are identical.
Thus F*(s) is uniquely described by its values in the strip —n/T < w <
tt/T.

It is useful to note that the left half of the 5-plane maps into the interior
168 Transform Techniques

co
/V

Fig. 3.12-2

of the unit circle in the 2-plane, as shown in Fig. 3.12-3. If s = a + jco and
2 = peje, where 2 has been written in polar form, the transformation

2 = esT gives

(3.12-20)
Then p = eaT and 6 = coT. For a < 0, a = 0, or a > 0, the magnitude
of 2 is p < 1, p=l (the unit circle), or p > 1, respectively. Note
also that, if co is replaced by co + 2tt/T, the value of 2 is unchanged. Thus
each horizontal strip in Fig. 3.12-2 maps into the entire 2-plane. This is, of
course, consistent with the previous conclusion that the values of F*(s)
for —v/T < co < n/T uniquely determine F*(^) and hence F(z). It also
explains why there are a finite number of poles of F(z), despite an infinite
number of poles of F*(s).
Equation 3.12-18, which is used in the next section, follows directly
from Eq. 3.12-17. If F(s) = F1(s)F2*(s), then

CO
/V

s-plane

Fig. 3.12-3
Sec. 3.12 Properties of the Z Transform 169

Since

f** (* +j^y) =

for all integral values of k, it follows that

1 00 / %Tk\
F*(s) = F2*(s) - 2 FAs + j—7 =
1 lc=-oo \ T 1
or
F(2) = ^(2)^(2)

The Inverse Transform

There are three common ways of finding f(nT), or equivalently /*(/),


when F(V) is given: by infinite series, partial fraction expansion, and an
inversion integral. As shown in Examples 3.11-1 and 3.11-2, when the
two-sided Z transform is used, the region of convergence must be specified
in order to uniquely determine f(t) at the sampling instants. This am¬
biguity is best resolved by the third of the three inversion methods; thus
the initial discussion is restricted to the one-sided transform.
The first method reconstructs the infinite series appearing in the defini¬
tion of Eq. 3.11-8, repeated below.

F(z) =/(0) + f(T)z~1 + • • • +f{nT)z~n + • • • (3.12-21)

The values of f{nT) can then be determined by inspection. Equation


3.12-21 can be regarded as a Taylor series written about the point at
infinity. If f(z) denotes the function formed from F(z) by replacing 2
by I/2,
f(z) =/(0) +/(7> + * * * + f(nT)zn + • • •

is a Taylor series about the origin. From Eq. 3.6-12,

dnf( 0)
dz

As discussed in Section 3.6, however, it is usually easier to find the coeffi¬


cients by long division. If f(z) is the quotient of two polynomials, they
should both be arranged in ascending order to obtain an infinite series of
the proper form. When dealing directly with F(z), therefore, both the
numerator and denominator must be written in ascending powers of
(2-1).
170 Transform Techniques

22 2z -2
Example 3.12-3. F{z) — . Determinef(nT).
(z - 2)0 - l)2 1 - 4Z-1 + 5z~2 - 2z
By long division,
F(z) = 2z-2 + 8z-3 + 222-4 H-
Thus
m 0
AT) 0
f(2T) 2
/(3 T) 8

/(4 T) 22

This method is often easier than any other when f(nT) is desired for only a few values
of n. A disadvantage is that it does not give a general expression for the «th term in
closed form.

A second method is based upon the partial fraction expansion of


F(z)/z and the use of a short table of transforms, such as Table 3.11-1.
F(z) itself is not expanded in partial fractions, because the functions of 2
appearing in the table have the factor 2 in their numerators.
2z
Example 3.12-4. F(z) = . Find/(«T).
(z - 2)(z - 1y

Using the techniques of Section 3.4,

F(z) 2
(z - 2)0 - l)2 z-2 0 ~ If z - 1

z z z
F(z) = 2
z-2 0 - l)2 2 - 1

Assuming that T = 1, and using Table 3.11-1,

f(n) = 2[2n — n — 1] for n > 0

By Theorem 3.12-4, if T is not necessarily unity,

f(nT) = 2[2W — n — 1] for n > 0

The reader should verify that, for n = 0, 1, 2,..., this result checks the answer of the
previous example.
2O2 — 2z —1)
Example 3.12-5. F(z) = — ^ — . Find f(nT).

The partial fraction expansion is

F(z) B C D
+ + 7—:—+
0 -,/)2 0 —j) (2 + y')2 0 + j)
where
1 —F / 1 —/
A =-- , C = -- , B = D = 0
2 2
Sec. 3.12 Properties of the Z Transform 171

Then

F(z) = L±2
2 (~jz - l)2
, i -./
2
m
(jz - l)2

1 -j €~^/2Z
-)--
1 +/ d7Xi2Z
~Y~ (e-^/2z - l)2 2 (e0nt2z - l)2

Using Table 3.11-1, and Eq. 3.12-7,

f(nT) = |[V2 e-jirl*ne3irnl2 -f V2 ejn-/4/;e-3>^/2]

= V2 n cos j- n — — j for n > 0

This example is typical of the manipulations used when F(z) contains complex conju¬
gate poles.

The third method of finding f(nT) is to use the inversion integral, which
may be derived in several ways. For generality, the derivation is per¬
formed in terms of the two-sided transform. From Section 3.7, the
inversion integral for the two-sided Laplace transform is

fit) = f f F(sy• <h


2rrj Jc

where the contour C is any path from c — joo to c + j00 within the region
of convergence ac < a < oy . For convenience, the contour is taken as
a straight vertical line. On the basis of the discussion associated with
Figs. 3.12-2 and 3.12-3, it is logical to break this contour up into the
individual sections . . . , —37t/T < co < —rr/T, —tt\T < to < tt/T,
tt\T < co < 377/r, .... Also, since the purpose of the derivation is to
obtain f{nT) from F(z), t is replaced by nT.
1 00 rc+U(2fc+l)jr]/T
f(nT) = -?- T F(s)enTsds (3.12-22)
2tTj *=-oo Jc+Wk-MlT

The right side of this equation must be rewritten so as to include F(z), or


equivalently F*(s), which is given in terms of F(s) by Eq. 3.12-17. This is
accomplished by replacing the dummy variable 5 by s + j{j27rk\T) in Eq.
3.12-22. Since F2*nk = 1,
00 'c+jn/T
1 2rrk nTs
/oo = — 2 Z7TJ fc=-oo J c-jn/T
F s + j
T
ds

Interchanging the order of the summation and integration,


T C C+jn/T -j 00
2irk .nTs
f(nT) = - 2 r l .v +j ds
2ttJ J C-jn/T T k=- 00 T
'c+jn/T
T
F*(s)€nTs ds
2rrj J c-jn/T
172 Transform Techniques

A
2-plane

>

Fig. 3.12-4

Letting z = eTs and dz = Tz ds, and recalling from Eq. 3.12-20 that the
vertical line a — c (—77/T < 00 <7t/T) corresponds to a circle of radius
ecT in the 2-plane,

finT) = (> F(z)zn~1 dz (3.12-23)


2lzj

Figure 3.12-4 shows two circles of radii px = exp (Toc ) and p2 = exp
(Tac ). The closed contour in Eq. 3.12-23 must be confined to the shaded
area, which is the region of convergence for the direct Z transform.
F(z) is analytic at all points in this region. By the residue theorem of
Eq. 3.6-16,
finT) = ^ [residues of Ffyz71-1] (3.12-24)

at the poles in the region \z\ < px. This equation is perfectly general,
and it can be used for both positive and negative values of n.

Example 3.12-6. F(z) = z[(2 - €)z - 1 ]/[(z - €)(z - l)2], with the region of con-
vergence 1 < \z\ < e. Find f(nT).
zn[(2 - e)z - 1]
F(z)zn-1
"(T- e)(z - iy

For n > 0, the only pole inside the contour of integration is at 2 = 1. The residue at
this pole is
d zn[( 2 — e)z — 1]
n + 1
dz Z — € Jz=l
Hence
finT) = n + 1 for n > 0
Sec. 3.12 Properties of the Z Transform 173

For n — — 1, F(z)zn~l has an additional pole at 2 = 0. The residue at z = 1 is zero, and


the residue at 2 = 0 is
(2 - e)z - 1
= e"1
(2 - e)(2 - l)2 2=0
Thus
f(-T) =

For n = —2, the residue at the pole at 2 = 1 is —1. The residue at the second order
pole at the origin is
- d (2 - e)2 - 1
= 1 + e-2
_dz (2 — e)(z — l)2
=0

SO f( — 2T) = e-2.

The procedure above for calculating f(nT) for negative values of n is


unnecessarily tedious. If F{z)zn~1 has a second or higher order zero at
infinity, then19

— (> F(z)zn 1 dz
2ttJ 3 c
= —2 [residues of F(z)zn~1 at the poles outside contour C] (3.12-25)

In the preceding example, the residue at the pole at 2 = e, the only one
outside of the contour of integration, is

V[(2 - e)z - 1]~


— en
- - 1)2 _

Thus f(nT) = en for n < 0. It is seen that poles of F(z) in the regions
\z\ < pY or \z\ > p2 in Fig. 3.12-4 give rise to components of f(nT) for
n > 0 and n < 0, respectively. In the one-sided Z transform, the dis¬
cussion of Section 3.4 indicates that p2 -> 00, and f(nT) becomes zero for
n < 0. The region of convergence is then the entire 2-plane outside of
the circle of radius pu which encloses all the poles of F(z).

Example 3.12-7. The one-sided transform is F(z) = 22/(2 — 2)(2 — l)2. Find f(nT).
The residue of F(z)zn-X at 2 = 2 is

2zn
= 2(2")
- 1h 2=2

and at 2 = 1 is
d / 227
= -2 (n + 1)
dz\z — 2, J 2=1
so
f(nT) = 2[2n - n - 1] for n > 0

agreeing with the result of Examples 3.12-3 and 3.12-4.


174 Transform Techniques

The Significance of Poles and Zeros

The nature of the terms in f(nT) depends upon the location of the poles
of F(z). Table 3.12-3, showing the effect of pole position in the one-sided
transform, can be constructed from Eq. 3.12-24 or its equivalent. Poles

inside or outside the unit circle yield terms that approach zero or infinity,
respectively, as t approaches infinity. First order poles on the unit circle
(except at z = 1) yield sinusoidal terms of constant amplitude, but higher
order poles produce terms that increase without limit as t approaches
infinity.
Sec. 3.13 Application of the Z Transform 175

When the two-sided transform is used, the form off{nT) depends upon
the region of convergence as well as the location of the poles. If the func¬
tion of time is known to be bounded, the region of convergence must
include the unit circle. In this case, poles inside and outside the unit
circle yield components off (nT) for positive and negative time, respectively.

3.13 APPLICATION OF THE Z TRANSFORM


TO DISCRETE SYSTEMS

The two basic methods of analyzing fixed continuous systems using


the Laplace transform are the transformation of differential equations into
algebraic equations and the system function concept. Similar methods
exist for the analysis of fixed systems with discrete signals using the Z
transform.

The Solution of Difference Equations

Any fixed linear system whose input v(t) and output y(t) are defined
only at the discrete instants t = kT can be described by the difference
equation

aQy(kT + qT) + aq_qj(kT + qT — T) + • • • + a^jikT + T) + a0y(kT)


= bmv(kT + mT) + • • • + b1v(kT + T) + b0v(kT) (3.13-1)

where the a - s and b/s are constants. By the use of Eqs. 3.12-1 through
3.12-5, the difference equation can be transformed into an algebraic
equation in 2, which can be solved for Y(z). y(kT) can then be found by
the inversion methods of the last section. Note that the transformed
equation will contain y(0) through y(qT — T). These terms represent the
q boundary conditions needed to determine the arbitrary constants in the
classical solution.
Example 3.13-1. Find a general expression for Y(z) when

a3y(k + 3) + a2y{k + 2) + axy(k + 1) + a0y(k)


= b3v(k —j— 3) —F b2v(k -F 2) + b\v{k + 1) F bav{k)

and when y(k) and v(k) are zero for k < 0.


Using Eqs. 3.12-1 through 3.12-5 with T = 1, the transformed equation becomes

[a3z3 + a2z2 + axz + a0\ Y(z) — z*[a3y(f))] — z\a3y{ 1) + a2y{0)]


— z[a3y{l) + a2y{ 1) + axy{ 0)]

= [b3z3 -F b2z2 + bxz + b0\V(z) — z3163i;(0)] — z2[b3v( 1) + b2v(0)]

-z[M2) + M1) + M0)1


176 Transform Techniques

The constants y{0), t>(0),..., y{2), v{2) are not directly given. Substituting k — — 3, —2,
and —1 into the original difference equation, however, and noting that y{k) and v(k) are
zero for k < 0,
a3y{ 0) = b3v( 0)
a3y{\) + «2?/(0) = M(l) + b2v(0)
a3y( 2) + a2y( 1) + a:y(0) = b3v{ 2) + 62t;(l) + MCO)

The transformed equation then becomes

m=^ + + +
a3z3 + ^2 2 + axz + a0
2
m
V

By generalizing the result of the last example, it is possible to write


down the transformed output by inspection of the difference equation for
any nonanticipatory system that is initially at rest. For such systems,
m < q in Eq. 3.13-1, and y(k) and v(k) are zero for k < 0. The boundary
condition terms disappear, and

Y(z) =
K2"* + "’ + ^z + ^ V(z) (3.13-2)
aaza + • • • + <2,2 + a0

This equation is valid for any value of T.


Example 3.13-2. Solve

y(k + 2) — 3 y(k + 1) + 2 y(k) = 2 v(k + 1) — 2v(k)

when the system is initially at rest, and when

jk for k > 0
v(k)
lo for k < 0

From Eq. 3.13-2 and Table 3.11-1,


2(2 - 1) 2 2
Y(z) Viz) =
_1

rV
N

hi

22 - 32 + 2
1

From Example 3.12-7,


y(n) = 2[2n — n — 1] for n > 0

which agrees with the answer to Example 1.9-1.

The System Function

Equation 3.13-2 indicates that there is a direct relationship between


the transformed input and output of a system initially at rest. The system
function (or “transfer function’’)! is defined as

(3.13-3)

t H(z) is sometimes called the “pulsed transfer function” or “sampled transfer function”
to distinguish it from H(s).
Sec. 3.13 Application of the Z Transform 177

and can be written down directly from the difference equation, as in Eq.
3.13-2. In Section 2.12, it was shown that H(z)zk is the forced response
to v(kT) = zk. From Eqs. 2.12-37 and 2.12-38,

bmzm + •••-}- b±z -f- b0 yp(kT)


(3.13-4)
aazQ + • • • + axz + a0 _ v(kT) J v(kT)=z

Equation 3.13-3 enables the output y(nT) to be found by the same three-
step procedure used for the Fourier and Laplace transforms. For a
fl for k = 0
discrete system subjected to the unit delta function v(kT) = ,
V(z) = 1, and |0 for fc ^ 0
d(nT) =S’~1[H(z)] (3.13-5)

where d(nT) is the response to the unit delta function. Equations 3.13-4
and 3.13-5 provide a convenient way of finding a system’s delta response
from its difference equation.
Example 3.13-3. If the discrete system of Fig. 1.9-la is described by the difference
equation of Example 3.13-2, find its delta response.

d(n) = iT-1
2 - 2

This form does not appear in Table 3.11-1, but, by the initial value theorem of Eq.
3.12-14,
2
d(0) = lim - = 0
z—►OO 2 2
By Eq. 3.12-3,
22
dik + 1) =
2-2

This form does appear in Table 3.11-1, hence

dik + 1) = 2fc+1 for k > 0


or
d(k) = 2k for k > 1

which agrees with the answer to Example 2.12-6.

For a continuous system preceded by a sampling switch, as in Fig.


3.11-1, the Laplace transformed output is

Y(s) = H(s)V*(s) (3.13-6)

where H(s) is the transform of the impulse response h(t). By Eq. 3.12-18,

Y(z) = V(z)&[Ht)\ (3.13-7)


178 Transform Techniques

By comparison with Eq. 3.13-3,

H(z) =&[h(t)] = &[h*(t)\ (3.13-8)

It is important to note that, while Eq. 3.13-6 can be solved for the output
as a continuous function of time, Eq. 3.13-7 yields the output only at
the sampling instants. The calculation of the output between sampling
instants is deferred until the next section. The determination of H{z)
for more complicated configurations is illustrated by the following
examples.
v

Example 3.13-4. Find H(z) — Y(z)/V(z) for the configurations of Fig. 3.13-1. The
systems are initially at rest, and the sampling switches operate synchronously.

v(t) v*(t) q(0 q*(t) y(t)


h\(t) h2(t)

(a)

v(t) v*(t) q(t) y(t)


h\(t) h2(t)

(b)

Fig. 3.13-1

In part (a) of the figure, Q(z) = Hfz)V(z) and Y(z) = H2(z)Q(z); so H(z) =
Hx{z)H2{z). In part (b), H{z) is the Z transform corresponding to H1(s)H2(s). This
can be written
H(z) = <ar\h(t)]

where h(t) = SP-^H^H^s)], or, by Eq. 3.12-16,

Hfs)H2(s)
H(z) =2 residues of at the poles of H^H^s)
1 - esTz~1

The shorthand notation


H(z) = H.Hfz)

is often used, but the reader must remember that HyH^z) ^ Hx(z)H2(z).

Example 3.13-5. In the previous example, find y(t) for t — nT if hft) — h2(t) = tcl
for t > 0, T = 1, and v(t) = e'tU-ft).
For the configuration of part (a) of Fig. 3.13-1,
Sec. 3.13 Application of the Z Transform 179

By Eqs. 3.12-7 and 3.12-24,


yTl-\- 2

y(n) = e“" residue of at 2 = 1


(Z - 1)5

1 d4 1
= e -zn+2 = — (n + 2)(n + l)(n)(n — l)e_” for n > 0
4! dzi Jz=l

which agrees with the answer to Example 1.9-4 at the sampling instants.
For the configuration of part (b) of the figure, h(t) = r3e“76 by Example 1.6-2, so
3
tAe (ze)(z2e2 + 4ze + 1)
H(z) = %
6(2e - l)4

Then
(Zt)2(2262 + 426 + 1)
Y(z) =
6(26 - l)5
and
1 dx
2/00 =
44 dz*
(z” + 3 + 42”+2 + 2”+1)
Jz=l

n2{n + 1):
e~” for n > 0
24
which agrees with the answer to Example 1.9-3.

Example 3.13-6. Figure 3.13-2 shows a common sampled-data system, labeled with
Laplace transformed quantities. Find an expression for F(2).

Q(s) = V(s) - G(s)H(s)Q*(s)

V(s)+s~>. Q(s) / Q*(s) Y(s)


-X ---> G(s)
o
— A

W(s)

H(s)

Fig. 3.13-2
ETsing Eq. 3.12-18,
Q(z) = V(z) - GH(z)Q(z)

G(z) V(z)
Y(z) = G(z)Q(z) =
1 + GH(z)

where the symbol GH{z) is explained in Example 3.13-4.

The Convolution Summations

Equation 3.12-12, rewritten below, provides a bridge between the time-


domain and transform techniques. If F(z) = F1(z)F2(z), then
n

f(nT) = I A(«T - kT)f2(kT) (3.13-9)


fc=0
180 Transform Techniques

The proof of this convolution theorem follows from the basic definition
of the Z transform,
00

U*) = 2 fi(nT)z~n
n= 0
oo

F2(z) = 2 UnT)z-"
n—0

The product of the two infinite series is

F i(z)F 2(z) = [/,(0)/2(0)] + z-'imMT) + HT)fm


\
~ n
-n
+ + z ZfiinT- kT)MkT) +
L fc= 0
thus proving Eq. 3.13-9.
For nonanticipatory systems initially at rest, this equation can be com¬
bined with Eqs. 3.13-3 through 3.13-8. For a discrete system with input
v(kT), output y(kT), and delta response d(kT), Y(z) — V(z)3f[d(kT)].
By Eq. 3.13-9,
n n
y(nT) =%v(kT)d(nT- kT) =^v(nT - kT) d(kT) (3.13-10)
Jc—O lc=0

agreeing with Eqs. 1.9-1 and 1.9-2. For the continuous system and
sampling switch of Fig. 3.11-1, Y(z) = V(z)f£[h(t)]; hence the output at
the sampling instants is
n n
y(nT) =2 v(kT)h(nT~ kT) =2 v{nT - kT)h(kT) (3.13-11)
k=0 A:=0

agreeing with Eqs. 1.9-5 and 1.9-6.

The Free and the Forced Response

Consider the output


y(nT) =£Z’~1[V(z)H(z)] (3.13-12)

As in Section 3.8, the poles of V(z) and H(z) give rise to the forced (or
particular) and free (or complementary) response, respectively. A system
is unstable if its free response increases without limit as t approaches
infinity. Table 3.12-3 indicates that a system is unstable if and only if
H{z) has poles outside the unit circle, or multiple order poles on the unit
circle. From Eq. 3.13-4, the poles of H(z) are the roots of the characteristic
equation
aQzq + • • • + aiz + aQ = 0

Hence the above conclusions about stability agree with those of Section
2.12.
Sec. 3.13 Application of the Z Transform 181

y(t)

A systematic procedure for determining whether or not a polynomial


contains roots on or outside the unit circle does exist.20 Alternatively, the
bilinear transformation z = (1 + w)/(l — w) can be used to map the
interior of the unit circle into the left half of the w-plane. Routh’s
criterion can then be used to determine stability. The references also
discuss the application of graphical techniques in the z-plane for stability
analysis.21
It is theoretically possible for the free response to contain growing
oscillations, as in Fig. 3.13-3, even though it remains bounded at the
sampling instants. This situation rarely occurs, however.22

Time-Varying Systems

The analysis of time-varying systems by the Laplace transform in Section


3.9 has its parallel when using the Z transform. The time-varying system
function is defined as
H(nT, z) = fT[d*(nT, kT)] (3.13-13)

for a discrete system, where the transformation is with respect to kT, and
where d*(nT, kT) is the response at time nT to a unit delta function
occurring kT seconds earlier. For the configuration of Fig. 3.11-1, the
system function is
H(nT, z) = £T[h*(nT, kT)] (3.13-14)
Then
y(nT) = fT-1[V{z)H(nT, z)] (3.13-15)

The principal difficulty with this scheme is finding H(nT, z). If, however,
the delta or impulse response is known, this can be done by Eq. 3.13-13
or 3.13-14.
182 Transform Techniques

3.14 THE MODIFIED Z TRANSFORM

The use of the Z transform in the analysis of Fig. 3.11-1 yields the
output y(t) only at the sampling instants t = nT. Although a number of
methods have been suggested for obtaining the output between sampling
instants, the most useful one seems to be the modified Z transform.
The calculation off£[f{t)] is based upon samples off(t) at the instants
t = nT. In order for the Z transform to contain information about the
function of time at other instants, f{l) should be sampled at t = (n + y)T,
where y is a real, independent parameter which may assume any value
between zero and one. Parts (a) and (b) of Fig. 3.14-1 show an arbitrary
function f(t) and a train of unit impulses i(t). Recall that the product

fit) i(t)
A

oo oo oo oo oo oo

f*(t) = f(t)i(t) fit + 7 T)

f*(t, y) = fit + yT)i(t)


I
CO

A CO

CO A

/V

1 t
0 T 2T 3T AT

ie)
Fig. 3.14-1
Sec. 3.14 The Modified Z Transform 183

f{t)i(t) gives
oo

/*(<) = 2
n=— oo
f{nT)U,(f - nT) (3.14-1)

In order to sample f(t) at the instants t — (n + y)T, the plot of /'(/) can
be moved forward yT seconds. Alternatively, the plot of f(t) can be
moved backwards yT seconds, as in part (d) of the figure. The product
f(t + yT)i{t) gives
00

f*(t, y)= 2
n—— oo
f(nT+ yT)U0(t - nT) (3.14-2)

The Laplace and Z transforms of Eq. 3.14-2 are


oo

y)] =Zf(nT+ yT)e~nTs (3.14-3)


n=0

y)] = arm* + ?t)] = ar,m;)]


oo

= F(z, y) = 2 f(nT + yT)z~n (3.14-4)


n=0

The various nomenclatures introduced in Eq. 3.14-4 all stand for the Z
transform of f(t + yT), otherwise known as the modified Z transform
qffm Note that
F(z) = lim F(z,y) (3.14-5)
y^O

Iff(t) is given, the modified Z transform can be expressed as an infinite


series directly from Eq. 3.14-4. A better method, however, is to form
f{t + yT) and find the ordinary Z transform of this function by standard
tables. Tables of modified transforms do exist, but they save only a
small amount of effort.23
If F(s) is given, F(z, y) can be found, if desired, without the intermediate
step of determining f{t). By a proof similar to that used in deriving
Eq. 3.12-16,
F(s)eyTs
F{z, y) 2 residues of-——-
1 - €Tsz~1
at the poles of F(s) (3.14-6)

If the ordinary inverse transform of F(z, y) is taken, with y treated as


an independent parameter,

P-'lF(z, y)) = f(nT + yT) (3.14-7)


00

t Many authors define the modified Z transform as F(z, m) = z~1'^f(nT + mT)z~n.


n=0
This is the same as the definition of Eq. 3.14-4, except for the multiplying factor of z-1,
but it is not quite as convenient to use. Note that F(z) = lim zF(z, m).
184 Transform Techniques

For the configuration of Fig. 3.11-1, one would expect that

Y(z, y) = y) (3.14-8)
where
00

H(z, y) = &[h(t + yT)] =^h(nT+ yT)z~n (3.14-9)


71 =0

is the modified Z transform of the impulse response h(t). To prove this


result, form the product

V(z)H(z, y) = [d(0) + d(T)^1 + v(2T)z~2 + • • • ]

x [h(yT) 4- /i(T+ yT)z~x + h(2T + yT)z~2 + • • •]

= [v(0 )h(yT)] + z~x[v(f))h(T 4- yT) 4- v(T)h(yT)]


n
— 71
+ + z 2v(kT)h(nT+ yT - kT) +
_fc=0

By Eq. 1.9-3, the coefficient of z~n is y(nT + yT), so the infinite series
above is identical with the definition of Y(z, y), thus proving Eq, 3.14-8.
Finally,
y(nT + yT) = £T~1[V(z)H(z, y)] (3.14-10)

Example 3.14-1. Solve Example 3.13-5 for y(t, y). In Fig. 3.13-1, hx(t) = h2(t) = te~l
for t > 0, T — 1, and v(t) = e~tU_1(t).
For the configuration of part (a) of Fig. 3.13-1,

Q{z) = V(z)H1(z)

Y(z, y) = V(z)Hx(z)H2(z, y)
where
Ze
V(z) =
(ze - 1)

ze
Hx{z)
(ze - \y

€ y(gg) ye '(ze)
H2(z, y) = Z[(t + y)e-(*+y>] =
(ze — l)2 ze — 1
Using Eqs. 3.12-7 and 3.12-24,

e y(ze)3 ye y(ze)3
y(n, y) +
(ze - l)5 (ze - l)4

1 J4 y <73
= e~ne~y -(2"+2) + --(zn+2)
4! dzx v 3 \ dz3 V 7 Jz=l
1
= — (« + 2)(« + 1)(«)(« — 1 + 4y)e (n+y) for « > 0
i
References 185

As a check, note that, when y = 0 or 1, y{n + y) reduces to y(n) or y(n + 1), respec¬
tively, in Example 3.13-5. For 0 < t < 1, n = 0 and t = y, while for 1 < t < 2,
n = 1, and t — 1 + y, so

2/(0 = 0 for 0 < t < 1

2/(0 — (t — l)e_i for 1 < t < 2

agreeing with the answer to Example 1.9-4.


For the configuration of part (b) of Fig. 3.13-1,

Y(z, y) = V(z)H1H2(z, y)
where
(t + y)3e
HxH2{z, y)=2t

Then

(ze)2(z2e2 + 4ze + 1) 3y(ze)2(ze + 1) 3y2(ze)2 y3(ze)2


2/(«, y) = —
- 4-;-- + ;-ttt +
6 (ze — l)5 (ze - \y (z€ — l)3 (ze — l)s

n\n + l)2 y(n + \)(n)(2n + 1) 3 y2(n + 1 )n y3


= e-(«+y) - -1-1-1-(n -f 1)
24 12 12 6
and

2/(0 = t3 —
for 0 < / < 1
o

2/(0 = [2/3 - 3/2 + 3/ - 1] — for 1 < t < 2


6

agreeing with the answer to Example 1.9-3.

Example 3.14-2. Find an expression for Y(z, y) for the configuration of Fig. 3.13-2.
As in Example 3.13-6,
no
(2(0 =
1 + GH{ 0
Then
G(z, y)V(z)
Y(z, y) = Q(z)G(z, y) =
1 + GH(z)

REFERENCES

1. G. A. Campbell and R. M. Foster, Fourier Integrals for Practical Applications,


D. Van Nostrand Company, Princeton, New Jersey, 1948.
2. A. Erdelyi (editor), Tables of Integral Transforms, Vol. I, McGraw-Hill Book
Company, New York, 1954.
3. R. V. Churchill, Operational Mathematics, Second Edition, McGraw-Hill Book
Company, New York, 1958, Chapters 1 and 6.
4. M. F. Gardner and J. L. Barnes, Transients in Linear Systems, John Wiley and
Sons, New York, 1942, Chapters V, VI, and VIII.
186 Transform Techniques

5. S. Seshu and N. Balabanian, Linear Network Analysis, John Wiley and Sons, New
York, 1959, Section 5.5.
6. Gardner and Barnes, op. cit., p. 346, Eq. 2.1362.
7. R. V. Churchill, Complex Variables and Applications, Second Edition, McGraw-
Hill Book Company, New York, 1960.
8. B. van der Pol, and H. Bremmer, Operational Calculus Based on the Two-Sided
Laplace Integral, Second Edition, Cambridge University Press, Cambridge, 1955,
Section VI. 11.
9. W. R. LePage, Complex Variables and the Laplace Transform for Engineers,
McGraw-Hill Book Company, New York, 1961, Chapter 10.
10. S. Goldman, Transformation Calculus and Electrical Transients, Prentice-Hall,
Englewood Cliffs, N.J., 1949, Section 11.8. v
11. J. A. Aseltine, Transform Method in Linear System Analysis, McGraw-Hill Book
Company, New York, 1958, Chapter 17.
12. L. A. Zadeh, “Frequency Analysis of Variable Networks,” Proc. IRE, Vol. 38, No.
3, March 1950, pp. 291-299.
13. D. Graham, E. J. Brunelle, Jr., W. Johnson, and H. Passmore, III, “Engineering
Analysis Methods for Linear Time Varying Systems,” Report ASD-TDR-62-362,
Flight Control Laboratory, Wright-Patterson Air Force Base, Ohio, January,
1963, pp. 132-150.
14. L. A. Zadeh, “A General Theory of Linear Signal Transmission Systems,” J.
Franklin Inst., Vol. 253, April 1952, pp. 293-312.
15. L. A. Zadeh, “Time-Varying Networks I,” IRE Intern. Conv. Record, Vol. 9, Pt.
4, March 1961, pp. 251-267.
16. Gardner and Barnes, op. cit.. Chapter IX.
17. C. E. Shannon, “Communication in the Presence of Noise,” Proc. IRE, Vol. 37,
No. 1, January 1949, pp. 10-21.
18. J. G. Truxal, Automatic Feedback Control System Synthesis, McGraw-Hill Book
Company, New York, 1955, Section 9.2.
19. W. Kaplan, Advanced Calculus, Addison-Wesley Publishing Company, Reading,
Mass., 1952, p. 569.
20. M. Marden, The Geometry of the Zeros of a Polynomial in a Complex Variable,
American Mathematical Society Mathematical Survey, No. Ill, New York, 1949,
p. 152.
21. Truxal, op. cit.. Section 9.6.
22. J. R. Ragazzini and G. F. Franklin, Samp/ed-Data Control Systems, McGraw-Hill
Book Company, New York, 1958, p. 93.
23. E. I. Jury, Theory and Application of the z-Transform Method, John Wiley and Sons,
New York, 1964, pp. 289-296.

Problems

3.1 Derive Eqs. 3.2-6 and 3.2-7.


3.2 A periodic waveform is to be approximated by a finite trigonometric
series of n terms. Show that the approximation having the least mean
square error consists of the first n terms in the Fourier series.
3.3 Find the Fourier series for an infinite train of unit impulses, spaced T
seconds apart, as in Fig. 1.8-1 /.
Problems 187

3.4 By extending the result of Example 3.2-1, find and sketch the complete
output voltage for Fig. 3.2-2#. Assume that the circuit contains no initial
stored energy. Compare the answer with Example 2.12-3.
3.5 If the system function of a fixed linear system is H(s) = s/(s2 + 0.2s +
100), and if the input is the periodic waveform shown in Fig. P3.5, find and
sketch the Fourier series representing the steady-state output.

v(t)

Fig. P3.5

3.6 Find the Fourier transform off(t) = (cos + L/2)U_1( — t + L/2).


Sketch the frequency spectrum for L = 2n, L = 4tt, and L -* oo.
3.7 Prove the Fourier energy theorem:
00

t/wr * = -
.00
|G(a>)|2 dco

3.8 Show that, iff2(t) = (d/dty^t), then G2(co) = ycoG1(co).


3.9 Show that the ideal low-pass filter described by Fig. P3.9 is not a realiz¬
able system, by examining the impulse response h(t).

H(jw)
! \
1

_
-C0c 0 Wc
Fig. P3.9

3.10 Verify the entries in Table 3.3-1.


3.11 Prove the properties in Table 3.4-1.
3.12 Find =^[cos (It] by using Eq. 3.4-5 with n = 2. Using this result, find
^[sin (It] by Eq. 3.4-6. Then find £’[te~t sin (t + rr/4)] by Eqs. 3.4-8 and
3.4-9.
188 Transform Techniques

3.13 Find the inverse transform of the following functions by partial fraction
expansions, and Tables 3.3-1 and 3.4-1.

1
Fx{s)
(s + 1)V + 4)

F2(s)
(s + 1)V + 4)

F3(s)
0 + 1)V + 4)
252 + F
F*(s)
s(s2 +2^+5)

3.14 Find 1 from the tables of Laplace transforms.


jco3 + co2 + 2

3.15 Solve Examples 2.6-6 and 2.6-7 and Problem 2.2 by the Laplace transform.
3.16 A system is described by the differential equation

(/?3 + 3p2 + 4/7 + 2)t/0) = (p2 + 2/7 + 3)v(t).

Find the impulse response h(t), and evaluate h(0 +), //(0 +), and /?(0+).
If it is possible, find these initial conditions by the initial value theorem
of Eq. 3.4-15. Check the results by the methods of Chapter 2.
3.17 The circuit in Fig. P3.17 illustrates the application of the Laplace trans¬
form when there is initial stored energy. The circuit is originally operating

18 2 h

vw-w
4
28 18
volts
V/v Wv

2
volts

Fig. P3.17

in the steady state with the switch K open. If the switch closes at t = 0,
shorting out a resistance, find expressions for the currents for t > 0.
Represent the energy stored in the inductance and capacitance at t = 0
by added sources.
3.18 Write the Laurent series for each of the following functions about the
singularity at 5 = — 1. The quantity t should be treated as an independent
parameter.
.st Se st
(a) (b)
S + 1 (s + l)2
Problems 189

Find the residue at 5 = —1 from the series, and also by Eqs. 3.6-17
through 3.6-20.

bmsm + • * • + biS + bo
3.19 If F(s) = show that
ansn + • • • + a^s + ciq

1 r F'(s)

W)ds = z~p

where the contour C encloses the right half of the 5-plane, and where Z
and P denote the number of zeros and poles, respectively, of F(s) in the
right half-plane. Fligher order poles and zeros should be counted
according to their multiplicity. This result leads to the Nyquist stability
criterion.
3.20 If the one-sided transform is F(s) = s/[(s + 2)(j — 1)], find f(t) by the
inversion integral. Repeat for the two-sided transform

F(s) = si[(5 + 2)0 - 1)]

when the region of convergence is given by — 2 < 0 < 1.


3.21 Solve Problem 3.13 by the use of residues.
3.22 Show that the residue and partial fraction methods are essentially
identical, when F(s) is the quotient of two polynomials, with the de¬
nominator polynomial of higher order.
3.23 Show that J (1 /s2)<Est ds vanishes on contour DEFGA in Fig. 3.7-1 for
t > 0, and vanishes on DHA for t < 0. Generalize your proof so as to
verify Theorem 3.7-1.
3.24 Use the Laplace transform to find the output voltage for Fig. 3.2-2a.
Since E0(s) has an infinite number of poles, the following procedure, based
upon Reference 5, is suggested. First find the transient response by evalu¬
ating the proper residue. Next find the complete response for the first
cycle by neglecting the source voltage for t > 2. The steady-state response
for the first cycle is found by subtracting the transient response from this
result. Since the steady-state response is periodic, it is the same for all
cycles. Compare the answer with the solutions to Example 2.12-3 and
Problem 3.4.
3.25 Two time-varying systems are characterized by the system functions
Hft, s) and H2(t, s), respectively. If the systems are connected in cascade,

u(t) q(t) y(t)


hi h‘z ->

Fig. P3.25

as in Fig. P3.25, show that the overall system function

H(t, s) * Hft, s)H2(t, s).

Find a general expression for H in terms of H1 and H2.


190 Transform Techniques

3.26 For a system described by the differential equation

dv
y — + v
t -f- 1 dt

find h*(t, r) and H(t, s). Find the step response using transform tech¬
niques, and compare the answer with Problem 1.12.
3.27 Find the system function H(t, s) corresponding to each of the following
differential equations.

(a) (tp2 + 2p)y = v


(b) (ty + 4tp + 2)y ={p + \)v
Find the response of each system to v = e~*£/_iO).
3.28 Verify the entries in Table 3.11-1 by Eq. 3.11-6, and also by Eq. 3.12-16.
3.29 Find the two-sided Z transform of the following functions by Eq. 3.11-6,
and also by Table 3.11-1. Give the regions of convergence.

(a) fx{t) = e~m for all t


(b) f2(t) = d for t < 0, and t + 1 for t > 0

3.30 Prove those properties in Table 3.12-1 which are not derived in this
chapter.
3.31 Prove the first half of Eq. 3.12-17 by replacing /(/) by its complex Fourier
series, before transforming the relationship f*(t) = f{t)i{t). Problem 3.3
and Eq. 3.4-8 are helpful.
3.32 If |[/(/)] = z/[(z + 2)(4^2 + 1)], and if/(/) is known to be a bounded
function, find f(nT). Calculate the numerical values of f(nT) for —2 <
n < 4 by the inversion integral and by modifying the partial fraction
method in view of Eq. 3.11-10.
3.33 Find f{nT) for the following one-sided Z transforms, using Eq. 3.12-21,
Eq. 3.12-24, and partial fraction expansions.

z
(a) Fx{z) =
(z + 2)(422 + 1)

2z3 - 2z2 + z
(b) F2(z) =
(z - 1 )(z2 - z + 1)

3.34 Solve Problems 2.14 and 2.15 by the Z transform, again commenting
on the stability of the systems.
3.35 In Fig. 3.13-2, G(s) = 1 /(s + 1), H(s) = 1, and the ideal sampling
switch has a period T given by e~T =
(a) Find the step response r(nT) by the Z transform.
(b) Find the step response r[(n + y)T] by the modified Z transform.
(c) Determine the difference equation which relates v(t) and y(t) at the
sampling instants.
Problems 191

3.36 In Fig. 3.13-2, G{s) = [K(s + 1.04)]/[s(s + 0.692)], H(s) = 1, and the
period of the sampling switch is T = 1.
(a) By a root locus diagram, determine the values of the real constant K
for which the system is stable.
Cb) Let v(t) = 0, but assume that some disturbance produces the signal
^(0) = q{\) = 1. For K = 1, find y{nT).
3.37 A sampling switch is added at the right of H{s) in Fig. 3.13-2, and it
operates in synchronism with the original one. Find expressions for Y(z)
and Y(z, y).
3.38 A “sampler with a zero order hold” is described by the output-input
relationship y(nT + yT) = v{nT) for integral values of n and for 0 <
y < 1. It samples the input every T seconds and maintains the last
sampled value between sampling instants, as in Fig. 1.8-lc.

v*(t) y(t)
hi (t)

(a)

v(t) ho(t) =
y(t)
Sampler with —>
zero order hold a-t-'W-M

(b)
Fig. P3.38

(a) If the component is to be represented by the configuration of Fig.


P3.38tf, find hr{t) and H^s) for the element following the ideal sampler.
(b) Find H(z, y) for the system in Fig. P3.386.
3.39 Solve Problems 1.15 through 1.17 by the Z transform.
3.40 Formally prove Eq. 3.14-6.
4
Matrices and Linear Spaces

4.1 INTRODUCTION

The purpose of this chapter is to introduce the remaining analytical


tools required in the following chapters. Generally, these analytical
tools come under the general heading of linear algebra. More specifically,
the topics of interest are matrices, determinants, linear vector spaces,
linear transformations, characteristic value problems, and functions of a
matrix.
The reason for the use of these analytical techniques in the study of
systems is due primarily to the great deal of information which is required
to describe completely a large-scale system. This information, which may
consist of sets of differential or difference equations, can be expressed
conveniently in the compact notation of matrices. Once this notation is
adopted, the analysis of a system is largely the analysis of the properties
of the matrix. The advent of high-speed digital computers has made this
approach practical.
The topics chosen for inclusion in this chapter by no means exhaust the
detailed knowledge which exists about linear algebra. Rather, the topics
were chosen for their special pertinence to the study of control systems.
In particular, the topics of vector spaces and linear transformations have
special meaning, since the properties of a system may be made more
evident by the proper choice of a coordinate system. It then becomes
desirable to know the proper linear transformation which yields the
desired coordinate system.
The characteristic value problem lies at the heart of the analysis of linear
systems by a matrix approach. If a matrix is used to describe the structure
of a system, the characteristic values of that matrix describe the normal
modes of response of the system. Perhaps the characteristic value problem
192
Sec. 4.2 Basic Concepts 193

and the associated topic of functions of a matrix are the most important
topics in this chapter.

4.2 BASIC CONCEPTS

The set of linear equations

alxXi + a12x2 + • * * + alnxn = y1

a2\X\ + <^22X2 +'*’+ a2nXn = Vi

amixi + am2x2 + • * • + amnxn = y m

constitutes a set of relationships between the variables xl9 x2, . . . , xn and


the variables yl9 y2, . . . , ym. This relationship, or linear transformation
of the x variables into the y variables, is completely characterized by the
ordered array of the coefficients aijm If this ordered array is denoted by
A, and written as

011 a\2

«21 a22 ' ' * #2n


A = (4.2-1)

@rnl d m2 0mn

then it will be shown that the set of linear equations can be written as
Ax = y by a suitable definition of the “product Ax.” Certainly, this
expression is considerably simpler in form than the set of linear equations.
This is one of the major reasons for using matrices. A matrix equation,
or a set of matrix equations, contains a great deal of information in a
compact form. Without the use of this compact notation, the task of
analyzing sets of linear equations is quite cumbersome.
Consider then, the rectangular array of ordered elements of Eq. 4.2-1.
The typical element in the array may be a real or complex number, or a
function of specified variables. A matrix is a rectangular array such as
shown, but distinguished from simply a rectangular array by the fact
that matrices obey certain rules of addition, subtraction, multiplication,
and equality. The elements of the matrix an, al2, . . . , aiS are written
with a double subscript notation. The first subscript indicates the row
where the element appears in the array, and the second subscript indicates
the column. A matrix is denoted here by a boldface letter A, B, a, b, etc.,
or by writing the general element [ai}\ enclosed by square brackets. The
columns of the matrix are called column vectors, and the rows of the matrix
are called row vectors. A matrix with m rows and n columns is called an
194 Matrices and Linear Spaces

(m x n) matrix or is said to be of order m by n. For a square matrix


(.m = n) the matrix is of order n.

Principal Types of Matrices

(a) Column matrix. An (m x 1) matrix is called a column matrix or a


column vector, since it consists of a single column and m rows. It is
denoted here by a lower-case, boldface letter, as

<*i

Cl2
a =

(b) Row matrix. A matrix containing a single row of elements, such as


a (1 x n) matrix, is called a row matrix or a row vector.
(c) Diagonal matrix. The principal diagonal of a square matrix consists
of the elements au. A diagonal matrix is a square matrix all of whose
elements which do not lie on the principal diagonal are zero.

a ii All elements
zero
a 22
Diagonal matrix =

All elements
zero a nn

(d) Unit matrix. A unit matrix is a diagonal matrix whose principal


diagonal elements are equal to unity. The unit matrix here is given the
symbol I.
1 0 0 • • • 0
0 1 0 • • • 0
1 = 0 0 1 • • • 0

0 0 0 • • • 1

(e) Null matrix. A matrix which has all of its elements identically equal
to zero is called a zero or null matrix.
(/) Transpose matrix. Transposing a matrix A is the operation whereby
the rows and columns are interchanged. The transpose matrix is denoted
Sec. 4.2 Basic Concepts 195

by Ar. Thus, if A = [aij], then AT = [aH], i.e., the element of the ith
row,y'th column of A appears in theyth row, ith column of A2’. If A is an
(m x n) matrix, then A2 is an (n x m) matrix.

Special Types of Matrices

(a) Symmetric matrix. A square matrix, all of whose elements are real,
is said to be symmetric if it is equal to its transpose, i.e., if

A = AT or aij = aH (i,j=\,...,n)

(b) Skew-symmetric matrix. A real square matrix is said to be skew-


symmetric if
A = —A2' or ait = —aH (i,j— 1, . . ., n)

This, of course, implies that the elements on the principal diagonal are
identically zero.
(c) Conjugate matrix. If the elements of the matrix A are complex
0ciij = a+ jfiij), then the conjugate matrix B has elements bij =
a— jpiy This is written in the form B = A*.
(d) Associate matrix. The associate matrix of A is the transposed
conjugate of A, i.e., associate of A = (A*)21.
(e) Real matrix. If A = A*, then A is a real matrix.
(/) Imaginary matrix. If A = —A*, then A is pure imaginary.
(g) Hermitian matrix. If a matrix is equal to its associate matrix, the
matrix is said to be Hermitian, i.e., if A = (A*)2", then A is Hermitian.
(h) Skew-Hermitian matrix. If A = —(A*)2", then A is skew-Hermitian.

Elementary Operations

Addition of Matrices. If two matrices A and B are both of order


(m x n), where m and n are two given integers not necessarily different,
then the sum of these two matrices is the matrix C. The i, jth element of
C = A + B is defined by
cu = + b{j (4.2-2)

Example 4.2-1. Evaluate the sum of the matrices indicated.

2 "4 2"
-3 0 1

Addition of matrices is commutative and associative, i.e.,

A + B = B + A Commutative
A + (B + C) = (A + B) + C Associative
196 Matrices and Linear Spaces

Subtraction of Matrices. The difference of two matrices A and B both,


say, of order (m x ri) where m and n are two given integers not necessarily
different, is the matrix D, where the /,yth element of D = A — B is defined
by
da = ais - bu (4.2-3)
Example 4.2-2. Evaluate the difference of the matrices indicated.

“ 2 4" “2 -2“ “0 6“
3 1 _3 0_ -6 1_

Equality of Matrices. Two matrices A and B, which are of equal order,


are equal if their corresponding elements are equal. Thus

A = B (4.2-4)
if an only if a= bij.

Multiplication of Matrices. The definition of the product of two


matrices A and B comes about in a natural way from the study of linear
transformations. Consider then the linear transformation

Vi = allx1 + a12x2
(4.2-5)
V'2 — + * 22X2

The elements yx and y2 can be considered as components of a vector y.


Similarly, x1 and x2 can be considered as components of a vector x.
Equation 4.2-5 can then be visualized as the matrix A transforming the
vector x into the vector y. This transformation can be written in the form

y = Ax (4.2-6)
where
Vi a\2 xx
y = , A = and x =
_9*21 a22_ _X2

Equation 4.2-6 could also have been written


2

= 2 aux>’ ' = 1.2 (4.2-7)


3=1

This leads to the definition of the product of an (m x n) matrix by an


(n x 1) matrix or column vector as

Ax = (4.2-8)

This product is referred to as the postmultiplication of A by x, or the


premultiplication of x by A. It is important that this distinction be made,
since in general multiplication is not commutative, i.e., AB ^ BA.
Sec. 4.2 Basic Concepts 197

Assume now that the vector x is formed by the linear transformation


Xi = bxlzx + bl2z2
(4.2-9)
x2 — b2pz2 -f b22z2
or
2

xi = 1 bjkzm j = 1.2
k=1

The relationship between the vectors y and z can be obtained by substi¬


tuting Eq. 4.2-9 into 4.2-7. The result of this substitution is
2 2

Vi =1aa1
3=1 k=1
i = 1 2 , (4.2-10)

Since the order in which the summations are taken can be interchanged,
2/2

Vi
k
2=1 ( 'zL jk 'k>
i = 1, 2 (4.2-11)

The transformation from the column vector z to the column vector y


can be written in matrix notation as

y = ABz (4.2-12)

The product AB can be viewed as the matrix C where

C = AB (4.2-13)
or
r 2

2
_3 = 1
jk

A typical element cik of C is the summation shown inside the brackets.


The first subscript inside the brackets is the row index of the matrix
product, and the last subscript inside the brackets is the column index.
In order for the product to be defined, the number of columns of A must
equal the number of rows of B. Proceeding to the general case, the product
of two matrices, A(m x n) by B(n x />), is defined in terms of the typical
element of the product C as
n

C = AB
_k=
2 1^ik^kj (4.2-14)

Thus the i,jth element of C is the sum of the products of the elements of the
zth row of A and the corresponding elements of theyth column of B. The
resulting matrix C is (m x p).
If the number of columns of A is equal to the number of rows of B, the
two matrices are said to be conformable, in that the product AB exists.
For the case where A is (m x n) and B is (n x m), then both products AB
198 Matrices and Linear Spaces

and BA exist. The product AB is (m x m), and the product BA is (n x n).


They are, of course, generally not equal. Even in the case where m = n,
and hence both products AB and BA are (m x m), the two products are
not necessarily equal. However, if they are equal, i.e., AB = BA, the
two matrices are said to commute.
Example 4.2-3. Evaluate the products of the matrices indicated.

(a)
"1 r "1 2" "i r
AB

_1

VO
<N
_2 2 0 1

i
(b)
"1 2“ ~i -r '5 3"
BA ^ AB
0 1 2 2 2 2
(c)
"i r '-i r "0 0”

_1

o
o
2 2 i -l

i
Note that AB = [0] does not imply that A = [0] or B = [0].

From the preceding discussion it may be seen that matrix multiplication


is associative and distributive, but not in general commutative.

A(BC) = (AB)C Associative

A(B + C) = AB + AC) _ _ (4.2-15)


Distributive
(A + B)C = AC + BCJ

Premultiplication or postmultiplication of a
Scalar Multiplication.
matrix by a scalar multiplier k multiplies each element of the matrix by k.
The typical element of the product kA is katj.
Example 4.2-4. Perform the indicated multiplication of the matrix by the scalar 2.
_1

"1 “2 —2~
1

_2 3_
i

Multiplication of Transpose Matrices. The product of two transpose


matrices, B7 and A7’, is equal to the transpose of the product of the orig¬
inal two matrices B and A, taken in reverse order, i.e.,

BtAt = (AB)r (4.2-16)

This is easily shown by taking the transpose of C = AB. The typical


element of the product AB is given by

n i = , m
'zo = 2 aikbjcj
k=1 i 15 • • • i p
Sec. 4.2 Basic Concepts 199

where it is assumed that A is (m x n) and B is (n x p). The typical element


of CT is
n n

^ij ji ®kj
k=l k=1

Therefore C7 = (AB)7 = B7’A?7. The transpose of AB is equal to the


product of the transposed matrices taken in reverse order.

Multiplication by a Diagonal Matrix. Postmultiplication of a matrix A


by the diagonal matrix D is equivalent to an operation on the columns of
A. Premultiplication of a matrix A by the diagonal matrix D is an opera¬
tion on the rows of A. Obviously, premultiplication or postmultiplication
by the unit matrix I leaves the matrix unchanged, i.e.,

IA = AI = A

Example 4.2-5. Evaluate the given matrix products.

#11 #12 dn 0 Q\2^/22


Postmultiplication:
1_
sx,
0

Cl 2 od2 2_
to
to

_#21 #22_ 11
1

~dx 1 0 -
#11 #12 &iid\\ ^12^11

Premultiplication:
0 d22 #21 #22 ^21^22 ci 2 2d22

Products of Partitioned Matrices. It is sometimes convenient to con¬


struct a matrix from elements which are matrices, or to reduce a given
matrix to another matrix whose elements are submatrices of the original
matrix. For example, by drawing the vertical and horizontal dotted lines
in matrix A, which is of order (3 x 3), a (2 x 2) matrix can be written using
the submatrices A4, A2, A3, and A4.

aw a\2 ai3
'A! A3
A =
h>l a22 a23
_A2 a4_
_a3\ a32 a33_
where
a.21 a 22 a23
A-i — [/hi] A2 — A3 — [^12 ^13] A4 —

L^3iJ a32 °33

Assume that the matrix B, which is also of order (3 x 3) is partitioned


in the same manner, yielding

^11 ^12 ^13


re, b3i
B = —
b2\ b22 ^23
B, b4
_^31 ^32 &33_
200 Matrices and Linear Spaces

Clearly, the sum of matrices A and B can be expressed in terms of the


submatrices as
Ai + B4 A T b 3 3

A + B =
_A + B A + b 2 2 4 4

The product of A and B can also be expressed in terms of the submatrices


as
(A B + A B2) (A B + A B4)
1 1 3 4 3 3

AB =
_(A B + A B2) (A B + A B4)_
2 1 4 2 3 4

In general, the product of two matrices can be expressed in terms of the


submatrices only if the partitioning produces submatrices which are
conformable. The grouping of columns in A must be equal to the grouping
of rows in B. If this condition is satisfied (it is assumed that A and B
are originally conformable), then the submatrices formed by the partition¬
ing may be treated as ordinary elements.
Example 4.2-6. Evaluate the indicated matrix product by means of partitioning.

'0 2 ■ 1 0 3"
AB = 31
1 -1 °J -1 2 1

0 1 -1_

0 “ 1 0 3" ”3"
+ [0 1 -1]
1 -1 -1 2 1 0
2 4 2~ "0 3 -3“
_
~—2 7 -r
+
2 -2 2_ _0 0 0 2 -2 2

Differentiation of a Matrix. The usual ideas of differentiation and


integration associated with scalar variables carry over to the differentiation
and integration of rfiatrices and matrix products, provided that the original
order of the factors involved is preserved. Let Aif) be an (m x n) matrix
whose elements aift) are differentiable functions of the scalar variable t.
The derivative of A(t) with respect to the variable t is defined as

dau(t) da12(t) daln(t)


dt dt dt
da21(t) da22(t) da2n(f)
4 [A(t)] = A(0 = (4.2-17)
dt dt dt dt

daml(t) dnm2(0 mn(0

dt dt dt
Sec. 4.2 Basic Concepts 201

From this definition it is evident that the derivative of the sum of two
matrices is the sum of the derivatives of the matrices, or

7 [A(0 + BO)] = AO) + BO) (4.2-18)


dt

The derivative of a matrix product is formed in the same manner as the


derivative of a scalar product, with the exception that the order of the
product must be preserved. Thus, as typical examples,

7 [A0)B0)] = A(0B(0 + A0)B0) (4.2-19)


dt
and

2 [a30)] = A0)a20) + AO)A(t)AO) + a20)A(0 (4.2-20)


dt

Integration of a Matrix. Similar to the definition of the derivative of a


matrix, the integral of a matrix is defined as the matrix of the integrals
of the elements of the matrix. Thus

an(t) dt a 12CO ^^ • • • aln(t) dt


J j
r r
a2n(t) dt
C3

^22(0 dt
CM
rH

...
A(t) dt = (4.2-21)
>

aml(t) dt am2(1) dt ... ttmnO) dt

The operator notation Q = j( ) dt is commonly used to signify the


integral of a matrix. When the superscript t and the subscript t0 are affixed
to Q, they indicate the upper and lower limits of the integration. Thus

) dt (4.2-22)

Example 4.2-7.
~t r
Find Qq(A), if A
_i t_

Q$(A)
't

t dt t

Jo
202 Matrices and Linear Spaces

4.3 DETERMINANTS

The theory of determinants is also useful when dealing with the solution
of simultaneous linear algebraic equations. Determinant notation simpli¬
fies the solution of these equations, reducing the solution to a set of rules
of procedure. As an example, the set of equations

awx\ ~b ^12*^2 — yi
(4.3-1)
*21*^1 “b Cl22X2 ~ y%

can be solved by finding an expression for x1 from the first equation, and
then substituting this expression into the second equation. The result of
performing this operation is the solution

y 1^22 - y2«i2 y-2an - yi*2i


x1 = - x2 = -
OnO22 ^12*21 ^11^22 " Cl j2U 21

This solution assumes that the denominator (tfntf22 — a12a21) is not zero,
Equation 4.3-1 can be written in matrix notation as

Ax = y (4.3-2)
where
~*n *i2~ xx ~y{
A = , X = and y =
_ci 21 a 22_ _x2_ _y 2_

The determinant of the matrix A is written as

*11 *12
IAI =
*21 a22

and the value of the determinant is defined to be (aY1a22 — a12a2i). A


determinant is only defined for a square array of elements, and the order
of the determinant is equal to the number of rows or columns of elements.
Thus this determinant is called a second order determinant. If the deter¬
minants
a 12
(j/l*22 - 2/2*12)
022

a 11 y1
= (Ml - 2/l«2l)
*21 2/2
Sec. 4.3 Determinants 203

are formed, then the solution to Eq. 4.3-1 can be expressed in terms of
these determinants as
lAii
*i =
IA |
provided |A| ^ 0.
The definition of the value of the determinant of any square matrix A
follows. The determinant of the (n x n) square matrix A, written as |A|,
has a value which is the algebraic sum of all possible products of n elements
which contain one and only one element from each row and column,
where each product is either positive or negative depending upon the
following rule: Arrange the possible products in ascending order with
respect to the first subscript, e.g., ar3a22a31 • • • . Define an inversion as the
occurrence of a greater integer before a smaller one. The sign of the
product is positive if the number of inversions of the second subscript
is even; otherwise it is negative. For example, the sequence 321 has three
inversions: 3 before 2, 3 before 1, and 2 before 1.

Example 4.3-1. Evaluate the determinant

a\\ avi a\z

I A| # 2i #22 #23

a 1
3 a 32 a 33

The possible products and their signs are

Possible Products Number of Inversions Sign

«n02 2^33 0 4-
# 11#2 3^32 1 -
# 10 # 2 1^33 1 -
^12^23^31 2 +
«13«2 1^32 2 +
#13^2 2^31 3 -

The determinant of A is then

1 A| — flll<?22^33 + #12^23^31 4" #13^21^32 (a 1^23^32 4" a 12^21^33 T ^13^22^31)


1

For a determinant of order n, there aie /?! such products.

From the preceding definition, the following properties of a determinant


can be established:

1. The value of the determinant is unity if all the elements on the


principal diagonal (tfu, a22, . . . , ann) are unity and all other elements are
zero.
204 Matrices and Linear Spaces

2. The value of the determinant is zero if all the elements of any row
(or column) are zero, or if the corresponding elements of any two rows
(or two columns) are equal or have a common ratio.
3. The value of the determinant is unchanged if the rows and columns
are interchanged.
4. The sign of the determinant is reversed if any two rows (or two
columns) are interchanged.
5. The value of the determinant is multiplied by a constant k if all the
elements of any row (or column) are multiplied by k.
6. The value of the determinant is unchanged if k times the elements of
any row (or column) are added to the corresponding elements of another
row (or column).

Minors and Cofactors

Minors. If the zth row and y'th column of the determinant |A| are
deleted, the remaining n — 1 rows and n — 1 columns form a determinant
|M^|. This determinant is called the minor of element aij. For example,
if the determinant A is given by

#n #12 #13

#21 #22 #23

#31 #32 #33

the minor of element a12 is


#21 #23

#31 #33

A minor of |A|, whose diagonal elements are also diagonal elements of


|A|, is called a principal minor of |A|. These minors are of particular
importance, as is seen later.

Cofactors. The cofactor of the element au is equal to the minor of


aij9 with the sign (—1)*+3' affixed to it. Thus the cofactor of aij9 written
Cij9 is defined as
C„ = (-1)*' |M„| (4.3-3)

The cofactor C12 of the previous example is

#21 #23
Cl2 ~ (—1)
#31 #33
Sec. 4.3 Determinants 205

Example 4.3-2. Evaluate the minors and cofactors of the third order determinant
shown in Example 4.3-1.
#2 2 #23
|M„| = — <722033 #23#32 — Cll
#32 #33

#21 #23
|M„| = — #21#33 #23#31 — — C12
#31 #33

#21 #22
|M13| = L— #21#32 #22#31 — E^13
#31 #32

#12 #13
|m21| = — #12#33 #13#32 — — C'21
#32 #33

#11 #13
|M„| = #11#33 #13#31 — C22
#31 #33

#11 #12
|m23! = — #11#32 #12#31 — C23
#31 #32

#12 #13
|m31| = — #12#23 #13#22 — T31
#22 #23

#11 #13
|m32| = — #11#23 #13#21 — C32
#21 #23

#11 #12
|m33| = — #11#22 #12#21 — T33
#21 #22

Laplace Expansion of a Determinant

If the results of Examples 4.3-1 and 4.3-2 are compared, it is seen that
the determinant of the third order matrix A can be expressed in terms of the
elements of a single row or column and their respective cofactors. Thus
<

c„ + +
S3
II

#12 (-—12 #13 (-—13


rH
rH

— #21 ^21 + #22 C22 + #23 (^23

— #31 Oil + #32 (-32 + #33 (-33

= #11 Cn + #21 (^21 + #31 (-31

— #12 (-12 + #22 C22 + #32 (-32

= #13 (-13 + #2&(-23 + #33 (-33

The Laplace expansion formula for the determinant of any (n x n) matrix


A is a direct generalization of the preceding observation. This formula
states that the determinant of a square matrix A is given by the sum of the
206 Matrices and Linear Spaces

products of the elements of any single row or column and their respective
cofactors. Thus
n

|A| 2 j = 1, or 2, . . . , or n (column expansion)


i=1
n
(4.3-4)
|A[ i = h or 2, . . . , or n (row expansion)
0=1

Example 4.3-3. Lind the determinant of A, where

ro -1 H
A = 1 2 0
2 0 2

Using the formula established in Example 4.3-1,

A| = [(0)(2)(2) + (— 1 )(0)(2) + (1)(1)(0)] - [(0)(0)(0) + ( —1)(1)(2) + (1)(2)(2)] = -2

Using the Laplace expansion along the first row,

|M„|=4 Cu = 4
| Af ^21 = 2 U12 = —2
\M13\ = 4 C18 = 4

|A| = (0)(4) + (— 1)(—2) + (1)(—4) = -2

Using the Laplace expansion along the second column,

|M12|=2 C12 = —2
|Af22| = -2 C22 = -2
\Mn\ = -l C32 = 1

|A| = (— 1)(—2) + (2)(—2) + (0)(1) = -2

Using the six fundamental properties of a determinant, the value of the determinant
can be found by reducing the determinant to a diagonal determinant. The value of the
determinant is then the product of the diagonal elements. The following steps show
how this method is applied.

Step 1. Multiply the third row by ( — -J-) and add this to the first row.

— 1 -1 0
I A| 1 2 0
2 0 2

Step 2. Add the new first row to the second row.

-1 -1 0

I A| = 0 1 0

2 0 2
Sec. 4.3 Determinants 207

Step 3. Subtract the third column from the first column.

-1 -1 0
|A| 0 1 0
0 0 2

Step 4. Subtract the first column from the second column.

-10 0
I A| 0 1 0

0 0 2
The value of the determinant is then

IA| = alxa2,a^ = (—1)(1)(2) = -2

The second property of a determinant states that the value of a deter¬


minant is zero if the corresponding elements of any two rows (or two
columns) are equal or have a common ratio. In terms of the Laplace
expansion of a determinant this can be formulated as

(elements in the *th row replaced


2. akica = ° k ^ i by elements in the kth row)
3= 1

(4.3-5)
(elements in the /th column replaced
2 a3kCji
j=1 ~ 0 k 1
by elements in the kih column)
Using the Kronecker delta notation,
10 i^k
^ik
1 i = k

Eqs. 4.3-4 and 4.3-5 can be combined into the more useful relationship

I akjCit = |A| dik


3=1
(4.3-6)
2 ^ jk^ ji IA|
3=1

This relationship is of importance in the derivation of Cramer’s rule, as is


seen in Section 4.6.

Pivotal Condensation1 -

Chio’s method of pivotal condensation is a convenient procedure for


evaluating a given determinant by computation of a set of second order
determinants. The method is as follows.
Choose any element ai} in the determinant as the pivot term. Select any
element aik which is in the same row as au, and any element aQj which is
208 Matrices and Linear Spaces

in the same column as aij. The elements aqk, aqj, aik, and aij are then
used to form a second order determinant, with the elements kept in their
proper order. Form all such second order determinants with the pivot
term as one of the elements. The original determinant can now be ex¬
pressed as an n — 1 order determinant using the second order determinants
as elements, and 1 /a^r2 as a multiplying factor. The position of the new
elements can be found by subtracting one from each of the subscripts
of the element in the second order determinant that lies on a diagonal
with the pivot term, if this element lies below the pivot term; the sub¬
scripts are unchanged if this diagonal element lies above the pivot term.
By repeating this procedure, the value of va determinant of high order
can be computed by successively reducing the order of the determinant
by one.
Example 4.3-4. Write the given determinant as a determinant of second order deter¬
minants by means of pivotal condensation.

#11 #12 #11 #13


#11 #12 #13
1 Cl 2 i Cl 22 #21 #2 3
#21 Cl2 2 #23
#n #11 #12 #11 #13
#31 #32 #33
#31 #32 #31 #33

Example 4.3-5. Evaluate


1 3 4 2
4 5 6 1
I A|
3 5 8 9
4 6 2 5
using pivotal condensation.
Using the term axl as the pivot,

1 3 1 4 1 2
4 5 4 6 4 1
-7 -10 -7
1 3 1 4 1 2
I A| -4 -4 3
3 5 3 8 3 9
-6 -14 -3
1 3 1 4 1 2
4 6 4 2 4 5

Again using the new anas the pivot term, after multiplying the first and second columns
by -1,
7 10 7 -7
4 4 4 3 -12 49
- -302
7 10 7 -7 38 21
6 14 6 -3

Note that each step required only the computation of a set of second order determinants.
Sec. 4.3 Determinants 209

To show how the method of pivotal condensation was obtained, a


proof is given for a third order determinant. The extension to high order
order determinants is then evident. Let

#11 #12 #13

|A| = #21 #22 #23

#31 #32 #33

Multiply all the rows of the determinant, except the row containing the
pivot term, by the value of the pivot term. This multiplies the determinant
by the pivot term raised to the n — 1 power. To keep the value of the
determinant unchanged, the resulting determinant is divided by the pivot
term raised to the n — 1 power. Using a±1 as the pivot term, the result of
performing this step is
a ii a 12 a 13
1
|A| = #21#11 #22#11 #23#11
a ii
#31#11 #32#11 #33#11

The first row is then multiplied by the term directly beneath the pivot
term (in the original determinant) and is then subtracted from the row
containing that term.

#11 #12 #13

0 (#22#11 #21#12) (#23#11 #21#13)

#31#11 #32#11 #33#11

This process is repeated until all the terms in the column of the pivot
term are zero except for the pivot term, i.e.,

#11 #12 #13

0 (# 22# 11 #21#12) (#23#11 #21#13)

0 (#32#11 #12#3l) (#33#11 #13#3l)

The resulting determinant is then seen to be equal to a determinant of


order n — 1 divided by the pivot term raised to the (n — 2) power.

#11 #12 #11 #13

1 #21 #22 #21 #23

#11 #11 #12 #11 #13

#31 #32 #31 #33


210 Matrices and Linear Spaces

The method of pivotal condensation is quite useful in finding the solution


to a set of linear nonhomogeneous equations and in the analysis of
circuits. In the latter application, the method of pivotal condensation is a
rapid procedure for eliminating unnecessary nodes from an admittance
matrix formulation of a circuit.
It is interesting to note that the evaluation of an nth order deter¬
minant by the Laplace expansion rule generally requires (n\)(n — 1)
multiplications. The method of pivotal condensation requires (iz3/2 -f n2
—n/3) multiplications; hence this method is a more efficient procedure.
For example, if n = 6, then the Laplace expansion requires 3600 multi¬
plications. The method of pivotal condensation requires only 106
multiplications.

Product of Determinants

It may be shown that the determinant of the product of two square


matrices A and B of order n is the product of the determinants |A| and
|B|. That is, if C = AB, then

|C| = |A| • |B| (4.3-7)

Example 4.3-6. Evaluate the product of the given determinants.

#11 #12 bn b 12 (#11^11 T #12^21) (#11^12 T #12^22)


#21 #22 bi i ^22 (#21^11 T #22^21) (#21^12 + #22^22)

Derivative of a Determinant

As a consequence of Eq. 4.3-4 (Laplace expansion formula), the deriv¬


ative of a determinant with respect to one of its elements is equal to the
cofactor of that element, i.e.,

= C ■ (4.3-8)
oaa dau \*=i

If the elements are functions of a parameter, then the derivative of the


determinant with respect to the parameter is the sum of n (n x n) determi¬
nants, in which the /th (z = 1, 2, ...,«) is the original determinant except
that each element in the /th row (or column) is replaced by its derivative.
For example, if the determinant is given by

an 0*0 a12(x)
|A| = = «u (x)a22(z) — a2l{x)ai2{x)
tf2i0*0 fl22(r)
Sec. 4.4 The Adjoint and Inverse Matrices 211

then the derivative with respect to x is given by

7- |A| = fln(i) [a22(x)] + a22(x)-j [an(x)]


dx dx dx

- a2l(x) A- [al2(x)\ - a12(x) — [a2l(x)}


dx dx
d
au(x) — [ai2(x)] [au(*)] a12(x)
dx dx
+ (4.3-9)
«2 l(X) , [a22(x)] -7- [«2i(x)] a22(x)
dx dx

4.4 THE ADJOINT AND INVERSE MATRICES

If A is a square matrix and CiS is the cofactor of atj, then the matrix
formed by the cofactors CH is defined as the adjoint matrix of A, i.e.,

AdjA = [C,J (4.4-1)

Thus the adjoint matrix is the transpose of the matrix formed by replacing
the elements aij by their cofactors.
Example 4.4-1. Find the adjoint matrix of

'1 0
A = 2 3
1 2

C1X = -3 C12 = 2 C13= 1


C21 = -4 C22 = 1 C23 = -2
C3i = 6 C32 = -4 C33 = 3
Therefore

Adj A = [C„] =

Equation 4.3-6 indicates that


T (4.4-2)
A| I

Therefore the product of the matrix [a^] and the adjoint matrix [CH] is
equal to
[««][C„] = IA | I (4.4-3)

which is a diagonal matrix with all its elements equal to the determinant
of the coefficient matrix A. If both sides of Eq. 4.4-3 are divided by |A|,
212 Matrices and Linear Spaces

provided |A| ^ 0,
ama = i (4.4-4)
|A|

From this equation it seems natural to define the matrix (Adj A)/|A| as
the inverse or reciprocal of A, such that

AA-1 = I (4.4-5)
where

A_1 = ^iTr» CIA] ^ °) (4.4-6)


|A|

It is evident that A_1A = I, i.e., that a matrix and its inverse commute.
Only square matrices can possess inverses.
Example 4.4-2. Find the inverse of A and verify that AA_1 = I, if

1_

1
(N

co
2 3 Adj A

i
A = , A - 1, A-1- , -
1 2 | A| -1 2

To show that AA-1 = I,

Adj A ~2 3“ 2 -3" '1 O'


A I
|A| 1 2 -1 2 0 1

Products of Inverse Matrices

The products of a string of inverse matrices obey the same rules of


transposition as do the products of transpose matrices. To show this,
consider the product C = AB. Premultiply both sides of the equation
by B-1A-1 and postmultiply both sides by C-1. This results in the
relationship
B-1A-1 = C-1 (4.4-7)

Example 4.4-3. Find the inverse matrix of the product AB where

'1 -2“ 'i -r


A = B =
_0 2_ 2 1_
The product AB is

The inverse matrix (AB)-1 is given by


2 3"
Adj (AB) -4 -3
(AB)-1
| AB| 6
Sec. 4.4 The Adjoint and Inverse Matrices 213

Alternatively, the separate inverses A-1 and B-1 are given by

"2 2“ " 1 r
0 1 2
1 _ B 1 - -

The product B-1A-1 is given by

" i r "2 2" " 2 3


_1

_1
r-H

o
(N
1

1
B-1A_1i_
=

i
i
3 2

Thus (AB)-1 = B-1A_1, which agrees with Eq. 4.4-7.

Derivative of the Inverse Matrix

For a /-value at which A(/) is differentiable and possesses an inverse,


the derivative of A_1(/) is given by

(4.4-8)
dt
This relationship can be derived by considering

7 [A-1(0A(0] = 7 = [0]
dt dt
This is

A~\t)A(t) + f [A-\t)]A(t) = [0]


dt
Hence

y [A_1(0] = — A_1(<)A(0A-1(0
dt
Note that this is not the same as {dAjdtY1.

Example 4.4-4. Find the derivative of the inverse of A, where

0 1'
A (/) =
-1 -/

The inverse of A is
—t -r
A ~\t) =
1 o
Clearly,
-1 O'
T [A~\t)] =
dt o o
Checking this result with Eq. 4.4-8,
1
1_
1

7
o
o

“-1 0"
7
■♦•a

1
i

— [A ~\t)} = -
i—

1_

1_

1_

_1
o
o
o
o
o

dt
i
1
1
214 Matrices and Linear Spaces

Some Special Inverse Matrices

Involutary matrix. If AA = I (a matrix is its own inverse), the matrix


is said to be involutary.
Orthogonal matrix. If A-1 = AT, the matrix A is said to be an orthog¬
onal matrix.
Unitary matrix. If A = {(A*)r}_1, then A is unitary. (A is equal to
the inverse of the associate matrix of A.)

4.5 VECTORS AND LINEAR VECTOR SPACES

The rows and columns of a matrix are called row vectors and column
vectors, respectively, in Section 4.2. This is simply an extension of the
more familiar concept of vectors in two- or three-dimensional spaces
to an ^-dimensional space. When n is greater than three, the geometrical
visualization becomes obscure, but the terminology associated with the
familiar coordinate systems is still quite useful. For example, the coordi¬
nate system having the unit vectors
-o~
~r -(T
0
0 l
0
0 0
0
0 5 0
> *2 — J • • • > Ifl
• • •

• •
0
0 0
1

can be thought of as an ^-dimensional system with mutually orthogonal


coordinate axes.

Scalar Products

The scalar product (or inner product) of two vectors x and y is written
as (x, y) and is defined as

(x, y) = (x*)ry = x1*y1 + x2*y2 + • • • + xn*yn


= yTx* xTy* (4.5-1)

Note that the complex conjugates of the components of x are indicated,


since these components may be complex, in general. For the case where
Sec. 4.5 Vectors and Linear Vector Spaces 215

both x and y are real, Eq. 4.5-1 reduces to the more familiar form
n

<x> y) = 2 XiVi = XlVl + XiV2 + • • • + XnVn


i= 1

The scalar product can then (for real x and y) be written as

<x, y) = xry = yTx = <y, x) (4.5-2)


Example 4.5-1. Find the scalar product (x, y) of the vectors

Y
.A. -
-
"1 + f
y —
\7 -
"i + r
2 -V -
The scalar product (x*)ry is equal to <x, y> = (1 — j)(1 + j) + 2(2j) = 2 + 4y.
This is not equal to, but is the conjugate of, x^y* = (1 + /)(! — j) + 2(—2j) — 2— 4j.

Outer Product

If the (n x 1) column vector x is denoted by x) and the (1 x m) row


vector (y*)T is denoted by (y, then the outer product x)(y is the (n x m)
matrix.f

xiVi* XiV2* ' ' ■ XlVm*

xiVi* X2V2* ' ' ' x2Vm*


x><:y = x(y*)2’ = (4.5-3)

• • • '1/ ^
XnVl* XnV* ^no m

Orthogonal Vectors

Two vectors x and y are said to be orthogonal if their scalar product


(x, y) is equal to zero, i.e.,
<x, y) = 0 (4.5-4)

Length of a Vector

The length of a vector x, denoted by ||x|| and sometimes called the


norm of x, is defined as the square root of the scalar product of x and x,
i.e.,
||x|| = V(x, x) = Vx1*x1 + x2*x2 + • • • + Xn*Xn (4.5-5)

As a consequence of this definition, it can be shown that

||x + y|| < ||x|| + ||y|| (Triangle inequality) (4.5-6)


and
|(x, y)| < ||x|| • ||y|| (Schwarz inequality) (4.5-7)

t The outer product is often called the dyadic product.


216 Matrices and Linear Spaces

Unit Vectors

A vector is said to be a unit vector if its length is unity, so that (x, x) = 1.


A unit vector can be obtained from the vector x by dividing each compo¬
nent of the vector x by its length. Thus

X = (4.5-8)
X, x>
Example 4.5-2. Find the unit vector x corresponding to the vector x, where
1 +j 2

.2
v
The scalar product <x, x) = 10. Therefore

~1 +,/2 "

A
VIo
X =

2-7
V io

Linear Independence

A set of m vectors x{, having n components xu, x2i, . . . , xni, is said


to be linearly independent provided that no set of constants Jclt k2, . . . , km
exists (at least one kt must be nonzero) such that
Mr + k2x 2 + • • • kmxm = 0 (4.5-9)
On the basis of the concept of linear independence of vectors, several
important definitions can be stated as follows:
Singular matrix. A square matrix is said to be singular if the rows or
columns are not linearly independent. In this case |A| = 0. If |A| ^ 0,
then the matrix is said to be nonsingular. Hence only nonsingular matrices
possess inverses.
Degeneracy.f If the rows (columns) of a singular matrix are linearly
related by a single relationship, the matrix is said to be simply degenerate,
or of degeneracy 1. If more than a single relationship connects the rows
(columns), the matrix is said to be multiply degenerate. If there are q
such relationships, the matrix is of degeneracy q.
Rank. The rank r of a matrix A is the largest square array in A with a
nonvanishing determinant. If the order of a square matrix is n, then
r — n — q (4.5-10)
Clearly, an (n x n) matrix has rank r < n, only if the matrix is singular,
t The term nullity is often used in place of degeneracy.
Sec. 4.5 Vectors and Linear Vector Spaces 217

Sylvester's law of degeneracy. The degeneracy of the product of two


matrices is at least as great as the degeneracy of either matrix, and at most
as great as the sum of the degeneracies of the matrices.

A clear illustration of Sylvester’s law is given by the product of two


diagonal matrices A and B,

a1b1
@2^2

^3^3
AB

where
A = B =

Assume that A has m zero elements and B has q zero elements, where
q < m. If the zero elements of A do not occur where the zero elements of
B are located, then the total number of zeros in the product AB is m + q.
If the zero elements of B are located where the zero elements of A occur,
then the total number of zero elements in the product AB is m.
Returning to Eq. 4.5-9 and linear independence, the latter can be
expressed in terms of the rank of the matrix formed by the elements of the
m vectors xl5 x2, . . . , xw. This matrix is

^11 X12 x1 m

X21 X22 X 2m
A = m < n

Xn 1 Xni
X nm

If the rank of the matrix A associated with these m vectors is less than
m, i.e., r < m, then there are only r vectors of the set which are linearly
independent. The remaining m — r vectors can be expressed as a linear
combination of these r vectors. Therefore a necessary and sufficient
condition for the vectors to be linearly independent is that the rank of A
be equal to m.

Gramian

The Gramian of a set of vectors is formed by assuming that a relation¬


ship such as Eq. 4.5-9 does exist. By successively taking the scalar products
218 Matrices and Linear Spaces

of x,- and both sides of Eq. 4.5-9, the set of equations

*!> + k2{X2, x2> + • • • + fcm(x1; Xj = 0


kx{x2, xx) + k2{x2, x2> + • • • + km(x2, xj = 0

ki(xm, X,) + k2(xm, x2) H-+ km(xm, xm) = 0

is obtained. As shown in the next section, this set of homogeneous


equations possesses a nontrivial solution for the k{, only if the determinant
of the coefficient matrix [<xi? x,->] vanishes. This determinant is called the
Gramian or Gram determinant, and it is

<*1, Xj) <Xi, X2> • • • <XX, XTO>

<X2, Xt) (X2> X2) • • • <x2, xj


G = (4.5-11)

<xm, Xx> <XM, X2> • • • m’

Therefore a set of vectors is linearly dependent if and only if the Gramian


of the set of vectors is equal to zero. Note that, if the vectors are
orthogonal, the Gramian is a diagonal determinant.

Linear Vector Space3

A linear vector space S consists of a set of vectors such that

(a) the sum of any two vectors in S is also a vector in the set;
(b) every scalar multiple of a vector in the set is a vector in the set; and
(c) the rules for forming sums of vectors, and products of vectors with
scalars have the following properties:

(1) x + y = y + x for any x and y in S;


(2) (x + y) + z = x + (y + z) for any x, y, and z in S;
(3) there exists a vector equal to 0 in S, such that x + 0 = x for any
x in S;
(4) for any x in S, there exists a vector y in S such that x + y = 0;
(5) lx = x for any x in S;
(6) a{bx) = (ab)x for any scalar a and b, and any x in S;
(7) (a + b)x — ax + bx for any scalar a and b, and any x in S;
(8) a(x + y) = ax + ay for any scalar a, and any x and y in S.

The most common example of a linear vector space is the set of all vectors
contained within a three-dimensional Euclidian space. For example, all
the forces acting on a space vehicle constitute a vector space, where the
forces are described as vectors in the particular coordinate system chosen.
Sec. 4.5 Vectors and Linear Vector Spaces 219

If a set of vectors xl5 x2, . . . , xm are contained in the space S, then the
set of all vectors y which are linear combinations of these vectors, i.e.,

y = Mi + ^2*2 + ‘ (4.5-12)

is a vector space. The dimension of the space is the maximum number of


linearly independent vectors in the space.
If only r of the x, vectors in Eq. 4.5-12 are linearly independent, the
dimension of the space that can be generated by these vectors is equal to r,
the rank of the set of the vectors. For example, consider the vectors

T " 0 " T
Xi = 1 x2 = 1 x3 = 2
_0 _ _1 _ _1 _

These three vectors are linearly dependent, as x1 + x2 = x3. A vector y,


which is the linear combination of these three vectors,

y = ^ixi k2X2 T k3x3 = (k1 + At3)x1 + (k2 + A;3)x2

cannot have all three of its components specified independently. Only


two of the components of y can be specified independently. The third
component must be a function of the other two. The dimension of the
space generated by these particular xl5 x2, and x3 is equal to two.
In an ^-dimensional space, the n components of any vector y can be
specified independently if the vector y is generated by a set of vectors of
rank n. In that case, the set of n linearly independent vectors is said to
“span the space.” These n linearly independent vectors can also be used
to form a basis in the space. A basis of a space is a set of vectors such
that any vector in the space is a unique linear combination of the basis
vectors. A basis is, in essence, a coordinate system.
When the components of a vector y are given, it is necessary to indicate
the basis or coordinate system with respect to which the components are
specified. For example, the location of a point in a three-dimensional
space could be specified with respect to a rectangular coordinate system,
a cylindrical coordinate system, a spherical coordinate system, etc.
The statement that y has the components 1, 0, 2, i.e.,

is meaningless, unless the basis is also specified.


220 Matrices and Linear Spaces

Example 4.5-3. Show that the vectors

T T ~r
Xl = 1 , x2 = 2 , x3 = 3
_1_ _3_ _2_

specified in terms of the orthogonal basis (1,0,0), (0,1,0), (0,0,1) span a three-
dimensional space. Specify any vector y of this space in terms of the vectors xx, x2, x3,
and in terms of the basis vectors.
Since the Gramian of these three vectors is

3 6 6
G = 6 14 13 5*0
6 13 14

the three vectors are linearly independent, and therefore they span the three-dimensional
space. Thus any vector y in the three-dimensional space can be expressed as a linear
combination of these three vectors as y = A^Xj + k2x2 + k3x3.
If the vectors x1? x2, and x3 are used as a basis for the three-dimensional space, then,
relative to this basis, the vector y is

Fig. 4.5-1
Sec. 4.5 Vectors and Linear Vector Spaces 221

Relative to the particular orthogonal system of coordinates (1,0, 0), (0, 1,0), and
(0, 0, 1), the vector y is
(k i + A'2 + k3)
y = (k1 + 2k2 + 3A:3)

(ki + 3A:2 + 2k3)_

These vectors are illustrated in Fig. 4.5-1 for a specific y.

Gram-Schmidt Orthogonalization of a Vector Set

If a set of m linearly independent vectors is given, an orthogonal set


of m linearly independent vectors can be found, expressed in terms of the
original set of vectors. If the length of each vector in the orthogonal set
is unity, the set is said to be an orthonormal set.
Assume that the set xl9 x2, . . . , xm is given, and that an orthonormal
set yls y2, . . . , ym is desired. The procedure used to obtain this set is as
follows: Select any one of the x* vectors as the first vector in the yf set.
For example, select y1 = xx. Select another vector x2 from the original
set. Let y2 = x2 — kyx, where k is to be chosen such that y2 is orthogonal
to y1? i.e., such that (yls y2) = <y1? x2) — k{ylf yx) = 0. Therefore

k = (yi, x2)

<yi» yi)
Note that k — 0 if and only if (y1? x2) = 0, i.e., x2 is orthogonal to yl5
in which case take y2 = x2. Generally,

<yi» x2)
y2 = x2
<yi, yi) Yl
The geometrical idea behind the process is that any vector, in particular x2,
may be decomposed into two components, one of which is parallel to yx
and the other perpendicular to yx. To obtain the latter, which is called
y2 here, the component of x2 in the direction of yx is subtracted from x2.
In a similar fashion, y3 is written as y3 = x3 — k2y2 — ^yi, i.e., as the
component of x3 perpendicular to the plane defined by yx and y2. Ana¬
lytically, if the vector y3 is to be orthogonal to both y2 and yx, this leads to
the equations
(yi, x3) = k2(y,, y2> + ylt y,> = kx(yx, y,>

(yz* *3> = k2(y2, y2> + *i<y2, yi> = k2(y2, y2)


or
<y2, x3> (yi. x3>
y3 = x3 - y2 - --: yi
<y2, y2) <yi. yi>
222 Matrices and Linear Spaces

This procedure leads to they'th member of the set, yy, as yx = xx;

3—1
y<>-v
y> = xs - 2 y, O' = 2, 3m)
(=i <y<, y<>

The y/s now form an orthogonal set, and the unit vectors

y< (4.5-13)
ft =

form an orthonormal set.


Example 4.5-4. Using the set of vectors of Example 4.5-3, form an orthonormal
set Yj.
Choosing yi = xl5
rn rr
yi = l i
and y'-V3
[_i_ [lj
Then, since <ylf x2>/<y1, yx> = 2,

T ~r _-r "-r
l
y-2 = 2 - 2 i = 0 and y2 = —= 0
V2
_3_ _i_ i_ i_
Similarly,
T ~-r T r sin
y3 = 3 — isi 0 - 2 1 = 1
2 i 1 21
and
-r
A

y3 2
V6
-1
The orthonormal basis is then

T ~-r ~-r
1 A
i i
1 0 2
’ y!-V2
Yl_ V3
LiJ lj
’ y3 “ V6
L-iJ
Reciprocal Basis

The solution for the k/s of Eq. 4.5-12 can be simplified by defining a
set of vectors r,- such that

<r<, x3.) = du (i,j =1,2, . . . , m) (4.5-14)

Given a basis for a space, it is not difficult to show that there always exists
a set of vectors such that a set of relations of the form of Eq. 4.5-14 is
Sec. 4.6 Solutions of Linear Equations 223

satisfied. The vectors r1} r2, . . . , rm are linearly independent and thus
span the ra-dimensional space spanned by the basis vectors xt. Therefore
they constitute a basis for the space. Owing to the relationship (Eq.
4.5-14) between this basis and the basis formed by the xt- vectors, the basis
formed by the ri vectors is called a reciprocal basis.
The principal use of the reciprocal basis is in finding the constants ki
of Eq. 4.5-12. If the scalar products of both sides of Eq. 4.5-12 are taken
with rl9 the result is

<fi, y> = &i(ri, xx) + k2{r1; x2> + • • • + km(r1( xm>

However, by the definition of the reciprocal basis, the scalar products


(r1} x2), . . . , (rl9 xw) vanish so that k1 = (rl9 y), and in general

*,= <r»y> (4.5-15)

Equation 4.5-12 can then be written in the form


m

y = 2, (Ti- y>x; (4.5-16)


i=1
The scalar product of (ri9 y) is the component of the vector y in the direc¬
tion of the vector x*. This equation proves to be extremely useful when
the mode characterization of systems is discussed in the next chapter.

4.6 SOLUTIONS OF LINEAR EQUATIONS

Cramer’s Rule

Assume that the nonhomogeneous system of n equations

Vuxi + ^12^2 + * ‘ • + alnxn = yx


a2\X\ + 022X2 + * * ' + a2nxn ~ V2

^nlXl ^n2X2 "E H- @nnXn Vn

is given. The a{j and yi are a known set of numbers. The n unknowns xi
are to be determined. Note that the number of equations is equal to the
number of unknowns. By the use of determinants or matrices, a systematic
procedure exists by which the unknowns generally can be determined.
The set of equations, Eq. 4.6-1, can be written in the compact forms,
n
2 aijxj = yit /=1,2, ...,n or Ax = y (4.6-2)
3= 1
224 Matrices and Linear Spaces

where
*1 Vi
x2 V2
• •
X = and y —
• •

• •

_y n_

By multiplying both sides of Eq. 4.6-2 by the cofactor Cik and summing
over all values of i between 1 and n, it follows that
n n n

1 2, cikaa )xi = 2 CikVi’ k = 1,2, ... ,n (4.6-3)


3 = 1 U=1 1=1
or
(Adj A)Ax = (Adj A)y
However, the term in parentheses in Eq. 4.6-3 is equal to zero, except
when j — k (see Eq. 4.3-6). When j = k, this term is equal to the deter¬
minant of A. Therefore

lAl ** = 2 c»*&> k — 1, 2, . . . , w or |A| x = (Adj A)y (4.6-4)


i=l
If the determinant |A| is not identically zero, the n equations are linearly
independent, i.e., none of the equations can be written as a linear com¬
bination of the other equations. Then the unknown xk is uniquely given by
n

2 ^ikVi
i=l
Xk = k — 1, 2, . . . , n or x = — y = A xy (4.6-5)
|A| |A|
The numerator of the expression for xk is simply the determinant of A
with the /cth column replaced by the column formed by the right side of
Eq. 4.6-1. Thus Cramer’s rule for a solution by determinants can be
stated as follows:
For a set of n linear algebraic equations in n unknowns aq, x2, . . . , xn,
a solution for the unknowns exists if A is a nonsingular matrix. The value of
a given variable, xk, for example, is the quotient of two determinants. The
denominator of the quotient is the determinant of the coefficient matrix.
The numerator of the quotient is the determinant of the coefficient matrix
with the A:th column replaced by the column consisting of the right-hand
members of the set of n equations.
Example 4.6-1. Determine the solution for the two simultaneous equations
a?i 4- 2x2 = 0

xx — lx2 = 2
using Cramer’s rule.
Sec. 4.6 Solutions of Linear Equations 225

The coefficient matrix A is given by

"1
A =
1

The determinant of A is equal to —4. Replacing the first column of A by the right-hand
members of the simultaneous equations, the first unknown x1 is

0 2
2 -2
*i = = 1
-4

Replacing the second column of A by the right-hand members of the simultaneous


equations, the second unknown x.2 is
1 0
1 2
x2
-4
Since

and A-1 is easily determined to be

the x's are given by matrix methods as


1
_1

1
_1
o

" r
H*

_1

_x2_ _2_ i
L $J
1
i

This solution, which agrees with that determined by Cramer’s rule, can be verified by
inspection.

Homogeneous System of Linear Equations

If the right-hand members of Eq. 4.6-1 are zero, the set of equations is
said to be homogeneous. In this case, the numerator of Eq. 4.6-5 vanishes.
Consequently, if the determinant |A| does not vanish, the set of equations
has only the trivial solution x = 0, i.e., aq = 0, x2 = 0, . . . , xn = 0. If
the determinant |A| does vanish, two or more rows or columns of A are
linearly related. Then a ^-parameter family of solutions can be obtained,
where q is the degeneracy of A.
Assume that the rank of the coefficient matrix is r. The values of r
variables can then be expressed in terms of the other q = n — r variables
by the following procedure:

1. Omit q = n — r equations, such that the coefficient matrix of the r


unknowns has a nonvanishing determinant.
226 Matrices and Linear Spaces

2. Form r equations with the r unknowns on the left side of the equation,
and the remaining q — n — r unknowns on the right side.
3. Solve for the r unknowns in terms of the q = n — r unknowns by the
use of Cramer’s rule.

These steps yield q independent solutions, as can be seen by writing


Eq. 4.6-2 for the homogeneous case as

Ax = [ax a2 • • • a Jx = + x2a2 + • • * + xnan = 0

where a1? a2, . . . , an denote the n columns of A. If A is of degeneracy q,


then q linear dependency relations of the form

^lal + ^2a2 + • * * + ^Aan = 0


can be determined. Hence Ax = 0 has q independent solutions.
In order to view the situation geometrically, let b2, b2, . . . , bn denote
vectors which are the transposed rows of A. Thus A = [bx b2 • • * bn]r,
and the homogeneous equations can be written as [bx b2 • • • bn]rx = 0.
This is the same as
<bi, x) = 0
<b2, x) = 0
(4.6-6)

<bn, x) = 0
If x is to satisfy Eq. 4.6-6, and hence Ax = 0, it must be simultaneously
orthogonal to all the vectors b1? b2, . . . , bn. But if the rank of A is n,
then the b/s are linearly independent and utilize all the available n dimen¬
sions. No vector can be found which is orthogonal to all the b’s. However,
if A is of rank n — 1, there is one linear dependency relationship between
the b’s. Thus one of the n dimensions of the linear vector space is not
occupied and is available for x. Similarly, if the rank of A is n — 2, two
dimensions are available. Hence there are two linearly independent vectors
which satisfy Eq. 4.6-6. In general, if the degeneracy of A is q, so that the
rank of A is n — q, then q dimensions of the linear vector space are avail¬
able for x. The number of linearly independent solutions of the homo¬
geneous system of linear equations is equal to q.
The preceding discussion was directed toward the case in which the
number of equations and number of unknowns are equal. If the number
of unknowns, n, exceeds the number of equations, m, the rank of A is less
than n. Nontrivial solutions always exist for this case. If m > n, non¬
trivial solutions exist only if the rank of A is less than n. In both cases,
the number of linearly independent solutions is n — r, where r, assumed
less than n, is the rank of A. These solutions can be found by the three
steps indicated above.
Sec. 4.6 Solutions of Linear Equations 227

The solution vectors x generally are not orthogonal to one another.


However, the Gram-Schmidt orthogonalization procedure can be used to
orthogonalize the solution vectors.
Example 4.6-2. Solve
4xx + 2x2 + x3 + 3x4 = 0

6xx + 3x2 + x3 + 4x4 = 0


2xx + x2 + x4 = 0

The rank of the coefficient matrix is two, i.e., the highest order array having a non¬
vanishing determinant is obtained by omitting the first and third columns and the
second row. Consequently, a set of linearly independent equations is
2x2 + 3x4 = —4xx — x3

x2 + x4 — —2xx

Using Cramer’s rule, x2 = x3 — 2xx and x4 = —x3.


Since xx and x3 are arbitrary, infinitely many sets of solutions exist. However, only
q = 2 of these are independent. For example, let xx = x3 = 1. Then x2 = x4 = —1.
Thus a solution vector is
1
-1
Xi =
1
-1

where xx denotes the first solution vector. Similarly, xx = 1, x3 = 2 leads to a second


solution vector

Since the Gramian


4 5
G = 7^ 0
5 9
xx and x2 are linearly independent. A further choice for xx and x3 leads to a vector x
which is not linearly independent of xx and x2. For example, xx = 2, x3 = 3 yields a
vector which is the sum of xx and x2.
Since <x1} x2> = 5 0, xx and x2 are not mutually orthogonal. Using the Gram-
Schmidt orthogonalization procedure,

-1 5
and x2' = i
1 3

-1 —3

are determined. xx' and x2' are mutually orthogonal.


228 Matrices and Linear Spaces

Normalization of x/by Hx/H = 2 and of x2'by ||x2'|| = Vl 1/2 yields the orthonormal
solutions
f ’-l"

-1 1 5
and x./ — .—
1 2vTl 3

-1 -3

The fact that xl5 x2, x/, x2', x/ and x2' all are solutions to the original equations can be
determined by substitution.

A particularly useful special case of homogeneous systems is one in


which the rank of the coefficient matrix is equal to n — 1. Under this
condition one independent solution can be"found, since q — 1. It can
be shown that the unknowns are proportional to the cofactors of their
coefficients in any non zero row of the coefficient matrix.f Thus a solu¬
tion is given by
Xj = kCii (j = 1, 2, . . . , n) (4.6-7)

where k is an arbitrary scalar and i takes any of the values 1, 2, . . . , n.


Example 4.6-3. Solve
xx — 2x2 + x3 — xi = 0
xx -f- x2 — 3x3 = 0
2xx 4- x2 + x4 = 0
x2 + x3 + x4 = 0

The rank of the coefficient matrix is equal to 3. Therefore the unknowns are pro¬
portional to the cofactors of any row of the coefficient matrix. Calculating C4j,

-2 1 -1

Cn — 1 -3 0 = -2
1 0 1

1 1 -1
C 42 1 -3 0 = -10
2 0 1
1 -2 -1

C43 — 1 1 0 -4
2 1 1
1 -2 1
C 44 — 1 1 -3 = 14
2 1 0
n
t Compare Eq. 4.3-6, ^ akjCij = |A|<5,-*., and Eq. 4.6-2, aijxj = yiy under the con-
j=i i=1
ditions that |A| = y,- = 0, / = 1, 2,. . ., n.
Sec. 4.6 Solutions of Linear Equations 229

The solution to this set of equations is

x4 = —k, x2 = —5k, x3 — —2k, x4 = Ik

These equations comprise one independent solution and all scalar multiples of it.

Dependent Nonhomogeneous Case

The general procedure for the homogeneous case can be utilized for the
nonhomogeneous case in which the number of equations is less than the
number of unknowns. Such a case is

*11*1 + *12*2 + • • • + alnxn = yx


*21*1 + *22*2 H-+ *2n*n = V*
(4.6-8)

*ml*l *ra2*2 "E T *mn*n Vm

where m < n. These equations can be written in the homogeneous form

*11*1 +* 12*2 H-+ *1 nXn + VlXn+l = 0

*21*1 + *22*2 H-+ *2n*n + V&n+l = 0


(4.6-9)

*ml*l + *m2*2 + * * * + *mn*n + 2/m*n+1 = 0

Equation 4.6-9 may be solved using the procedure outlined for homo¬
geneous systems. If xn+1 is then set equal to —1, the solution to Eq.
4.6-8 can be obtained. If it turns out that Eq. 4.6-9 has a solution only
when xn+1 — 0, then Eq. 4.6-8 has no solution. Consequently, Eq. 4.6-8
has a solution only when the coefficient matrix of Eq. 4.6-8 and the
coefficient matrix of Eq. 4.6-9, called the augmented matrix, have equal
rank.

Example 4.6-4. Solve


x1 + 2x2 — 3x3 — 4x4 = 6

x1 + 3x2 + x3 — 2x4 = 4
2xx + 5^2 — 2x3 — 5x4 = 10

The coefficient matrix and the augmented matrix are of rank 3; therefore the set of
equations possesses a solution. The square array formed by eliminating the third
column has a nonvanishing determinant. The set of equations can be written as

x4 -f 2x2 — 4x4 = 6 + 3x3

x4 4- 3x2 — 2x4 = 4 — x3
2xx -f 5x2 — 5x4 = 10 4- 2x3
230 Matrices and Linear Spaces

The determinant of the new coefficient matrix is equal to unity. Therefore


(6 + 3x3) 2 -4
x1 = (4 - x3) 3 -2 — 10 T 1 1
(10 + 2x3) 5 -5
1 (6 + 3x3) —4
x2 = 1 (4 — x3) —2 = —2 — 4x3

2 (10 + 2x3) -5
and
12 (6 + 3x3)
x4 = 1 3 (4-x3) = 0
2 5 (10 + 2x3)
Since x3 is arbitrary, the complete solution is then
xx = 10 -f- 11c
x2 = —2 — 4c

x3 = c
xA 0
where c is an arbitrary constant.

Example 4.6-5. Solve the equations


Qnxi T aX2x2 — yx
a2Xxx 4r a22x2 = y2
for the following cases:
(a) nonhomogeneous, i.e., y # 0;
(b) homogeneous, i.e., y =
; 0

(c) dependent nonhomogeneous, i.e., a21 — a22 = y2 = 0 .


Assuming that a12 ^ 0 ^ a22, the equations can be written in the form

an Vi
x9 -x, H-
a12 aX2

a21 V'2
X2 -xx H-
a22 a22

These equations can be represented in an a^avplane as in Fig. 4.6-1. As indicated in


the following, such a representation aids in clarifying the points of this section.
(a) Nonhomogeneous. Two possibilities exist for this case. If |A| = 0, then axxa22 —
^12^21 ~~ OI"
axx a2i
a 12 a 22

Thus the slopes of the two lines of Fig. 4.6-1 are equal, i.e., the lines are parallel. Conse¬
quently the lines do not intersect, and no solution vector exists.
If | A| 7^ 0, the lines intersect at one and only'one point, given by the solution

Vla22 V 2^12 V2a\\ y ia2\


= X2 =
I A|
These are the components of the solution vector.
Sec. 4.6 Solutions of Linear Equations 231

x2

(b) Homogeneous. Again two possibilities exist. Since yx = y2 = 0, both lines pass
through the origin. If the slopes are unequal, i.e., if |A| ^ 0, then there is one solution.
It is the trivial one, xx = x2 = 0.
If |A| =0 and q = 1, then both lines have the same slope. Thus the lines coalesce,
and any point on the lines is a solution. Since the equation of the coalesced lines is
a 11x1 + a12x2 = 0 = a2ixx + o22x2, the one independent solution is given by

x2 aii a2i
xx aX2 @22

In terms of the preceding discussion, n = 2, but r = 1. Hence omit one of the original
equations (either one), and the solution is

a 9i
Xo —-x1
a22
which are equal.
Note that
a ii @21
bx and b2
@12 @22

They are linearly related and hence parallel in the plane. The solution x is orthogonal
to both bi and b2.
232 Matrices and Linear Spaces

(c) Dependent nonhomogeneous. The equation is axxxx + aX2x2 = yx. This can be
written in homogeneous form as axxxx + aX2x2 + x3yx = 0. In order to solve as a
homogeneous equation, the form a12x2 = — axxxx — x3yx is written. The solution is

aw V\
x2 —-xx — x3 —
a 12 tx12

This is the equation of a plane in three dimensions, and any point on the plane is a
solution. In particular, along the line in the plane for which x3 = —1, the solution is

an . Vi
x2 --xx H-
a i2 &12

Any point on this line is a solution. For this simple case, of course, this equation is
simply the original equation in rewritten form.

4.7 CHARACTERISTIC VALUES AND


CHARACTERISTIC VECTORS f

The topic of characteristic values and characteristic vectors is an ex¬


tremely important one, as the dynamic behavior of a linear system is
dependent upon the characteristic values of the system. It is important
that the reader understand this topic, as much of system analysis and
synthesis depends upon the solution of characteristic value problems.
Consider the vector-matrix equation

y = Ax (4.7-1)

where y and x are column vectors, and A is a square (n x n) matrix. This


equation can be viewed as a transformation of the vector x into the
vector y. The question arises whether there exists a vector x, such that the
transformation A produces a vector y, which has the same direction in
vector space as the vector x. If such a vector x exists, then y is proportional
to x, or
y = Ax = Ax (4.7-2)

where X is a scalar of proportionality. This is known as a characteristic


value problem, and a value of A, e.g., X.t, for which Eq. 4.7-2 has a solution
X; 0, is called a characteristic value of A. The corresponding vector
solution x^ 0 is called a characteristic vector of A associated with the
characteristic value Xt.

f The terms eigenvalues and eigenvectors are frequently used in place of characteristic
values and characteristic vectors, respectively. The terms latent roots and latent vectors
are also used.
Sec. 4.7 Characteristic Values and Characteristic Vectors 233

Equation 4.7-2 can be written in the form of a homogeneous set of linear


equations as
On - A)x1 + a12x2 + • • • + alnxn = 0
021*1 + O22 - A)x2 + • • • + a2nxn = 0

0*1*1 + o„2*2 + ' • • + (ann - A)xn = 0


or
[21 - A]x = 0 (4.7-3)

This system of homogeneous equations has a nontrivial solution if, and


only if, the determinant of the coefficients vanishes, i.e., if

121 - A| = 0 (4.7-4)

The Characteristic Equation

The nth. order polynomial in 2, given by Eq. 4.7-4, is called the character¬
istic equation corresponding to the matrix A. The general form for the
equation is

P(2) = 2n + a1Xn~1 -{- a2Xn~2 + • * • + Qn-i^ T an — 0 (4.7-5)

The roots of the characteristic equation are precisely the characteristic


values of A. When all the characteristic values of A are different, A is said
to have distinct roots. When a characteristic value occurs m times, the
characteristic value is said to be a repeated root of order m. When a
characteristic root is of the form a + jft, the root is said to be complex.
Complex roots must occur in conjugate pairs, assuming that the elements
of A are real.
The coefficients ax and an of the characteristic equation, Eq. 4.7-5,
are of special interest. If 2 is set equal to zero, then

P(0) = |-A| =aB (4.7-6)


Therefore
a„ = (-l)"|A| (4.7-7)

If the polynomial P(X) is written in the factored form (assuming the


characteristic values are distinct)

m =a- — 4> - • a — aj
and 2 is again set equal to zero, it follows that

p(0) = (-i)" iai = (—lnu-'-y


or
AiA2A3 ■ • • An = |A| (4.7-8)
234 Matrices and Linear Spaces

The product of the characteristic values is equal to the determinant of A.


Note that, if any of the characteristic values is equal to zero, A is singular.
By expanding the factored form of the characteristic equation, the
coefficients of the various powers of A can be obtained in terms of the
characteristic values. For example, the coefficient of An_1 is

#1 = (^1 + ^2 + * * * + An)

If the determinant |AI — A| is also expanded, it is found that the coefficient


of An_1 is equal to the negative sum of the diagonal elements of A, i.e.,

ax = —(Ax + A2 -F * • * + An) = —(tfii + #22 + * ’ * + ann) (4.7-9)

Thus the sum of the diagonal elements of a square matrix is equal to the
sum of its characteristic values. Because of its importance, the sum of the
diagonal elements of a matrix is given a title, namely, trace of the matrix.
Hence the above can be written as

h + A2 + • • • + Xn = trace of A = Tr [atj] = #n + #22 + • • • + ann


(4.7-10)

If the trace of Ak (A multiplied by itself k times) is given the symbol


then a useful recursive formula for expressing the coefficients of the
characteristic equation in terms of the various Tk can be written as

#i = T1
#2 == \{a1T1 + T2)
as = ~i(^2^i + a\T2 + T3) ^ 7-11)

an ~-(fln-l^l + an- 2^2 + * * * + #lFn_i + Tn)


n
This formula provides an alternative means to Eq. 4.7-4 for determining
the characteristic equation.! In particular, it is more helpful if a computer
is to be used to determine the coefficients in the characteristic equation.
Example 4.7-1. Find the characteristic equation of A, where

Using Bocher’s formula,

ax = -Tx = -(2 + 1 - 1) = -2

f This formula is called Bocher’s formula. See Reference 4.


S<?c. 4.7 Characteristic Values and Characteristic Vectors 235

The matrix product of A with itself yields

'5 3 r
A2 = 4 2 3
4-2 7

Thus a2 = —?(a1T1 + T2) = —5. Similarly,

T4 -4 17
A3 = 13 3 11
13 11 3

so that a3 = — \{a2Tx + axT2 + T3) = 6. Thus the characteristic polynomial is

23 - 2A2 - 5A + 6 = (a - 1)(2 + 2)(A - 3) = 0

The characteristic values are 2X = 1, 22 = —2, and 73 = 3.

Modal Matrix

For each of the n characteristic values 2,(z =1,2,..., n) of A, a


solution of Eq. 4.7-3 for x can be obtained, provided that the roots of
Eq. 4.7-4 are distinct. The vectors x„ which are the solutions of

[2,1 - A]x, = 0 (j = 1, 2, . . . , n) (4.7-12)

are the characteristic vectors of A. Since Eq. 4.7-12 is homogeneous,


/c,x„ where ki is any scalar, is also a solution. Thus only the directions of
each of the x, are uniquely determined by Eq. 4.7-12. The matrix formed
by the column vectors /:,x, is called the modal matrix.f
For this case of distinct characteristic values, columns of the modal
matrix can be taken to be equal, or proportional, to any nonzero column
of Adj [2,1 — A]. This is based upon the fact that [2,1 — A] is of rank
n — 1. The rank must be less than n, because of Eq. 4.7-4. However,
it cannot be less than n — 1, because all the (n — 1) rowed minors of
12,1 — A| would also be zero. This would in turn require (see Eq. 4.3-8)

-jt f|AI — A|
d/1 '

indicating that 2, is a repeated root of Eq. 4.7-4. But this has been ruled
out by the assumption that the characteristic values are distinct. Thus
[2,1 — A] is of rank n — 1, and application of Eqs. 4.6-7 and 4.4-1 then
shows that the columns of the modal matrix are proportional to any
t The term “mode” is used because, as later chapters show, the modes of dynamic
behavior of a linear system can be expressed in terms of motion along the characteristic
vectors.
236 Matrices and Linear Spaces

nonzero column of Adj [XJl — A]. Since the columns of Adj [At-I — A]
are linearly related for a given X^ each choice of Xt specifies only one
column of the modal matrix.
Example 4.7-2. Find the characteristic values and a modal matrix corresponding to
the matrix A, where
~2 -2 3“
A = 1 1 1
1 3 -1

The characteristic equation is, from Eq. 4.7-4,

A — 2 2 -3
-1 A — 1 -1 = 0 = A3 — 2A2 - 5A f 6 = (A - 1)(A + 2)(A - 3)
—1 —3 A T 1

The characteristic values are Ax = 1, A2 = —2, and A3 = 3.


The adjoint matrix Adj [Al — A] is

"(A2 - 4) (-2A + 7) (3A - 5)


(A + 2) (A2 - A - 5) (A + 1)
_(A + 2) (3A - 8) (A2 - 3A -f 4)

For Xx — 1, the adjoint matrix is


"-3 5 -2"
3-5 2
3 -5 2_

For A = —2, the adjoint matrix 5

"o ii -ir
0 1 -1
0 -14 14
For A = 3, the adjoint matrix is
'5 1 4“
5 1 4
5 1 4

Since the characteristic vectors are uniquely determined only in direction, these vectors
can be multiplied by any scalar and still satisfy Eq. 4.7-12. Consequently, a modal
matrix is
"-i ii r
M = l l l
1 -14 1

Each column of the modal matrix is a characteristic vector which spans a one¬
dimensional vector space. The three columns of the modal matrix form a basis in the
corresponding three-dimensional space.
Sec. 4.7 Characteristic Values and Characteristic Vectors 237

The preceding discussion considered the modal matrix when the char¬
acteristic values of A are distinct. For the case in which there is a repeated
characteristic value and A is nonsymmetric, the determination of the
number of independent modal columns is not quite as clear. The reason
for the ambiguity is that there is no unique correspondence between the
order of a repeated root of the characteristic equation and the degeneracy
of the corresponding characteristic matrix [XiI — A].
If Xi is a repeated root of order p, the degeneracy of the characteristic
matrix cannot be greater than p, and the dimension of the associated
vector space spanned by the corresponding x2 is not greater than p. The
problem arises when there is a repeated root of order p, and the de¬
generacy q of [A*I — A] is less than p. Only q < p linearly independent
solutions to Eq. 4.7-12 can be found. The dimension of the associated
vector space for the xt is less than p, and p linearly independent char¬
acteristic vectors corresponding to A,- cannot be obtained. Only when the
(n x n) matrix A is symmetric is the degeneracy of [AJ — A] definitely
equal to p for a p-fold root, so that p linearly independent characteristic
vectors can be found.
For the case in which the degeneracy of [AZI — A] is equal to one
(simple degeneracy), the modal column can be chosen to be proportional
to any nonzero column of Adj [AtI — A]. This is the only column that
can be obtained for the set of p equal roots. For the case where the
degeneracy of [2*1 — A] is equal to q > 1, Adj [AtI — A] and all its
derivatives up to and including

3C2 {Adj [XI - A}|w,


are null matrices.! The q linearly distinct solutions for the modal columns
can be obtained from the columns of differentiated adjoint matrices
which are non-null. For example, if q = p (so-called “full degeneracy”),
then the p linearly distinct modal columns can be obtained from the
nonzero columns of

{Adj m - A]}L=i(
Cl A

This matrix (called a derived or differentiated adjoint), being of rank /?,


can be placed into the form of the matrix product CB, where the p
columns of the rectangular matrix C and the p rows of the rectangular
matrix B are linearly independent. The p columns of C can be selected
for the required modal columns.6

t The proof of this can be found in Reference 5.


238 Matrices and Linear Spaces

Example 4.7-3. Find the characteristic vectors of the given A matrices.


Repeated characteristic roots and simple degeneracy.

1 0“
0 1
-3 3

The roots of the characteristic equation |2I — A| =0 are equal to 1, 1, 1, a triple


unity root. However, the degeneracy of the characteristic matrix

~ 1 -1 O'

[AI — A] |A=1 = 0 1 -1
-1 3 -2

is equal to one. Therefore there is only one modal column corresponding to the triple
root. Since
~1 -2 r
Adj [2,1 - A] |Ai=1 1 -2 1
1 -2 1

the only linearly independent characteristic vector is

It can span only a one-dimensional space. The determination of two additional vectors
for this case is considered in the next section.
Repeated characteristic roots and multiple degeneracy.

"2 1 r "2-2 -1 -1 "


>2

>

A = 1 2 l -1 2-2 -1
II
1

_1

o
o

_0 0 i_
1
i

The characteristic values of this matrix are 1, 1, 3, a double root at 2 = 1 and a single
root at 2 = 3. The degeneracy of [2,1 — A] for 2t- = 1 is equal to two. Since the charac¬
teristic matrix has “full degeneracy,” it is possible to obtain a linearly independent
vector solution for each of the repeated roots.
The adjoint matrix is
-(2 - 2)(2 - 1) (2 - 1) (2 - 1)
Adj [21 — A] — (2-1) (2 - 2)(2 - 1) (2 - 1)
0 0 a-m - 3)

Substitution of 2, = 1 in any column of the adjoint matrix yields a null column. The
first derivative of the adjoint matrix is
"22-3 1 1 "
d
{Adj [21 - A]} 1 22-3 1
1).
0 0 22 -4
Sec. 4.7 Characteristic Values and Characteristic Vectors 239

Evaluating this matrix at A = 1 yields

■-1 1 1 " "-I 1 "

"1 -1 0 “

1 -1 1 = 1 1
_0 0 1_

0 0 — 2_ 0 — 2_

Hence the two modal columns corresponding to the repeated root at A = 1 are given by

~-r " r
Xl = i Xo = l
o_ 2_

The characteristic vector or modal column for A,- = 3 can be chosen to be proportional
to any nonzero column of the matrix Adj [A,I — A], Thus

The complete modal matrix is then


r
M = i
o
Symmetric Matrices

The case where the matrix A is real and symmetric occurs so frequently
when dealing with linear circuits that special attention should be paid to
the form of the characteristic values and characteristic vectors associated
with such a matrix.
A fundamental property of real symmetric matrices is that the char¬
acteristic values of a real symmetric matrix must be real. This can be
shown by assuming that the characteristic values are complex. They must
then occur in complex conjugate pairs, as must the characteristic vectors.
Thus Ax = Ax and A*x* = Ax*. Premultiplying the first equation by
x*r and postmultiplying the transposed form of the second equation
by x gives A(x, x) = (x, Ax) and A*(x, x)= (x, ATx>. Noting that,
for a symmetric matrix A = AT, these expressions require

(A — A*) (x, x) = 0.

Since (x, x) # 0, A = A* and hence the characteristic values are real.


A second property of real symmetric matrices is that the characteristic
vectors form an orthogonal set. Let Aj and A2 be two distinct character¬
istic values with associated characteristic vectors xx and x2. Then
AjXj = Ax1 and A2x2 = Ax2. If the first of these relationships is transposed
and postmultiplied by x2, then
240 Matrices and Linear Spaces

Premultiplying the second relationship by xxT yields

A2X1TX2 = XxTAx2

If Eq. 4.7-13 is subtracted from this expression, then

(h ~ Ai)(Xi, x2) = 0 (4.7-14)

Since it was assumed that Ax A2, it follows that

<xl9 x2) = 0 (4.7-15)

Equation 4.7-15 shows that the characteristic vectors of a symmetric


matrix form an orthogonal set. If the matrix has n distinct characteristic
values, these vectors form an orthogonal "basis in the ^-dimensional
space. Furthermore, it can be shown that, if a symmetric matrix has a
characteristic value Xi of order p, then associated with this characteristic
value are p linearly independent characteristic vectors. As was shown
earlier, this is not generally true for a nonsymmetric matrix.
In a similar manner it can be shown that the characteristic values of a
Hermitian matrix are real, and that the characteristic vectors are orthog¬
onal, such that
<xl5 X2) = X!*rX2 = 0

Example 4.7-4. Show that the characteristic vectors of the symmetric matrix A are
orthogonal, where
"-3 2~
A =
2 0

The characteristic equation is A2 + 3A — 4 = 0, and the characteristic values are


Ax = 1 and A2 = —4. The adjoint matrix is
"A 2
Adj [AI - A]
2 A + 3
A suitable set of characteristic vectors is

and x2

Clearly, <x1? x2> = 0

Diagonalizing a Square Matrix

Consider the case in which the modal matrix M is nonsingular, such


that its inverse M_1 exists.! Under this restriction, the solution to Eq.

t This is always the case if the characteristic values of A are distinct, or in the case of a
symmetric A matrix. See Reference 7.
Sec. 4.7 Characteristic Values and Characteristic Vectors 241

4.7-12 can be combined to form the single equation

^ixn ^2X12 ^nXln

^1X21 ^•2X22 ^nX2n

}lxn\ KXn2 * ‘ ' ^nXnn

. . . • • •
S3

a in X11
rH
rH
al2 X12 Xln
• • • • • •
— *21 a22 a2 n X21 x22 X2n
(4.7-16)

• • • • • •
anl 0' n2 @nn Xnl Xn 2 Xnn_

or
MA = AM
where

is a diagonal matrix composed of the characteristic values 21? A2, . . . , Xn.


Since M_1 exists, the diagonal matrix A can be found by premultiplying
both sides of Eq. 4.7-16 by M_1, yielding

A = M_1AM (4.7-17)

Higher powers of A can be diagonalized in the same manner. For


example,
A2 = (M-1AM)(M-1AM) = MXA2M (4.7-18)

A transformation of the type B = Q-1AQ, where A and B are square


matrices and Q is a nonsingular square matrix, is called a collineatory
or similarity transformation. More is said about this type of trans¬
formation in the next section, when transformations are discussed.
The significance of this transformation is evident when the set of linear
equations y = Ax of Eq. 4.6-2 is examined. If the transformation
x = Mq is made, then Eq. 4.6-2 can be written as

y = AMq (4.7-19)

Premultiplying both sides of Eq. 4.7-19 by M_1 gives

M-1y = M-1AMq = Aq (4.7-20)


242 Matrices and Linear Spaces

Letting z = M-1y yields


z = Aq (4.7-21)
or
zi = K<h
^2 == ^-2^2

In terms of the new coordinate system ql9 q2, q2, ... , qn, the set of equations
described by Eq. 4.7-21 is uncoupled. Note that the qi coordinates are
in the direction of the characteristic vectors. These coordinates are called
the normal coordinates of the system. By transforming to the normal
coordinates, the characteristic values, and hence the modes of the system,
are isolated. It is for this reason that M is called the modal matrix.
Recognition of the characteristic vectors as uncoupling the coordinates
forms the core of the mode interpretation of linear systems.
The columns of the modal matrix M form a basis, and the rows of
M_1 form a reciprocal basis, in the original space. If the columns of the
modal matrix are called ul5 u2, . . . , un, and the reciprocal basis is denoted
by rl9 r2, ..., rn, then any vector y can be expressed as

y = <ri, y)«i + <r2, y)u2 4-+ <r„, y)u„ (4.7-22)

To illustrate the equivalence of this form and the use of the normal
coordinates, consider

y = MM-1y = [ux u2 • • • uJM^y

Z2

= [«1 u2 • •• uj = ^lUi + 22u2 + • • • + znun

If this expression is compared with Eq. 4.7-22, it i$ evident that z{ =


(rz, y) and therefore the rows of M_1 form a reciprocal basis.
This point could have been made by recognizing the fact that, if the
modal columns uu u2, . . . , un constitute a proper basis, then, in view of
the definition of the reciprocal basis (rt-, u,-) = dij9 the rows of M_1
must constitute the reciprocal basis, i.e., MM-1 = I. However, the
details of the preceding observation are instructive, as the reader will
Sec. 4.7 Characteristic Values and Characteristic Vectors 243

find both forms in the literature. Some authors prefer expressing a


vector in terms of scalar products, while others use the normal coordinate
form. The reader should be aware of the fact that both forms produce
identical results. The scalar product form is a little more geometric in
interpretation, possibly giving the reader an intuitive or physical feeling
for the results. The normal coordinate form is written slightly more
compactly and is generally used for proofs and derivations. These points
will be made evident when the dynamic behavior of linear systems is
discussed in Chapter 5.

Example 4.7-5. For each of the cases below, show that the product M_1AM is a
diagonal matrix with its elements equal to the characteristic values.
Distinct roots (see Example 4.7-2).

"2 -2 3“ “-1 11 r “-15 25 -10“

M
1 1 1 M = 1 1 l 0 2 -2

S!

II
1 3 -1 1 -14 i_ 15 3 12_

Consequently,
"1 0 0 "

M AM 0-2 0
0 0 3 /

Repeated roots and multiple degeneracy (see Example 4.7-3).

"2 1 r "-1 1 r ~-l 1 0“


A = 1 2 i M = 1 1 l M-1 = i 0 0 -1
_0 0 i_ 0 -2 0 1 1 1_

“1 0 0 ~

M_1AM = 0 1 0
0 0 3

This procedure yields a diagonal form when the degeneracy of the characteristic matrix
[A J — A] is equal to the order of the root. If it is not, a similar procedure yields a more
general Jordan form discussed in the next section.

Example 4.7-6. For the linear transformation given, show that the columns of M form
a basis and the rows of M-1 form a reciprocal basis. Express y in terms of the charac¬
teristic vectors.
f 0 11

The characteristic equation |AI — A| = 0 is then

A2 + 3A + 2 = 0
244 Matrices and Linear Spaces

The roots of this equation are 2X = — 1 and 22 = —2. The adjoint matrix is

2 + 3 1
Adj [21 - A]
-2 2
The characteristic vectors are
" r ■ r
Xi = and x2 =
-1 -2

A normalized set of characteristic vectors are

1/V2' 1/V5
Ui and u2 =
-1/V2 -2/V5
The modal matrix is
1/V2 l/v%'
M =
-l/v7! -2/V5
The inverse modal matrix is
2^2 V2
M1
— V5 -V5

Note that the rows of M_1 form a reciprocal basis

2V2" -V5
_1

and r2 =
1 <N

L. _
<
>

Ul
1
1

such that
<rl5 Uj> = 1 <r2, uj> = 0

(ru u2> = 0 <r2, u2> = 1

The vector y, expressed in terms of the vector x, is

x2

If the characteristic vectors u* are used as a basis, then the vector y can be expressed as

y = <1+ y)ux + <r2, y)u2 = MM_1y = [ux u2]M-xy

= (2V2 yx + V2 y2)ux + ( — V5 yx — V5 y2)u2

= (—2V2 xx — V2 ^2)ui + (2 V75 xx + 2V5 £2)u2

Note that substitution for ux and u2 yields

yx = (-2V2 x, - V2 z2)(1/V2) + (2V5 xx + 2V5 x2X\/V5) = x2

y2 = (-2A/2 xx - V2 x2)( — I/V2) + (2V5 + 2V5 x2X-2/V5) = -2xx - 3x2


Sec. 4.8 Transformations on Matrices 245

4.8 TRANSFORMATIONS ON MATRICES

Before discussing the subject of transformed matrices, certain ele¬


mentary operations on a matrix are considered. These operations are:

1. Interchange of any two rows (columns).


2. Addition to a row (column) of a multiple of another row (column).
3. Multiplication of a row (column) by a nonzero constant.

Performing any one of these elementary operations on a square matrix


is equivalent to premultiplication or postmultiplication of the matrix
by a nonsingular matrix, such that the rank of the transformed matrix is
equal to the rank of the original matrix.

Operation 1. This operation is equivalent to renumbering the rows


(columns), and therefore cannot change the rank of the matrix. Let Qi
denote the (n x n) unit matrix with the zth and y'th rows interchanged, and
let A be any (n x n) matrix. Clearly, premultiplication of A by Qx produces
a matrix with the zth and y'th rows interchanged. Postmultiplication of
A by Qi produces a matrix with the zth and y'th columns interchanged.
For example,
"1 0 O ' *11 *12 *13 11 *12 *13

0 0 1 *21 *22 *23 31 *32 *33

_0 1 0 _ _*31 *32 * _ 33 21 *22 *23

"*11 *12 *13 "1 0 ' O 11 *13 *12

*21 *22 *23 0 0 1 21 *23 *22

_*31 *32 * _
33 _0 1 0_ 31 *33 *32

Operation 2. If k times the y'th row is added to the zth row of A, this is
equivalent to premultiplication of A by Q2, where Q2 is the (n x n) unit
matrix with the element k in the zth row and y'th column (z y). In a
similar fashion, the addition of k times the zth column to the yth column
is equivalent to postmultiplication of A by Q2. For example,

"1 0 O ' "*11 *12 *13 *11 *12 *13

0 1 0 *21 *22 *23 = *21 *22 *23

0 k 1_ _*31 *32 * _
33 _*3i + kan *32 ~F ^*22 *33 + ka23_

"*11 *12 * _
13 '1 0 O' "*n *12 + ^*13 *13

*21 *22 *23 0 1 0 = *21 *22 “b kz723 *23

_*31 *32 * _
33 _0 k 1_ _*31 «32 + ktf33 * _
33
246 Matrices and Linear Spaces

Operation 3. If the zth row of A is multiplied by a constant k, this is


equivalent to premultiplying A by Q3, where Q3 is the (n x n) unit matrix
with k substituted for the zth element on the principal diagonal. Similarly,
multiplication of the zth column of A by a constant k is equivalent to
postmultiplication of A by Q3. For example,

'1 0 0" an 012 013 011 012 013

0 1 0 021 022 023 = 021 022 023

_i
k_ ka33_

a
_0 0 _031 032 033_ ka%2

CO
rH
0n 012 013 "1 0 0"
~0n 012 ka13
0 1 0 — ko 23
021 022 023 021 022

_031 032 033_ _0 0 k_ *.031 032 ^033_

The determinant of Q3 is k, so that, if the transformed matrix is to


maintain the same rank as A, the constant k must be nonzero.

Equivalent Matrices

Two matrices A and B are said to be equivalent if one matrix can be


obtained from the other matrix by a sequence of the elementary operations.
Since any sequence of these operations on the rows of A may be performed
by premultiplication of A by a corresponding sequence of matrices
P; (z = 1,2, 3,. . .) which are all nonsingular, such a sequence of operations
on the rows of A corresponds to premultiplication of A by a nonsingular
matrix P; and similarly any sequence of operations on the columns
of A is equivalent to postmultiplication of A by a nonsingular matrix
Q. Consequently, matrix B can be said to be equivalent to matrix A
if and only if two nonsingular matrices P and Q exist such that

B = PAQ (4.8-1)

Two matrices which are equivalent have the same rank. Conversely, it
may be shown that two (m x /?) matrices are equivalent if and only if they
have the same rank.

Normal Form

Any matrix A of rank r > 0 can be reduced to an equivalent matrix


of the form
Sec. 4.8 Transformations on Matrices 247

where Ir is the (r x r) unit matrix. These forms are called normal forms,
or canonical forms. Note that, if the nonsingular matrix A can be reduced
to a unit matrix by a sequence of operations on the rows of A, then
P = A-1, Q = I. This is another method of finding A-1. In general,
if A is reduced to a unit matrix by a sequence of elementary operations,
then
A = P !PAQQ 1 = P-XIQ 1 = P !Q 1 (4.8-2)

Thus every nonsingular matrix can be expressed as a product of elementary


matrices.

Example 4.8-1. Reduce the following matrices to the normal form.


Nonsingular matrix.

A =

Step 1. Interchange rows 2 and 1.


1_

1
1
r-H
O

~—2 —3"
_i
O

2 —3_
L
1

Step 2. Add 3 times the second row to the first row.

"1 3" ~-2 -3' ~-2 O'


i_

_i
r-H
O

o
_1

i
i

Step 3. Multiply the first row by


1_

1__1
i

i
o

<N o
o

1 0“
1
1

_i
r-H

r-H
O
H-k

o
L
1

This reduction was performed using only row operations, in order to illustrate how
the inverse of a matrix may be obtained. The sequence of elementary operations was

O' 1 3' 0 r
0 1 0 1 1 0
The product PA is
in
i_

i
o

“_3
2 ~1 O'
_i

—i
o

1 2 —3_
i

Therefore P = A-1.
Singular matrix.
2 —3'
2 -1
-3 4

Step 1. Add the first row to the third row.

"1 0 0~ " 1 2 -3" ~ 1 2 — 3~


0 1 0 -1 2 -1 = -1 2 — 1
1 0 1 -1 -3 4_ 0 -1 1_
248 Matrices and Linear Spaces

Step 2. Add the first row to the second row.

“1 0 O' ' 1 2 -3' "1 2 -3"


1 1 0 -1 2 -1 = 0 4 -4
0 0 1_ 0 -1 1_ _0 -1 1_

Step 3. Add the second column to the third column.

‘1 2 -3' "1 0 O' "1 2 -1"


0 4 -4 0 1 1 =r 0 4 0
_0 -1 1 _0 0 1_ _0 -1 0_

Step 4. Add the first column to the third column.

"1 2 -1" "1 0 r 'i 2 O'


0 4 0 0 1 0 =
0 4 0

_0 -1 0 _0 0 l X) -1 0_

Step 5. Add (—2) times the first column to the second column.

~1 2 0~ "1 -2 0" "1 0 0~


0 4 0 0 1 0 = 0 4 0
_0 -1 0_ _0 0 1_ _0 -1 0_

Step 6. Multiply the second row by

"1 0 0~ '1 0 0~ "1 0 0"


0 1 0 0 4 0 = 0 1 0
4
_0 0 1 _0 -1 0 _0 -1 0_

Step 7. Add the second row to the third row.

i o on
_j
1_
T—H

o
o

o
o

I2 i
0 1 0 0 1 0 —
0 1 0 = “ |°
i_

-1
o
o
o

0 1 1_ 0 0_ _0 0 0_
r

The elementary matrices P and Q are

"1 0 0" "1 0 0" "1 0 O’ '1 0 0" ~4 0 O'


p = 0 1 0 0 1 0 1 1 0 0 1 0 _ 1
— 4 1 1 0
w
_0 1 1 0 0 1 0 0 1 1 0 1 5 1 4_

'1 0 O' "1 0 r 'i -2 0" "1 -2 r


Q = 0 1 1 0 1 0 0 1 0 = 0 1 i
0 0 1_ 0 0 l _o 0 1_ 0 0 i_

The transformation PAQ yields

"4 0 0“ “ 1 2 -3“ '1 -2 1 " '1 0 0“


o
>

1 1 0 -1 2 -1 0 1 1 = 0 1 0
ii

_5 1 4_ -1 -3 4_ 0 0 1_ _0 0 0_
Sec. 4.8 Transformations on Matrices 249

which is the normal form


I, ! 0
0 0

The transformation B = PAQ is the most general kind of matrix trans¬


formation. Other transformations are defined in terms of the relationship
between P and Q. Specifically, these transformations are

Collineatory (Similarity):

B = Q_1AQ, or P = Q 1
Orthogonal:
B = QtAQ = Q_1AQ, or P = QT = Q 1
Congruent:
B = QrAQ, or P = QT

If A is a Hermitian matrix, the following transformations are defined:

Conjunctive:
B = Q*rAQ, or P = Q*T
Unitary:
B = Q*rAQ = Q'AQ, or P = Q*T = Q1

Collineatory (Similarity) Transformation

Consider the linear transformation

y = Ax (4.8-3)

where both x and y are defined in an /7-dimensional space in terms of


a basis z,-. Assume now that the basis is to be changed to the set of vectors
Wj. This is a generalized problem in coordinate transformations. Let
x' and y' form the coordinates of x and y, respectively, in the new basis.
Since and form two bases in the /7-dimensional space, there must
exist a nonsingular P such that

x' = Px X = P-1x'
, , , (4.8-4)
y' = Py y = P V

The relationship between y' and x' in the new coordinate system is to
be found. To obtain this relationship, premultiply both sides of Eq.
4.8-3 by P, forming Py = PAx. Using Eq. 4.8-4,

y' = PAP-V (4.8-5)


or
y' = Bx' where B = Q~XAQ, P = Q"1 (4.8-6)
250 Matrices and Linear Spaces

The matrix B, relating x' and y' in the new coordinate system, is obtained
from A by a similarity transformation.
Similarity transformations have the extremely important property
that the characteristic values are invariant under such a transformation.
To show this, let B = PAQ. Then

B — AI = PAQ - AI = P(A - AP *Q *)Q

The corresponding determinants are

|B - AI| = |P| • |Q| • |A - AP^Q-1!

Since P = Q_1, the product of the determinants |P| and |Q| is equal to
unity. It follows that
|B - AI| = |A - AI| (4.8-7)

Therefore the characteristic values are invariant under a similarity trans¬


formation. If x/ is a characteristic vector corresponding to the char¬
acteristic value Xi of Q iAQ, then xz- = Qx/ must be a characteristic
vector of A corresponding to the same characteristic value Ai of A.

Orthogonal Transformation

Consider the linear transformation of Eq. 4.8-4,

x = Qx', P 1 = Q

where the vector x is defined in terms of an orthogonal coordinate system.


If the new coordinate system is also orthogonal, then the length of the
vector x' in the new coordinate system must be identical with the length
of the vector x in the original coordinate system. Therefore

<X, X) = <x', x'>

Expressing this relationship in terms of the matrix Q,

X TX = xTx = xTQTQx
This requires that QTQ = I, or

Qr = Q1 (4.8-8)
Therefore, if a transformation from one mutually orthogonal basis to
another mutually orthogonal basis is made, the transformation matrix
Q which relates a vector in the new coordinate system to a vector in the
original coordinate system must satisfy the relation shown in Eq. 4.8-8.
The transformation is then called an orthogonal transformation. The
matrix Q is called an orthogonal matrix. An orthogonal transformation
is a special case of a similarity transformation. Lengths and angles are
preserved.
Sec. 4.8 Transformations on Matrices 251

As a consequence of Eq. 4.8-8, it is seen that

IQrl IQI = IQI2 = 1 or IQI = ±1 (4.8-9)


An interpretation of the minus sign for the determinant of Q is that an
orthogonal transformation may be obtained by a rotation and a reflection.
Lengths and angles are still preserved if both a rotation and a reflection
are involved in an orthogonal transformation. It should also be noted
that the cosine of the angle between axis i' and the axes 1,2, ... ,n are
denoted, respectively, by the elements of the Q matrix qa, qi2, . . . , qin.
These quantities are called direction cosines, or the directions of the new
axes with respect to the old axes.

Unitary Transformation

If A is a Hermitian matrix (<aH = aiS*), then a treatment corresponding


to that shown for an orthogonal transformation can be used. For this
case, it can be shown that the transformation from one orthonormal basis
to another orthonormal basis is accomplished through the use of a unitary
transformation of the form x = Qx', where Q is a unitary matrix defined
by
Q*rQ = I or Q*t = Q”1 (4.8-10)
The determinant of a unitary matrix is ± 1. Thus a unitary transformation
preserves lengths in the complex sense.

Congruent Transformation

Two matrices A and B are called congruent if one is obtained from the
other by a sequence of pairs of elementary operations, each pair consisting
of an elementary row transformation followed by the same elementary
column transformation. As a consequence of the definitions of elementary
transformations, it follows that two matrices are congruent if there exists a
nonsingular matrix Q such that
B = QtAQ (4.8-11)
Congruency is a special case of equivalence, so that congruent matrices
have equal rank. A congruency transformation is a transformation to
a new basis such that, if two vectors x and y are related in the original
basis by Eq. 4.8-3, the vectors x' and y' in the new basis are related by
the equation
y' = Bx' = QTAQx (4.8-12)
Transformations of this type are useful when dealing with quadratic
forms, which are discussed in the next section.
252 Matrices and Linear Spaces

Conjunctive Transformation

Two (n x n) Hermitian matrices A and B are conjunctive if one can be


obtained from the other by a sequence of pairs of elementary trans¬
formations, each pair consisting of a column transformation and the
corresponding conjugate row transformation. In view of the definitions
of elementary transformations, two (n x n) Hermitian matrices A and B
are conjunctive (Hermitely congruent) if there exists a nonsingular matrix
Q such that
B = Q*tAQ (4.8-13)
A conjunctive transformation is a transformation to a new basis such
that, if two vectors x and y are related in the original basis by Eq. 4.8-3,
the vectors x' and y' in the new basis are related by the equation

y' = Bx' = Q*rAQx' (4.8-14)

Transformation to a Diagonal Form

Quite frequently it is advantageous to transform the coordinate system


of a problem to a new coordinate system, such that a linear transformation
in the new coordinate system is dependent upon a diagonal matrix. For
this reason, the conditions under which the matrix B = PAQ can be
reduced to a diagonal form are now discussed.

Congruent Transformations

Real Symmetric Matrices. The real symmetric matrix A of rank r


can be reduced to a congruent diagonal matrix having the canonical
form
'1, | 0 0
QrAQ = o Ir—J) 0 (4.8-15)

0 0 0
The integer p is called the index of the matrix, and the integer s—p —
(r — p) — 2p — r is called the signature of the matrix. Two (n x n) real
symmetric matrices are congruent if and only if they have the same rank
and the same signature or index.
Example 4.8-2. Reduce the given symmetric A to the canonical form of Eq. 4.8-15.
2 2 ~

3 5
5 5
Sec. 4.8 Transformations on Matrices 253

Step 1. Subtract the third row from the second row.

"1 0 0“ “1 2 2~ "1 2 2~
0 1 -1 2 3 5 —
0 -2 0
_0 0 1_ 2 5 5_ 2 5 5_

Step 2. Subtract the third column from the second column.

"1 2 2" "1 0 0~ "1 0 2"


0 -2 0 0 1 0 = 0 -2 0
2 5 5 _ _0 -1 1_ 2 0 5_

Step 3. Subtract twice the first row from the third row.

~ 1 0 O' "1 0 2" '1 0 2'


0 1 0 0 -2 0 = 0 -2 0
2 0 1 2 0 5_ _o 0 1
Step 4. Subtract twice the first column from the third column.

"1 0 2 " "1 0 -2 ' "1 0 0 ~

0 -2 0 0 1 0 = 0 -2 0
_0 0 1 _ _0 0 1_ _0 0 1

Step 5. Multiply the second row by 1/V 2 and the second column by 1/V2.

'1 0 O ' "1 0 O ' "1 0 O ' "1 0 O '

0 1/V2 0 0 -2 0 0 1/V2 0 = 0 -1 0
_0 0 1 _ 0 0 1 _0 0 1 _ _0 0 1_

Step 6. Interchange the second column with the third column and the second row
with the third row.
“1 0 0 ~ "1 0 0 " "1 0 0 “ '1 0 cr
0 0 1 0 -1 0 0 0 1 =
0 1 0
_0 1 0 _ _0 0 1_ _0 1 0 _0 0 -1 _

In this case the index of the matrix \sp = 2, the rank is r = 3, and the signature is 5 = 1.

“1 0 O' “1 0 -2" '1 0 0" "1 0 0" '1 -2 0 "


Q = 0 1 0 0 1 0 0 1/V2 0 0 0 1 = 0 0 1/V 2
_0 -1 1_ _0 0 1_ _0 0 1_ 0 1 0_ 0 1 — 1 / V/2_

Skew-Symmetric Matrices. A skew-symmetric matrix of rank r can


be reduced to the canonical form of Eq. 4.8-16 by the congruent trans¬
formation B = — Q Z AQ. The result is

B = —QrAQ = diag (D1? D 2> 0, 0) (4.8-16)


where
r o r
and m — | rank A
o
254 Matrices and Linear Spaces

Two (n x n) skew-symmetric matrices are congruent if and only if they


have the same rank.
Complex Symmetric Matrices. An (n x n) complex symmetric matrix of
rank r can be reduced by a congruency transformation to the canonical
form

(4.8-17)

Two (n x «) complex symmetric matrices are congruent if and only if they


have equal rank.
Hermitian Matrices. An (n x n) Hermitian matrix of rank r can be
reduced to the canonical matrix of Eq. 4.8-18 by a conjunctive trans¬
formation.
XX) o'

B = Q*rAQ = 0 ^ r—v 0 (4.8-18)

_0 0 0_
The index p and signature 5 are the same as defined for a real symmetric
matrix.
Skew-Hermitian Matrices. An (n x n) skew-Hermitian matrix of rank
r can be reduced to the canonical matrix of Eq. 4.8-19 by a conjunctive
transformation.
./i, ! 0 0
B = Q*'AQ = 0 j^-r—p 0 (4.8-19)

_ 0 0 0_
Similarity Transformations

In the previous section it was shown that a transformation of the type


M_1 AM produces a diagonal matrix A if the matrix A had n distinct
characteristic values. For the case where repeated roots are involved, this
transformation is still possible provided the matrix [AI — A] has full
degeneracy. This similarity transformation is also always possible if the
matrix A is real and symmetric, a case which is prevalent in the study of
linear circuits. Since the characteristic vectors of a real symmetric (or
Hermitian) matrix are mutually orthogonal, there always exists a real
orthogonal matrix such that

Q-1AQ = QtAQ = diag (!„ a2, . . . , Xn) (4.8-20)


However, this is not generally the case when nonsymmetric matrices are
involved, and most matrices found in the analysis of control systems are
nonsymmetric matrices.
Sec. 4.8 Transformations on Matrices 255

When an (n x n) nonsymmetric matrix has repeated characteristic values,


there may be less than n linearly independent characteristic vectors;
thus a similarity transformation to a diagonal form may be impossible.
However, it can be shown that any square matrix A can be transformed
by means of a similarity transformation to a Jordan canonical matrix
having the following properties:8

1. The diagonal elements of the matrix are the characteristic values


of A.
2. All the elements below the principal diagonal are zero.
3. A certain number of unit elements are contained in the superdiagonal
(the elements immediately to the right of the principal diagonal) when the
adjacent elements in the principal diagonal are equal.

A typical Jordan form is

1
K 1
h
A1

Note that the “ones” occur in blocks of the form

i
h i
h i

K i

These are called Jordan blocks.


The number of Jordan blocks associated with a given characteristic
value \i in the Jordan form resulting from a similarity transformation of A
is equal to the number of characteristic vectors associated with the
characteristic value, i.e., q, the degeneracy of [AJ — A]. Unfortunately
however, the orders of the Jordan blocks are not easily determined.!

t This is an extremely complicated problem of linear algebra.9’10


256 Matrices and Linear Spaces

The result is that it is not clear whether the Jordan form given above or the
form
K l
Ax
Ax 1
J =
Ax
A2 1
A2

would be the result of the transformation J = M-1 AM. The number of


“ones” associated with a given Af is the order of At- in the characteristic
equation minus the degeneracy of [A*I — A]. However, this does not
clearly define the situation, since both of the above forms have two “ones”
associated with Ax.
It is useful to know that in the case of full degeneracy no “ones” are
present, as was shown in the preceding section. Also, in the case of simple
degeneracy (q = 1), all the superdiagonal elements are unity. For the
cases which fit neither of these categories, the level of the discussion
presented here dictates a trial and error determination of J and M based
upon'
AM = MJ (4.8-21)

Let the columns of M be denoted by xl5 x2, . . . , xn. Then there is a


Jordan block of order m associated with A* if and only if the m linearly
independent vectors xl9 x2, . . . , xm satisfy the equations

Ax1 = Apq or (A*I — A)xx = 0


Ax2 = A^x2 + xx or (A*! — A)x2 = — x1
(4.8-22)

Axm A2:Xto ~b xm_i or (AjI A)xTO xm_-^

These expression apply for each Jordan block.


Example 4.8-3. Show that the matrix
Ax 1
A =
0 L

cannot be reduced to a diagonal form by means of a similarity transformation, if


Ax = A2.
Let
q 22 q 12
^11 <7l2 ^21
Q = and O-a —
_^21 ^22_ IQI
Sec. 4.8 Transformations on Matrices 257

where |Q| = qxlq22 — ^21^12- The similarity transformation Q_1AQ is then

^22(^11^1 ~1“ q%\) ^12^21^2 q22^q 12^1 T- <722) <712*722^2

Q_X^Q _ _~~q2l(qil^l + <721) + <7ll<721^2 —q2l(qi2^1 + <722) + <7ll<722^2_


IQI
If the nondiagonal terms are to vanish, then

^ 2) "T ^222 — 0
qnqzitti - K) + ^2i2 = 0

If Ax = A2, then q22 and q2X must vanish. However, if these terms vanish, then |Q| = 0,
which violates the similarity transformation. Therefore the matrix A cannot be
diagonalized by a similarity transformation if Ax = X2. Since A is already a Jordan
form, no further transformations are considered.

Example 4.8-4. Reduce the A matrix of Example 4.7-3 with repeated characteristic
roots and simple degeneracy to Jordan form. Determine M.
Since
"0 1 0 “

A = 0 0 1
1 -3 3

has a characteristic value A = 1 of order three and only simple degeneracy, there is only
one linearly independent characteristic vector. Hence the Jordan form contains one
Jordan block. Also, the order of A, minus the degeneracy indicates two “ones” in the
Jordan form. Hence the Jordan form must be

'1 1 0"
0 1 1
_0 0 1

The characteristic vector xx is given by the first of Eqs. 4.8-22, which is

*21 — 3-11

*31 = *21

*11 — 3*2X + 3x31 = x3X

where xxx, x2X, and x3X denote the elements of xx. These equations yield *xx = x21 = a?3X.
Thus
T
xx = 1
1

is a characteristic vector. Note that it must be, and is, the same characteristic vector
determined in Example 4.7-3.
Now, considering x2, the second of Eqs. 4.8-22 gives

*22 == *12 "E 1


*32 = *22 ”b 1

*12 3*22 “j“ 3*32 = *32 T 1


258 Matrices and Linear Spaces

Substitution of the first two equations into the third yields &12 — 3'12* Hence x12 is
arbitrary. Let x12 = 1. Then

The third of Eqs. 4.8-22 gives

X2Z ~ X1Z "b 1

^33 = “b 2

3^13 3^23 ”b 3^33 ^ 3^33 “b 3

Again, substitution of the first two equations into the third yields an arbitrary com¬
ponent, since it gives a;13 = x13. Hence let a;13 = —1. Then

~-r

x3 = o
2_
Thus
_i i -r
M = 1 2 o
_1 3 2_
Asa check,
’ 4 -5 2“
M-1 = -2 3 -1
1 -2 1
and
1 O'
M^AM 1 1
0 1
Example 4.8-5. Reduce
"1 0 O'
A = 1 1 0
2 3 2
to Jordan form and determine M.
Evaluation of |2I — A| yields the characteristic equation (4 — 1)2(2 — 2) = 0.
Considering the degeneracy of [XV — A] for 2 = 1,

0 0~
[I - A] = 0 0
-3 -1

is of degeneracy one. Hence J consists of two Jordan blocks. One is a first order block
corresponding to 2 = 2. The second is a second order block with a single “one.” It is

'2 0 O'
J = 0 1 1
0 0 1
Sec. 4.8 Transformations on Matrices 259

The characteristic vector xx corresponding to A = 2 is given by


*u = 2*X1
*11 "t " *21 = 2*21

2*n T 3*2i T 2*31 = 2*31

Choosing *31 = 1, arbitrarily, leads to


“0~
Xi 0
1
Considering x2 corresponding to A = 1,
*12 = *12
*12 T" *22 = *22
2*12 ~b 3*22 ~h 2*32 = *32

Choosing *22 = 1, arbitrarily, leads to


0“
x2 1
-3
In order to determine a second vector corresponding to A = 1, Eq. 4.8-22 gives
*13 — *13

*13 ~b *23 — *23 T 1


2*13 T 3*23 T 2*33 = *33 3

Arbitrarily choosing *23 = 0 gives

x3

Thus
0 0
M = 0 1
1 -3
As a check,
_5 3 r
M1 0 1 0
1 0 0
and
'2 0 0~
M AM = J = 0 1 1
0 0 1
Example 4.8-6. Reduce
"0 0 10 “

0 0 0 1
0 0 0 0
0 0 0 0
to Jordan form, and determine M.
260 Matrices and Linear Spaces

Evaluation of |AI — A| yields the characteristic equation A4 = 0. The degeneracy of


[AI — A] for A = 0 is two. Thus there are two Jordan blocks, and J has two “ones” in
the superdiagonal. These requirements are satisfied by either
0 0 0 0 " "0 1 0 0
0 0 1 0 0 0 0 0
or J =

0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 0

Equations 4.8-22 can be used on a trial and error basis to determine the correct form.
Assume that the correct Jordan form is the first gne given, i.e., one consisting of
(lxl) and (3 x 3) Jordan blocks. The first of Eqs. 4.8-22 gives

»31 = 0

xn = 0
0 = 0
0 = 0
Thus associated with the (1 x 1) Jordan block is

*^21

0
where xyl and x21 cannot both be zero, but are otherwise arbitrary.
Now, considering the (3 x 3) Jordan block, the first of Eqs. 4.8-22 gives
x32 = 0
X j2 =: 0
0 = 0
0 = 0
Hence
X12

X22
x, =
0
0
where x12 and x22 cannot both be zero and must be chosen so that xx and x2 are linearly
independent.
The second of Eqs. 4.8-22 gives
X22 = 0 +

x±2 — 0 +■ x.,2
0 = 0 + 0
0 = 0 + 0
Thus
x\z
X2Z

3? 2 2
Sec. 4.8 Transformations on Matrices 261

Using the third of Eqs. 4.8-22,


*34 = 0 + x13
*44 = 0 + *23

0 = 0+ x12
0 = 0+ x22

The last two expressions violate x2 0. Hence the proper Jordan form cannot be the
one consisting of (1 x 1) and (3 x 3) Jordan blocks.
Assuming now that the correct Jordan form consists of two (2 x 2) Jordan blocks,
the first of Eqs. 4.8-22 yields
*n
_ *21
Xi
0
_ 0 _
where xu and x21 cannot both be zero.
The second of Eqs. 4.8-22 gives
x32 = 0 + Xu

*42 — 0 + x21
0 = 0 + 0
0 = 0 + 0
Thus
*&12

*22
x2
xn
21

where x1 and x2 must be linearly independent.


Now considering the vectors associated with the second (2 x 2) Jordan block, the
first of Eqs. 4.8-22 gives
+33 = 0

xi3 = 0

0 = 0
0 = 0
Thus
~xiz

x23

Xs= 0
0

where x13 and x23 must be chosen so that xx and-x3 are linearly independent.
The second of Eqs. 4.8-22 gives
x3i = 0 + x13

*41 0 + x23
0 = 0 + 0
0 = 0 + 0
262 Matrices and Linear Spaces

Thus
~xli xn X12 X13 xli~

x2i X21 X22 X23 X24


and M =
X13 0 X11 0 X13

_X23_ _ 0 X2 1 0 *^23_

where the components of xu x2, x3, and x4 must be chosen so that these vectors are
linearly independent.
As a simple check, let
*1 0 0 0 “

0 0 1 0 \
0 1 0 0
0 0 0 1
Then
ri 0 0 0“
0 0 1 0
M 1 =
0 1 0 0
_0 0 0 -1_
and
~0 1 0 O'
0 0 0 0
M XAM
0 0 0 1
0 0 0 0

There are many other canonical forms which can be obtained, the Jordan
form being a special case of the more general hypercompanion form.
A fairly complete listing of these forms is given in the literature.11,12
For most of the problems that the reader will face, knowledge of the
Jordan form is adequate.

4.9 BILINEAR AND QUADRATIC FORMS

An expression of the form

B = a11x1y1 + a12x1y2 + • • • + alnx1yn


+ a21x2y1 + a22x2y2 + • • • + a2nx2yn
+.
+ an\XnV\ + an2XnV‘l + ' ' * + annXnVn

where all components are real, is called a bilinear form in the variables
x., yj% This form can be written compactly as
n n

B = 12,2=1 3=1
(4.9-1)
Sec. 4.9 Bilinear and Quadratic Forms 263

or in matrix form as

_1
a 12

M
flu Vi

3
*21 a 22 a2n y2
B = [x1 x2 • = xrAy = (x, Ay)

-0/.1 ' * • a nn- -Vn-


(4.9-2)
The matrix A is called the coefficient matrix of the form, and the rank
of A is called the rank of the form.
If the vector x is equal to the vector y, then Eq. 4.9-2 becomes

Q = \tAx = <x, Ax) (4.9-3)


Q is called a quadratic form in xl9 x2, . . . , xn. An alternative expression
for Q is the double summation

Q = 12
i—1 3 = 1
auxixi (4-9-4)

Note that the coefficient for the term xyc^i ^ j) is equal to {aij -f au).
This coefficient would be unchanged if both aii and aH are set equal to
\{aij + ax). Therefore the matrix A can be said to be a symmetric matrix
without any loss in generality.
If the matrix A is a Hermitian matrix, such that a{j* = aH, then the
corresponding Hermitian form is defined as

H = xt*Ax = 22 aijxi*xj = <x, Ax), (a{j* = ah) (4.9-5)


Z=1 3=1

The theorems which are developed for a real quadratic form have a
set of analogous theorems for the case of a Hermitian form. Since the
proofs of the analogous theorems require only minor changes from
the proofs for the real quadratic case, the latter theorems are stated
without proof.

Transformation of Variables
The linear transformation x= By, where B is an arbitrary (n x n)
nonsingular matrix, transforms the quadratic form of Eq. 4.9-3 into a
quadratic form in the variables yl9 y2, . . . , yn. This form is
Q = yTBTABy (4.9-6)

Q = yrCy where C = B2AB (4.9-7)


Since C = ByAB is a congruent transformation, it follows that the rank
of the quadratic (Hermitian) form is unchanged under a nonsingular
transformation of the variables.
264 Matrices and Linear Spaces

Reduction to Diagonal Form

In many instances, it is desirable to express Q as a linear combination


of the squares of the coordinates, with no cross-terms present. The matrix
A can be reduced to a diagonal matrix by use of the congruent trans¬
formation shown in Eq. 4.9-7. A particularly useful transformation
occurs when B is an orthogonal matrix (Br = B_1), such that the trans¬
formation is orthogonal. As shown in the preceding section, this can be
accomplished for symmetric matrices by choosing B to be equal to the
normalized modal matrix M. Thus the linear transformation x = My
produces the quadratic form

Q = yrM-1AMy = yrAy = <y. Ay)

= KVi + KVi + ' 4 • + „yn2 (4.9-8)

If the symmetric matrix A is of rank r < n, the modal matrix can still
be formed such that the transformation shown yields a diagonal matrix
with the diagonal terms equal to the characteristic values of A. For this
situation the modal matrix M is not unique. There are infinitely many
ways in which a set of m orthogonalized characteristic vectors corre¬
sponding to a characteristic value of order m can be chosen. Note that,
if there is a zero characteristic value of order m, then there are only n — m
nonzero terms in the quadratic form. The matrix A is then of rank
r = n — m.
Example 4.9-1. Reduce Q to a linear sum of squares, where Q = <x, Ax) and

1 1"

0 2
2 0

The characteristic values for A and associated normalized characteristic vectors are
given by:
1
1
-1
-1

2 ~

i
/2 = 4, U) = ~ 1
V6
1

0~
1
= -2, us = 1
1
Sec. 4.9 Bilinear and Quadratic Forms 265

The normalized modal matrix is then

1/V3 2/V6 0 "l/V3 -I/V3 -1/V3 ‘


M = -1/V3 1/V6 1/V2 and M1 = 2/V6 1/V6 1/V6
_ —1/V3 1/V6 — 1 / V2_ 0 1/V2 — I/V2 _

The transformation

*i = (l/v'3)i/1 + (2/v'6)i/2 jh = (1/V3)®, - (1/V3)x2 - (1/Vj)x3

*2 = -(l/v'3)i/1 + (1/V6)y, + (l/V2)i/3 j/2 = (2/V6)®! + (1/V6)®, + (l/Vi)®,

= -(l/VJ)^ + (1/V6)</2 - (1/v'2)</3 1/3 = (1/V2)®3 - (1/V2K

leads to the quadratic form

<2 = + 22y22 + A3I/32 = y2 + 4?/22 - 2?/32

The reduction to a sum of squares can also be approached by the Lagrange technique
of repeated completion of the square. This technique is demonstrated as follows:

Q — 3xp + 2xxx2 -f 2x±x3 -j- 4x2x3

= 3xp + 2xj,(x2 -f x3) + }(x2 -f x3)2 -f 4x2x3 — \{x2 + x3)2

= [V3 xx + (1/V3)(aj2 + x3)]2 — \(x2 — 5-r3)2 + 8x32


Let
Vi = ^/3x1 + (1/V3)x2 + (1/V3)xs
y2 = x2 — Sx3

Vz =
Then
Q = y 12 — -3I/22 + 8 y32

The matrix B which performed the transformation x = By is the triangular matrix

1/V3 -i -2
B = 0 1 5
0 0 1

This is not an orthogonal matrix, and therefore the new coordinate system does not
have mutually orthogonal unit vectors. However, the congruent transformation
BrAB does reduce the quadratic form to a sum of squares.

The results of the preceding discussion show that a real quadratic


form can be reduced by a real nonsingular transformation to the form
shown in Eq. 4.9-8 or, for the case where the congruent transformation
is nonorthogonal, to the form

Q = oqz 1 + a2z22 + * • • + ocpzp - ap+1zp+1 — * * * — anz* (4.9-9)

The number of positive terms p is called the index of the quadratic form.
266 Matrices and Linear Spaces

If the quadratic form is of rank r, then only r terms are present in


Eq. 4.9-9. If the nonsingular transformation

rri — sj ; 04 ^2 i ( , 2, . . . , r)
1

W; = z,- / = (r + 1, . . . , n)

is introduced, then Eq. 4.9-9 becomes

Q— + VV2 + * * • + Wp — Wp+1 — • • • — w* (4.9-10)

Equation 4.9-10 can be viewed as a direct consequence of a transformation


to the canonical form of Eq. 4.8-15.
In a similar manner, a Hermitian form of rank r can be reduced to the
diagonal forms

Q = ot1z*z1 + 0C2Z*Z2 + • • • + GCrZ*Zr = w*w1 4- w*w2

+ • • • +Kwv - K+1WP+1 - • • • - vv*wr (4.9-11)

The latter form follows from the definition of the canonical matrix of
Eq. 4.8-18.

Definite and Semidefinite Forms

The quadratic form Q = (x, Ax) is said to be positive definite if it is


non-negative for all real values of x, and is zero only when the vector x is
a null vector. For these conditions to be satisfied, it is clear from Eq.
4.9-10 that A must be a nonsingular matrix, and that the index (number
of positive terms) and rank of the quadratic form must be equal. From
Eq. 4.9-8, it is evident that a quadratic form is positive definite if and only
if the characteristic values of the nonsingular matrix A are all positive.
Either of these conditions can be used to define a positive definite quadratic
form.
If the quadratic form (x, Ax) is positive definite, the matrix A is also
said to be positive definite. A real symmetric matrix is positive definite
if and only if there exists a nonsingular matrix C such that

A = CTC (4.9-12)

To show this, let A be reduced to a unit matrix by a congruent trans¬


formation. Hence A can be written as A = BrInB. Since In2 = ln and
1/ = in, A = (Brl/)(InB). Let C = I„B. Then A = CTC. These
Sec. 4.9 Bilinear and Quadratic Forms 267

steps are also valid when A is reduced to the canonical form

1_

1
I-H

o
o

o
1

1
by a congruent transformation. In the latter case, C is of rank r, and A
is positive semidefinite.
The quadratic form is called positive semidefinite if it is non-negative.
It can be zero when the vector x is not zero. This case arises when A
is singular. From Eq. 4.7-8 it follows that, if A is singular and of rank r,
then it must possess n — r characteristic roots equal to zero. There are
then n — r terms of Eq. 4.9-8 which are identically zero, even when the
associated yt components are nonzero.
Analogous to the definitions above, the quadratic form may be negative
definite and negative semidefinite. The conditions required for these
forms, as well as the corresponding Hermitian cases, are listed in Table
4.9-1.

Determination of Positive Definiteness by Use of the


Principal Minors

It is advantageous to be able to determine whether or not a quadratic


form is positive definite without solving the characteristic value problem,
or reducing A to a canonical form. This can be done by examination of
the leading principal minors of A. A principal minor of A is a minor of A
whose diagonal elements are also diagonal elements of A. The rath
leading principal minor of A, denoted by Am, is defined as the determinant
obtained by deleting the last n — ra columns and rows of A. Since A is
symmetric, the leading principal minors are

#11 #12 #13


#n #12
I
>

II

Ai — #n, A2 — = A|
CO

#12 #22 #23


#12 #22
#13 #23 #33
(4.9-13)

These leading principal minors of A are also called the discriminants


of the quadratic form.
It is now shown that a real quadratic form is positive definite if and only
if all the leading principal minors of A are positive. The starting point of
this proof is Eq. 4.9-12, A = CTC. Without any loss in generality, let

c = aad (4.9-14)
268 Matrices and Linear Spaces

s: S.
1/5
t- -o
<3
s= <
t.
/■-s
o
C < as 1
00 — > 1
C S ■4—* w ^ as '
■o —
as os
. ’c/5
< « A-S as 1)> as
O . > >
OS Co Cl Q.
<3 — •' ■*■*
- <3„ 00 «J • w
j o • '55 .as M
c C, oS • o . c « o c o
c d a . £ <1 0 <] D-<] D.
CU <3
<1 *— ~ o
,
iH
CO
- c
^ 03 4- C <3■ "E <\ a - °
^ V.
<3 <3 1 I I
St *4
St v> * *
*4 C4 st
* M
*4 V- d | d
M Si
5*
+
2

V
n

1 ;
+ : + ; : i1
: + : + d Cl
• d Cl M *4
p ^ *d *
d
l_ 1 1 *4 ci Cl N dd 14
O + 1 * Cl + * Cl
Ph Cl Cl C4 d Cl w
^ 1
5* + ^ + 1
14 *4
4- d 4 d fH *
1 *4 M
d * Cl * S*
^ M hT M^ 1 1
IIII III* IIII IIII 11 II
II II
Ol o» OJ o> O) O) Ol O)
as
-*—* a>
(U -4—*
d>
-4—» 'E -4—» s
«c c u
c as <D
6\ .. o U o •a r,° Uo
CJT
<D tT1 ° * o TD Po &H
* o
Tt <D
Q E W ii Q E U ll
< e~, A * A as II * D VA Va <D
0) U— U — c/3 u_ u_ > 1 — 1 — C/5
3 > ,1 u „ u as II u ,| u d>
a w II — II — > IIII —
u IIII —
u ”c3 II — II — >
c/5
H -4-*
A
A

<v < <


A

O 1/5
Oh o z W)
as a. <u
4-* o
C/5 C/5 o
1- C/5
-*-»
*-»
<d
'£ 00 o o OS
<u Si On w O N >
w d o _ £ .> O —
A V t- rf a as
aS ^ •- <a S3 .t;
*■< V O JZ </5 3 .e oo
crt:
os o as
•2
E> cr *- o
.c | as o a. e
U

T3 K
aS ^ II V =: o s: o
as
^ -a Co, Co, II II V II
c c
as 1-1 II II ^ co.
P<

o
A

*[ o o o
II s:
A
1 <

o g as E o E o E
u.
2 2 1 £ 1 -2 E <2
■a c ■O C -n c e
aS aS OS 03 aS aS .5
E^E ■% a- E | or E ■■§
- S £ — o £ - E e £
aj ,0 t- c3 .O u- c3 .O U-. as c2 t-
(L) ^ a> cl> a> CL) ^ <D as a>
c* X Ot I 2
Sec. 4.9 Bilinear and Quadratic Forms 269

where A is a diagonal matrix whose diagonal elements are A1? A2, . . . , A„,
and D is a triangular matrix of the form
"1 d12 diz " * din
0 1 ^23 ^2 n
0 0 1 * * ‘ *4 n

_0 0 0 • • • 1
Substitution of Eq. 4.9-14 into Eq. 4.9-12 yields
A = DrAD

~ 1 0 0 ... Q 0 0 • • • 0“
pi
d\2 1 0 ... o 0 22 0 • • • 0
^13 ^23 1 ... o 0 0 2
a3 . . • 0
^ :

... ] _ 0 0 0 • • •
<4 n <4 n K.
3
1

~1 d\ 2 4.3 * • • din

0 1 ^23 • • * d2n
(4.9-15)
X 0 0 1 • * • <4«

0 0 0 ... 1

If a new variable y is defined by the linear transformation


y = Dx (4.9-16)
then the real quadratic form Q = (x, Ax) can be expressed as

Q = <y> Ay> = Kvi + Kv* -t-f KVn (4.9-17)


The real quadratic form Q is positive definite if and only if all the
diagonal elements A1? A2, . . ., An are positive. However, since D has been
chosen to be a triangular matrix with unity diagonal elements, the deter¬
minant of D and D7 is equal to unity. Hence

IA | = |DT| | A| | D | = |A| = 2xA2 • • • ln (4.9-18)


If the variable xn is set equal to zero, then the variable yn is also zero.
The quadratic form obtained by setting xn = 0 is then

Q1 = Kv\ + + ' * * +
By a similar argument, the discriminant of this quadratic form An-1 is
equal to
An_i = -4^2 ' * ’ ^n-1 (4.9-19)
270 Matrices and Linear Spaces

In general, the discriminant Afc, obtained by setting xk+1 = xk+2 = • • • =


xn = 0 is
A, = * * • Xk (4.9-20)

Solving for the elements X2i . . . , An,

K = Aj
a2
2*2

A3
a3 (4.9-21)
a2

Clearly, if all the elements are to be positive, then all the leading principal
minors A1? A2, . . . , An must be positive. Therefore a real quadratic form
Q = <x, Ax) is positive definite if and only if all the leading principal
minors of A are positive.
For the case of Hermitian forms, a set of analogous statements can be
made. A summary of the useful statements regarding real quadratic and
Hermitian forms is given in Table 4.9-1. The statements about the negative
definite forms can be proved by requiring (—A) to be the matrix of a
positive definite form.

4.10 MATRIX POLYNOMIALS, INFINITE SERIES,


AND FUNCTIONS OF A MATRIX

In this section some of the basic ideas regarding matrix polynomials


and infinite series are developed. With the introduction of a few modi¬
fications, the theorems regarding matrix polynomials and infinite series
are directly analogous to the theorems of scalar variables. In addition,
some important theorems regarding functions of a matrix are presented.
These theorems are essential to the solution of linear vector-matrix
differential equations.

Powers of Matrices

The matrix product AAA • • • A, where A is a square matrix of order n,


can be written as Afc, where k is the number of factors involved in the
product. The multiplication of powers of a matrix follows the usual rules
Sec. 4.10 Matrix Polynomials and Infinite Series 271

for scalar algebra. The matrix A0 is defined as the unit matrix of order n.

AkAm — Ak+Tn
(A^)m = Akm (4.10-1)
A0 = I n

If the power to which the matrix is to be raised is negative, the same


rules apply if the matrix is nonsingular, so that its inverse exists. That is,

(A"1)771 = A —m (4.10-2)

A set of similar rules applies in the case where a fractional power of a


matrix is to be computed. Thus, if A771 = B, where A is a square matrix,
then A is an rath root of B. The number of rath roots of a matrix de¬
pends upon the nature of the matrix, there being no general rule as to how
many rath roots the matrix B possesses.
Example 4.10-1. Find the square root of A, where

<2n aVi

_a21 #22.
Let
bn b12
B2 = = A
b 21 ^22

Then
bii2 T b\2b2j ^12(^11 T b22) all a12

_b21(bn + b22) &12&21 + b222 _ Cl21 CI02

Equating like terms yields

b\\“ T b\2b2i = flu bifbn T ^22) = ^12

b'2\(b\\ + ^22) = a2\ bioboi + Z>222 = a22

This is a set of four nonlinear simultaneous equations which has no general solution.
A pair of numerical examples illustrate the ambiguity involved.
Let
"4 1
A =
0 1
Then b21 = 0, blx = ±2, b22 = dbl, and b12 — The square root of A is then

‘2 f
B = ±
0 1
As a second example, let
■4 O'
A =
0 4
One possible answer to this problem is
'2 O'
B = ±
0 2
272 Matrices and Linear Spaces

However, this is not the only solution. Another solution is

with 6X12 + 612621 = 4. Therefore there are an infinite number of square roots of A.

Matrix Polynomials

Consider a polynomial of order n, where the argument of the polynomial


is the scalar variable x, i.e.,

N(x) = pnxn + pn-i^n_1 + * • *v + pxx + p0 (4.10-3)


If the scalar variable x is replaced by the (n x n) square matrix A, then the
corresponding matrix polynomial is defined by

N(A) = pn An + pn_x A"-1 + ■ • • + pxK + p0l„ (4.10-4)

Note that the last term is multiplied by the nth order unit matrix In.
Example 4.10-2. Let N(x) — 3x2 + 2x + 1 and

Determine N(A).

N(A) = 3A2 + 2A + I
[4 41
r h
+ 2 2
ri °i rn
141
3
0 4 0 2 +
0 1

0 17

Factorization of a Matrix Polynomial

The polynomial N(x) may be written in the factored form

w(*j) = p„ix - ^i)(* - K) ■ ■ ■ (x - K)

where A2, . . . , An are the roots of the polynomial N(x) = 0 and are
all assumed to be distinct. Similarly, the factored form of a matrix poly¬
nomial is
N(A) = pn(A - A1I)(A - X,l) • • • (A - Xn\) (4.10-5)

This form is used later to prove Sylvester’s theorem.

Infinite Series of Matrices

Consider the infinite series in the scalar variable x,


.00

S(x) = a0 + apx + a2x2 + • • • = ^akxk


&=0
Sec. 4.10 Matrix Polynomials and Infinite Series 273

If the argument of the infinite series is replaced by the square nth order
matrix A, then the infinite series of A can be written as

S(A) — Gq\n -b uqA -f- (72A2 -)-•••= ^ cqAfc (4.10-6)


k=o

It may be shown that this series converges as k approaches infinity if all


the corresponding scalar series £(2Z), i = 1, 2, . . ., n, converge, where
the A/s are the characteristic values of A. Because the topic of convergence
of matrix series is in general rather involved, a detailed discussion of it is
omitted.
Some of the more important infinite series of matrices are as follows:

Geometric series:
OO

G(A) = I + aA + u2A2 + • • • = 2 akAk (4.10-7)


k=0
Exponential function:

A2 A3 Ak
exp A = I + A H-1-b ' ’ * H-b
2 3! k\
(4.10-8)
-A A2 A3 , (~l)fcA*
exp ( — A) = I — A H-b
2! 3! k\

It can be shown that this series is absolutely and uniformly convergent.13


Although the scalar multiplication exev can be written as either exev or
€vex, the corresponding matrix product eAeB cannot be written as eBeA
unless A and B commute. Then

eAeB = eBeA = eA+B, if AB = BA (4.10-9)

Clearly, this condition is satisfied if B = A, or B = —A. If B = -A,


then Eq. 4.10-9 becomes
eAe-A = e(A-A) = J
(4.10-10)

From Eq. 4.10-10 it can be concluded that (cA) 1 = e A, or that e A is


the inverse matrix of eA.

Sine function:
A A3 A5 exP L/A] - exp [ — /A] ,41011)
sinA = A —-1-—
3! 5! 2j
Cosine function:
A2 A^ exp [j\] + exp [-jA]
cos A = I (4.10-12)
2! ^ 4! 2
274 Matrices and Linear Spaces

where the complex exponential is defined by setting A equal to jA in


Eq. 4.10-8.
A2 , A4 A3 + A^
exp (j A) = + j A-
2! 4! 3! 5!
= cos A + j sin A (4.10-13)
exp (—jA) = cos A — j sin A
Hyperbolic sine:
A3 A5 exp A — exp (—A)
sinh A = A H-1-f • • (4.10-14)
3! 5! 2
Hyperbolic cosine:
V

A2 A4 exp A + exp (—A)


cosh A = I H-1-f • • (4.10-15)
2! 4! 2

Trigonometric Identities

The matrix trigonometric identities, which are analogous to the corre¬


sponding scalar trigonometric identities, can be established by use of the
preceding definitions of the matrix trigonometric functions. For example,
the identity
cosh2 A — sinh2 A = I (4.10-16)
can be derived from

cosh2 A = /exP A + exP (-A)WexP A + exP (~A)'

exp 2A 4- exp (—2A) I


4 2
/exp A — exp (—A)\ /exp A — exp (—A)
sinh A =

_ exp 2A + exp (—2A) I


~"/l 4 ~ 2
The (2 x 2) real matrix analogous to the scalar j = v — 1 is defined by
'0 -1
Jo = (4.10-17)
1 0
Note that J02 = — I, J03 = — J0, J04 = I,. . . , etc. This matrix is useful in
finding certain trigonometric identities. For example, if A = #J0, then
Eq. 4.10-11 can be written as
3T 5

sin (uJ0) == qJq -j- ^ ^0 ~F — J0 -j- * * * = J0 sinh a. (4.10-18)


Similarly,
cos (aJo) = I cosh a (4.10-19)
Sec. 4.10 Matrix Polynomials and Infinite Series 275

Cayley-Hamilton Theorem

The generalization of Eq. 4.7-18 to Av = produces an inter¬


esting and useful relationship. If N(X) is a polynomial in A of the form

N(X) = ln + C^71-1 + • ‘ * + Cn_fk + Cn

then this generalization shows that the polynomial using A as the variable
is
N(A) = An + cqA^1 +-h cn_xA + cnl = MN(A)M-1

Wi)
N(X2)
M"1 (4.10-20)

WJ
where A1? X2, . . . , Xn are 4he zeros of the polynomial N(X).
If the polynomial chosen is the characteristic polynomial, i.e., if
N(X) = P(X), then = N(X2) = • • • = N(Xn) = 0. It follows that

P(A) = [0] where P(X) = |AI — A| (4.10-21)

This statement is known as the Cayley-Hamilton theorem. The theorem


states that “the matrix A satisfies its own characteristic equation.” The
preceding proof is based on the assumption that A has distinct character¬
istic roots. However, it can be shown that this theorem holds true for any
square matrix.14 This theorem is of considerable importance when
calculating various functions of the matrix A.
Example 4.10-3. The matrix

A =

has the characteristic polynomial


P(X) = z2 + 37 + 2
Show that P(A) = [0J.
Substituting A for the variable A gives

P(A) = A2 + 3A + 21
'-2 -3“ " 0 r "1 0“ "0 0“
+ 2 _
+ 3
6 7_ 2 — 3_ _0 1 _0 0_

Therefore P(A) = null matrix = [0].

Example 4.10-4. Find A-1 for the matrix of the preceding example by using the
Cayley-Hamilton theorem.
276 Matrices and Linear Spaces

Since A satisfies its own characteristic equation,

A2 + 3A + 21 = [0]
or
A + 31 + 2A-1 = [0]
Therefore

A-1 = —iA - fl =

This is often a convenient way of computing the inverse of a matrix.

Reduction of Polynomials
v

By means of the Cayley-Hamilton theorem, it is possible to reduce any


polynomial of the «th order matrix A to a linear combination of I, A,
A2, . . . , An~\ or a polynomial whose highest degree in A is n — 1. This
is best shown by example.
Example 4.10-5. Find N(A) = A4 + A3 + A2 + A + I if

r
A =
-3

The characteristic equation of A is A2 -f 3A + 2 = 0. Therefore

A2 + 3A + 21 = [01 or A2 = -3A - 21
Consequently,

A4 = 9A2 + 12A + 41 = 9(—3A - 21) + 12A + 41 = -15A - 141

Similarly,
A3 = —3A2 - 2A = —3( —3A - 21) - 2A = 7A + 61
Hence
N(A) ( —15A - 141) + (7A + 61) + (-3A - 21) + A + I

-9 -10"
= — 10A - 91 =
20 21

This is a polynomial of first degree in A.

Sylvester’s Theorem|

Sylvester’s theorem is a useful method for obtaining a function of a


matrix, if the function can be expressed as a matrix polynomial. The
following is a statement of Sylvester’s theorem, valid when A possesses
n distinct roots:

If N(A) is a matrix polynomial in A, and if the square matrix A possesses

f The proof of this theorem closely follows the proof given in Reference 15.
Sec. 4.10 Matrix Polynomials and Infinite Series 277

n distinct characteristic values, the polynomial in A can be written as

N(A) = J N(XtW,) (4.10-22)


i=1
where

n (a - /,!)
W = ’4-
IT (7 *,) 3=1
-

From the Cayley-Hamilton theorem, it is known that any matrix poly¬


nomial N(A) can be represented by a polynomial in A, whose highest
degree is n — 1. Thus N(A) can be written as
N(A) = ax A71-1 + #2An_2 + • • • + an_x A + an I (4.10-23)

In order to prove Sylvester’s theorem, Eq. 4.10-23 is written in the form

N(A) = Cl[(A - A2I)(A - 4D • • * (A - 41)]


41) • • • (A - 41)]
+ c2[(A - 4I)(A -
+.
+ ck IT (a — 41)
3=1
3 ^k

+.
4- c„[(A - AjIXA - A21) • • • (A - A^I)]
or

N(A)=ic,n(A-;,D (4.10-24)
fc=l 3=1
3 ^k

Since there is one factor missing from each of the product terms, N(A) is
clearly a polynomial of degree n — 1, with n arbitrary constants of com¬
bination. If the characteristic vectors of A are denoted by u1? u2, . . . , un,
then postmultiplying Eq. 4.10-24 by yields the relation
n

N(A)u, 2 c* n (a- m) U (4.10-25)


lc=l 3=1
3

However, since Au* = Xl\xi or (A — AfI)uf = 0, all the terms except the
zth are zero. The zth term is not zero, since it does not contain the factor
(A — 40. Therefore
1_

I
«K

N(A)u,- = ci fr (a — a,i) u; = u, (4.10-26)


1

'*
•?>4

J1)_1
—) J- JL
L - 3 ¥=i
278 Matrices and Linear Spaces

If the characteristic values of A are distinct, then N(A)ut- = u*.


Therefore
N(li)

fl 0, - A,)
2=1
3
Consequently,

ff (A - A,I)
2=1
n

N(A) = 2 AW ^-
n (Ai - a,)
2=1
0^3
This concludes the proof of Sylvester’s theorem.

Example 4.10-6. Calculate eA, using Sylvester’s theorem, for the A matrix of Example
4.10-5.
Since eA can be expressed as a convergent series in A, Sylvester’s theorem can be used
directly on eA, rather than on the infinite series representation of A. Certainly, if the
infinite series for eA converges, then eA can be determined by Sylvester's theorem.
Therefore
2
2 ^ z„a,.)
2=1
where
A - LI A - XJ.
= and Z0(A2)
— X-2 /, — Xx
Since
0
A =
-2

and /i = —1 and X.2 = —2, it follows that

-1
2
Consequently,
le-1 - e-2 e-1 - e~2

-lie-1 - e-2) -(e-1 - 2e-2)

Example 4.10-7. Calculate Ak, using Sylvester’s theorem, for the A matrix of the
previous example.

a* = 2
2=1

Using ZoCT and Z0(22) as determined in the previous example,

”2( —l)fc — (—2)fc ( —l)fc — (—2)k


_—2( — 1 )fc + 2(—2)fc -(-1)* + 2(—2)fc
Sec. 4.10 Matrix Polynomials and Infinite Series 279

Since it can be shown that16

n (a yi) Adj [AI _ A]


(4.10-27)
IT - *,) dP(X)/dX
j^i X=Xi
where P(X) is the characteristic polynomial of A, Eq. 4.10-22 can also be
written as
N(K) Adj [2ZI - A]
N(A) = I (4.10-28)
z=i dP{X)ldX |A=;

Sylvester’s Theorem-Confluent Form

When the matrix A contains repeated characteristic values, Eq. 4.10-28


must be modified. The modified form of Eq. 4.10-28 is called the confluent
form of Sylvester’s theorem.! Assume a characteristic value of order
5. The contribution to N(A) from the z'th root Xi can be shown to be
s— 1
1 d N(X) Adj (7.1 - A)
(4.10-29)
(s - 1)!\dXS— 1 rr a - ?-,y
i=i
“"A=Ai

The sum of the contributions of all the roots with different values is then
N(A). Hence
1 { d3-1 fA(2) Adj (21 - A)'
N(A) = 2 (4.10-30)
T(s - 1 )\\dkS— 1 O. n -
3=1
3 A=A*

where the summation is taken over all the roots, with repeated roots taken
only once.
Equation 4.10-30 is the confluent form of Sylvester’s theorem. A
typical term of the summation, corresponding to a multiple root 2Z, can
be expanded into the form
s—1
1 d N(X) Adj (AI - A)'
(5 - 1)! i dfs—1 n (* - w
3=1
3 A=A,

zs.k(K) (4.10-31)
=2
k=i (/c — 1)!
where
dkN(X)
nw =
dXk X=X,

t A proof of the theorem for this form can be found in Reference 17.
280 Matrices and Linear Spaces

and
Adj (AI - A)
zM = : :
no- W
0=1
'o k=Xi

Example 4.10-8. Find the general form for any matrix function of A, where the
matrix function can be expressed as a matrix polynomial in

1 3"
0 2
2 4
Evaluation of

121 - A| = P(K) = A3 - 4A2 + 52 - 2 = (A - 1)2(A - 2)

shows that the characteristic equation has a double root at A = 1, and a single root at
A = 2. The contribution to N(A) from the single root is

N{2) Adj (AI — A) |A=2


(2 - l)2
Since
"(A2 -42-4) (A + 2) (3A + 2) "
Adj [AI - A] = (6A-34) (A2-4A + 15) (2A + 18)
(12 - 5A) (22-5) (A2 — 6) _
and
-8 4 8
Adj [AI - A]a=2 = -22 11 22
2 -1 -2

the contribution to N(A) from the single root at A = 2 is then

r -8 4 8‘

N(2) -22 11 22
2 -1 -2

The contribution to N(A) from the double root at A = 1 is

d A(A) Adj [AI - A]


dX A - 2
or, using Eq. 4.10-31,

d \Adj (AI - A)] d


-Adj (AI - A) - -Adj (AI - A)
z‘<!> = Tx A - 2 A=1 dk
Z0(l) = —Adj [AI — A]a=1
Sec. 4.10 Matrix Polynomials and Infinite Series 281

Since
~2A — 4 1 3'
- {Adj [AI - A]} = 6 22-4 2
-5 2 22
"7-3 -5" "2-1 -3“ "9 4-8"
Zi(l) = 28 -12 -20 + -6 2 -2 _
22 -10 -22
_—7 3 5__ 5 -2 —2_ _—2 1 3_

“7-3 —5“|
Zo(l) = 28 12 -20
-7 3 5

Therefore the contribution to N(A) from the double root at 2 = 1 is given by

jv(i)zx(i) + ~r,N(X) Z0(l)


dA ;.=i
The sum of these contributions is then

" -8 4 8" “ 9 -4 -8"


N(A) = Nil) -22 11 22 + N(l) 22 -10 -22
2 -1 —2_ —2 1 3_

" 7 -3 -5“
+ dN(X)
28 -12 -20
dA A=i
—1 3 5_

As a particular example, let N(A) = eAt, or N(A) = eXt. Then

dN{A)
Nil) = e2h A/(l) = and = tel
dA ;.=i
and

N(A) =

' (9ef + 7tet - 8e20 (-4e* - 3tel + 4e20 (-8e* - 5tet + 8e2!) ‘
(22e; -j- 2Stet — 22e2i) (-10e* - 12tel + 1 le24) (-22c4 - 20tel + 22e2<)
_ ( — 2eJ - 7/e4 + 2e24) (e4 + 3/e4 - e24) (3e4 + 5/e4 - 2e24)

Cayley-Hamilton Technique

An alternative, and often simpler, procedure for evaluating a function


of a matrix is obtained by making use of the Cayley-Hamilton theorem.
First, consider the case where N(A) is a matrix polynomial which is of
higher degree than the order of A. If 7V(2) is divided by the characteristic
polynomial of A, then

NW = e(A) + *<*> (4.10-32)


PAi) P(X)
282 Matrices and Linear Spaces

where R(X) is the remainder. Then, if Eq. 4.10-32 is multiplied by P(A),


the result is
7V(A) = Q(X)P(X) + R(A) (4.10-33)
Now, if P(A) = 0, Eq. 4.10-33 becomes
2/(2) = R{X) (4.10-34)
Correspondingly, since P(A) = [0] by the Cayley-Hamilton theorem, the
matrix function N(A) is then equal to R(A).
Example 4.10-9. Solve the problem of Example 4.10-5 using the Cayley-Hamilton
technique.
N(A) = A4 + A3 + A2 + A + I
or v
7V(A) = A4 + A3 + A2 + A + 1
The characteristic polynomial of

A =

is A2 + 3A -f 2 = 0. Dividing 7V(A) by P(L),


A4 + A3 + A2 + A + 1 ( —10A - 9)
-7-7- = A2 2A T 5 T
A2 + 3A + 2 A2 + 3A + 2
The remainder /?(A) is then R(L) = — 10A — 9. Hence N(A) = R(A) = — 10A — 91,
which is the same result obtained in Example 4.10-5.

The preceding technique is valid only for the case in which N(A) is a
polynomial function of A. When F(A) is desired, where T(A) is an analytic
function of A, in a region about the origin, an extension of the previous
method can be used. If F(X) is an analytic function in a region, it can be
expressed by an infinite power series in A, which converges in the region
of analyticity. Therefore the function F(A) can be expressed as a poly¬
nomial in A of degree n — 1. Consequently, the remainder R(X) of
Eq. 4.10-33 must be a polynomial of degree n — 1. It follows that, if
2(A) is an analytic function of A in that region,
T(A) = g(A)P(A) + R{X) (4.10-35)
where P(X) is the characteristic polynomial of A, and R(X) is a polynomial
of the form
R{X) = a0 + ajA -f a2A2 + • • • + a^A*1-1 (4.10-36)
The coefficients a0, als . . . , a„_]_ can be obtained by successively sub¬
stituting Ax, A2, . . . , An into Eq. 4.10-35. Since P(kt) — 0, the equations

nh) = R(h)
T(A2) = R(A2)
(4.10-37)

F(K) = R(L)
Sec. 4.10 Matrix Polynomials and Infinite Series 283

are obtained. Equation 4.10-37 describes a set of n linear equations in


n unknowns. Therefore a unique solution can be obtained for all the
coefficients of the polynomial R(X).
It now remains to show that
m r(x>-

(4.10-38)
fi(A) =
PQ)
is an analytic function of X. Since the zeros of the denominator of Q(X)
are also zeros of its numerator, the function Q(X) is analytic in the region
of analyticity of F(X). Therefore Eq. 4.10-35 is valid for all values of X
in the region of analyticity of F(X). Consequently A may be substituted
for the variable X, if the region of analyticity includes all the characteristic
values of A. This substitution yields
F(A) = Q(A)P(A) + R(A) (4.10-39)

Since P(A) is identically zero by the Cayley-Hamilton theorem, it follows


that
F(A) = R(A) (4.10-40)
which is the desired result.
Before proceeding to some examples of this technique, the problem
of repeated characteristic roots should be investigated. Obviously, if A
possesses a characteristic value Xi of order s, only one linear independent
equation can be obtained by substituting Xi into Eq. 4.10-35. The re¬
maining s — 1 linear equations, which must be obtained in order to solve
for the a/s, can be found by differentiating both sides of Eq. 4.10-35.
Therefore, if A has a characteristic value of order 5, a set of linear equa¬
tions of the form
dkF(X) dkR(X)
k = 0, 1 s — 1 (4.10-41)
dXk a=a2 dXk
must be obtained in order to find a unique solution for the coefficients
of the polynomial of Eq. 4.10-36.
Example 4.10-10. Find eA(, using as A the matrix of Example 4.10-9.

Since A is a second order matrix, the polynomial R(2) is of first order, i.e.,
R(X) = a0 + axA

Therefore the two linear equations obtained by substituting Xx and X2 into Eq. 4.10-35
are
F(Xx) = R(Xx) F(X2) = R(X2)
€ai< = a0 + a:Ax eA2f = a0 + ccxX2

= a0 — ccx e~2t = a0 — 2ax


284 Matrices and Linear Spaces

Solving for a0 and cc1.


an = 2e 1 — e 21

ax = e 1 — e 2t

Hence
'a0 0 ‘ 0 ax"
F(A) = = a0I + axA = +
0 a0 —2ax — 3ax
2e-« _ e-2f €-< _ e-2«

-2(e"* - e-2t) —(e_< - 2e-2()

Example 4.10-11. Determine eAf, where A is the matrix used in Example 4.10-8.
This matrix has a double root at 2 = 1 and a single root at 2 = 2. Since this is a third
order matrix, the polynomial R(X) is R(X) = a0 + ax2 + a222. The three equations for
the a*’s are given by F(2X) = and

dF(X) dR(X)
dX A—A^ dX A= +
where 2X = 1, and by F(22) = F(22), where 22 = 2. Thus the a’s are specified by
1
1_

*1 i r a0
w

td = 0 1 2 ai
_1
-1

d1
8
<M
i

Solving for the a’s,


_a0" "1 1 r -i ” er
ai = 0 1 2 td

_a2_ _1 2 4_ _e2t_

It is instructive to solve this set of three simultaneous equations by using the Cayley-
Hamilton theorem to find the inverse of the coefficient matrix. The characteristic
polynomial of the coefficient matrix is 23 — 622 + 42 — 1 = 0. Hence

T 1 1
5C2 + 4C - I = [0] where C = 0 1 2

J 2 4
Then
" 0 -2 1"
-i = C2 - 6C + 41 = 2 3 -2
-1 -1 1_
Therefore
a0 = -2/6* + € it
-it
«i = 2d + 3 td - 2i
a2 = — d — td + €2*
Hence
eA t — a„I + ajA + a0A2
where
"013" ' -9 6 14"
A = 6 0 2 and A _ -10 10 26
_ —5 2 4_ _ -8 3 5_
Sec. 4.10 Matrix Polynomials and Infinite Series 285

The result is

" (9e* + 7/C - 8e2f) (—4ef - 3Ul + 4e2() (-8e‘ - 5/C + 8e2t) "
eA^ = (22ee + 28/C — 22e2') (-IOC - 12/e' + 11C) (-22e' - 20/e' + 22e2')
_ (-2et - 7tet + 2e2') (ef + 3/C - e2') (3ef + 5/e' - 2e2')

This matrix checks with the result obtained in Example 4.10-8 using Sylvester’s
theorem. A considerable amount of labor is involved in either method, but this is
usually the case when dealing with a matrix of order higher than two. Generally, the
Cayley-Hamilton technique requires much less labor than the use of Sylvester’s
theorem.

Example 4.10-12. Generalize the preceding discussion on the Cayley-Hamilton


technique so that any analytic function of A can be generated. Assume that A has
distinct characteristic values.
The n equations which determine the coefficients a, can be written in matrix form as

... JK— 1~
'FOX -1 Ax a0

}n — 2
F(l2) 1 L a\ . . . Ag ai

Jn—1
_F(K). J K K • • • An J J^-n— ]_

or f(A) = Ca, where C is the coefficient matrix shown. Consequently, a = C_1F(A).


Therefore
n—1
F(A) = 2 «.-A<
i=0

where the a/s are determined from the equation a = C-1F(A).


An alternative procedure, again assuming distinct roots, is to diagonalize A by means
of a similarity transformation, and then utilize the generalization of Eq. 4.10-20 to the
case of an analytic function. A simple example illustrates this procedure.
In Example 4.7-6, the matrix

was analyzed, and the modal matrices M and M-1 were found to be

1/V2 I/V5- ' 2/2 V2


M = M1 =
1/a/2 —2/a/5_ _-V5 -V5_
Hence
-1 O'
A = MAM1 = M M -i

0 -2

From the generalization of Eq. 4.10-20, F(A) = MF(A)M x. If F(A) = eA/ is desired,
then
r* 0
F(A) = M M
„-21
0
286 Matrices and Linear Spaces

Performing the indicated matrix multiplications yields

~ 2e_< - e~2t
e\t —
_ — 2(e_< - e-2t) — (e-i - 2e-20.

which checks with the result found in Example 4.10-10.


Once the modal matrices are obtained, this is a very convenient procedure for finding
an analytic function of a matrix. However, the procedure requires a complete charac¬
teristic vector analysis.

4.11 ADDITIONAL MATRIX CALCULUS

The usual ideas of differentiation and integration associated with scalar


variables previously were shown to carry over to the differentiation and
integration of matrices and matrix products, provided that the original
order of the factors involved is preserved. However, because such opera¬
tions are frequently performed on the exponential function of a matrix
and on quadratic forms in later chapters, they are considered specifically
af this point.

Differentiation of the Exponential Function

If A is a constant matrix and t is a scalar variable, then the exponential


function eAt is defined, similarly to Eq. 4.10-8, as the infinite series

A2/2 A3/3
eA* = exp (At) = 1 + A t + — + — +•• • (4.11-1)
2! 3!

This series is absolutely and uniformly convergent for all values of the
scalar variable t. The derivative of the exponential function eAt with
respect to t is then the term by term differentiation of Eq. 4.11-1, or

- [cA<] = A + A2t + — +••• = A<fA‘ = €a,A (4.11-2)


dt 2!

If the operator notation p = d/dt is used, then it follows that

—— [eA<] = p\eM) = AV‘ = €MAk (4.11-3)


dr

In general, if N(p) is a polynomial of the differential operator p, then

N(p)eAt = N(A)eA< = eA<N(A) (4.11-4)


Sec. 4.11 Additional Matrix Calculus 287

Often the situation arises where the polynomial operator N(p) must
operate on the matrix product eA*B(t). It is assumed that the product
AB(t) exists, but that the product B(7)A does not. In this case,

/?[eA<B(0] = eA'B(0 + eA<AB (?) = eAt[pl + A]B(f)

p2[eAtB(t)] = eA'B(0 + 2eA<AB(f) + eAtA2B(t) = eAt{pl + A)2B(/)

In general,
/[€A*B(0] = eAt(pI + A)fcB(f) (4.11-5)
Consequently,
A(/?)[eA<B(0] = eAtN(pI + A)B(t) (4.11-6)

Integration of the Exponential Function

The integral of the exponential function eAt, where A is a constant


matrix, can be found by integrating the infinite series expression for
eAt, i.e.,

t ft ft a v c* a¥
ea' dt I dt + \ Atdt + \ ——dt + — dt +
o Jo Jo 2! Jo 3!

_ , At2 , A2t3 , AV ,
It +-h-b-+*••
2! 3! 4!
Hence
At
A I €A* dt = eAt - I

Assuming that A is nonsingular,

-1/ A t
eAi dt = A-\eAt At
- I) = (eAi - I)A-1 (4.11-7)

o r
Example 4.11-1. Find eM dt, where A =
'o ■2 -3

The result of Example 4.10-10 shows that

■ 2e-f - e~2t €~* - e-2<


-At -
— 2(e_t - e~2t) —(e_t - 2e~2<)
Therefore
e~2t 3 e~2t n
rt — 2e_i + — + - -€-* +—+-
2 2 2 2
>

SX
II
n\

Jo
-
Tyv

<N

_ 2e-‘ e~2t - 1
1 yv
l

l
288 Matrices and Linear Spaces

This result can be checked by the application of Eq. 4.11-7. Since

-f -¥
A-1 =
1 o

le-1 - e- 21
~

_1
_1

toH
3

to
1

1
n\

<T\

1
1
■I 2
A~xeAt — A'1
1 0 -2c~l + 2e — it e~* + 2e~2t 1 0

3e.-it .-21

■2e~* + — + -
2 2 + v
2 + \2
2e-‘ - e -21 - 1 €-* - €-2t

The solution to a linear time-varying matrix differential equation often


depends upon the exponential function

exp A(A) dX
L Jo

This exponential function is defined as the infinite series

oo i r (*t
exp A(A) dX = 2 - A(X) dX (4.11-8)
k=o k! Uo
Example 4.11-2. Find
7 O'
exp A(A) dX if A =
L Jo 0 t

The integral of this matrix is

- 0
2
A(A) dX =
r2
° 2
Consequently,
~t2 k ~€‘2/2 0 '
0
00 1 2
exp A(X) dX
T\ t2
L Jo
0
2 0 et2/2

Differentiation of Quadratic Forms

The stability analysis of dynamical systems often requires the differ¬


entiation of a quadratic form. If the quadratic form is Q = (x, Ax),
where A is a symmetric matrix, then

gradx Q = 2Ax (4.11-9)


Sec. 4.11 Additional Matrix Calculus 289

where gradx denotes the vector operator

_a_

dx9
(4.11-10)
Jdx,

L_^„_

Also, frequently useful is the matrix operator formed by taking the


outer product of gradx and grady, yielding

d_

dx1
3

dXr

gradx)(grady =
-dVi dy2 dym-i

dx n_
2

dxx dyx dxx dy2 dxi dym


d2 d2 d2

dx2 dyx dx2 dy2 dym (4.11-11)

_dxndy1 dxndy2 dxn dy m_


Example 4.11-3. Evaluate the gradient of Q = (x, Ax>, where
'1 3
A =
3 2
Since Q = ccx2 + 2x22 + 6xxx2, the gradient of Q with respect to the variables xu x2 is
given by
~dQl
2xx -f 6x2
dxx
gradx Q =
dQ
4x2 + 6xt
_dx2_
290 Matrices and Linear Spaces

Note that

1__i
<N ■n-
'1 3" xx + 6x2

iH
2 Ax = 2

_1
_x2_ + 6xx_

(M
i
If both A and the variables xl9 x2, . . . , xn are functions of time, then
the derivative of the quadratic form with respect to t is given by

7 [2(0] = 7 <x(0, A(0x(0>


at at
= <x(0, A(t)x(0> + <x(0, A(0x(0> + (x(0, A(0x(0>
Since A is a symmetric matrix,

2(0 = 2(x(0, A(0x(0> + <x(0, A(0x(0>


= (gradx Q, x(t)) + <:x(t), A(0x(/)> (4.11-12)
If A is independent of time, this reduces to

20) = <gradx Q, x(0) = 2<Ax(0, x(0> (4.11-13)


Example 4.11-4. Find the time derivative of Q(t) = <x(r), A(/)x(t)>, where

21
A(0 =
It 1

Since Q(t) = t2xx\t) + x2\t) + 4to1(r)a;2(r), the derivative of Q(t) with respect to t is

Q(t) = 2 txx\t) + 2t2xx(t)xx(t) + 2 x2(t)x2(t) + 4a71(r)x2(r) + 4txx{t)x2{t) + 4txx(t)x2{t)

= xx(t)[2t2xx(t) + 4to2(7)l + i;2(r)[2x2(0 + 4/z1(01 + 2txx\t) + 4xx{t)x2(t)

Note that Q(t) is the sum of


lL

~t2 2f
2<x(7), A(f)x(/)> = 2[xx{t) x2(t)]
21 1 js2(0_
= xx{t)[2t2xx{t) + 4tx2(t)] + x2(t)[2x2(t) + 4txx{t)]
and
~2t 2“ *i(0
<x(t), A(t)x(0> = MO *2(01
1_
_1
N>

«M
i

= 2 txx2(t) + 4x1(t)x2(r)

4.12 FUNCTION SPACE

This section is devoted to extending the vector concepts of Section 4.5


to the problem of determining a basis in “function space.” In the pre¬
ceding discussion it was shown that any n linearly independent vectors
form a basis in ^-dimensional space, such that all vectors in that space can
be described by the linear combination c^ + c2x2 + • • • + cnxn.
Sec. 4.12 Function Space 291

Consider now a set of n functions . . . ,fn(t), which are defined


over the interval {a, b), such that no function ft(t) is a multiple of any of
the other n — 1 functions over this interval. Certainly, a linear combina¬
tion + c2f2(t) + * • • + cnfn(t) does not describe all functions
which can be defined over the interval (a, b). However, a basis can be
chosen such that any function satisfying regularity conditions to be stated
later can be expressed as a linear combination of the members of the
basis.
The topic considered here has become important in the area of adaptive
and self-optimizing systems. As the systems which are to be controlled
become more complex, less is known about their internal mechanisms
and parameter interrelationships. The identification and modeling of
such systems is an important first step for adaptive or optimum control.
The employment of a convenient orthonormal basis of functions to
describe the performance of a system has become an increasingly useful
approach.

Scalar (Inner) Product

The scalar product of two real-valued functions f(t) and g{t) over the
interval {a, b) is defined as|

</, g) = f dt (4.12-1)
Ja

If the functions are complex functions of a real variable t, the definition


of the scalar product is modified as

</, S> = f/WO dt (4.12-2)


Ja

Norm of a Function

The norm of a real-valued function is analogous to the length of a


vector and is defined to be

Norm / = \\f\\'A = = (4.12-3)

A normalized function is a function whose norm is unity. If the norm of a


function is zero, then f(t) must also be zero, except for a finite number of
points, in the interval (a, b).
Cb
t It is assumed that all functions satisfy the Lebesgue condition \f(t)\2 dt < oo.
%) CL
292 Matrices and Linear Spaces

Orthogonal Functions

Two functions f(t) and g(t) are orthogonal over the interval {a, b) if
their scalar product vanishes, i.e.,

</.*> = 0 (4.12-4)
A set of normalized functions (f>i(t), • • • is said to be an orthonormal
set if the members of the set obey the relation

<&.&> = «« (4-12-5)
Similar to the approach used in the Gram-Schmidt orthogonalization
procedure, a set of n linearly independent functions can be used to derive
a suitable orthonormal set of functions.!

Orthogonal Functions as a Basis in Function Space

If the infinite orthonormal set of functions </>i(t), <f>2(t),... are considered


coordinate vectors or coordinate functions in a space which has an
infinite number of dimensions, then, by analogy with vector space, a
function f(t) can be considered to be a vector in this space, and the
components of this function in terms of the coordinate functions are
given by
«*=</,&> (4-12-6)
These components of the function with respect to the coordinate functions
are called the expansion coefficients, or Fourier coefficients, of the function
relative to the orthonormal set <f>k(t).
If the function f{t) is approximated by a linear combination of n ortho¬
normal functions
n
= (a < t < b) (4.12-7)
k=l

then the best approximation in the “least squares” sense is obtained by


letting ak = ck. This can be shown by minimizing S’, the square of the
n
norm of the difference between f(t) and ^ ak</>k(t), i.e.,
k=1
n 2 rb n

fit) ~ 2, ak<f>k(t) /(0-2«A(0


k=l -l fc=l
t A set of functions /x,/2, . . . ,/„ are said to be linearly dependent if there exists a
n
relationship ^ crfi = 0 for all values of the scalar argument of the function, where all
i=1
the cfs are not zero. If such a relationship does not exist, the functions are said to be
linearly independent.
S<?c. 4.12 Function Space 293

Setting the derivative of S’ with respect to ai equal to zero,


n

/(dt = 2 ak4>k(0
A=1

Since </, </>3) = and (</>_,-, <£fc) = <5^, it follows that c3- = sq. Therefore
the coefficients ak should be adjusted to the expansion coefficients ck.
This approximation by means of a minimization of the mean square
error is known as an approximation “in the mean.”
Since <0 cannot be negative, it follows that
n
f\t) dt ~^ck > 0 (4.12-8)
&= 1
Equation 4.12-8 is known as Bessel’s inequality. Since the terms in the
approximating series are orthonormal, the addition of orthonormal
terms cj>n+1(t), </>n+2(0> • • • must decrease the mean square error between
the function and the approximation. However, even though the sum-
00

mation 2 ck converges to a positive number which is not greater than


p fc=1
f\t) dt, this positive number may not be identically equal to the
Ja
integral. A good illustration of this point is the fact that a Fourier series
for a given function may consist of the mutually orthogonal cosine set
if the function is even, or of the mutually orthogonal sine set if the function
is odd. If the function is neither odd nor even over the interval in question,
however, both the cosine set and the sine set are required to represent
the function. Neither set alone generally converges to the function.
A given orthonormal set </>2(0> • • • » <^n(0 is called “complete”
if any piecewise continuous function f(t) can be approximated in the
mean with an arbitrarily small error by a sufficiently large number of
terms, i.e.,
"6 n 2
for a complete set of
f^) -tcMt) dt < e (4.12-9)
Ja k=l orthonormal functions

00

ll/ll2 = 2c/ (4.12-10)


k= 1

Equation 4.12-10 is known as the “completeness relation.” A sufficient


condition for the completeness of a set of orthonormal functions is that
Eq. 4.12-10 be satisfied for all continuous values of/(/) over the interval
(a, b). It is important to note that the completeness relation does not
imply that

f(t)= lim 2
294 Matrices and Linear Spaces

Certainly the series converges to fit) in the mean, such that the mean
square error over the interval {a, b) tends to zero. The series does represent
f(t) at a given point if f(t) is a continuous function throughout the
interval, and if the series converges uniformly in the interval (a, b). How¬
ever, even when a complete set of orthonormal functions is available, the
convergence of the series is a rather involved problem and is not treated
here.18

Weighted Orthogonal System

A weighting function w(t) may be added to the definition of the ortho¬


normal functions.
b
dt = S(j

The weighting function is generally selected to emphasize a region of


interest in the overall interval (a, b). The functions <j>k{t) are then said to
be orthonormal relative to the weighting function. The Fourier coeffi¬
cients of a function fit) with respect to the weighting function are given by

Example 4.12-1. A given function fit) is to be approximated by a series of orthonormal


functions. The orthonormal functions are to be composed of polynomials in t. If the
interval of interest is 0 < t < oo and the weighting function is e~l, the orthonormal
functions are known as Laguerre polynomials.
The Laguerre polynomials are
L0 = 1
L1 = —t + 1
t2
Lo —- — — 2t + 1
2

These polynomial functions are orthonormal with respect to the weighting function
since

Li(t)Lj(t)e ' dt = 6ij


Jo

For a given function f(t), the expansion coefficients ck are given by

ck =
References 295

If, for example, the function /(t) = e t is to be represented by a Laguerre series, the
expansion coefficients of the first five terms are

CO Cj J, C-2 g, C3 — i6, C\ — 32

Thus
4
fit) = 2 CkLk = H — T it + y2 — 3V3 + TFsff4
*=0

The square of the norm of the function relative to the weighting factor is
poo poo

|2w = f\t)w{t) dt = e~3t dt = i

The mean square error of approximation is


n

S = ml -14
k-0

As each orthonormal term is added to the approximation, the mean square error is
decreased, as shown in Table 4.12-1.

Table 4.12-1

Approxi¬ Maclaurin Series =


t2 ,3 ,4
Approxi¬ mation
No. of mation Error 1 ~, + V. ~3! +4!
Terms £ at t = 2 at t = 2 at t = 2

1 0.0833 24.99 0.5 +0.3647 + 1.0


2 0.0208 6.24 0.25 +0.1147 -1.0
3 0.0052 1.56 0.125 -0.0103 + 1.0
4 0.0013 0.39 0.1042 -0.0311 -0.3333
5 0.0003 0.09 0.115 -0.0203 +0.3333

A calculation of/(0 for a given value of t illustrates an interesting point. The approxi¬
mation for f(t) as each orthonormal term is added is also given in Table 4.12-1 for t = 2.
The actual value of f(t) at t = 2 is e~2 = 0.1353. Note that the error at t = 2 actually
increases as the fourth term is added, and then decreases as the fifth term is added.
This is in contrast to the mean square error over the entire interval, which decreases as
each term is added. A Maclaurin series expansion for f(t) at t = 2 is extremely poor,
as shown.

REFERENCES

1. F. Chio, Memoire sur les Fonctions Connues sous le Nom de Resultantes ou de


Determinants, Turin, 1853, p. 11.
2. L. A. Pipes, Matrix Methods for Engineering, Prentice-Hall Inc., Englewood Cliffs,
N.J., 1963, pp. 10-12.
3. B. Friedman, Principles and Techniques of Applied Mathematics, John Wiley and
Sons, New York, 1956, pp. 2-5.
4. M. Bocher, Introduction to Higher Algebra, The Macmillan Co., New York, p. 296.
296 Matrices and Linear Spaces

5. R. A. Frazer, W. J. Duncan, and A. R. Collar, Elementary Matrices, Cambridge,


University Press, 1938, p. 62.
6. Ibid., p. 65.
7. E. A. Guillemin, The Mathematics of Circuit Analysis, John Wiley and Sons, New
York, 1949, pp. 113-114.
8. F. B. Hildebrand, Methods of Applied Mathematics, Prentice-Hall Inc., Englewood
Cliffs, N.J., 1952, p. 61.
9. L. S. Pontryagin, Ordinary Differential Equations, Addison-Wesley Co., Reading
Mass., 1962, pp. 95, 291 ff.
10. F. R. Gantmacher, Applications of the Theory of Matrices, Interscience Division,
John Wiley and Sons, New York, 1959, pp. 301-303.
11. F. Ayres, Theory and Problems of Matrices, Schaum Publishing Co., New York,
1962, pp.188-195, 203-214.
12. H. W. Turnbull and A. C. Aitken, An Introduction to the Theory of Canonical
Matrices, Blackie and Son, London, 1932.
13. E. A. Coddington and N. Levinson, Theory of'Ordinary Differential Equations,
McGraw-Hill Book Co., New York, 1955, p. 65.
14. Frazer, Duncan, and Collar, op. cit., p. 70.
15. Hildebrand, op cit., pp. 65-67.
16. Frazer, Duncan, and Collar, op. cit., p. 78.
17. Ibid., p. 83.
18. R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. I, Interscience
Division, John Wiley and Sons, New York, 1953, Chapters II, V, and VI.

Problems

4.1 Perform the elementary operations A + B, A — B, and AB on the


following matrices.
1 2 3 -4 1
A = 0
jLt -5 B 1 5 0
4 0 2 -2 3

"«n a12 f>n 0 '


A = B
_a21 a22_ 0 t>22_

r
1
Cx|

CO

"l
A = B 0
3 2 1 i
4.2 What are the conditions on the elements atj and bij of the (2 x 2) matrices
A and B such that AB = BA?
4.3 Compute AB, where

1 2 1 2 1 2 1 2
0 1 2 1 2 1 0 1
B =
0 0 0 1 0 1 0 0
0 0 1 0 1 0 0 0
Problems 297

4.4 Under what conditions is (A ± B)2 = A2 + 2AB + B2 ?


4.5 Given the matrix equation AB = AC. Under what conditions is B ^ C ?

4.6 Find I A(/) dt and (d/dt)A(t) for


Jo
cos t t2 ■
A =
1 tanh t

4.7 Find the value of the following determinant by using only the definition
of a determinant.
-7 -4 3 2
3 2 -5 2
IA
6 4 0 -4
6 4 1 -5

4.8 Find all the minors and cofactors of the determinant given in Problem
4.7. Show that the Laplace expansion of the determinant along any row
is equal to the expansion along any column.
4.9 Find the value of the determinant shown in Problem 4.7 by the method
of pivotal condensation.
4.10 Show that the product of the determinants |A|, |B|, is equal to the de¬
terminant of the product, |AB|, where

3 0 2 "l -1 4“
A = -2 -1 -1 B = 2 3 0
-1 -3 5 5 0 2

4.11 Find the derivative d/dx of the determinant

x2 — 1 X — 1 1

|A| X4 X3 2x -f- 5
X + 1 X2 X

4.12 Find Adj A of


1 1 1 0
2 3~ 1 2 l"
2 3 3 2
0 1 2 A = 2 1 0 A =
1 3 3 2
0 0 1 -1 0 1
4 6 7 4

4.13 Prove that, if A is of order n and rank n — 1, then Adj A is of rank 1.


4.14 Prove that, if A is of order n and rank <n — 1, then Adj A = 0.
4.15 Show that |Adj A| = |A|n_1 if A is of order n.
4.16 Find A-1 for the matrices given in Problem 4.12.
4.17 Repeat Problem 4.16 using partitioned matrices.
298 Matrices and Linear Spaces

4.18 The inverse of a square nonsingular matrix can be found by a technique


of pivotal condensation. First, the following array is set down:

n
A
'I
#in 0 0 • • • 0
hi ^12 a13

a22 a23 &2n 0 -1 o - • 0


721

hi @n2 hn 0 0 0 • • • -1

1 0 0 • • • 0 0 0 0 • • • 0

0 1 0 • • • 0 0 0 0 • • • 0

0 0 0 • • • 1 0 0 0 • • • 0

0 0 0 • • • 0 1

Using the principles of pivotal reduction, the array to the left of the
broken line is eliminated. The inverse matrix appears in the lower
right-hand box, where each element is divided by the element in the
2n + 1 row and n + 1 column. Find the inverse matrix of

2
A =
1
by pivotal reduction and verify the result using A 1 = Adj A/|A|. Why
does this technique work?
4.19 Repeat Problem 4.16 using the technique of pivotal reduction.
4.20 Find A-1 and (d/dt)[A-1] for

2t - 2 t + 2 -3
A = 31 - 1 t -1
-1 At — 3 -t + 1

4.21 If matrix A is symmetric, show that only (n/2)(n + 1) cofactors need be


computed in obtaining A”1.
4.22 An (m x n) matrix A is said to have a right inverse B if AB = I, and a
left inverse C if CA = I. Prove that B exists only if A is of rank m, and
that C exists only if A is of rank n.
4.23 Show that (AB)-1 = B-1A-1, where

1 2 3 “2 3 4~
A = 2 4 5 B = 4 3 1
1_

1 2 4
i

4.24 Show that, if the symmetric matrix A is nonsingular, then the inverse
matrix A-1 is also symmetric.
Problems 299

4.25 Find the inner and outer products of the following pairs of vectors:

1 -1
1 and y = 1
2 1

1 +j -1
x ■i -j and y = 1 +j
2 +J2 1 ~j

2
-1
and y =
1
1

2+y3 1 -f- j 2
1 -/2 0
x and y =
3 1 +y
0 2

4.26 Find the value of a which makes ||x — ay|| a minimum. Show that
for this value of a the vector x — ay is orthogonal to y and that
IIx — ay ||2 + || ay ||2 = ||x||2. The vector ay is called the projection of
x on y. Draw a diagram for the case where x and y are two-dimensional.
4.27 Find the projection of the vector (1, 1, 1) on the plane
xx + 2x2 + 3x3 — 0.

4.28 Assuming a three-dimensional space, prove that the four vectors (1,0, 0),
(0, 1, 0), (0, 0, 1), and (1, 1, 1) form a linearly dependent set, but that any
three of these vectors form a linearly independent set.
4.29 Express the vector y = (6, 3) in terms of the basis vectors xx = ( —1, 2),
x2 = (1, 3). Determine the reciprocal basis.
4.30 Using as a set of basis vectors xx = (1, 1, 1), x2 = (1, 0, 0), and x3 =
(0, 1, 0), find an orthonormal basis by use of the Gram-Schmidt orthog-
onalization procedure.
4.31 What is the dimension of the vector space spanned by the following sets
of vectors ?
(a) xx =(1,2,2, 1), x2 =(1,0,0, 1), x3 = [3, 4, 4, 3]
(b) Xj =(1,1,1), x2= (1,0,1), x3= (1,2,1)
(c) Xj =(1,0, 1,0), x2 = (0, 0, 5, 0), x3 = (10,0, 1,0),
x4 = (5, 0, 7, 0)
(rf) xx = (1,0,0), x2= (0,1,0), x3 = (0,0,1), x4= (1,1,1)
300 Matrices and Linear Spaces

4.32 Find the rank and degeneracy of the following matrices.

'2 -2 3" 7 4-1


A = 10 -4 5 B = 4 7-1

_1

_i

-'t
VO
IT)

1
1

i
i
4.33 Find the degeneracy of the product AB where A and B are the matrices
given in Problem 4.32.
4.34 Let Sx be a subspace of the ^-dimensional space S. Show that any vector
x which is not in S± can be represented by

x = y + yi
where y is a vector in and yx = x — y is orthogonal to all vectors in
Sv Draw a three-dimensional picture illustrating this theorem. (See
Problem 4.26.)
4.35 Show that, if the ^-dimensional vector x is orthogonal to a set of basis
vectors of an /i-dimensional space, then x = 0.
4.36 Solve the following set of equations using Cramer’s Rule.

+ x2 + x3 + = 10
xx
3x1 x2 — x3 — 4x4
+ —14 =

2x1 + 2x2 — x3 + x4 = 7
x± + 3x2 + 4x3 — xi = 15

4.37 Find the solution to the equation y = Ax using the inverse matrix A-1,
where
1 2 3" x1 ”f
A = -1 0 2 x = x2 y = 3
0 1 0 x3 5

4.38 A convenient pivotal reduction scheme for computing the n unknowns


in an «th order system is as follows: Add to the coefficient matrix an
additional column consisting of the negative of the right-hand members.
Add to this array an additional n + 1 rows which have unity in the
diagonal element and all other terms equal to zero.

a12 *13 * * * *1 n ~Vi


a21 a22 *23 *2 n -V2

&nl ®n2 *n3 .* • * *nn Vn

0 0 • • •
1

0 1 0 • • •
0
0
0)
o|
1 n + 1

0 0 0 • • • 0 1J
Problems 301

Choose any element amk as the pivot term, where m and k are less than
or equal to n. Using the method of pivotal condensation, the rows and
columns of this array are reduced one at a time until only column n + 1
remains. This column consists of n + 1 elements clt c2, . . . , cn+1. The
solution for the k\h unknown xk is then given by

Cfc
xk = -
Cn+1

Solve the following set of equations using the method of pivotal


condensation.

ry> , ry* I ry*

2x1 — x2 + x3 = — 1

Show why this technique works.


4.39 Solve the following sets of linear equations.

(a) x2 + x2 -f x3 + x4 = 0
x± + x2 + x3 — x4 = 4

X1 + X2 ~ X3 + ^4 = —4

xx — x2 + x3 + £4 = 2

(b) xx + 2x2 + 9x3 = 0


2x1 + 2x3 = 0
2>x1 5^ = 0

(c) 2x1 + x3 — x4 = 0
xx + 3x2 + 2x3 + 4x4 = 0
xx + x2 + x3 + x4 = 0

4.40 Find the characteristic values and characteristic vectors for the following
matrices.
3
1 2
1 3
(a) 0 2
1
-1 1
2

4.41 Find the characteristic vectors, the modal matrix, and the diagonal form
for the matrices

0 1 o " ~2 -2 3 ” 7 4 -f
(a) 0 0 1 (b) 1 1 1 (c) 4 7 -1
-2 -4 -3 1 3 -1 -4 -4 4
302 Matrices and Linear Spaces

4.42 Show that


"2 2 l" 2 1 -f
A = 1 3 1 and B = 0 2 -1
1 2 2 -3 -2 3

have the same characteristic values but are not similar matrices.
4.43 If a characteristic equation /?(2S) = 0 with a multiple root of order s
has degeneracy*/, whereq > 1, prove that the adjoint matrix Adj [ASI — A]
and all its derivatives up to and including (da~2/dkq~2)[Adj (21 — A)]a=As
are null.
4.44 The matrix
0 0
1 0
4 1

has multiple roots. Show that by adding a perturbation matrix

0 0“

0 e2 0
0 o e3

all the characteristic values of A are distinct. Show that as e{ approaches


zero all the characteristic vectors collapse into one characteristic vector.
4.45 Find the sum and product of the characteristic values of A, where

1 0 1
0 0 2
A =
0 5 1
5 1 0

4.46 If the «th order matrix A has a characteristic value At- repeated q times,
show that the rank of — A is not less than n — q, and that the as¬
sociated invariant vector space (null space) is not greater than q.
4.47 Prove that the characteristic values of a unitary matrix [A-1 = (Ar)*j
and an orthogonal matrix (A-1 = AT) have an absolute value equal to
unity.
4.48 Prove that, if Xi is a nonzero characteristic value of A, then i A|/Af is a
characteristic value of Adj A.
4.49 If the matrix A has characteristic values A1? . . ., An, show that
{a) Am has characteristic values Axm, . . . , Xnm.
(,b) A-1 has characteristic values 1/AX,. . . , \\Xn
(c) AT has characteristic values A1? . . . Xn.
0d) kA has characteristic values k\, . . . , kln.
Problems 303

4.50 If A is a triangular matrix, such that the elements either above or below
the principal diagonal are zero, show that the elements of the principal
diagonal are the characteristic values of A.
4.51 A vector xk, for which (A — ^ 0, but for which (A — 2il)kxk = 0,
is called a “generalized characteristic vector” or a characteristic vector
of rank k associated with the characteristic value 2Z-. These generalized
characteristic vectors are useful in determining the modal matrix for
systems which have repeated characteristic values.
{a) Show that characteristic vectors of different rank are linearly inde¬
pendent.
(b) Find the generalized characteristic vectors for

1 0 0
A = 1 1 0
2 3 2

(c) Find a modal matrix M for the matrix of part (b).


4.52 The adjoint transformation A* of the linear transformation A is defined
as (Ax, y> = <x, A*y>. Show that (A*)* = A.
4.53 Solve the following set of linear equations by using only elementary
row transformations.
xx + 2x2 x3 = 2
7.xx + x2 + %3 = 7
5.rx — 3x2 + 3xs = 8
4.54 Reduce
0 1
A = -1 0
-2 3

to the canonical form shown in Eq. 4.8-16.


4.55 Reduce
6J j i +j
A = j y -i

L-i +j i j

to the canonical form shown in Eq. 4.8-19.


4.56 Reduce
3 2
2 5
5 1

to the canonical form of Eq. 4.8-15.


304 Matrices and Linear Spaces

4.57 Introduction to Linear Codes: Suppose that four binary digits {c^, c2,
c3, c*}, where each c may be either zero or one, are to be stored in
a computer memory. However, one or more of the digits may be in¬
advertently changed during the read or write cycle. Thus, the numbers
read out of memory, {xx, x2, x3, x4}, may be in error. It is possible to al¬
leviate this situation somewhat by introducing “parity check bits”
c5, c6, c7 as specified below. The encoded data (cl5 . . . , c7) are then
stored in seven bits of memory.

Noise = €
Fig. P4.57

The system shown in Fig. P4.57 illustrates this procedure. c5, c6, and
c7 are computed from cl5 c2, c3, c4 in the encoder. The numbers xlt. .., x7
are later read out of the memory and passed through a decoder.

Cl

^2 e2 x3
Vi
c = € = X = y = V2

• . . J/3_

_C1_ _e7_ x7

x = c + e

c5, c6, c7 are chosen as follows:


c5 is chosen so that the parity of (number of ones in) the sequence
*i = {?2, c3, c4, c5} is even.
c6 is chosen to make the parity of s2 = {c4, c2, c4, c6} even.
c7 is chosen to make the parity of s3 = {a?1} x3, x4, x7j even.
Errors are detected by checking the parity of the sequences corresponding
to slt s2, s3 in the vector x Define the vector y such that

1 if the parity of s± is odd


0 if the parity of sx is even

and likewise for y2 and s2, and y3 and s3.


It turns out that, if no more than one error occurs (i.e., x and c differ
in no more than one position), then it is possible to reconstruct c merely
by computing y. This is called a Hamming code. The problem is
Problems 305

formulated in terms of vector spaces and linear transformations. All


arithmetic is done modulo 2:

0+0=0 00=0
0 + 1=1 0-1=0
1+0 = 1 1-0=0
1+1=0 1-1=1

0a) Show that the mapping from the x vectors to the y vectors is a linear
transformation T. Find the matrix A of this transformation T with respect
to the natural basis in V7 and the natural basis in V3 (the seven and three-
dimensional spaces, respectively).
(b) List the eight error vectors e which correspond to
(1) no error;
(2) an error in the ith place only for / = 1, 2, . . . , 7.
Note: x = c + e defines e
(c) Show that the vectors c are the null space of A. Find the matrix

of the form whose column vectors are a basis for this null space.

(d) If the only possible errors are those listed in (b), show that it is always
possible to detect which c has occurred by computing y. For each y list
the corresponding e.
(e) In general, any matrix of ones and zeros defines a “linear code.”
There are other 3 by 7 matrices which define linear codes which have the
same error-correcting properties as A. How many are there, and what
is their relationship to A?
(/) In general, if a code defined by an (m x n) matrix A of a linear trans¬
formation T from Vn into Vm is to correct for a possible set of n-
dimensional error vectors {0, el5 . . . , ep}, what condition must be placed
upon the transformed vectors . . . , T(ev)} in Vm? Prove your
answer using the linearity property for T.
(g) Is it possible to construct a single error-correcting code such as the
one above from a (3 x 8) matrix? a (3 x 6) matrix? Tell why, referring to
your answer to (/).
4.58 If the coefficient matrix A of the linear transformation y = Ax is singular,
then the null space of A is the vector space formed by the vectors which
satisfy Ax = 0. Find the null space of

1 2 -2
A = 1 3 3
0 1 5
306 Matrices and Linear Spaces

4.59 This problem is meant to illustrate the meaning of characteristic vector,


and of characteristic value in terms of two-dimensional spaces.
(a) Consider the linear transformation which is rotation of vectors by
an angle 6 as shown in Fig. P4.59a.

(1) Find the matrix of T with respect to e1 and e2.


(2) Find all characteristic vectors, and characteristic values.
(b) Consider the linear transformation 7Y-.) = Ae, where

k\ (k2 &i)
0 k2

(1) Find the characteristic vectors and characteristic values.


(2) Sketch, in the elt e2 plane, the characteristic vectors Pi and p2
and their transformations T( px) and T( p2).
(3) Suppose k1 = k2. Describe the situation
(i) geometrically;
(ii) in term of characteristic vectors and characteristic values.
(c) To see how it is possible that only one characteristic vector exists,
consider the linear transformation obtained as follows: Stand a deck
of playing cards on edge so that you are looking at the deck sideways.
Draw a vector a on the edge of the deck. Now “skew” the deck as shown
in Fig. P4.59b and note the “new” vector a' = T{a) obtained. What is
the matrix of T with respect to e1 and e2?
This transformation has an element of “shear” not present in (a)
or (b).
(1) Find the (second order) characteristic vector, and the characteristic
value.
(2) Find the Jordan normal form.
Problems 307

(d) Consider the transformation which is the result of first performing (b)
for k1 = V2 and k2 = 1/V2; and then performing («) for 0 =45°.
(1) What is the matrix of this transformation?

T
=£>

(2) Find the (second order) characteristic vector and the characteristic
value, and interpret in terms of (c).
4.60 A second order orthogonal matrix must be of the form

cos 0 sin 6 cos 0 sin 6'


or An
—sin 6 cos 6 sin 6 —cos 0

(a) Prove that this is true.


(b) Consider the linear transformation y = Ax, where A = A^. What
can be said about the components of a vector in the original coordinate
system with respect to the same components in the transformed co¬
ordinate system ? How is the relationship between two different vectors
in the original coordinate system affected? Draw a picture of this
transformation.
(c) Repeat (b) for A = An.
4.61 Problem 4.59 illustrated the difference between linear transformations
whose Jordan forms have ones on the superdiagonal and those which do
not. This was done in two dimensions by means of an analog involving
a pack of playing cards. In a similar manner (not necessarily involving
cards), develop analogies in three dimensions for Jordan forms:

"l 0 0~ "l 1 o"

(fl) 0 1 1 and (b) 0 1 1


0 0 1 0 0 1

Make sketches of both (a) and (b), indicating clearly the directions of
all characteristic vectors.
308 Matrices and Linear Spaces

4.62 Find the Jordan form and a modal matrix for the following matrices.

4 2 2 0
-1 0 -1 -1
(«) (b)
-1 0 1 1
1 2 1 3

2 0 0 0 1 0 0 0
0 1 1 0 0 0 1 0
(c) id)
0 1 3 2 0 1 0 0
V

0 1 -1 2 0 -1 -1 -1

0 0 0 0
0 0 0 0
0 0 0 1
10 0 0

4.63 Determine if the following matrices are: (1) positive definite, (2) positive
semidefinite, (3) negative definite, (4) negative semidefinite

'2 1 l" 3 1 1
2 6 —3
1 3 0 1 7 1
0) 0b) 6 6 —3
1 0 1 1 1 -1
3 3

~2 1 l" "-1 -1 -l”


(c) 1 3 2 id) -1 -4 -1
1 2 1 -1 -1 -2

1 1 1
(e) 1 3 1
1 1 1

4.64 Consider the quadratic equation <x, Ax) = 1, x real, A(2 x 2), real,
symmetric with an > 0. The equation defines a curve in the x-plane.
(a) Show that, if A is positive definite, the curve is an ellipse.
(1) What significance have the characteristic vectors in terms of the
orientation of the ellipse?
(2) What significance have the characteristic values ?
(3) Draw a sketch to illustrate (1) and (2).
(4) What happens if there is only one characteristic value?
(b) Show that, if |A| < 0, the curve is a hyperbola. Repeat (1), (2), and
(3) above for the hyperbola.
Problems 309

(c) What happens if one of the characteristic values is zero ?


0d) Sketch curves defined by the equation for the following values of A.
Find characteristic values and characteristic vectors in each case, and
indicate these quantities on your sketch.
5 3.
2 2

A =
3. 5.
2 2

4.65 Consider the function Q(x) = (x, Ax>, where A is a real, symmetric
matrix.
(a) Show that x = 0 is a stationary point.
(b) Under what condition on A is x = 0 a relative maximum? Prove it.
(c) Under what condition on A is x = 0 a relative minimum? Prove it.
(<d) If A is nonsingular and fits neither of these conditions, then what is
x = 0?
(e) What happens if A is singular with either single or multiple degen¬
eracy ?
(/) Illustrate your answers to (b) through (e) by means of sketches of the
level curves of (1) to (4) below.

-2 r
(1) A = (2) A =
i -l

1 1 1
(3) A = (4) A =
1 1 1
4.66 For the matrix
r
-3

find (a) A1^, {b) cos A, (c) sinh A.

4.67 Find eA by both the Cayley-Hamilton theorem and Sylvester’s theorem


for
7 3 6 6
1 1 1
2 2
4 2 4 4
{a) A = 1 1 3.
0b) A =
2 2
3 5 0 -1
3. 1
2" 2
2-2 4 5

~0 1 o" “2 1 o"
(c) A = 0 0 1 (d) A = 1 2 0
1 -3 3 1 1 1
310 Matrices and Linear Spaces

4.68 Find A10 where A is given by

~1 0 o' ~1 i o'
3

{a) A = 0 2 2 (b) A = 3 -1 0
0 2 2 4 4 1
3 3_

4.69 Equation 4.11-7 defines the integral zAt dt in terms of A 1. If A is


Jo
singular, how can this integration be performed ?
4.70 Prove Eq. 4.11-9.
4.71 Prove that ||f(x) — g(x)\\ = \\f(x)\\ + ||(g(V)|| [/ and g are defined over
the interval (a, b)] only if f(x) and g(x) are orthogonal over the interval
(a, b).
4.72 Show that the mean value of a function over a given interval is always
less than or equal to the rms value of the function over the same
interval.
4.73 Show that Problem 4.72 is a special case of the Schwarz inequality
b
f{x)g{x) dx < f\x) dx g2(x) dx

Prove that this equality exists if and only if g(x) = kf(x), where k is a
constant.
4.74 Prove that the Laguerre polynomials are orthonormal over the interval
(0, co).

4.75 One technique which has been proposed for linear process identification
using only the input-output operating record of the process is to model
the process with a set of orthogonal transfer functions. In this technique,
the output of the model of the process is taken as a linear sum of the
outputs of the orthogonal transfer functions. The input to the model is
the same as the input to the process. See Fig, P4.75. The coefficients

Fig. P4.75
Problems 311

ax, a2, . . . , ay are adjusted until the mean square difference between the
output of the process, y(t), and the output of the model, z(t), is a
minimum. The set of orthogonal transfer functions is then said to “model”
the process.
(a) If
N
z(t) = 2
1
show that the optimum settings for the coefficients are given by

lim — I y(t)Zj(t) dt
00 T time average of yiOz^t)
a, =
mean square value of Zj(t)
\t) dt

(,b) A set of n orthonormal functions is to be constructed for the interval


0 < t < co using as a set of linearly independent transfer functions

1
FIs) = i — 1
5 + 5V

Show that the orthonormal transfer functions (called Kautz orthogonal


filters) are given by

®100 = -r-1-
+ Si
n—1

_no- sk)
$n0) = ^- , n > 2
IT 0 + Sk)
k=1
(c) If a weighting factor w(t) = e 2<xt is used, show that the orthonormal
transfer functions are given by

V 2(5-! + a)
#i0) =
s + s1

V 2(s2 + a)(5 — 2 a — ^
<f>20) =
(s + 51)(5- + S2)
n
_ ITO - 2a - sk)
®«0) = V2(sn + a) -- > n > 2
ITO + sk)
k—1

(d) If the orthonormal functions are constructed from transfer functions


having pairs of complex poles at s = — a< ± jdi and a weighting factor
312 Matrices and Linear Spaces

e~2<xt, show that the orthonormal functions are

s — a + \s±\
3^0) = V2(a + aj)
(s + ax)2 + p*
s — a — Is-jJ
OgCi') = V 2(a + ax)
0 + a^2 + ft2

_ (J - a + W) n K* - 2a - a,)2 + p2]
022_i = V2(a + a^)- k~X -
I! K* + «*)2 + Pi?}
*=i \
0 - a - 1^1) XX [0 - 2a - afc)2 + Pk2]
/c=1
®2i = ^2(a + oti)

IT k* + **)2 + /Vi
fc=l
where i = 1, 2, . . ., n\2 and n = number of complex poles.

\sk\2 = O + afc)2 + pk2

Hint: Parseval’s theorem states that


Pc+j 00
fiit)hit) dt = F1(s)F2( -s) ds
J c—j CO

If the weighting factor w(t) = e 2at and a set of orthogonal functions


<f>i(t) are used, then
' 00 'c+j CO

<f>iit)<f>jit) r2at dt = $>i(s + a)03-(—5 + a) ds = d tj


2nj Jc- j 00
5
State Variables and Linear
Continuous Systems

5.1 INTRODUCTION

In this chapter the matrix techniques and vector space concepts of


Chapter 4 are brought to bear upon the problem of multiple input-
multiple output systems. Because of the increasing demands for well-
designed, complex systems, a search has been made for techniques which
are useful in the design of such systems. Not only must these techniques
be amenable to computation, but, more importantly, they must also be
amenable to conceptual thinking. Most practicing engineers think in
intuitive terms, and any proposed technique must enable the engineer
to grasp the concept of the technique and relate it to his previous
training.
The analysis and synthesis of multiple input-multiple output systems,
hereafter called multivariable systems, is a formidable task. The calcula¬
tion and control of interrelated effects in a multivariable system is a
complicated exhausting process and is best done by an electronic computer.
However, the engineer who programs the computer must be aware of all
the possible techniques to solve his problem. From these reasons, and
others, the search for techniques to handle these problems has centered
on the “state variable” approach.
From a mathematical viewpoint, the state variable approach is the use
of matrix and vector methods to handle the large number of variables
which enter into such problems. As such, these are not new methods,
but rather they are the rediscovery of existing mathematical techniques.
They aid considerably in the solution of linear multivariable problems.
More importantly, however, the state variable approach aids conceptual
thinking about these problems, and nonlinear system problems as well.
Furthermore, it provides a unifying basis for thinking about linear and
313
314 State Variables and Linear Continuous Systems

nonlinear problems. These two topics are frequently treated as somewhat


unrelated by the engineer.

5.2 SIMULATION DIAGRAMS

A convenient method of representing the mathematical equations


governing a system is a block diagram, similar to the diagram sometimes
drawn before simulating a system on an analog computer. Such a diagram
consists of blocks, with the function of the simulated element indicated
inside the block. The basic elements most frequently utilized are ideal
integrators, ideal amplifiers, and ideal
adders, as shown in Fig. 5.2-1. The word
“ideal” is used because in actual practice
such factors as phase shift, sign inversion,
and loading must be taken into account
before a workable analog computer simu¬
lation can be determined.
y\
o A+
y1±y2
^ Adder The approach used to generate a block
diagram of a linear differential equation
is to integrate successively the highest
^2 derivative to obtain all the lower order
Fig. 5.2-1 derivatives and the dependent variable.
The block diagram is completed by satis¬
fying the requirements of the differential equation, i.e., multiplying the
derivatives by their respective coefficients and summing these terms to
“close the loop.” Some illustrations serve to clarify this method.

Example 5.2-1. Draw a block diagram for the system governed by the differential
equation , . , ,
1 y 4- ay + by = v

where v is the input, and y the output.


Solving for the highest derivative ij,

y = v — ay — by

Integrating y twice, both y and y are obtained, as shown in Fig. 5.2-2a. The loop is then
closed by satisfying the requirement of the differential equation

y = v — ay — by

The completed diagram is shown in Fig. 5.2-2b.


Figure 5.2-2b is essentially the form of an analog computer diagram for this system.
However, an analog computer diagram would also have to account for the sign change
inherent in the integrators, amplifiers, and adders, the initial conditions of the system,
and any time and amplitude scaling factors. Thus Fig. 5.2-2b exhibits only the mathe¬
matical description of the system with zero initial conditions, and not its practical
simulation.
Sec. 5.2 Simulation Diagrams 315

y y y
/ ->
/
(a)

(b)
Fig. 5.2-2

Example 5.2-2. Draw a block diagram for the differential equation


y + ay + by — v -\- v

The only change that must be made to the previous block diagram is the addition of a
block which provides the v term, as shown in Fig. 5.2-3a. However, a block containing
differentiation is not generally utilized. Differentiators are noise-accentuating devices
and therefore are not employed in an analog computer simulation.

(a)
316 State Variables and Linear Continuous Systems

y — ev — dv y — ev — dv
/

(a)

dv
•• •• ,•
y — ev — dv
->
J ^6
(b)
\y~e\
) > jr
y “ ev

(d)

Fig. 5.2-4
Sec. 5.2 Simulation Diagrams 317

This can be simulated as in Fig. 5.2-3b. The output of the first integrator is y — v.
By adding v, y is obtained. The diagram is then completed as in the previous example.
In essence, the v input of Fig. 5.3-2a has been shifted to the right of the first integrator,
and the differentiation and integration operators cancel.

Example 5.2-3. Draw a block diagram for the system governed by the differential
equation
y + ay + by = ev -F dv + cv

where a, b, c, d, and e are constants.


Using the method of the previous example, the input to the first integrator is assumed
to be y — ev — dv, as shown in Fig. 5.2-4a. The term — dv can be canceled by adding
dv to the output of the first integrator, as in Fig. 5.2-4b.
The term — ev can be canceled by adding ev at the output of the second integrator.
This is shown in Fig. 5.2-4-c. However, when an attempt is made to close the loop to
satisfy the differential equation, it is seen that the derivative y does not appear by itself
at any point in the diagram. Point 1 in Fig. 5.2-4d has the value cv — ay — by + ae 'v
and does not satisfy the requirements of the differential equation

y — ev — dv = cv — ay — by (5.2-1)

An additional term aev is returned to the input of the first integrator. The system can
still be simulated in the form shown, however, but with a modification of the blocks
containing the multiplying factors c, d, and e.
Assume, then, the form shown in Fig. 5.2-5a, where b0, blf and b2 are the proper
multiplying factors. Call the input to the first integrator q. The values of other points
in Fig. 5.2-5a are indicated. The block diagram is a simulation of the differential
equation

(5.2-2)

Recognition that the output of this simulation must be equal to y gives

(5.2-3)

Then
q = y — bxv — b2v
(5.2-4)
q — ij — b-ii) — b2v

Substituting Eqs. 5.2-2 and 5.2-3 into Eq. 5.2-4 yields

ij — bxv — b2v = v dt — b2v)

(5.2-5)

Collecting terms,
y — b2v — {bx + ab2)v = b0v — ay — by (5.2-6)
318 State Variables and Linear Continuous Systems

(b)
Fig. 5.2-5

Comparing Eq. 5.2-6 with Eq. 5.2-1, the requirements for b0, bx, and b2 are found by
equating like terms. They are
b2 = e
bx + ab 2 = d or bx = d — ae (5.2-7)
b0 = c

The completed diagram is shown in Fig. 5.2-5b.


This is a general procedure for linear systems and is utilized in Section 5.5 to derive
the standard form of a linear vector matrix differential equation. It is equivalent to
canceling the output of the first integrator in Fig. 5.2-4d caused by the undesired input
aeu, if the loop were closed. The cancellation is accomplished by modifying the gain of
the d block. Comparison of Figs. 5.2-4d and 5.2-5b reveals this viewpoint.
Sec. 5.2 Simulation Diagrams 319

Time-Varying Systems

Time-varying systems can be represented by utilizing blocks with time-


varying gains. If derivatives of the input are contained in the differential
equation, the procedure indicated in Example 5.2-3 becomes more com¬
plicated. It is reserved for a problem at the end of the chapter.

Example 5.2-4. Draw a block diagram for the time-varying system governed by the
differential equation y + a(t)y + b{t)y = v.
Using amplifiers with time-varying gains, this is simulated as shown in Fig. 5.2-6.

Fig. 5.2-6

Nonlinear Systems

By utilizing blocks known as multipliers and function generators, a


diagram can be drawn for systems which are
nonlinear. These two blocks are shown in Fig. —2—>
5.2-7. * >

Example 5.2-5. Draw the block diagram for the system


whose differential equation is ———>

y + ay + by3 = v Fig. 5.2-7

The simulation diagram for this system is shown in Fig. 5.2-8a. This system can also
be represented as in Fig. 5.2-8b. The block indicated by ( )3 takes the place of the two
multiplier blocks. In this case the function to be generated is f(y) = y3.

Example 5.2-6. Draw a block diagram for

y + ayy + by3 = v

This is a case where both a function generator and a multiplier must be used. The
simulation diagram is shown in Fig. 5.2-9.

Example 5.2-7. If the functional relationship between the input y and the output e of a
block is defined as e = d(y), draw a block diagram for the differential equation

y + d(y) + by = v

The nonlinearity is ideal limiting as shown in Fig. 5.2-10.


320 State Variables and Linear Continuous Systems

(a)

(b)

Fig. 5.2-8

e = d(y) ^
Ideal limiting

Fig. 5.2-10
Sec. 5.2 Simulation Diagrams 321

(a)

(b)
Fig. 5.2-11

Often, when the function to be generated cannot be expressed in a convenient mathe¬


matical form, the input-output relationship of the desired function is drawn directly in
the block. In most cases this is done where the input-output relationship is piecewise
linear. This is done in the diagram shown in Fig. 5.2-11 a, which is more informative
than Fig. 5.2-11 b. Figure 5.2-11 a indicates at a glance what the nonlinearity is, without
having to refer to another defining diagram for d(y).

Multivariable Systems

For multivariable systems the approach is substantially the same.


Examples 5.2-8 and 5.2-9 illustrate two cases.

Example 5.2-8. Diagram the multivariable system

V\ + tyi + 2y2 = vx

y* + V\ + Vi — v2

It is obtained by following the principles previously discussed. The simulation


diagram is shown in Fig. 5.2-12.
322 State Variables and Linear Continuous Systems

Fig. 5.2-12

Example 5.2-9. Diagram the multivariable system


Vi + yi = v1 + 2v2
yz + 2>y 2 + 2 y2 — vx -T v2 + v2
This can be handled by transferring the v2 term to the left side of the equation as in
Example 5.2-2. The simulation diagram is shown in Fig. 5.2-13.

Fig. 5.2-13

5.3 TRANSFER FUNCTION MATRICES

The concept of the transfer function H(s) was introduced for single
input-single output fixed linear systems in Section 3.5. For systems which
have more than one input or output, transfer functions between various
Sec. 5.3 Transfer Function Matrices 323

input-output terminals become of interest. The principle of the transfer


function is simply extended to cover this more general case. The transfer
function is the transfer function between input terminal j and output
terminal i and is defined by
Y(s)
Ha(s) = 777-; ^(s) = o, k*j (5.3-1)
*j(s)
If the elements H{j{s) are ordered into an array, where the first subscript i
denotes the row and the second subscript j denotes the column, then this
array is called a transfer function matrix.

Example 5.3-1. A system with two inputs and two outputs is governed by the set of
differential equations
iji + 52/! + 6 = +
yx Vi 3vi +
Av2 + 8y2
y2 + 2/2 = Vi + 2i)2 + 2v2

Determine the transfer function matrix and draw a block diagram.


Taking the transform of both equations, assuming zero initial conditions, yields
0 + 2)(s + 3) Yfs) = s(s + 3) Fx(5) + 4(5 + 2) V2(s)
(s + 1)T2(s) = sVfs) + 2(5 + 1)F2(5)
or
5 4
Yfs) = Vi(s) + V2(s)
5 + 2 5 + 3

Y2(s) = Vi(s) + 2 V2(s)


5+1

The transfer function matrix H(5) is then


4
(5) H12(sY 5 + 2 5 + 3
H (5) =
H21(s) H22(s) 5
2
5 + 1

The transfer function diagram is shown in Fig. 5.3-1.

VM s
|Q->- Yi (s)
s+2

|Q -* Y2 (s)

/■

1K

V2(s) 4
s+3

Fig. 5.3-1
324 State Variables and Linear Continuous Systems

Example 5.3-2. A two input-two output system is described by the differential equations

Vi + y2 = v1 + v2

2/2 + Vi = v2

Determine the transfer function matrix and a block diagram.


Transforming the equations assuming zero initial conditions,

5 Y&) + Y2(s) = Vxis) + V2(s)

sY2(s) + Y,(s) = V2(s)

Solving for ^(5) and Y2(s),

YAs) Vi (s) + V2(s)


(5 + 1)(j - 1) ^ + 1
-1
Y2(s) ■-
(s + 1)(j - 1)
V1(s) + -i-
s +
V2(s)
1

Hence
s 1
(S + 1)CS - 1) 5 + 1
H (s) =
-1 1
Js + 1)(5 — 1) S + 1_

A transfer function diagram for this system is shown in Fig. 5.3-2.

It should be noted that the transfer functions 1 /(s + a) and s/(s + a), commonly
shown on diagrams of this sort, can be obtained from a single integrator. This is shown
in Fig. 5.3-3. An important point to be stressed here is related to the fact that the transfer
function diagram of Fig. 5.3-2 uses three integrators. A conclusion that the system is of

V(s)

V(s)

Fig. 5.3-3
Sec. 5.4 The Concept of State 325

third order is incorrect. The original differential equations require only two integrators
for simulation, as shown in Fig. 5.3-4. The simulation diagram, obtained from the
original differential equations, always shows the correct order of the system. However,
the transfer function diagram should be used with the physical problem always in view;
otherwise an incorrect conclusion about the order of the system may be made. The

Fig. 5.3-4

transfer function diagram often masks the physical properties of a system. This point is
covered at greater length in Section 5.7, when the controllability and observability of a
multivariable system are discussed.

5.4 THE CONCEPT OF STATE

Consider a single input-single output, linear, electrical network whose


structure is known. The input to the network is the time function v(t),
and the output of the network is the time function y(t). Since the network
is known, complete knowledge of the input v(t) over the time interval
— oo to t is sufficient to determine the output y(t) over the same time
interval. However, if the input is known only over the time interval t0 to t,
then the currents through the inductors and the voltages across the
capacitors at some time tl9 where t0 < tx < t (usually t1 = t0), must be
known in order to determine the output y(t) over the time interval t0 to t.
These currents and voltages constitute the “state” of the network at time
tv In this sense, the state of the network is related to the memory of the
network. For a purely resistive network (zero memory), only the present
input is required to determine the present output.
For another example of the state of a system, consider the solution of a
linear constant coefficient differential equation for t > t0. Once the form
of the complete solution is obtained in terms of arbitrary constants, these
constants can then be determined by the fact that the system must satisfy
boundary conditions at time t0. No other information is required. The
boundary conditions can be termed the state of the system at time t0.
Heuristically, the state of a system separates the future from the past, so
that the state contains all the relevant information concerning the past
history of the system required to determine the response for any input.
326 State Variables and Linear Continuous Systems

The idea of state is a fundamental concept and therefore cannot be


defined any more than, for example, the word “set” can be defined in
mathematics. The most that can be done is to state the properties required
of a system whose behavior involves the notion of state. The systems
which are considered here are classified as deterministic systems. A
deterministic system is defined by the following:!

(1) There is a class of time functions \(t) called admissible input


functions;
(2) for each time, a set Xt is defined whose elements x(t) are the possible
states at time t; and
(3) to each pair \(t), x(t) is assigned at least one time function, called
an output function, and for every t' > t, a ufiique element x(t') contained
within Xr.

These sets of states and assignment must satisfy the following consistency
conditions for a state-determined system:

(1) The admissible input functions must be such that, if v^t) and v2(t)
are admissible input functions, then

v3(0 = Vj(0 t< /„


v3(0 = v2(0 t > /„

is an admissible input function. Figure 5.4-1 illustrates the scalar case.


(2) The manner in which a system reaches a present state does not
affect the future output. The present state of a system and the present

t Item 2 allows for the possibility of a time-varying state space, and item 3 involves the
concept of a final or terminal state for each initial state at time t. These ideas differ
somewhat from the more general ideas expressed by Zadeh and Desoer,1 whose concept
of state evolves from the characterization of a system by a listing of all observable input-
output pairs.
Sec. 5.4 The Concept of State 327

Fig. 5.4-2

and future inputs to a system uniquely determine the present and future
outputs. Thus, for every x(/0) contained in Xto and admissible input
functions Y^t), v2(0 with v^r) = v2(0 for t > t0, any output function
associated with x(70) and \±(t) is identical with any output function asso¬
ciated with x(>0) and v2(/) for t > t0. This is illustrated in Fig. 5.4-2 for
the scalar case.
(3) If the initial state of a system and the input v(0, t > t0, are given,
then the output y(t), t > t0, is uniquely determined. Assume that the
328 State Variables and Linear Continuous Systems

Fig. 5.4-3

known input and output time functions are divided into two time intervals.
For the second interval, there may be many initial states for which the
given input function over this interval yields the given output function.
However, at least one of these possible initial states must be the terminal
state of the first interval. This is illustrated in Fig. 5.4-3. More formally,
for any x(t0) contained in Xto and input function v(7), let Xx contained in
Xt , tx > t0, be the set of all states for which the output function associated
with \(t) and x contained in Xx be the same as the output function asso¬
ciated with v(7) and x(f0) for t > tx. Then x(A) is contained in Xx.
These three consistency conditions can be written as a pair of equations,
which are called the state equations. They are

yOo> 0 = g[x(4); 'Vo. 0] (5.4-1)

x(/) = f[x(?0); ▼(/„, 0] (5.4-2)

where both g and f are single-valued functions. Equation 5.4-1 states that
the output y over the time interval t0 to Ms a single-valued function of the
state at the beginning of the interval and the input v over this time interval.
The state at the end of the interval is said, in Eq. 5.4-2, to be a single¬
valued function of the same argument. These two equations define a
state-determined system.
For most of what follows, the outputs of the integrators in the simulation
diagram are used as the components of the state vector. The state vector
is defined in terms of an //-dimensional state space, whose coordinates are
aq, aq,. . ., xn. The motion of the tip of the state vector in state space is
called the trajectory of the state vector.
Although the outputs of the integrators in the simulation diagram form
Sec. 5.5 Matrix Representation of Linear State Equations 329

a natural state vector, these variables may not be physically measurable


in a system. In order to control successfully a multivariable system, the
control laws must be in terms of measurable information about the system.
More is said in this regard later.

5.5 MATRIX REPRESENTATION OF LINEAR STATE EQUATIONS

In the previous section the state equations of a continuous deterministic


system were defined to be

y(0 = gMh))> vOo, 0]


x(0 = f[x(/0), v(f0,0] (5'5_1)

If the system can be described by a set of linear ordinary differential


equations, the state equations can be written as

x(0 = A (f)x(0 -f B(f)v(0


y(0 = C(0x(0 + D(t)r(t) (5-5'2)

where A(t), B(t), C(t), and D(t) are, in general, time-varying matrices, and

x± 01 Vi
x2 ^2 V2
• •
, V = » y =
* • •
• •

0m yf

A general diagram for these equations is shown in Fig. 5.5-1. If the system
is fixed, i.e., nontime-varying, then the matrices A(t), B(t), C(t), and D(0
are constant and can be written simply as A, B, C, and D.

Fig. 5.5-1
330 State Variables and Linear Continuous Systems

Example 5.5-1. Write a set of state equations for the system of Fig. 5.2-12.

For the a?’s, u’s, and y>s indicated in Fig. 5.2-12,

'x, ro l 0 0- -Xg -o 0-
x0 o -3 -2 0 x2 1 0 Vi
+
x. o 0 0 1 x3 0 0 v2
X. o -1 -1 0 X 4_ _0 1_

x.

Vi '1 0 0 0' X, Vi
+ [0 0]
V2 0 0 10 X, v2
l_X4_
Thus
-o 1 0 0- “0 0-
0 -3 -2 0 1 0 T 0 0 O'
A = , B = C = D = [0]
0 0 0 1 0 0 0 0 1 0
_0 -1 -1 0_ _0 1_
Example 5.5-2. Determine a set of state equations for a general nontime-varying
linear network consisting of resistors, inductors, and capacitors.
Without loss of generality, the network can be represented as in Fig. 5.5-2a. Define
the state variables as
x4, x2,. . . , xm — voltages across the resistive network terminal
pairs connected to Cu C2,. . . , Cm, respectively; and

xm+1, xm+2,. . . , xn = currents into the upper resistive network terminals


connected to Lm+l, Lm+2,. . . , Ln, respectively

Then the network can be described by x = Ax, or


n
Xi ^5^ OijXj
3=1
where the a^s are to be determined.
Assuming that the network is initially at rest, = 0, / = 1, 2,. . . , n. If a unit step
of voltage is applied in series with the yth capacitor at t = 0, initially all capacitors
appear as short circuits and all inductors appear as open circuits. This is true because
capacitor voltages and inductor currents cannot change instantaneously in response to a
finite signal. The current through the zth capacitor at t = 0+ is

4/0+) = —0+) = —CiOijXj{ 0+)

This value is the same as the constant current which would flow through the zth terminal
pair, if all capacitors were replaced by short circuits and all inductors replaced by open
circuits, and a source of 1 volt applied as abo.ve. Thus
Or ..
Clij
C*)

~Ci
i, j, 1,2, m

where is the current through the short circuit replacing Cu due to the unit step
voltage source replacing C3. Note that the resultant network which must be analyzed to
determine these au's is a purely resistive network.
Sec. 5.5 Matrix Representation of Linear State Equations 331

*771+1

Lm+ 1

Lm + 2

Ln

eL

For this same step of voltage applied in series with the y'th capacitor at t = 0, the
voltage across the /th inductor is

£tj(0+) — —Lpcf 0+) — —Lia^xf 0+)


Thus
0Lif i = m + 1» m + 2,.. ., n
hi ~
Li j = 1, 2, . . . , m
where a,-,- is the voltage across the /th open-circuited inductor terminals in response to
the 1-volt source replacing C3. All capacitors and inductors are short- and open-
circuited, respectively, so that, again, analysis of the purely resistive network is sufficient.
If instead of the above, a unit step of current is applied in parallel with the y'th
inductor at t = 0, the voltage across the /th inductor at t = 0+ is

eL.( 0+) — —Lpc^ 0+) — —Liai]Xj(0-\-)


332 State Variables and Linear Continuous Systems

Thus
ru
@ij i, j= m + m + 2, . . . , n
Li

where ri} is the voltage across the /th open-circuited terminal pair in response to a unit
current source applied to the /th terminal pair.
For the same current applied in parallel with the y'th inductor at t = 0, the current
through the /th capacitor at t = 0+ is

ic. = — CiX^O-y) = —CiOijX^ 0+)


Thus
fin i = 1, 2, . . . , m
Ci j = m + 1, m + 2, . . . , n

where fin is the current through the /th short-circuited terminal pair in response to a unit
current applied to the /th terminal pair.
By these simple steps, the A matrix for a general time-invariant, linear RLC network
can be found by examination of purely resistive networks. As a specific illustration of
the procedure, the reader can confirm that the A matrix for the circuit of Fig. 5.5-2b is

2 0

A = (R\ + ^2 )C
2R±R2
0 ” CRi + Rz)L

Using the description of Eq. 5.5-2, an nth order ordinary differential


equation characterizing a system can be rewritten as a set of n first order
differential equations in terms of the state variables x.
Example 5.5-3. Determine a state variable representation for the system described by

y + ay + by = v

where v{t) is the input and y{t) is the output.


A simulation diagram for this system is shown in Fig. 5.5-3. Convenient choices for
the state variables are the outputs of the integrators, i.e., y and y. Let xx = y} x2 = y
so that
xx = x2

x2 = —bx i — ax2 + v

Fig. 5.5-3
Sec• 5-5 Matrix Representation of Linear State Equations 333

The vector matrix representation for this system is

X, 0 1 ' x. 'O'
+
x, —b —a Xr 1
x,
y = [l 0] + [0]u
x*
Thus
‘ 0 1 ' O'
A = B = , C = [1 0], D = [0]
—b —a 1

The generalization of the procedure of Example 5.5-3 to the case of an


nih. order single input-single output, linear, constant coefficient system,
described by

(Pn + ^n-iPn 1 + • • • + <*1p + a 0)y

= (PnPn + Pn-lP^1 + * * ‘ + PlP + P0)V (5.5-3)

where p = d/dt is now considered. A simulation diagram for this system


is shown in Fig. 5.5-4. The outputs of the integrators are chosen as the
state variables. The constants ai and must be determined in terms of the
a’s and /Ts in order to relate the diagram to Eq. 5.5-3.
By inspection of the simulation diagram,

y — x i + b0v
+ bkv, k < n (5.5-4)
= ~ (¥i + +*’*+ an-\xr) + bnv

Differentiating y once yields


py — xx + b0v
v

Fig. 5.5-4
334 State Variables and Linear Continuous Systems

Substituting for xx from Eq. 5.5-4 gives

py = x2 + bp) + b0v (5.5-5)

Following this procedure, the second and higher derivatives of y are given
by

phj = x2 + bp) + b0v = x3 + b2v + bp) + b0v


pn-iy = xn + bn_p) + bn_2pv 4- * • * + Kp^v ^5‘5

Pny = —(aoxi + Vix2 + • • • + &n—ixn) + bnv + b^pu + * * * + b0pnv


\

Substituting for y, py, . . . ,pn~1y from Eqs. 5.5-4, 5.5-5, and 5.5-6 into
Eq. 5.5-3, and comparing the result with the expression forpny as given by
Eq. 5.5-6, the expressions for the at and bi are given by

at = oq (5.5-7)
and
b0 = Pn
b\ P &n—-\bo
^2 = Pn-2 an-\bl ~ arz-2^0 (5.5-8)

bn~ ft0 an-l^w-l an-2^n-2 * a0^0

Equation 5.5-8 is a convenient, if not explicit, form of an expression for the


b- s. For a given numerical case, the b's can be found by successive
substitution. Since Eq. 5.5-8 indicates that the /Ts can be written as
1_

1 0 0 • • • 0
o

Pn

P n—1 <*„_ 1 1 o • • • 0 bi

Pn—2 an-2 an_! 1 • * * 0 h

• = • • • • • •

• • • • • •
.

• • • • • •
.

oc0 oq * • ■ an_L 1_ Jn_


o
1

the b's can also be determined by premultiplying both sides of this


expression by the inverse of the coefficient matrix. As a consequence of
these expressions, one form of the matrices A, B, C, and D for a system
Sec. 5.5 Matrix Representation of Linear State Equations 335

described by Eq. 5.5-3 is given by

0 1 0 • • •
0

0 0 1 • • •
0
A = (5.5-9)
• • • • • • • • • • • •
1

— — a2 • • •
— a0 ai

b0 1 0 0 O' ftn

ftn-1
bi an-1 1 0 0
ft n—2
b2 ^rc—2 ^n—1 1 0
D
=
• • •
B
• • •

• • •

• • • a n—1 ftc
_ ^0 ai

• • •
[10 0 0]
o

II
o

Pn

These state equations characterize the system in the so-called standardform.


Example 5.5-4. Represent

y + 3i) + Ay + y = 2v + 3v + v + 2v

in standard form.
From Eq. 5.5-8,

b0 = fin = 2
bi = fin—i - v-n-ibo = 3 — 3(2) = -3
b 2 = fin-2 — *n-ibi — a n_2b0 = 1 — 3(—3) — 4(2) = 2
bo = fin-2 — *n-ib2 — an_2^i — a„_360 = 2 — 3(2) — 4(—3) — 1(2) = 6

Thus Eqs. 5.5-2 and 5.5-9 give as a vector matrix representation for this system

~xf ' 0 1 0~ ~xf "-3"


x2 = 0 0 1 x2 + 2
_x3_ -1 -4 — 3_ _x3_ 6_

Xi

y = [1 0 0] x2 + 2y
xo

The complete simulation diagram is shown in Fig. 5.5-5.


336 State Variables and Linear Continuous Systems

Fig. 5.5-5

State Equations—Partial Fractions Technique

An alternative form for the state equations can be obtained by a partial


fraction expansion of the transfer function of the system. Consider a
single input-single output system whose transfer function is given by H(s).
Assume that the transfer function can be formed as the ratio of two poly¬
nomials as N(s)/D(s), where the order of N(s) is at most equal to the order
of D(s). If it is assumed that D(s) has only distinct zeros, then D(s) can be
written as

D(s) = dn(s - A^O — Aa) • • • (j — A„)

where A* is a characteristic value of the system. The transfer function H(s)


can then be expanded into the form

H(s) = d0 + + •'• + — (5.5-10)


S — A1 S — A2 S —

where d0 = lim H(s)


s—► 00

and

N(s)
Ci = (s - A,.)
D(s) s=A,

The output T(s) = H(s) V(s) can be expressed as

n c
Y(s) = d0V(s) + 2 —s- K(s) (5.5-11)
s Aj 2=i —
Sec. 5.5 Matrix Representation of Linear State Equations 337

Consequently, the simulation diagram of Fig. 5.5-6 can be drawn, where


the state variable xi satisfies the first order differential equation

*i - Kxi = v
and
n

y = 2 cixi + do V
2=1

The state equations are then

x-, «1 1

1
o
Xc
0 • • • 1
0 22 • • • 0 •
+ V

.


0 0 • • •
1_
s

Xn _Xn_ 1
(5.5-12)
Xi

Xo

y = [ci f-. + d0v

Xn

Fig. 5.5-6
338 State Variables and Linear Continuous Systems

or
x = Ax + vl
(5,5-13)
y = (c, x) + d0v

This method of determining a state variable representation of the system


leads to state equations in the normal form of Eq. 4.7-21.

Partial Fractions Expansion—Repeated Roots

For the case where the roots of the denominator polynomial D{s) are
not distinct, the partial fraction technique leads to a nondiagonal Jordan
canonical form. As an example of this, consider the case where D(s) is of
the form
D(s) = dn(s - Xy(s - A2)(j - A3) • • • 0 — K)

The partial fraction expansion of H(s) = N(s)/D(s) is then

cn 12 Ik
H(s) — d0 + + k-1 + * • • +
(s - Kf (* - 0 - ^i)
Co Co e„
+ -+-2— + • • • +
S — 2o S /i-: s — Xn
n
<T3- C„■
— d0 + 2 +
f1 (s - xyk~i+l) ' i“* s - A,
2 (5.5-14)

where
d0 = lim H(s)
S-> 00

1 v-1 kN(s)
clA = (s - AJ
U - 1)! ds*-1 D(s)J s=;.x
and
N(s)
Ci = (s - A,)
D(s) s=A,

The output T(s) = //(s)K(s) is then

Y(s) = d9V(.s) + 2cuX,(s) + 2 c, V(s) (5.5-15)


A=1 *= 2
where
ns) ns)
V(s) = and X((s) =
(s - xy-i+i - s - a.
However, S/s) can be written in terms of Xj+1(s) as

1
n(s) = n+i(s) 7 = 1.fc — 1 (5.5-16)
(s - Aj)
Sec. 5.5 Matrix Representation of Linear State Equations 339

As a consequence of Eqs. 5.5-15 and 5.5-16 the simulation diagram of


the system can be drawn as shown in Fig. 5.5-7. The state equations of
this system are
x, *1 0
Xc
1 0 0 x2 0
0 •
0 *1 1

0
Xr ** + 1 v
Xk+1
0 0 0 Aj 0 0 0
%-t-l
1
0 0 0 0 A2 0 0 • •


• • • • • • •

• •
0 0 0 0 0 0
Xn _*n _ 1
(5.5-17)
340 State Variables and Linear Continuous Systems

Xi

x2

Xr
y — [cn ci2 -Ik C«,
cj + d0v
Xk+1

or
x = Jx + bv
(5.5-18)
y = <c, x) + d0v

where J, b, and c are defined by the equivalence of Eqs. 5.5-17 and 5.5-18.
The partial fraction technique is appealing when the system has only one
input and one output. The result of this approach is an A matrix which is
either in diagonal form or a nondiagonal Jordan form. This form is
particularly convenient when the dynamic behavior of the system is to be
found. However, this approach presents certain difficulties. Most of
these difficulties can be overcome by the general technique of converting
the state equations in standard form (if they are available) to the normal
form.

Normal Form

The method of partial fraction expansion is useful when the state equa¬
tions of a single input-single output system are to be found. In this case
B, and C are vectors and D, v, and y are scalars. When the system is
multivariable, the method of partial fraction expansion can become un¬
wieldy. It then becomes desirable to be able to convert the state equations
in standard form to a form where the A matrix is either a diagonal matrix
or a more general Jordan matrix. This conversion can be accomplished
by means of a similarity transformation.
Consider the fixed system defined by the state equations in the standard
form
x = Ax + Bv
(5.5-19)
y = Cx + Dv

It is assumed that the characteristic values of A are distinct. If the linear


transformation x = Mq is introduced, where M is the modal matrix, then
Sec. 5.5 Matrix Representation of Linear State Equations 341

Eq. 5.5-19 can be written as


Mq = AMq -f Bv
(5.5-20)
y = CMq + Dv
Premultiplication of the first of these expressions by M_1 yields
q = M-1AMq + M^Bv
Since M is the modal matrix, the similarity transformation M_1AM results
in a diagonal Jordan matrix A. The principal diagonal elements of A are
the characteristic values A1? A2, . . . , An. Consequently,
q = Aq + B„v
(5.5-21)
y = C„q + D„V
where A = M-1AM, B„ = M *B, C„ = CM, and D„ = D. Equation
5.5-21 is known as the normal form for the state equations. In this form
the differential equations in terms of the state variables qu q2, . . . ,qn
are uncoupled. That is, they are of the form qt — Xiqi + f, where f is the
forcing function applied to the zth state variable. This procedure can be
applied when the standard form for the state equations is known, and as
such it is a more general approach than the partial fraction expansion of the
transfer function. It is also shown in later sections that the transfer function
approach can lead to serious misunderstanding about the system perform¬
ance.
In the case of repeated characteristic values, A may be replaced by a
nondiagonal Jordan matrix. It should be noted however, that the occur¬
rence of repeated roots in the characteristic equation |AI — A| = 0 does
not of necessity imply a nondiagonal Jordan matrix in the normal form.
Section 4.8 showed, for example, that, if the degeneracy of the character¬
istic matrix is equal to the order of the root, then sufficient characteristic
vectors can be obtained such that the modal matrix is nonsingular. The
matrix A can then be transformed by the similarity transformation M_1AM
into a diagonal matrix.
Example 5.5-5. Write the state equations for the dynamics of Fig. 5.5-8a in normal
form.
For the state variables indicated in Fig. 5.5-8a, the standard form of the state equations
is

x +

y = [1 0]x + 2v
The modal matrices M and M-1 associated with the A matrix are

" 1 1 " 2 1
M = and M_1 =
-1 —2 _ -1 -1_
342 State Variables and Linear Continuous Systems

(a)

(b)

(c)
Fig. 5.5-8

The normal form matrices are


0 “

A = M XAM =
2

B = M B = n

Cn = CM = [1

Dn — d o — 2
Sec. 5.5 Matrix Representation of Linear State Equations 343

The normal form for the state equations is

q = Aq + bnv
y = <cq> + d0v

where q = M_1x, or qx = 2a?! + x2, q2 = —xx — x2. The block diagram for these
equations is shown in Fig. 5.5-86.
The transfer function of this single input-single output system is

2s2 + Ss + 11 2s2 + Ss T 11
H(s) =
s2 + 3s + 2 (5 + 1)(^ + 2)
The partial fraction expansion of H(s) is then

d0 = lim H(s) = 2
s—► 00

c, = H(s)(s + Dl,..! = 5
c2 = H(s)(s + 2)|»__2 = —3
Consequently
5 3
H(s) = 2 +
s + 1 s + 2
The state equations of this system are

"-1 0“ ~r
X = x +
0 -2 1
y — [5 — 3]x + 2v

This form is represented in Fig. 5.5-8c.


Note that the bt and of the partial fraction expansion are identical with the c* and
bi, respectively, of the normal form. This interchange of input and output gain does not
affect the validity of either approach. The important factor is the overall gain 6jC,-.
Certainly the xt of the partial fraction expansion could be redefined so that the two
methods would be identical. However, if this is done, the partial fraction expansion
technique for repeated poles of the transfer function does not produce a Jordan matrix.
This point is illustrated in Problem 5.13.

Example 5.5-6. Write the state equation for the system of Fig. 5.2-13 in normal form.
For the state variables indicated in Fig. 5.2-13, xz is already in normal form. Thus
only xx and x2 need be considered. Then

" 0 r xx "0 v1
+
-2 -3 x2 1 v2

pi
y 2 = [i 0]
L^2_
or

c = [1 0], D = [0]

From Example 5.5-5,

“ 1 r 2 r "-1 O'
M = , M-1 = , A =
-1 — 2_ -1 -1 0 -2_
344 State Variables and Linear Continuous Systems

Hence

B = MB = Cn = CM = [1 1]

The diagram for these equations is shown in Fig. 5.5-9.

5.6 MODE INTERPRETATION—A GEOMETRICAL CONCEPT2

In this section a geometrical interpretation is applied to linear fixed


systems. The principal advantages of the approach are its conceptual
simplicity, and that it offers the engineer an intuitive “feel” for the nature
of solutions of linear systems. The importance of this feel or insight is not
to be overlooked.
As an introduction to the mode concept, consider the differential equa¬
tion
y + + 2y = fit)

Taking the Laplace transform,

Y(s) = F(s) , sy(0) + 2/(0) + 3y(0)


(s + l)(s + 2) (s + l)(s + 2)
Hence

F(s)
2/(0 = se -l
+ [22/(0) + 2/(0)]€-‘ - [2/(0) + 2/(0)]<T2'
_fs T- l)fs + 2T
Sec. 5.6 Mode Interpretation—A Geometrical Concept 345

If the terms of the response involving <rl and e~2t are called “modes” of
the system, then the solution y(t) may or may not exhibit these modes,
depending upon the initial conditions and the zeros of F(s). For example,
if the initial conditions are such that 2y(0) + 2/(0) = 0, then the initial
condition response does not exhibit the mode If, in addition, F(s)
has a zero at s = — 1, then the complete solution for y(t) does not exhibit
the mode The mode e~l is completely suppressed. Consequently, if
the initial conditions are such that a zero of the response transform occurs
at the same point as does a pole of the transfer function of the system, and
if a zero of the forcing function transform also occurs there, then the mode
corresponding to that pole does not appear in the system response. This
pole-zero cancellation, although somewhat obvious, leads to the general
concepts which are to be discussed.

Linear Fixed Homogeneous Systems

Assume for simplicity that a system has distinct characteristic roots


(eigenvalues) 2l5 A2, . . . , An. To each characteristic root 'ki there is a
characteristic vector (eigenvector) u^. Let the characteristic vectors be
normalized, so that |uj2 = 1. (This is equivalent to setting the length of
each of the characteristic vectors equal to unity.) For the linear unforced
system
x = Ax (5.6-1)

the characteristic vectors are defined by

Au* = ^uz, <uf,uf)=l (/=1, ...,») (5.6-2)

Since the characteristic roots are distinct, the characteristic vectors are
linearly independent. Therefore x(t) can be uniquely expressed as a linear
combination of these characteristic vectors:
n

X(() = 2 ai(0ui (5.6-3)


1=1

The general form for the oq(?) is given by

*<(0 = cd,t (5.6-4)

where the c/s are constants. Thus

X(t) = 2 ciei,tUt. (5.6-5)


1=1

Evaluating Eq. 5.6-5 at t = 0 yields


n
x(0) = 2 C{u; (5.6-6)
i=l
346 State Variables and Linear Continuous Systems

The constants c{ can be determined by utilizing the reciprocal basis r*


defined by the relation
(Ti> Uj) = da (i,j = 1, ...,«)

By forming the scalar product of both sides of Eq. 5.6-6 with the con¬
stant ci is found to be
c, = <r„ x(0)> (5.6-7)

Therefore the initial condition response of a linear fixed system can be


written as

x(0 = 2 <r„ x(0))e'ii,ui (5.6-8)


i=1 V

The scalar product (ri? x(0)) represents the magnitude of excitation of the
ith mode of the system due to the initial conditions. If the initial conditions
initially lie along the zth characteristic vector, then only the ith mode is
excited. The scalar products (r3, x(0)), where i ^ j, are identically zero.
Therefore, for a linear, fixed, unforced system with distinct characteristic
values, the initial condition response is given by a linear weighted sum of
the modes where is a characteristic root of the A matrix.
Example 5.6-1. To illustrate the use of the mode expansion technique analyze the
system considered at the beginning of this section.
For this system and dC OC ^ j OC “ OC 2 j

The characteristic vectors and a reciprocal basis are

1/V5'
u2 =
-2/V5
and
1 <N

<
1_

>

1
<N

1
_1

j r2 —
1
>
1

_
1

Thus the solution for the initial condition response is given by

x(o = 2 eXitvLi
i=1

= llV2y(0) + V2y(0) ]e-'u,

- [V5 y(0) + V5 y(P)h~z‘ u2

Note that, if the initial condition vector is set equal to one of the characteristic vectors
(or any factor times the characteristic vector), then the scalar product (r*, x(0)> vanishes
for all but the component associated with that characteristic vector. For example, if the
initial conditions are such that y{0) = 1,2/(0) = — 1, then the initial condition vector
lies along ux and only the mode involving is excited.
Sec. 5.6 Mode Interpretation—A Geometrical Concept 347

Linear, Fixed, Forced Systems

So far, only the homogeneous case case been discussed. For the case in
which a forcing function is applied to the system, the same general prin¬
ciples apply if the forcing function is first decomposed among the
vectors as
n
Bv = f(f) = 2//0U, (5.6-9)
i=1

where ft(t) = (r*, f(t)> = (ri? Bv(t)). Utilizing the convolution integral for
linear fixed systems, the general expression for x(t), assuming distinct
characteristic values, is then given by

x(0 = 2 (ij, x(0))e;l*'uj +[ 2 (rs> Bv(r)>€A‘<<-r)ui dr (5.6-10)


i= l JO i—1

The importance of Eq. 5-6.10 lies in that the effect of the forcing function
on each mode is considered independently. The amount of excitation of the
zth mode due to the forcing function is given by

| Vi, Bv(r))e'l<(<-'>ui dr
Jo

If the forcing function is so selected that it always lies along the direction
of one of the characteristic vectors, then only one mode is excited by the
forcing function. This situation occurs in circuits that have symmetrical
or balanced properties. In these circuits, the modes are referred to as
symmetrical or antisymmetrical, such that a “symmetrical” forcing func¬
tion excites only the symmetrical modes.3

Complex Characteristic Roots

If the coefficient matrix possesses a complex characteristic root Ax, then


A2 = is also a complex characteristic root. The characteristic vectors
ux and u2 are also complex conjugates, such that u2 = uA*. The following
relationships then exist between the characteristic vectors and the recip¬
rocal basis vectors:

— al + jlh 22 — &1 jPi


“i = + A" u2 = < - >l" (5.6-11)
2rj = r/ + jti 2r2 = r/ - yr,"

<r/, »i'> = 1 (r/', u/> = 0


(5.6-12)
W, O = 0 <rr, «i"> = 1
348 State Variables and Linear Continuous Systems

The normalization condition is given by

«, <) + «, u/> = 1 (5.6-13)

By utilizing these expressions, the unforced solution for a system with k


pairs of complex characteristic roots is given by
k

x(0 = 2 e<M{lXr/> x(0)) cos + (r/, x(0)) sin


i=l

+ [<r/', x(0)> cos Pit - <r/, x(0)) sin /VK'} (5.6-14)


The amplitude and phase of each oscillatory mode depends on the initial
conditions. If the initial conditions are equal to u/, then from Eqs. 5.6-12
and 5.6-14 the solution is \
x(t) = eai*[(cos Pit)ui — (sin pxt)ii/'] (Mode 1) (5.6-15)
If the initial conditions are equal to u/', then similarly the solution is

x(t) = eai<[(sin P±t)u/ + (cos pit)u/'] (Mode 2) (5.6-16)


Note that, for a complex characteristic root, the motion in phase space
takes the form of an exponential spiral in the u/, u/' plane.
Example 5.6-2. Assume that the system under investigation has an A matrix given by

Determine the initial condition response of each of its modes.


The characteristic roots of this matrix are
— —1 +j
A2 = -1 -j

The characteristic vectors are found by successively substituting the A/s into Adj [Al -A].
Since
"A + 2 1
Adj [AI - A] =
-2 A

the normalized characteristic vectors are given by


1 1 +/ 1 -/
U, = —7= Uo = —7=
V6 -2 V6 -2
Notice that u2 = u^.
The reciprocal basis is found from Eq. 5.6-12 as

r/ = V/- 0
6 V6
T
The complete solution is then

x(/) = e-‘{[<r/, x(0)> cos t + <r/', x(0)> sin /]u/


+ [(r/', x(0)> cos t — <r/, x(0)> sin du/'}
Sec. 5.7 Controllability and Observability 349

If the initial conditions are so chosen that x(0) = u/, then only the mode

x(0 = e_t[(cos Ouf — (sin/)u1//] (Mode 1)

is excited. If the initial conditions are so chosen that x(0) = u/', then the mode

x(r) = e~'[(sin 0ui' + (cos 0ui"] (Mode 2)

is excited. The difference between these two modes is the amplitude and phase of the
damped oscillations.

The mode expansion is a geometric representation of the diagonalized


form of the A matrix. This representation shows that, for distinct charac¬
teristic roots, the modes of the system are uncoupled and are independently
excited by the initial conditions and the input forcing function. The
amount of excitation is measured by the scalar product of the reciprocal
basis and the vector in question (initial conditions or input forcing func¬
tion). For systems which do not have distinct characteristic roots, the
modes are not necessarily uncoupled. The Jordan canonical form is about
the best that can be accomplished in the way of diagonalizing the A matrix
in this case.

5.7 CONTROLLABILITY AND OBSERVABILITY

The increasing emphasis on linear multivariable control systems has


caused a thorough re-examination of the intuitive concepts which have
been handed down from the early studies on single input-single output
systems. This is due to the large increase in design effort which must be
expended in going from the single variable control system to the multi-
variable system. Most of these early concepts can be extended to multi-
variable problems by the use of matrix techniques; however, there are
some important tools of analysis and synthesis which must be examined
with greater care. Transform techniques and transfer functions have
been widely used, since they enable the engineer to work with algebraic
equations rather than differential equations. However, when single
variable subsystems are connected together to form multivariable systems,
the transfer functions involved must be examined with care lest some
degree of freedom be lost in the combination. In this section, some of the
problems connected with multivariable systems are outlined. For a more
complete discussion the reader is referred to References 4 to 9.
Consider the set of system equations

x = Ax + Bv
(5.7-1)
y = Cx + Dv
350 State Variables and Linear Continuous Systems

It is assumed that this system has n state variables, m inputs, p outputs,


and distinct characteristic values. In terms of the mode expansion tech¬
nique of Section 5.6, the solution to Eq. 5.7-1 is given by (see Eq. 5.6-10)
n 't n
x(0 = 2 <r„ x(0)>€Ai<ui + 2 <r„ dr (5.7-2)
JO i= 1

If the forcing function is expressed as the sum of its components, i.e.,

Bv = b1v1 + b2v2 + b3v3 + • * • + bmvm (5.7-3)


where

then the contribution to x(t) due to the forcing function can be expressed
as
t 72 LYl
2 2 <ri> bJ.)uXT)€'iilt“,)Ui dr
Jo 1=1 0=1

If the scalar product (iy, b;> is zero for some mode for ally, then the input
is not coupled to that mode, and, regardless of what the forcing function is,
there is no way for the input to excite or control this mode. Obviously,
a criterion for complete controllability of a linear time-invariant system
is that the scalar products (ly, b3-) do not vanish for ally.f
A system which is not observable has dynamic modes of behavior which
cannot be ascertained from measurements of the available outputs.8
Considering only the initial condition excitation of the modes, the output y
is given by Eqs. 5.7-1 and 5.7-2 as
n

y = c2 \ri. x(0))J\ (5.7-4)


4=1

assuming that A has distinct 2’s. A condition that the ith mode disappear
in all the outputs is
(c u,-) = 0 j,for all j (5.7-5)
t This is a necessary and sufficient condition if A has distinct A’s. If A has repeated
/Ts and is diagonalizable, then this is necessary but not sufficient. In this case or if A
can only be transformed to a more general Jordan form, then one should consider the
more basic requirement for controllability, which is that there be an input \(t),
0 < t < T < oo, such that x(0) can be forced to x(T) = 0. This leads to the necessary
and sufficient condition that the column vectors of the matrix [B, AB, . . . , A(n-1)B]
span the state space of the system.
Sec. 5.7 Controllability and Observability 351

where c5- is a column vector comprised of the elements of theyth row of C.


This condition extends to the case where a forcing function is applied.!
If the system equations have distinct A’s and are written in the normal
form
q = Aq + Bnv
(5.7-6)
y = cnq + Dnv

then each state variable qt of the system represents a different mode of


behavior. In this form, the conditions for controllability and observa¬
bility become quite clear. Controllability is a function of the coupling
between the inputs and the various modes of the system. A particular
mode (or state variable) cannot be controlled if the input is not coupled
into this mode. This would be the case if there were a zero row in the
matrix.
On the other hand, a particular mode of behavior would not be observed
in any of the outputs of the system if there were no coupling between that
mode and any of the outputs. This would be the case if there were any
zero columns of the Cn matrix. Therefore, if the system equations can be
written in the normal form with distinct A’s, the conditions for complete
controllability and observability are:

Controllability: no z^ro rows of Bn


Observability: no zero columns of Cn

Generally speaking, a system can always be divided into four subsystems


which display these concepts. This is shown in Fig. 5.7-1. The subsystems
have the following characteristics:

System S*: completely controllable and completely observable


System Sc: completely controllable but unobservable
System S°: uncontrollable but completely observable
System Sf: uncontrollable and unobservable

The requirement that a system contain only subsystems S* is that the


entire system is controllable and observable. It should be pointed out that
disregard of subsystems Sc, S°, and Sf may lead the system designer into a
position where the overall system is unstable, while subsystem S* is
perfectly well-behaved. If, for example, subsystem S° contains unstable
modes, then excitation of these modes by nonzero initial conditions yields

t If A has repeated A’s, the basic requirement for observability is that, for some
T > 0 and all initial states x(0), knowledge of A, C and y(t), 0 < t < T, suffices to
determine x(0). This leads to the necessary and sufficient condition for observability
that the columns of [C*r, A*TC*T, . . . , (A*r)(n-1,C*7'] span the system state space.
352 State Variables and Linear Continuous Systems

Fig. 5.7-1

an unstable output. Similarly, even though subsystem Sc is unobservable,


the required forcing function to subsystem S* may cause the state variables
of Sc to become excessively large, damaging the system. The same type of
reasoning holds true for subsystem Sf.

Composite Systems

When a system is comprised of some or all of the components S*, Sc, S°,
or Sf, the overall system may or may not exhibit the properties of these
subsystems, depending upon the connections between the subsystems.
Example 5.7-1.

System 1: xx — —xx + vl System 2: x2 = —2x2 + v2

Vi = X1 +

Now let the output of system 1 be connected to the input of system 2, and let the output
of the system be taken as y = x2 — yx. This is shown in Fig. 5.7-2. Determine the
controllability and observability characteristics of the composite system.

Fig. 5.7-2
Sec. 5.7 Controllability and Observability 353

The state variable formulation of the composite system is

x1
+
x2

y = [—l Vi

Using the mode expansion technique, Ax = — 1, A2 = —2, and

1/V2
u, =
_1/V2_

1
r2 =
1
For this system,
(r„b> = V 2 <r2, b) = 0
<c, ux> = 0 (c, u2> = -1
Therefore the mode e~2tu2 is not controllable, and the mode is not observable.
Thus, even though the individual subsystems are both observable and controllable, the
overall system is neither observable nor controllable.
An interesting point arises if transfer functions are utilized. The transfer function of
the first system is (s 4- 2 )/(s + 1). The transfer function of the second system is l/(s + 2).
Therefore the overall transfer function from vx to x2 is
X2(s) _ 1
V2{s) s + 1

The overall transfer function is then


_ F(s) s + 2 1
--+- = -1
Vl(s) s + 1 s + 1

This confirms the previous analysis that the mode e~2'u2 cannot be excited and the mode
cannot be observed. This analysis by means of transfer functions shows that the
use of a transfer function for an overall system may mask some of the modes of the
subsystems.

Gilbert has developed a theorem which lists some of the important


controllability and observability aspects of a general linear feedback
system.8

Theorem 5.7-1. Let the linear subsystems Sa and Sb be connected in


the feedback arrangement of Fig. 5.7-3. The cascade connection SaSb
is denoted Sc, and the cascade connection SbSa is denoted S0.

(a) The order n of the system is equal to the sum of the orders of Sa and
Sb, i.e., n = na + nb.
(b) A necessary and sufficient condition that the feedback system be
controllable (observable) is that SC(S0) be controllable (observable).
354 State Variables and Linear Continuous Systems

ya = y

yb V6
sb

Fig. 5.7-3

(c) A necessary but not sufficient condition that the feedback system be
controllable (observable) is that both Sa and Sb be controllable (observable).
(<d) If Sa and Sb are both controllable (observable), any uncontrollable
(unobservable) coordinates of the feedback system are uncontrollable
(unobservable) coordinates of SC(S0) and originate in Sb.

The importance of this theorem lies in the fact that controllability and
observability can be determined from the individual open-loop subsystems
Sa and Sb. This is a great aid in analysis.

Transfer Functions

The use of transfer functions, or transfer function matrices in the case of


multivariable systems, is widespread among practicing engineers, for two
reasons. One is that transfer functions allow the use of algebraic equations
rather than differential equations. The second is due to the smaller size of
the transfer function matrix. For example, if a system has n state variables,
m < n inputs, and p < n outputs, then the size of the A matrix is (n x n),
while the size of the H(s) matrix is (m x /?)<(« x n). In using these
matrices however, there are two common errors. The first error is the
failure to realize that the transfer function of a composite system may mask
the modes of the subsystems. This was illustrated in Example 5.7-1, where
the pole 5 = —2 of system 2 was canceled by the zero of system 1.
The second error is that, although the elements of the transfer function
matrix may indicate all the modes of the subsystems, it may fail to indicate
the modes of the composite system. A transfer function matrix represents
only the observable and controllable part of the system, namely S'*.f A
system may consist of subsystems which individually are observable and
controllable, but the composite system may be neither observable nor
controllable. This was illustrated in Example 5.7-1. Thus the order of the
system may be underestimated, and an incorrect conclusion about the
stability of the system may result.

t If A has repeated characteristic values, this may not even be true. For example,
consider such a case in which A is diagonalizable.
Sec. 5.7 Controllability and Observability 355

Example 5.7-2. Determine the transfer function matrix and the order of the system
described by
Vi + ayx = vx + v2
y-z + {a + b)y2 + aby2 = vx + avx + v2 a ^ b

Transforming these equations for zero initial conditions,

Vx(s) V2(s)
4
Y1(s) = - -i +
s + a s + a

Vx(s) V2(s)
Y2(s) = -^-r +
s + b (s + a)(s + b)
The transfer function matrix is

■ 1 1 “1 1 " “0 0 ”
5 + a s + a 1 1
H0) = = 0 7- + 1
1 1 b — a a — b
_s + b (s + a)(s + b)_ s + a s + b

The immediate, but erroneous, conclusion is that this matrix represents a second
order system. However, if a simulation of this system is made, the minimum number of
integrators required is three. This is shown in Fig. 5.7-4. This fact is evident from the
original differential equations.

To be sure of the order of a system, the use of a theorem by Gilbert


always gives the minimum order of the differential equations corresponding
to an H(s) with simple poles.

Theorem 5.7-2. Let H(s) be expanded into partial fractions as


m i/'

H(s) = 2 + D t5'7’7)
i—l S — At-

where
K- = lim (s — A*)H(s) and D = lim H(s)
S-+A,- s-» oo

It is assumed that the elements of H(s) have, at most, m finite simple poles,
where m < oo. The rank of the zth pole, ri9 is defined as the rank of the
356 State Variables and Linear Continuous Systems

matrix K^. Then the system cannot be expressed by a set of differential


equations of order less than
m

n = Xri
i=1

As an example of this theorem consider Example 5.7-2. Let = —a


and 22 = —b. Then r1 = 2, r2 = 1, and n = r1 + r2 = 3. The minimum
order of the system is three.

In conclusion it should be stated that, when dealing with multivariable


systems, controllability and observability play an important role in the
analysis and synthesis of such systems. When using transform techniques,
a firm grasp must be kept on the physical problem, lest the mathematical
manipulations obscure the true nature of the system.f

5.8 LINEAR FIXED SYSTEMS—THE STATE TRANSITION MATRIX

The homogeneous differential equation for a linear fixed system is given


in vector matrix form by
x = Ax (5.8-1)

The matrix A is a constant (n x n) coefficient matrix, and the vector x is an


(n x 1) column matrix consisting of the n state variables aq, x2, . . . , xn.
Similarly to the solution for scalar differential equations, the solution to
Eq. 5.8-1 is given by
x(0 = eA,‘-r)x(r) (5.8-2)
where the matrix <=A* is defined by Eq. 4.11-1 to be the infinite series
A 2^2 a 3^3
eA = I + At + — + — + ■■ • (5.8-3)
2! 3!
Equation 5.8-2 can be substituted into Eq. 5.8-1 to verify that it is a
solution. Note that, at / = r, the matrix eA(*_T) is equal to I, the identity
matrix. Therefore the boundary conditions are satisfied, i.e.,

x(t) = €AOx(t) = Ix(t) = x(t)

The matrix 4>(/) = eAf is called the state transition matrix of the system
described by Eq. 5.8-1. Mathematicians prefer the term fundamental
matrix.% The nomenclature state transition matrix is more descriptive of
t For complete discussion of this problem, including the case where the characteristic
values are not distinct, the reader is referred to References 5, 8, and 10.
± For a rigorous discussion of fundamental matrices and their role in the solution of
differential equations, see Reference 11 or 12.
Sec. 5.8 Linear Fixed Systems—The State Transition Matrix 357

the use of this matrix and is generally preferred by the engineering


community.
The state transition matrix “describes” the motion of the tip of the state
vector from its initial position in state space, and as such it describes the
transition of the state of the system. Since the vector x(t) describes all the
time functions x±(t), x2(t),. . . , xn(t), a great deal of information is available
in this vector. Necessarily, the computation of the state transition matrix
is greater than usually encountered in solving for the dependent variable
in a linear differential equation. However, the additional information
available enables the system designer to utilize more sophisticated design
techniques.
The calculation of the state transition matrix cj>(7) may be performed in
several different ways. The principal methods are Sylvester’s theorem and
the Cayley-Hamilton technique, which were both considered in Section
4.10, and the infinite series method, the frequency-domain method, and
the transfer function method. Another method is illustrated in Example
5.9-2 and Problem 5.26.

Infinite Series Method

From the definition of eAt (Eq. 5.8-3), the state transition matrix 4>(7)
can be calculated by the infinite series
\2t2 A 3t3
<K0 = I + A? + — + — + • • •
2! 3!
Unless disappears for some small value of k, this method is the most
laborious. Once the summation is performed, the closed form of each
series for each element of the<J>(0 matrix must be found. This is generally
not a simple task, and, unless the order of cj>(?) is sufficiently low or the
analyst is sufficiently clever, the task may be insurmountable. A simple
example points out the difficulty involved.
Example 5.8-1. Find the state transition matrix <f>(0 for the matrix

'0 - 2'

A =
1 -3

The powers of A can be found by successive multiplication by A, so that

-2 6 '6 14'
A2 = and A3 =
-3 7 7 -15

Therefore, from Eq. 5.8-3, 4>(/) is given by

"1 0“ "0 -2" "-2 6“ t2 '6 14“


<M') = + / + — + +
7_ 2! 3!
7
>n
r-

_0 1_ _1 —3_ -3
i
358 State Variables and Linear Continuous Systems

Collecting terms yields

212 6r3 612 14/3


1 — — + — + ■ ■ ■ -21 -b +
2! 3! 2! 3!
4>(0 =
312 It3 It2 15r3
t-- + — + * * * 1 - 3t + — - +
2! 3! 2! 3!

By recognizing the infinite series for each element (this is the principal drawback of this
method),
r2e-J - e~2t 2(e~2t 6-01
<*>(') =
2e~2t

Frequency-Domain Method

If the Laplace transform of both sides of Eq. 5.8-1 is taken, the result is
sX(s) - x(0) = AX(j). Thus

X(s) = [si - At'xiO) (5.8-4)

Taking the inverse transform of both sides of Eq. 5.8-4 gives

x(0 = - AJ-^xCO) (5.8-5)

Comparing Eqs. 5.8-5 and 5.8-2, the conclusion is reached that

4>(0 = eA< = 2&~i[sl - A]-1 (5.8-6)

This method may be the most convenient to use for many problems. The
obvious difficulty is finding the inverse of [si — A].

Example 5.8-2. For the same A matrix as in the previous example, calculate <J>(0
using the frequency-domain approach.
Since
^ 2
[si ~ A]
-1 5 + 3

the inverse matrix [si — A]-1 is given by


s + 3 —2
1 s_
4>(s) = [si - A]-1
s2 + 3s + 2

where <E>(s) is the transform of <}>(0- T aking the inverse transform of <h(s) element by
element, the expression for is found to be the same as previously computed in
Example 5.8-1.
Sec. 5.8 Linear Fixed Systems—The State Transition Matrix 359

Transfer Function Method

The zth term x^t) of Eq. 5.8-2 can be written as the summation

*«(') = i 4<0*/0) (5.8-7)


3=1

assuming r = 0. The element </>^(7) can be determined by placing a unit


initial condition on state variable xj and zero initial conditions on all other
state variables. The time function xt(t) is then equal to </>tJ(0- In terms
of the simulation diagrams discussed in Section 5.2, this is equivalent to
placing a unit initial condition on the output of integrator j, and observing
the output of integrator i. However, placing a unit initial condition on the
output of an integrator is equivalent to placing a unit impulse on the input
to the same integrator. Therefore </>^(t) is the time response of the output
of integrator /, when a unit impulse is placed on the input to integrator j,
and all other integrators have zero initial conditions. Thus Otj{s) can be
interpreted as the transfer function from the input to a summer at the
input of integrator j to the output of integrator i. The collection of these
transfer functions forms <£(5) = [si — A]-1. The state transition matrix
in the frequency domain, O(s), can therefore be determined by an inspection
of the simulation diagram.

Example 5.8-3. The simulation diagram for the system described by the A matrix used
for the previous two examples is shown in Fig. 5.8-la. Determine

(a)

Integrator > Integrator


V + 2 . \ 1
1
s + 3
—i i—=► -2 irw l
360 State Variables and Linear Continuous Systems

With no loss in generality, the system can be redrawn in the frequency-domain form
shown in Fig. 5.8-16. All the transfer functions are then immediately evident.

Vs s + 3
#u0)
2 s2 + 3s + 2
s(s + 3)

-2
s(s + 3) __ -2
#12 (S)
“ 2 s2 + 3s + 2
+ s(s + 3)
1
s(s + 3) 1
#2lO)
2 52 4- 3s + 2
+ s(s + 3)
1
s + 3 _ s
#2 2Cf)
2 s2 + 3s + 2
s(s + 3)
Hence
s + 3 -2
1 s_
<£(s) =
s2 + 3s + 2

This matrix checks with ^(s) found in Example 5.8-2 by computation of [si — A]-1.
The calculation of <J>(7) is then simply the inverse transform of ^(s), element by element.

Linear Fixed Systems—The Complete Solution

The complete set of state variable equations for a linear fixed system is
given by
x = Ax + By
(5.8-8)
y = Cx + Dv

These equations can be interpreted in the following manner. A is the


essential matrix of the system, as the structure of this matrix decides the
nature of the state transition matrix. The nature of all solutions, whether
forced or unforced, depend upon this matrix. Clearly, if all the character¬
istic values of this matrix have negative real parts, then the solution to
Eq. 5.8-1 approaches zero as t approaches infinity. However, if one of the
characteristic values has a positive real part, then at least one of the state
variables becomes unbounded as t approaches infinity. If the characteristic
values which lie along the imaginary axis (zero real part) are simple,
4>(?) does not approach the null matrix as t approaches infinity, but it is
bounded.
Sec. 5.8 Linear Fixed Systems—The State Transition Matrix 361

B is a coupling matrix; the structure of this matrix determines how the


input is coupled to the various state variables. C is also a coupling matrix,
coupling the state variables to the output. Thus the first term of y(t) in
Eq. 5.8-8 represents the coupling of the state variables into the various
components of the output vector. D again is a coupling matrix, as it
directly couples the input vector to the output vector. The structure of
this matrix determines how the input forcing functions are distributed
among the various outputs. In most physical systems, D is a null or zero
matrix, so that the term Dv is usually zero.
The complete solution for x(t) and y(/) in Eq. 5.8-8 can be obtained by a
variety of approaches. For purposes of illustration, the method of varia¬
tion of parameters is used here. The method of adjoint systems is used
later to obtain the complete solution for the linear time-varying case.
The homogeneous differential equation x = Ax has the solution
xH(t) = 4>(7 — t)x(t) for r > r from Eq. 5.8-2. Analogous to Eq. 2.6-17,
the assumed particular solution is xP(7) — t)U(0x(t). The total
solution is

x(t) = xH(t) + Xp(t) =4>(/ - t)x(t) +4>0 - t)U(0x(t)

This is more conveniently written as

x(0 = 4>0 - r)[I + U(0]x(t) = <(>(< - r)z(t) (5.8-9)

where r is a constant and the vector z(t) is to be determined.


Substitution of Eq. 5.8-9 into Eq. 5.8-8 yields

[<t>0 - t) — AcJ>(? - t)]z(0 + <(>(/ - T)z(0 = Bv(0

However, the first term is a zero matrix, since cj>(7 — r) is a matrix whose
columns are solutions of Eq. 5.8-1. Thus

z(t) = ^(t - t)By(0 (5.8-10)

Integrating Eq. 5.8-10 from r to t is indicated by

z(t) — z(t) = | 4> x(2 — r)Bv(A) dX


From Eq. 5.8-9,

— r)x(t) = <|>“1(0)x(t) + 4> 2(2 — r)Bv(A) dX


JT

Finally, since
_ A(<—t) — A(A—t)
<(>(0) = I and <\>(t — r)4> (2 — t) = € 7,e
362 State Variables and Linear Continuous Systems

premultiplication by <)>(/ — r) yields

x(t) = 4>(t — t)x(t) j<\>(t — 2)Bv(2) dX (5.8-11)

The solution for y{t) follows by substituting Eq. 5.8-11 into Eq. 5.8-8. It is

y(0 = C<f>(« - t)x(t) + Pc<J>(1 - A)Bv(A) d/L + Dv(t) (5.8-12)

Equations 5.8-11 and 5.8-12 form the solutions to Eq. 5.8-8. The first term
of Eq. 5.8-11 represents the initial condition response of the system state
variables, while the second term represents the forced response. Note that
the second term of Eq. 5.8-11 is a convolution integral similar to that in
Eq. 1.6-2.
Example 5.8-4. Assume that the system whose A matrix is given by Example 5.8-1 is
subject to a unit step forcing function at t — 0 (see Fig. 5.8-la). Find the output
2/(0 = «i(0-
For this example,
'0 —2' '2e-* - e--2i 2(e~2t - e-*)'
A = , <*>(0 =
1 -3 €,-t' - e-2t 2e~2t -
O'
B = , C = [l 0], D = [0]
1
Therefore, from Eq. 5.8-12,

y(0 = C<|>(0x(0) + C<|\>(t - 2)B£/_1(2)dk


Jo
1-

-1

01l(O 012(0 *i(0) 0ll(/ “ 2) 012(/ 2) ~0 ~

dX
©
+

= [i 0]
_x2(0)_ Jo _ 1_

= 0nOK(O) + <f>n(t)x2(0) + 012(r - K) dX


'o
= (2e~t - e-2<K(0) + 2(e~2t - «r0*8(0) + 2[e~2^ - edX
Jo
= (2e~t - e-2T»i(0) + 2(e_2( - e^x^O) + 2e~{ - e"2* - 1 t >0

5.9 LINEAR TIME-VARYING SYSTEMS—


THE STATE TRANSITION MATRIX

For the case where the A matrix is not fixed but varies with time, the
homogeneous matrix differential equation of a linear system is

x = A(t)x (5.9-1)
Sec. 5.9 Linear Time-Varying Systems—The State Transition Matrix 363

It is indeed tempting to return to the scalar case and form an analogy


between the solution to the scalar differential equation

x = a(t)x (5.9-2)

considered in Section 2.5, and the solution to Eq. 5.9-1. The solution to
Eq. 5.9-2 is
x(t) = [exp b(t)]x(t) (5.9-3)

where r is some fixed time instant and

b(t) a(X) dX

The analogous solution for Eq. 5.9-1 would be


t*t

x(0 = exp A(A) dX x(t) (5.9-4)


J T

However, if Eq. 5.9-4 is substituted into Eq. 5.9-1, it is seen that Eq. 5.9-4
is the correct solution if and only if

B(t) _ dB(t) B(t)


t —^ c (5.9-5)
dt dt
where

B(0 = A(2) dX

Unfortunately, Eq. 5.9-5 is not always valid. In fact, it is seldom valid.


Two obvious, but trivial, cases where it is valid are those in which A is a
constant matrix and those in which A is a diagonal matrix. For the former
case the solution is known, and for the latter case the state equations are
uncoupled, so that Eq. 5.9-3 can be used for each of the x/s.
It can be shown that the requirement of Eq. 5.9-5 can be written in terms
of A as the commutativity condition13

A(t1)A(t2) = A(t2)A(t1) for all tx and t2 (5.9-6)

If Eq. 5.9-6 is satisfied, then Eq. 5.9-4 is the solution to Eq. 5.9-1 and the
state transition matrix is given by

<J>(t, t) = exp 1 A(X) dX (5.9-7)

The Matrizant

For the general case, where the commutativity condition of Eq. 5.9-6 is
not satisfied, the state transition matrix is not given by Eq. 5.9-7. However,
364 State Variables and Linear Continuous Systems

the solution to Eq. 5.9-1 can be obtained by a method known as the Peano-
Baker method of integration.14 This method follows.
Let the conditions x(r) be given. Then, by integrating Eq. 5.9-1, the
integral equation

x(t) = x(r) + f A(2)x(2) dX (5.9-8)

is obtained. This equation is sometimes called a vector Volterra integral


equation. It can be solved by repeated substitution of the right side of the
integral equation into the integral for x. For example, the first iteration is

x(() = x(r) + A(2) x(r) + A(s)x(s) ds dX (5.9-9)

The expressions can be simplified somewhat by the introduction of the


integral operator Q, where Q is defined by

Using this operator notation, Eq. 5.9-8 can be written as

x(0 = x(r) + Q(A)x(2) (5.9-10)

If the process shown in Eq. 5.9-9 is continued, then x(t) is obtained as the
Neumann series

x(0 = [I + Q(A) + Q(AQ(A)) + Q(AQ(AQ(A))) + ■ • -]x(r)


(5.9-11)

The first term in the parentheses is I, the unit matrix. The second term is
the integral of A between the limits r and t. The third term is found by
premultiplying Q(A) by A and then integrating this product between the
limits r and t. The other terms are found in like manner. If the elements
of A remain bounded between the limits of integration, then this series is
absolutely and uniformly convergent. This series defines a square matrix
G(A) which is called the matrizant.

G(A) = I + Q(A) + Q(AQ(A)) + Q(AQ(AQ(A))) + • ■ •


(5.9-12)

If both sides of Eq. 5.9-12 are differentiated with respect to t, the funda¬
mental property of the matrizant

- G(A) = AG(A) (5.9-13)


dt
Sec. 5.9 Linear Time-Varying Systems—The State Transition Matrix 365

is obtained. Therefore G(A) is indeed the solution to Eq. 5.9-1 and, as such
represents the desired state transition matrix for a time-varying system.
Thus
4>(*. r) = G(A) (5.9-14)
or
x(r) = G(A)x(t) = 4>(r, t)x(t) (5.9-15)

Clearly, if A is a constant matrix, then

(t~ T) A 2 a--)3 AS A 0-r)


4>(h t) = I + (t — r)A 4- A + A + €
2! 3!
Example 5.9-1. Find <f>(t, r) for the first order system x — —tx.
Obviously, the answer is <f>(t, r) = exp [(r2 — f2)/2]. This could be obtained by direct
integration of the given equation, or by recognizing that the commutant condition
(Eq. 5.9-6) holds and therefore Eq. 5.9-7 can be applied.
Using Eq. 5.9-12, the matrizant is seen to be

it2 - T
G(A) = <fi(t, r) = 1 - —

Recognizing this as the form for the infinite series e-2, the state transition matrix is then
given by

<Kt, r) = exp

This simple example points out the difficulty in using the matrizant
approach. Unless the series (Eq. 5.9-12) converges rapidly, the com¬
putation becomes quite lengthy.
An interesting alternative solution was proposed by Kinariwala.13 The
approach is to decompose A(t) into two matrices, A0(t) and Ax(t), where
A0(f) satisfies the commutant condition of Eq. 5.9-6. Thus

A(0 — A0(0 + A i(t) (5.9-16)


or
x = [A0(0 + Ax(0]x (5.9-17)

Ax(t) is interpreted as a perturbation upon A0(/).


The unperturbed equation
x0 = A0(/)x0 (5.9-18)
has the solution
x0(0 = 4>o(^, t)x(t) (5.9-19)
where
- rt
4 >oO, t) = exp A0(2) dX
_Jr
366 State Variables and Linear Continuous Systems

The solution to Eq. 5.9-17 is assumed to be of the same form as Eq. 5.9-19,
but with successive corrections added to take the perturbations into ac¬
count.
The perturbations Ai(0x(0 are equivalent to the forcing function term
Bv of the previous section. By direct analogy, using the superposition
theorem, the solution for x(t) is then given by
rt
x(0 = 4>o(*> t)x(t) + <J>o0, ^)A1(A)x(A) dX (5.9-20)
J T

Again, this is a Yolterra equation of the same form as Eq. 5.9-8. Using
the same process of iteration as previously described, the Neumann series
for<|>(/, r) is found to be

4>(A T) = [I + Q(<t>oAi) + Q(<t>0AiQ(<|>oAi)) + * • ']<t>o(A T)


(5.9-21)
This expression, in conjunction with Eq. 5.9-15, yields x(t).
If Ax(?) is relatively small, then only the first few terms of Eq. 5.9-21
are necessary for an adequate approximation. A good first choice for
A0(t) reduces the number of terms required. However, such a choice is
often difficult to determine. Needless to say, the analytical solution of
time-varying differential equations is generally quite involved—so in¬
volved that the engineer invariably uses a computer to perform the task.
In many cases, the state transition matrix can be easily obtained by choos¬
ing a set of state variables that are not immediately obvious. It may well be
worth the trouble to determine if a set of state variables can be found such
that Eq. 5.9-7 can be applied. An example illustrates this point.
Example 5.9-2. The homogeneous differential equation of a system is given by y —
2ty — (2 — t2)y = 0. Determine <f>(7, t).
The simulation diagram for this system is shown in Fig. 5.9-1 a. The obvious choice of
state variables is = y and x2 = y. The resulting A (?) matix is
r
21
However, A(/1)A(/l2) ^ A(t2)A(t1). Therefore Eq. 5.9-7 cannot be applied.
An examination of the differential equation shows that it can be rewritten as
d
ij - 2ty - (2 - t2)y = y - — (ty) - y — t(y — ty) = 0
dt
If the state variables are now chosen to be zv = y, z2 = y — ty, then

Since

z‘i — + tz2
dt
Sec. 5.9 Linear Time-Varying Systems—The State Transition Matrix 367

(a)

(b)
Fig. 5.9-1

then 22 = ?! + tz2. The expression for zx = y can be obtained from the definition of z2.
Thus zx — y = z2 -f- tzx. The matrix differential equation is then

«1 t r
2_ _1 t_ _22_

The simulation diagram is shown in Fig. 5.9-1 b. The new A(t) matrix

t r
A (t) =
1 t

does obey the commutant condition, i.e., A(tx)A(t2) = A(t2)A(tx). Therefore Eq. 5.9-7
can be used.
The state transition matrix 4>(r, r) is given by

4>(f, r) = exp A(o) do = exp B


-Jt

where

t T

The matrix eB can be found by any one of the techniques previously discussed. However,
it is instructive to find eB by use of Eq. 4.10-20. Thus

eB = MeAM 1
368 State Variables and Linear Continuous Systems

The characteristic values of B are Ali2 = [(72 — r2)/2] ± (/ — r), and the modal matrices
are
-1 r "1 r
M = (/ — T) M_1 — ^
1 -i_ 2(7 - r) _1 -i_
Therefore

t2 - T2 0
i r exp + (7 “ r) T T
72 - T2
i -l
exp — (r — r) 1 -1
0

and

’cosh (/ — r) sinh (t — r)'


<|>(/, r) = e('2--2)/2 y
sinh (t — r) cosh (7 — r)

The matrix <j>(7, r) does satisfy the differential equation (dldt)[<\>(t, r)] = A(r)4>(7, t),
since
'0 1
-j 4>(T r) = *4>(T t) + <t>(7>T)
dt 1 0
and
7 1‘ ‘o r
A(/)4>(/, r) = <t>0, T) = t) + 4>0, t)
1 r 1 o

Although almost all the time-varying problems that the control engineer faces must be
solved by either an analog or digital computer, it is also true that careful prior inspection
of the system can greatly reduce the computation time. The obvious state variables may
not be the state variables that should be used to control the system, or used to simulate
the system. In many cases, combinations of the obvious state variables may be more
valuable. The foregoing example illustrates this point.

The state transition matrix can also be found by using the simulation
diagram method of Section 5.8. For a time-varying system, the ith state
variable xj(t) is given by Eq. 5.9-22, similarly to Eq. 5.8-7.

*,(0 = I tK(t) (5.9-22)


3=1

The term </>^(/, t) can be found by obtaining the transfer function O^(t, s)
between the input to integrator j and the output of integrator /. Note that
this is a time-varying transfer function (see Section 3.9) and as such may be
difficult to evaluate. If the system were actually simulated on an analog
computer, a unit initial condition placed on integrator j at time r1 would
produce the response <f>i:j(t, tj) at the output of integrator /. A series of runs
and a cross plotting of the results are necessary if r) for all r < r is to
be obtained. This point is discussed in detail in Sections 5.11 and 5.12.
Sec. 5.9 Linear Time-Varying Systems—The State Transition Matrix 369

Properties of the State Transition Matrix

Property 1: By definition,
4»Oo, to) = I (5.9-23)

Property 2: The group property of a state transition matrix is

4>(4, t0) = <}>(t2, O+Oi. U) (5.9-24)

This can be shown from the relations x(r2) = t1)x(t1) = 4>(/2, toMto)
and x(0 = <!>(/!, f„)xOo)- Then x(t2) = <f>(/2,0+Oi, foM'o)- Therefore
<&(h, t0) = 4>fo h)4>(/i, t0)-

Property 3:
4>(h,t2) =4>-1(t2,t1) (5.9-25)

This property can be obtained from the expression1(/2, t±) = 1=


<K*i> h)- Since t2)<\>(t2, tj) = 4>(^, it follows that<J>_1(f2, ^)<|>(/2, tx)
= <K*i> tj). Postmultiplying by c\>~\t2, tj) yields tj =
A)*
For fixed systems.
Property 4:
<K* + r) = <|>(04>W (5.9-26)

Since 4>(t) = eAt for fixed systems, <(>(/ + t) = cA(<+t) = <j>(^)<j>(T).


Property 5:
cj>-1(0=cf>(-0 (5.9-27)

Let t = —t in Property 4. Then <j>(0) = I = cj>(0<4>(— 0 or 4>(~0 =

The Inverse State Transition Matrix—The Adjoint System

The inverse state transition matrix <|>-1(f, r) plays an important role in


the general solution of a time-varying system, in obtaining the impulse
response matrix H(/, r), and in the solution of optimal control problems.
The usefulness of this matrix is related to Eq. 5.9-25, which states

<K*, r) = 4r*(T, 0
The behavior of the system with respect to the variable t is a function of the
dynamics of the original system. The behavior of the system with respect
to the second variable t is a function of the dynamics of the system for
which 4>-1(?, r) is the state transition matrix. This system is known as the
adjoint to the original system.
370 State Variables and Linear Continuous Systems

If the original system is defined by x = A(t)x, then the adjoint system is


defined by
a = — aA(7) (5.9-28)
where a is a row vector, or
a = — AT(t)a (5.9-29)

where a is a column vector. This can be derived from Eq. 4.4-8, which
indicates that

j r)] = 4»“V. r) = -^‘O, rr)


at

Since <j>(?, r) = A(J)<Kb t), it follows that

r) = -<\>~\t, r)A(0 (5.9-30)

Therefore <J>_1(/, r) is the state transition matrix for the system whose
unforced differential equation is given by

a = — aA (t)

where a is a row vector. If the transposes of both sides of Eq. 5.9-30 are
taken, then|

[4>-\t,r)]T = [^(fr)]-1 = AT(t)[<\>T(t, t)]_1

Thus [4>T(/, t)]_1 is the state transition matrix for the system whose un¬
forced differential equation is given by

a = -AT(0<x (5.9-31)
where a is a column vector.
Example 5.9-3. Compare the state transition matrices of x -f tx = 0 and the adjoint
equation.
From Example 5.9-1,

x(t) = x(t) exp

or
/r2 - t2\
<Kt, r) = exp I —-— I t > t

= 0 t < T

The adjoint differential equation is given by a — /a = 0. The solution to this equation


is given by

a(/) = a(r) exp

f If A(t) contains complex elements, the conjugate transpose [A*(r)]r is taken.


Sec. 5.9 Linear Time-Varying Systems—The State Transition Matrix 371

or
It2 - r2\
r) = exp I —-— I t > t

= 0 t < T

If the variables / and r are interchanged in </>_1(7, r), then

/r2 - t2\
t) = exp I—-— I T > t

= 0 r <C t

A comparison of , t) with <f>{t, r) shows that the two expressions are identical
except that they are valid over different intervals. The physically realizable output
a(t) of the adjoint system represents the physically unrealizable portion of the solution
x(t). A physically realizable output of a system implies that the observation variable
is greater than the application variable.

The Adjoint Operator

The nth order homogeneous differential equation

y(n) + an_ i(t)y(n~1] + • • • + a0y = 0 (5.9-32)

can be written as Lny — 0, where Ln is the linear differential operator


defined by

= pn + y ak{t)pk, pk = J-
k—0 at
and the ak(t) are real. The linear adjoint differential operator is defined as

L* = (-1 )>” +I(-1)‘A(0 (5.9-33)


Tc=0

wherepkan(t) signifies that pk operates on the product of ak{t) and the depen¬
dent variable. Consequently, the linear adjoint differential equation
Ln * a = 0 can be written as

(— l)n/?na + (— l)n_1/?n_1[tfn_1(0a] + ‘ * ' + tfo(0a = 0

If the original differential equation, Eq. 5.9-32, is cast in the form


x = A(/)x, the A matrix is

T 0 1 0 0
0 0 1 0
A = (5.9-34)
0 0 0 1
-a0(t) —#i(0 —a2(t) • * ‘ ~an-i(t)_
372 State Variables and Linear Continuous Systems

For the adjoint system, a = — AT(/)a and — AT is given by

0 0 • • • 0 floO)
-1 0 • • • 0 «i(0
0 -1 • • • 0 a2(t) (5.9-35)

_ 0 0 • • • -1
To show that the operator notation and the matrix notation are equivalent,
the following observations are made on the adjoint matrix formulation.

«i 0o(O<*n
a2 — <*i + a1(t)ccn
ax ~ a<- 1 +
Differentiating an,

= '^■n—1 "F , l(0^n]


dt
Substituting for an_ls

= a«-2
dt
If this process of differentiation and substitution is performed (« — 1)
times, it can be shown that

(-1)V<X» + I1(-l)VK(0a„] = 0 (5.9-36)


fc=0

which is exactly the linear adjoint differential equation.


Alternatively, the adjoint operator can be derived from the definition15

(a, Lx) = <L*Ta, x>


This definition is often useful in formulating existence and uniqueness
criteria for the solution of differential equations.

Linear Systems with Periodic Coefficients16

Consider the linear system


x = A(r)x (5.9-37)

where A(/) is a continuous periodic matrix with constant period 7 ^ 0,


such that
A(/ +T) = A(r)
Because of this relationship, if r) is a state transition matrix for
Eq. 5.9-37, then so iscj>(7 + T, r), and there exists a constant nonsingular
Sec. 5.9 Linear Time-Varying Systems—The State Transition Matrix 373

matrix C such that <J>(/ + T, r) — <f>(t, r)C. This can be shown rather
easily, since the derivative of this expression yields

4>(' +T,t)= <j>(t, r)C = A(04>(^ t)C = A(0<f>(/ + T, r)


Thusc[>(r + T, r) also satisfies Eq. 5.9-37.
It can be shown that there exists a constant matrix B (called a logarithm
of matrix C), such that C = €BT. It follows that <J>(r + T, r) = r)eBT.
<Kh r) may or may not be periodic. This depends upon certain properties
of B.
If P(r) is defined as
P(0 = r)e-Bt (5.9-38)
then
P(t + T) = 4>(t + T, r)e-B(i+r)
= 4>f<, T)€Br€'-B«+r) = <J>(r, t)(~bi = Pd)

P(0 is nonsingular, since <J>(/, t) and are nonsingular for — oo < t <
oo. Therefore, for a periodic system, P(t) is a nonsingular periodic matrix
with period T.
It is interesting to note that P(/) is the solution to the differential equa¬
tion17
P(0 = A(/)P(0 ~ P(0» (5.9-39)

Thus B is such that Eq. 5.9-39 has a periodic solution of period T.


Example 5.9-4. Consider the first order differential equation x — a(t)x = 0, where a(t)
is periodic with period T. Determine the requirements on b of Eq. 5.9-39 for p{t) to
be periodic with period T.
Equation 5.9-39 is
p = [a{t) — b]p
The solution to this equation is

p(t) = p(0) exp J J [a(X) - b] dX

The term p{t) is periodic, with period T, if b is equal to the average value of apt) over an
interval T, i.e.,

b =
i
-
rTa(X) dX
1 Jo
Any integral multiple of l^j may be added to this value.

Although B generally cannot be determined uniquely, since BE is one of


the complex logarithms of C, the A matrix does determine all the properties
associated with B that are invariant under a similarity transformation.
Specifically, the set of characteristic roots of C are uniquely determined by
A. These characteristic roots (A1? X2, . . . , A„) are called the multipliers
associated with A. These multipliers are all nonzero, since the determinant
374 State Variables and Linear Continuous Systems

of C is nonzero. The characteristic roots of B are called characteristic


exponents.18
Although this discussion is somewhat superficial, its importance lies in
the fact that, for a periodic system, the knowledge of the state transition
matrix over one period determines the state transition matrix for — oo <
t < oo. For example, if <£(/, t) is known over the t interval (0 < t < T),
then C = <t>-1(0, T)<f>(r, r) and B = (log C)/T. Since P(?) is periodic with
period T, then cj>(/, t) is known for all t.

Example 5.9-5.19 The equation

y + [a2 + b2 sgn (cos t)]y = 0


where
( 1, a > 0
sgn a = ( 0, a = 0
( — 1, a < 0

is to be investigated for the two sets of values a2 = 1, b2 = 1, and a2 = 4.6, b2 = 5.0.


This differential equation can be written as

y + oj\t)y = 0

where the periodic coefficient eo2(0 is shown in Fig. 5.9-2. Note that it is piecewise
constant. The A matrix for this equation is given by (xx — y, x± = x2)

o r
A =
-0j\t) 0

For each of the intervals where eo\t) is constant, <f> is given by

1 .
cos cot —sin cot
<f> = _ A]-1 co
—oo sinojt coscot

Since co(t) is periodic, <|>(r, r) must also be periodic.


Since co2(t) is periodic and piecewise constant, 4>(3tt/2, — tt/2) can be found by solving
for 4>(7t/2, —7t/2) and 4>(37t/2, n/2). The state transition matix <4>(37t/2, —7t/2) is then
4>(3tt/2, —Trll) = 4>(37r/2, 7r/2)4>(W2, — 7r/2).

oo2(t)

Fig. 5.9-2
Sec. 5.10 Linear Time-Varying Systems—The Complete Solution 375

For the interval —tt/2 to tt/2, oj = V2 for a2 — b2 = 1, and co = 3.1 for a2 = 4.6,
b2 = 5.0. Thus

/77 77 1 ”—0.259 0.683” ”—0.951 -0.099"

II
\2'-2) = -1.37 —0.259_ 0.959 —0.951_

IN
>
3
II
For the interval tt/2 to 3tt/2, oj = 0 for a2 = 62 = 1, and co = j0.634 for a2 = 4.6,
b2 = 5.0. Thus
/ 37T A - ”i 77 "3.73 5.66”

U ’V ~
_0 1 a>=0
2.21 3.73_ co=J0.634

For the complete cycle, the state transition matrix is


"E"

” 4.56 0.131" “1.88 -5.75"


i

\ 2 ’ 2 ) ' — 1.37 —0.259_ a2=b2=l 1.41 — 3.78_

Since x[(3rr/2)/:] = 4>fc(377-/2, — 7t/2)x( —7t/2), boundedness of the periodic samples of


x(7) depends upon whether the characteristic values of 4> have an absolute magnitude
less than one. Periodicity of x(t) depends upon whether the characteristic values of <}>
have an absolute magnitude equal to one. For a2 = b2 = 1, the characteristic equation
121 — <J>| = 0 is given by 22 — 4.32 — 0.99 = 0. The roots are 212 = 2.15 ± 2.38.
Fora2 = 4.6, b2 = 5.0, the characteristic equation |2l — <j>| = 0 is given by 22 + 1.92 +
1.0 = 0. The roots are 2X 2 = —0.95 ± y'0.317. Therefore, for the first set of constants
(a2 = b2 = 1), the initial condition response is unstable. For the second set of constants
(a2 = 4.6, b2 = 5.0), the response is just periodic (within slide rule accuracy) since
|2j| = 1. (Note that 22 + 22 + 1 = 0 has a double root at 2 = 1.)
The conclusion to be reached from this example is that a system need not have a
periodic solution, even though the coefficients of the governing differential equation are
periodic. The solution can be stable, unstable, or periodic.

5.10 LINEAR TIME-VARYING SYSTEMS—THE COMPLETE SOLUTION

Using the results of the previous section, the complete solution to the
state equations for a linear time-varying system can be obtained. The
state equations for a linear time-varying system are given by

x = A(/)x + B(/)v
(5.10-1)
y = C(f)x + D(1)V
Equation 5.9-30 states that

4>_1(b t) = t)A(/)

If the first of Eqs. 5.10-1 is premultiplied by 4>-1(f, t) and Eq. 5.9-30 is


postmultiplied by x, the resulting equations are

t)x = r)A(0x + r)B(f)v


(5.10-2)
t)x = -<J>_1(b t)A(I)x
376 State Variables and Linear Continuous Systems

Adding these two expressions yields

4 MrV. t)x] = <►-*(/. t)B(0v (5.10-3)


at
Integration of Eq. 5.10-3 from r to t is indicated by

4T\t, r)x(0 - t)x(t) = JVu t)B(A)t(A) dX (5.10-4)

Since <j>_1(r, r) = I,

X(0 = t)x(t) + 4>(«, T)J Vv, t)B(A)v(A) (5.10-5)

Using the fact that

4>(A r) = <J>0, t)c|>(t, A) = X)


x(t) becomes

x(0 = 4>(b t)x(t) + f <\>(t, 2)B(2)v(2) dX (5.10-6)

The expression for y(t) is obtained by substituting Eq. 5.10-6 into the
second of Eqs. 5.10-1, to yield

y(t) = C(t)<(>(t, t)x(t) + J C(r)<t>((, A)B(A)v(A) dX + D(t)y(t) (5.10-7)

Equations 5.10-6 and 5.10-7 represent the complete solution to Eq.5. 10-1.
These equations are quite similar to Eqs. 5.8-11 and 5.8-12, respectively.
For a fixed system, the response depends upon t — r, or the time
difference between application of cause and observation of effect. The
result is the convolution integrals of Eqs. 5.8-11 and 5.8-12. The state
transition matrix for a fixed system has only one variable, namely the time
difference between application of cause and observation of effect. How¬
ever, for a time-varying system, the solution depends upon both t and r.
For a time-varying system, the cause-effect relationship is varying with
time, and therefore when the cause is applied and when the effect is ob¬
served are significant. The state transition matrix for a time-varying
system has two variables, one being the time of application of the cause,
and the other being the time of observation of the effect. Therefore
Eqs. 5.10-6 and 5.10-7 depend upon superposition integrals and not upon
convolution integrals.

Adjoint Solutions and Integrating Factors

For a scalar differential equation of the form

x — a(t)x + ift) (5.10-8)


Sec. 5.10 Linear Time-Varying Systems—The Complete Solution 377

the standard method of solution is by use of the integrating factor


exp J [—a{t)} dt, as in Section 2.5. With this method, the solution can be
expressed as

Jrdi dX = dX (5.10-9)

Performance of the integration on the left side yields

f^a(a) da
€ —
X(t) — € —STa(<x) da x(t) = a(a) da
v(X) dX
-V
Thus x(t) can be written as
t n t
x(t) = €ST aU) docx(r) + a(a) dav(X) dX (5.10-10)
*) T

Notice the similarity between Eqs. 5.10-10 and 5.10-6. c(>_1, the state
transition matrix for the adjoint system, is the integrating factor used for
the solution of Eq. 5.10-1. This is a fundamental property of the state
transition matrix for the adjoint system.

Example 5.10-1. Consider the time-varying system described by the differential


equation y + ty + y = v(t). The block diagram for this system is shown in Fig. 5.10-1.
The A matrix for the state variables indicated in Fig. 5.10-1 is

A(0 =
Determine <J>0> T)-
A(t) does not satisfy the commutativity condition, since

12 "-I

A(C)A (t2) = A(?2)A(r1)


— 1 + t]ti t2 —1 + t\t2

Therefore the state transition matrix r) is not given by Eq. 5.9-7. The state transition
matrix could be obtained by use of the matrizant, but the series solution obtained by
this method is difficult to express in closed form. However, by use of Eq. 5.9-22, a
set of differential equations can be obtained which are solvable, if the integrating factor
approach is used.

Integrator Integrator
2 1

Fig. 5.10-1
378 State Variables and Linear Continuous Systems

The homogeneous form of the given differential equation can be written as

y + ~r (yt) = 0
dt

By integrating this equation once, the first order differential equation y — — ty + cx is


obtained, where cx is a constant of integration. This equation is in the same form as
Eq. 5.10-8. The integrating factor is exp (t2/2). (Note that this integrating factor is the
solution to the adjoint differential equation a = ta.) From Eq. 5.10-10,

y(t) = y{r)^-t2)^ + Cle-*2/2 pe^2/2 dk

For the requirements of this example, the expression is best left in this form. The
integrated form is quite complicated, as evidenced by

y(t) = y(r) eV2 t2)/2

This illustrates an interesting point. Even when an analytical solution can be obtained
for a time-varying differential equation, this solution may not be easily evaluated. In
general, the solution for a time-varying differential equation is obtained by a computer
simulation, as the analytical solution is either difficult to obtain or difficult to evaluate.
The derivative y{t) is

y(t) = —ty(r)eV2 <2)/2 + Cl i


(A2—£2)/2

The solution for the state transition matrix can now be obtained by application of Eq.
5.9-22. If a unit initial condition is placed on integrator 2, and zero initial conditions
on all other integrators, then the time response of the output of integrator 1, y(t), is
equal to <f>l2(t, r). Therefore let y(r) = 1 and y(r) = 0.
The expression for y(t) gives cx = 1. Therefore

j ft
t) €~t2/ 2 V2/2 dl = e(A2-i2)/2dX
* T

In a similar manner, the other terms of the state transition matrix are obtained from the
following:
<M', T) = V(0 when y(r) = 0, y(r) = 1
r) = y(t) when y(f) = 0, y(r) = 1
r) = y(t) when y(r) = 1 ,y(r) = 0
Thus

r) = + T dX

e(t*-«>)/2 + t £U>-(>)/2 M
j’Zl('. r) = T — t

t) = 1 — t\ €*A“ ^l*dX
Sec. 5.10 Linear Time-Varying Systems—The Complete Solution 379

The state transition matrix is given by

e(rS-<*)/2 + TJ«W‘-<’,/a<W
4>(h t) =
r — t(f>n(t, r) 1 — rfls(t, r) _

Another Use of the Adjoint System

The adjoint system is useful for studying the effects of forcing functions
and initial conditions on linear combinations of the state variables. Con¬
sider an inner product of the adjoint variables a and Lx, where

L = - A

the system operator. This inner product is taken as

(a, Lx) = I ar(2)[x(2) — A(2)x(2)] dl (5.10-11)


10

Integrating the first term by parts yields

(a, Lx) = a T(X)x(K) - [oC(2) + oC (A)A(A)]x(A) dX (5.10-12)


to 31o
But
[<xT(A) + ar(A)A(A)] = [a(A) + Ar(A)a(A)]r = (L*a)r = 0r

(5.10-13)
from Eq. 5.9-29. Then, from Eqs. 5.10-11, 5.10-12, and 5.10-1,

aT(t)x(t) = aT(t0)x(t0) + f aT(2)B(2)v(2) dl (5.10-14)


J to

Note that, for an unforced system, i.e., if v(2) = 0, aTx is constant.


Equation 5.10-14 can be used to determine linear combinations of the
a^(0’s, once Eq. 5.10-13 has been solved for a(A) corresponding to t0 <
A < t.
In order to solve Eq. 5.10-13 for a(A), appropriate boundary conditions
must be specified on a(2). The boundary conditions which should be
specified depend upon the problem to be solved. For example, if the effect
of the forcing function and/or initial conditions on xx(t) is to be determined,
appropriate boundary conditions are ax(?) = 1 and a2(t) — a3(/) = • • • =
an(t) — 0. Solution of the adjoint system subject to these boundary
conditions, and the subsequent substitution of this solution into Eq.
5.10-14, permit this equation to be solved for the desired result.
380 State Variables and Linear Continuous Systems

Example 5.10-2. A terminal control system described by x = —(1 /T)x + m(t) starts at
ai(0) = 0 in response to m{t) = KU^{t). The state of the system at time t = t0 (a
fixed value) is desired to be #0(70) = €iti~to)IT, at which point m{t) is set equal to zero.
The system is then to “coast” to x = 1 at t = tx (also a fixed time instant). This is the
desired terminal point. When x(t0) is compared with x°(t0), however, it is found that,
owing to disturbances, x(t0) < #°(r0). How much longer should m(t) = K be maintained
so that x(tj) = 1 ?
Let At be the time duration beyond t0 for which m(t) = K. Then Eq. 5.10-14 becomes
't0+At
— <x(t0)x(t0) + I JCcc(X) dX

The adjoint equation is


dec 1
-jr = — a, a(tj) = 1, t0 < / < tl
dA 1
The solution for a(A) is
a (A) = €U-*i )/T
Then
'to + At
x(tx) = 1 = e(*o-D/Tc(r0) + iHTdX

Performance of the integration and solution for At yield

€(tl to)/T — x(t0) *°(fo) - x(to)


A t = Tin 1 + = Tin 1 +
XT KT

As a second illustration of the choice of boundary conditions on a(/l),


ai(r) = a2(t) = 1, a3(/) = a4(r) = • • • = ocn(t) = 0 enables Eq. 5.10-14 to
be solved for xx(t) -f x2(t). In the time-varying case, of course, this gener¬
ally must be done by simulation because of the difficulty in analytically
solving time-variable differential equations.
With respect to simulation, the adjoint problem with the boundary
conditions a(7) may be converted into an initial value problem by the
change in variable,
X = t-t1 (5.10-15)

Since X runs forward in “time” from t0 to t, tx runs “backward in time” in


the range 0 < tx < t — t0. The boundary conditions on a(A) at X = t
correspond to boundary conditions on a(/4) at tx — 0. Thus they are
initial conditions in the simulation. With the change in independent vari¬
able of Eq. 5.10-15, the adjoint equation becomes

d- -- — 'l} = AT(t - /,)«(( - «,) (5.10-16)


dt1
where the change in variable removes a minus sign.
tx is time on the simulator, usually an analog computer. The simulator
time runs from t± = 0 to tx = t — t0, as stated above. Thus the simula¬
tion method is most useful when / is some fixed, finite value of terminal
Sec. 5.11 Impulse Response Matrices 381

time T. In this case, Eq. 5.10-16 becomes

da(Tj- tp = aT(T_ _ (5.10-17)


dt1

The coefficient matrix of Eq. 5.10-17 is the transposed A matrix of the


original system, with t replaced by T — tx. Thus this system can be simu¬
lated by reversing the inputs and outputs of each of the simulation elements
of the simulation of x = A (Ox, and replacing the time t of any time-
varying gains by T — tv This is discussed further in the next two sections.

5.11 IMPULSE RESPONSE MATRICES

The impulse response matrix of a linear system characterizes the dynamic


equations of the system in the sense that, given the input vector from
— oo to t and the impulse response matrix H(7, r), the output vector at
time t can be uniquely determined. Equation 5.11-1 represents the vector-
matrix form of the superposition integral given in Eq. 1.7-2.

y(0 = r U(t, X)y(X)dX (5.11-1)


J — oo

Equation 5.10-7 can be rewritten in the same form as

y(0 = P C(04>(f, A)B(2)v(2) dX + D«v(0 (5.11-2)


J — 00
or

y(0 = f [C(04>(f, A)B(A) + U0(t - A)D(()]v(A) dl (5.11-3)


J —CO

Comparison of Eqs. 5.11-1 and 5.11-3 shows that the impulse response
matrix H(f, r) is given by

H(t, r) = C(04>a t)B(t) + U0(t - r)D(0, t ^ T


(5.11-4)
= 0 t < T

For a fixed system, Eq. 5.11-4 can be written as

H(/) = C<|>(0B + U0(t) D /> 0


(5.11-5)
= 0 t < 0

Example 5.11-1. Find the impulse response matrix of the RLC circuit shown in Fig.
5.5-2b. The inputs are v^t) and v2(t), and the outputs are zffi) and z2(/).
382 State Variables and Linear Continuous Systems

From Example 5.5-2, the A matrix of this circuit is

o
(Ri + *2)C
A =
2
0 (Ri + R2)LJ
Since A is diagonal, <f> = eA< is easily determined.
Owing to the symmetry of the circuit, the relationships between the elements of the
coupling matrices are:

b ii = b 12 (vx and v2 have the same effect upon ±x)


b 21 = b 22 (v1 and v2 have opposite effects upon x2)

C11 = C21 («! has the same effect upon ix and i2)
c22 = — c12 (x2 has opposite effects upon i\ and i2)
du = d22 (vx couples into ix in the same manner as r2 couples into z2)
d\2 = d21 {vx couples into i2 in the same manner as r2 couples into

The general expression for the impulse response matrix is then

cn C 22 ^n(0 o bn bn dn d12
H(t) = + tuo
cn C22 0 (f>22 (t)_ b22 b22 d\2 d\\
1 1 -1 r dn di2
— ^llCll^ll(0 + b 22c 22(f> 22{t) + U0(t)
1 1 1 -l dn dn
From the state equations
x = Ax + Bv
y = Cx + Dv

it is possible to obtain B, C, and D without resorting to loop equations. The method is


similar to that used in Example 5.5-2 to determine the elements of A. Note that

±i = anXi + aX2x2 + bnVx -f bi2v2

for all t. Assume that the network is initially at rest (xx = x2 = yx = v2 — 0). If vx is
suddenly made 1 volt by application of a unit step of voltage, XjfO-l-) = ^2(0+) =
y2(0+) — 0, since the capacitor voltage and inductor current cannot change instan¬
taneously. Thus
4(0+)
bn= -
C
Vi=U_x(0

This is the same as replacing the capacitor by a short circuit and the inductor by an open
circuit (the network is now a purely resistive network) and writing

bn =
4(0 = _ gey
Cvi(t) C

where gcl is the short-circuit transfer conductance between the input vx and the short-
circuited capacitor. Thus
1
bn = -
(/+ + R2)C
Sec. 5.11 Impulse Response Matrices 383

In a similar fashion,

b = - 6l^ ^
22 Lv2(t) L

where ai2 is the voltage gain between the input v2 and the open-circuited terminals of the
inductor. The capacitor and vx are shorted. Therefore

R>
bn —
(Ri + R2)L
Using similar arguments,

h 1
cn — —, *2 = v, = v2 = 0 or cn =
Xi Ri + R2
Is R*
C 22 — , *1 = vx = v2 = 0 or c22
X« Ri + Ri
h 1
d1 i = -, — X0 = v2 = 0, or dn
Ri + Ri

, _ h _
a 12 — — » *®i — — yx = 0, or d12 = 0

Therefore
-1 2« t r 2RiR2t —i r
R,2
H(0 = \j <*i + -R2)C
i i
+ (Ri + R2)L
i -l
(Ri + R*)2C (*i + *2)2L
u0(t) T O'
+
Ri + R2 0 1

Simulation Difficulties

In some control system applications, particularly in missile and space


work, it is necessary to determine elements of H(t, r), the impulse response
matrix of a time-varying system. One illustration of the use of H(7, r) is in
evaluating the mean square outputs of a system in response to random
signals.
Consider the case represented in Fig. 5.11-la, where the random signal
\(t) is applied to the system with the impulse response matrix H(7, r) at
t = 0 by the closing of the switch. Then

y(0 = pHjiy, t)v(t) dr ( . 5 11-6)


Jo
Let Y be the autocorrelation matrix y(t)yT(t), where the bar denotes the
statistical average. Thus
Vi VlV2 ' • • ViVv

y&i y<i
CO

Y = (5.11-7)

_VvV\ VvV* • • • Vv .
384 State Variables and Linear Continuous Systems

v(t)
► Hift r) ^y(t)
t= 0
(a)

l-<— - H (t, T) - —>1


1 1
1
1
n(t) | ! y(t)
-1 H2(t, T) -v(t) Hi (t, T)
White noisei K i
applied at
t = — oo

(b)
Fig. 5.11-1

Substitution of Eq. 5.11-6 yields

Y = Hi(t, t1)v(t1) drx V:z'(r2)H1T(t, r2) dr2


Loo _J o
Since H^/, t) is a deterministic impulse response, Y can be written as

Y = HiO, Tj) I vCt^v21^) HSXt, t2) dr2 dr1 (5.11-8)

Equation 5.11-8 can be used to evaluate the mean square outputs of the
system, assuming that H^r, t) and the autocorrelation matrix vCr^v2^)
are known, and that the integrations can readily be performed.
In many cases, Eq. 5.11-8 is too complex to permit analytical evaluation.
In the case of time-varying systems for example, it is frequently impossible
to determine a closed-form expression for Hr). In such cases, Y can be
determined by simulation techniques which have the advantage of not
requiring random signal generators.21 These simulation methods are
simplest when Y is to be determined at a fixed instant of time. This is the
case considered in the remainder of this discussion.
The basis for making random signal generators unnecessary in the
simulation is that Eq. 5.11-8 depends upon the autocorrelation matrix of
the random input, rather than upon the random input itself. Thus Y is
the same for Figs. 5.11-1# and 5.11-16, if the shaping filter is chosen so
that the autocorrelation matrix of v is the same in both cases. With
respect to Fig. 5.11-16,

v(tP = Pri H2(tj, A1)n(A1) dXx


J — CO
Then
T1 T2
Tt
''(t1)vt(t2) = I H2(Tl, Xx) I n(A1)nr(^2) H/(t2, X2) dX2 dXx
-oo -oo
Sec. 5.11 Impulse Response Matrices 385

Since n(A1)n2 (A2) = N^XC/^Ai — A2)] for white noise,

H2(t 1? A^N^H2t(t2, A,) cU, (5.11-9)


J — 00

after performing the integration with respect to A2. Equation 5.11-9 is an


integral equation which can be solved for the impulse response matrix of
the shaping filter necessary to make Figs. 5.11-la and 5.11-16 equivalent
with respect to Y. The solution of this equation is frequently easy,
particularly in the stationary case where transform techniques can be
used.
Assuming that the shaping filter has been appropriately chosen, Eq.
5.11-8 can then be applied to the representation of Fig. 5.11-16 by replacing
v(ti)v^7(t2)
and H^, r) by and H(r, t), respectively. H(f, r) is
the impulse response matrix of the combination of the shaping filter, the
switch, and the system. Since n^Jn^rg) = NfoXC/ofo — r2)], application
of Eq. 5.11-8 to the configuration of Fig. 5.11-16 yields

(5.11-10)

after performing the integration with respect to r2, setting tx = r, and


specializing the result to t = T. The lower limit, r = — oo, is required
because n(t) is applied at t = — oo.
Equation 5.11-10 indicates that Y can be determined by applying a
deterministic impulse (or its equivalent initial conditions), rather than a
noise source, multiplying the appropriate simulator outputs together in
pairs, and integrating the results. This assumes that the simulator outputs
are the impulse responses of the appropriately chosen shaping filter, the
switch, and the system. Note that the switch makes H(J, r) time-varying,
even if the shaping filter and system are time-invariant.
Of particular importance is the fact that the variable of integration in
Eq. 5.11-10 is r. Thus the generation of H(T, r) by the simulator should
produce H(t, r) for fixed t and variable r. Unfortunately, straightforward
simulation produces H(/, r) for variable t and fixed r, since t is the response
time and r is the time of application of the impulse. Thus the straight¬
forward simulation would have to be run many times, each for a different
t, and the data replotted for fixed t and variable r. Multiplication and
integration of the replotted data are then required to determine the
elements of Y.
To illustrate the bookkeeping task involved in this direct simulation
method, assume an m input, p output system. The expression for yt{T) is,
386 State Variables and Linear Continuous Systems

Impulse response-original system

Fig. 5.11-2a
Sec. 5.11 Impulse Response Matrices 387

Fig. 5.11-2b

from Eq. 5.11-1,


m

Vi(T) = 2 htj(T, A)i;,-(A) dX, i = 1, 2, . . . , p (5.11-11)


3=1

If a unit impulse is placed on the y'th input at time r < T, then

Vi(T) = /^(T, r), T > T, 1=1,2,...,/? (5.11-12)

Consequently, a unit impulse on the yth input produces a column of the


impulse response matrix for a fixed value of r. The unit impulse must be
repeatedly applied to theyth input for various values of r in order to obtain
h{j(T, t) for a variable r.
The impulse response

h(t, r)

of the system ij + ty + y = 0 is shown in Fig. 5.11-2. Figure 5.11 -2a


shows the results of a set of simulation runs of h(t, r) versus t for various
values of r. Figure 5.11-26 shows the results of cross-plotting these runs
for fixed values of l. This would have to be performed for each of the
elements hiS(t9 r). To obtain all the columns of the impulse response
matrix, a set of simulation runs must be performed for each of the m
388 State Variables and Linear Continuous Systems

inputs. If 10 simulation runs were sufficient to provide enough points to


produce a function of r, then 10m simulation runs would have to be
performed. This would provide mp sets of curves of the type shown in
Fig. 5.11-2a. For each of the mp sets of curves, a cross-plot would have
to be performed to produce mp curves of the type shown in Fig. 5.11-26.
In the following section, it is shown that a single computer run of a
modified adjoint system produces a row of the impulse response matrix
H(t, r) for a fixed value of t and variable r.

5.12 MODIFIED ADJOINT SYSTEMS20 22

The difficulty associated with determining the impulse response matrix


H(7, r) for variable r was indicated in the previous section. It is due to
the fact that the response variable is t rather than r. Since the state
transition matrix of the adjoint system is related to <J>(7, r), the state
transition matrix of the original system, by an interchange of t and r and
a transposition, a transposed adjoint system would seem to answer this
difficulty.! Indeed it does, after a change of time axes.
Consider Eq. 5.10-14 with t — T and t0 equal to — go. The equation
becomes

<xr(T)x(T) = (X:z’(t)B(t)v(t) dr
J — 00

since x(— co) = 0. Now suppose that ct'(T) is chosen as

«T(7’) = yr) ci2(T) ••• cin(T)]

where the cis’s are the indicated elements of the system C matrix. Then

y£T) = MD ci2(T) • • • cin(T)]x(T)

can be obtained.J That is,


cn (T)
ca(T)

Vi (T) = <XT(t)B(t)v(t) dr. if a(T) =

t The term system is used in this section to denote the combination of the shaping filter,
switch, and system of Fig. 5.11-16.
t The D matrix is assumed to be a null matrix, since this is true for any practical com¬
binations of shaping filters and systems. Otherwise an infinite y\T) would result.
Sec. 5.12 Modified Adjoint Systems 389

Comparing this expression with the zth row of

y(T) = H(T, t)v(t) dr


i —co
it is apparent that
{n(T,r)}mro,v = aT(r)B(r) (5.12-1)

if the adjoint system has the boundary conditions

ca(T)
cdT)
a(T) = (5.12-2)

CirtT)

In order to make these boundary conditions initial conditions for the


simulation, let r = T — t1 as was done in Eq. 5.10-15. Then analogous
to Eq. 5.10-17,
da(T — rx)
= AT(T - tMT ~ tx) (5.12-3)
drx

Also, Eqs. 5.11-10, 5.12-1, and 5.12-2 become

oo
Y = H(T, T - rx)N(T - t,)Ht(T, T - Tl) dr. (5.12-4)

{H(T, T - rOJithrow = aT(T - r,)B(T - t,) (5.12-5)

cn(T)
cdT)
“(0) = (5.12-6)

_CiniT)_

Remembering from Section 5.10 that Eq. 5.12-3 can be simulated by


reversing the inputs and outputs of each of the simulation elements of
x = A(t)x and replacing the time t of any time-varying gains by T — tl9
Eq. 5.12-5 is simulated by multiplying the adjoint signals by B(E— tf).
Of course, the adjoint system must have the initial conditions of Eq.
5.12-6. A little thought reveals that the initial conditions of Eq. 5.12-6
390 State Variables and Linear Continuous Systems

Column Column Column

(a)

(b)
Fig. 5.12-1

can be established by applying a unit impulse at tx = 0 through gains of

cn(T - h)
ci2(r ^l)

JiJJ - h)_

Since B and C are, respectively, the input and output matrices of the
original system, the net result of this argument is that the ith row of
H(7", T — t) required for Eq. 5.12-4 is generated for variable r by reversing
the inputs and outputs of each of the simulation elements of

x = A (t)x + B (t)\
y = c(t)x
and replacing the time t of any time-varying gains by T — tv The resulting
system is called the modified adjoint system. It is compared with the
original system in Fig. 5.12-1.
Sec. 5.12 Modified Adjoint Systems 391

If the original system has m inputs and p outputs, the modified adjoint
system has p inputs and m outputs. Each simulation run on the modified
adjoint system produces m outputs. These m outputs comprise a row of
H(r, T — r), where the running variable of simulation t is equal to r.
The particular row of the matrix is determined by the input on which the
unit impulse is placed at the beginning of the simulation run.
Example 5.12-1. For the differential equation y + ty + y = 0, the curves of Figs.
5.11-2a and 5.11-26 show the cross-plotting operation required to obtain h(T, r). Show
that the modified adjoint system produces the desired result in one simulation run.
The impulse response of the system is given by

h(t,r) = j\(A2-*2)/2dA r < t

= 0 T > t

If T is substituted for t and the change in variable, r = T — t1} is made, then the im¬
pulse response h{T, T — rx) is
pT
h(T, T - rj) = e(A2-T2)/2 di Ti>0
JT-rx

= 0 TX < 0

The simulation diagram of the original equation is shown in Fig. 5.12-2a. Thus the
modified adjoint system has the diagram shown in Fig. 5.12-26. The differential equation
corresponding to Fig. 5.12-26 is
a + (T — ?i)a = 0

(a)

(b)
Fig. 5.12-2
392 State Variables and Linear Continuous Systems

The response of the modified adjoint system to an impulse at tx = 0 is given by the


solution of
a + (T - Oi = U0(ti)
Let w = a. For tx > 0,
w + (T — tx)w = 0, w(0) = 1
Thus

jjKfj) = 6c = exp - Ttxj, tx > 0

Then

a(O =J exp ^ > 0

Let (3 = T — A. This yields /**(?!,()), the impulse response of the modified adjoint
system, as
rT
h*(h, 0) = a(0 = €U2-t2)/2 ^1, tx>0
jT-ty

This is identical with the expression for h(T, T — tx) if tx is written for tx.
Thus the impulse response of the modified adjoint system observed over the tx axis
is exactly the impulse response of the original system, observed over the rx axis. However,
the tx axis here is the time axis of observation on the computer, while the rx axis is the time
axis of application of impulses. Therefore a simulation run on the tx axis produces
exactly the same results as does a cross-plot sketched on the rx axis. This is shown in
Fig. 5.12-3, where a series of runs of h*(tu 0) are shown. Note that, if the r axis in Fig.
Sec. 5.12 Modified Adjoint Systems 393

5.11-2b were changed to r1 = T — r, these curves would be identical with the curves
of Fig. 5.12-3. Therefore the modified adjoint system produces in one simulation
run what would take many simulation runs of the original system.

Example 5.12-2. The A, B, and C matrices of a given system are

" 0 1 “ eft) 0
, B = , c =
_bft)_ 0 c2(0_

The simulation diagram for this system is shown in Fig. 5.12-4a. Determine the diagram
for the modified adjoint system.
Since H(t, r) = C(t)<$>(t, t)B(t), the order of H(7, r) is (2 x 1), or two rows and one
column. The application of a unit impulse at time r = r2 produces an output on both
yx and y2, thus giving all the elements of H(7, t*), since H(f, r) is a single column matrix.
However, this is for variable t and fixed r = t*.

(a)

(b)
Fig. 5.12-4
394 State Variables and Linear Continuous Systems

If the inputs and outputs of the simulation diagram are reversed and the change in
variable t = T — tx is made, the result is Fig. 5.12-46. Note that at each terminal the
number of inputs and of outputs are interchanged. This fact provides a rapid partial
check of the correctness of an adjoint simulation diagram. The vector-matrix differential
equations for this system are given by

0 1 cifT 11) 0
[ax a2] = [ax a2] + hr r2]
a2\(T tf) a22{T tf) 0 c2(T - tf)

MT
q = [ai a2]
bfT
or
aT = ctTA(T - tf) + rTC{T - tf)

q = *Tb(T - L)

If a unit impulse is placed on one of the inputs at time zero, the output q{tf represents
a row (one element in this case) of the original response matrix H(T, T — tf). Therefore
two simulation runs are required to produce both rows of the impulse response matrix.

REFERENCES

1. L. A. Zadeh, and C. A. Desoer, Linear System Theory—The State Space Approach,


McGraw-Hill Book Co., New York, 1963, pp. 23-31.
2. Ibid., pp. 311-326.
3. B. Friedland, O. Wing, and R. Ash, Principles of Linear Networks, McGraw-Hill
Book Co., 1961, pp. 58-64.
4. R. E. Kalman, “Canonical Structure of Linear Dynamical Systems,” Proc. Natl.
Acad. Sci., Vol. 48, No. 4, April, 1962 pp. 596-600.
5. R. E. Kalman, Y. C. Ho, and K. S. Narendra, “Controllability of Linear Dynamical
Systems,” Contrib. Differential Equations, Vol. I, No. 2, 1962, pp. 189-213.
6. Y. C. Ho, “What Constitutes a Controllable System,” IRE Trans. Auto. Control,
Vol. AC-7, No, 3, April 1962, p. 76.
7. E. B. Lee, “On the Domain of Controllability for Linear Systems,” IRE Trans.
Auto. Control, Vol. AC-8, No. 2, April 1963, pp. 172-173.
8. E. G. Gilbert, “Controllability and Observability in Multivariable Control Systems,”
J. Soc. Ind. Appl. Math.—Control Ser., Ser. A, Vol. 1, No. 2, 1963, pp. 128-151.
9. I. M. Horowitz, “Synthesis of Linear, Multivariable Feedback Control Systems,”
IRE Trans. Auto. Control, Vol. AC-5, No. 2, June 1960, pp. 94-105.
10. R. E. Kalman, “Mathematical Description of Linear Dynamical Systems,” J. Soc.
Ind. Appl. Math.—Control Ser., Ser. A, Vol. 1, No. 2, 1963, pp. 152-192.
11. E. A. Coddington, and N. Levinson, Theory of Ordinary Differential Equations,
McGraw-Hill Book Co., New York, 1955, Chapter 3.
12. R. A. Struble, Nonlinear Differential Equations, McGraw-Hill Book Co., New York,
1962, Chapter 4.
13. B. K. Kinariwala, “Analysis of Time Varying Networks,” IRE Inter. Conv. Record,
1961, Pt. 4, pp. 268-276.
14. Pipes, L. A., Matrix Methods for Engineering, Prentice-Hall Inc., Englewood Cliffs,
N.J., 1963, pp. 90-92.
Problems 395

15. B. Friedman, Principles and Techniques of Applied Mathematics, John Wiley and
Sons, New York, 1956, p. 43.
16. Coddington and Levinson, op. cit., p. 65.
17. Stuble, op. cit., p. 64.
18. Coddington and Levinson, op. cit., p. 80.
19. L. A. Pipes, J. Appl. Phys., Vol. 25, pp. 1179-1185.
20. M. Issac, “Deterministic and Stochastic Response of Linear Time-Variable Systems,
General Electric, LMED Technical Report, No. R62 EML1, March 1962.
21. J. H. Laning, and R. H. Battin, Random Processes in Automatic Control, McGraw-
Hill Book Co., New York, 1956, Chapter 6.
22. R. Sussman, “A Method of Solving Linear Problems by Using the Adjoint System,”
Internal Tech. Memor. No. M-2, Electronics Research Laboratory, University of
California, Berkeley, California.

Problems

5.1 Draw the simulation diagram for the following systems.

(a) y + 3y + 5y = v

(b) y + 3y + 5y = v + 2v + v

(c) yx + 3yL + 2yz = vx + 2v2 + 2v2

y2 + 42/j_ + 3y2 =v2+ 3v2 + v±

5.2 What are the differential equations for the system shown in Fig. P5.2?

v\

Fig. P5.2 (Continued on p. 396)


396 State Variables and Linear Continuous Systems

Fig. P5.2 (Continued)

5.3 Draw the simulation diagram for the following differential equations.
(a) y + ky + (co2 + e cos t)y = v (Generate cos t by simulation.)
(b) y + y2y + 5y = v
5.4 Show that the simulation in Fig. P5.4 satisfies Eq. 5.5-3.

Fig. P5.4
Problems 397

5.5 Find the transfer function matrix and draw the transfer function diagram
for the systems described below. Comment on the number of integrators
required.

{a) y1 + 3yx + 2y1 = vx + 2v1 + v2 + v2


Vz + 2y2 = -v1 - 2v1 + v2

(b) 2/i + 2/i = v± + 2v2


Vz + 3y2 + 2y2 = v2 + v2 - vx

(c) yx + 2y2 + y1=v1 + + v2


Vz + V\ + 2/2 = vz + ^1
(^0 Hi + + 2^ = 3^ + 4^ + 8i>2
Vz + 32/2 - 42/i ~ 2/i = + 2i)2 + 2r>2

5.6 Write the vector matrix equations for the system shown in Fig. P5.4 in
terms of the state variables indicated.
5.7 Write the vector matrix equations for the systems of Problem 5.5.
5.8 Find the vector matrix equations for the following systems using the partial
fraction technique. Show that these equations can be determined from the
equations obtained by simulation techniques by a coordinate transforma¬
tion.

(a) y + 3y + 2y = v
(b) ‘y + 4y + 5y + 2y = v
(c) 'y + Ay + 6y + My = v
(d) y1 - IO2/2 + Vi = vx

Vz + 6y2 = v2

5.9 Find the general expression for the elements of the A(/) matrix for an
RLC network if the indicators and capacitors are functions of time.
5.10 Given the system defined by the time-varying differential equation

n— 1 n
Pny +
k=0
2 <*n-lit)pky =
*=0
2 Pn-k(t)pkV

Show that this system has the state equations

-±1~ - 0 10 0 - xx "bp

±2 0 0 1 0 Xz bz

— +
• • •

. 1
A. an an—1 • • • “al- .xn_ A_

y = X! + b0v
398 State Variables and Linear Continuous Systems

where

b0(o = m
2 ^ r ln -f- in — i\
bi(t) = Pi(t) ~I I _ . ai_r_m(t)pmbr(t)
r=0 m—0 \ ^ l 1

5.11 Find the simulation diagram for the system defined by the time-varying
differential equation

n d3y d2y dy d2v dv


t2_+(cost)_+2_ +(sin ,yy^t»—+icos0-+v

Write the vector matrix equation x = Ax + Bv.


5.12 Using the partial fractions expansion technique (neglect d0), but letting
n
Y(s) = 2 W)
i=1
where xt(t) satisfies the differential equation xi — Xixi = b^t), find the
corresponding A, B, and C.
5.13 Repeat Problem 5.12 for the case where there is a repeated pole of order k.
Show that the ones of the Jordan matrix are replaced by bjbi_l5 i =
? k
5.14 Find the state transition matrix by each of the five methods presented for
the following systems.

(a) y + 4y + 3y = 0
(b) y + 2y + y = 0
(c) y + 2y + 2y = 0
(d) y + 4y + 6y + 4y = 0

5.15 Assume that a unit step is applied to the systems described by Problem
5.14u, b, and c. What is the output y(t)l
5.16 Assume that an input, cos cot, is applied to the system of Problem 5.14c.
What is the output y(t)l For what value of oj does y(t) have its largest
peak amplitude? Does y(t) ever become unbounded?
5.17 In the vicinity of the operating point (TV = 1, C = 1, 7? = 0) of a nuclear
reactor, it can be shown that the linearized dynamic equations are given by:

x1 = — x± + x2 + x3
ry* - ry* _ ryt

x3 = Kxx

where xx = N — 1, x2 = C — 1, x3 = R, and N = normalized power


level, R = normalized reactivity, C = normalized delayed neutron con¬
centration, K = normalized temperature coefficient of reactivity. Find
the state transition matrix for the linearized set of equations. For what
Problems 399

values of K is this system stable? For a given set of initial conditions,


what can be said about the relative magnitudes of the unforced responses
N(t), R(t), and C(t)?
5.18 Show, by substituting Eq. 5.6-3 into Eq. 5.6-1 and utilizing Eq. 5.6-2 and
the fact that the characteristic vectors are independent, that the general
form for the az(/) in Eq. 5.6-3 is given by
5.19 Show that, by substituting Eq. 5.6-3 into Eq. 5.6-1 and utilizing the recip¬
rocal basis, the result of Problem 5.18 can be obtained.
5.20 For the system defined by the differential equation y +5y +6y = /(/)
carry through the steps outlined in Example 5.6-1. Verify your result by
utilizing a Laplace transform method. What should be the forcing func¬
tion so that only one mode is excited?
5.21 Prove Eq. 5.6-10 by decomposing f(/) into an infinite sum of unit impulse
functions.
5.22 Prove Eq. 5.6-14.
5.23 Assume that the forcing function to a linear fixed system is given by
f(0 = Re [%ej(ot]
where g is a real vector and eo is the angular frequency of the forcing
function. Show that the output of the system is given by
n (r ■ q)
x(t) = Re 2 ' ” . (tA,t ~
i=i h ~J<o

If all the A/s have negative real parts, show that in the steady state the
peak amplitude of oscillation of the zth mode is given by

<Xi, g)
jW - A;

for real Vs, and *s eclualt0


<r»» 8> , (r**, 8> *
--- U; + --— u,-*
JO) - JO) - V

in the u'u" plane for complex A,-.


5.24 Using the results of Problem 5.23, show that, when is complex (A* =
— at- + jfc) and fa » act (high Q), for p{ a> the peak amplitude in the
steady state is given by the expression for the peak amplitude when the
Vs are real and negative. What can be said about the requirements for
obtaining the largest steady-state oscillation for a particular mode?
5.25 (a) Show that eA< can be represented by

2 eMut>(u
i=1

if A has distinct characteristic values. This is called the spectral representa¬


tion of eAh
400 State Variables and Linear Continuous Systems

(b) Since x(7) = eAt x(0), for an unforced system, show that x(/)is equal to

2 x(o))cA<<uf
i=l

5.26 (a) Show that the state transition matrix eM can be found by the use of
the modal matrix as ext = where A is a constant matrix. Why
is the transformation x = M(/)q not particularly useful for the system
x = A (t)x?
(b) Using the result of part (a), show that the complete solution for the
matrix differential equation x = Ax + Bv is given by

x(0 = I
J— oo
MeA(^-^M-1Bv(A) dX

(c) Using the result of part (b) and the fact that the columns of M are the
characteristic vectors and the rows of M_1 are the reciprocal basis
vectors rif show that
rt n
x(0 = 2 (ri’ Bv(A))eA^-A>U; dX
J— 00 i=1

5.27 The transform of the state transition matrix of a fixed system is given by
= fsl — A]-1. Why is the inverse state transition matrix transform
<E>_1(s) not equal to [si — A] ?
5.28 For the system shown in Fig. P5.28, solve for the initial condition response
x(t) using the mode expansion technique, and check using the state transi¬
tion matrix.

Fig. P5.28

5.29 Consider the constant resistance network shown in Fig. P5.29, where
L(t)/C(t) = R2; L(t), C(t) > 0. Let v(t) be a voltage source, xL be the flux
linkages of the inductor (x1 — LiL), x2 be the charge across the capacitor
(x2 = Cvc), y(t) be the current to the network. Choose as state variables
xi = (^i + #2)/2» x2' = (aq — x2)/2. Write the differential equations for
x{ and x2' and show that x/ is Sc and x2 is S°.
Problems 401

5.30 Consider the following two subsystems, both of which are observable and
controllable:^ = ol1x1 + v(t), x2 = cc2x2 + v(t). If these two systems are
connected in parallel, such that the output y = x1 — x2, what are the
conditions required for the overall system to be both observable and
controllable?
5.31 Given the cascade connection shown in Fig. P5.31, show that8

(a) n = na + nb
(Jd) A^, . . . , Aw . . . , Ana, Ajj,, . . . ,
(c) A necessary (but insufficient) condition for the controllability (observ¬
ability) of S is that both Sa and Sb be controllable (observable).
(d) If Sa and Sb are both controllable (observable), any uncontrollable
(unobservable) modes of S must originate in Sb(Sa).

Fig. P5.31

5.32 Given the parallel connection shown in Fig. P5.32, show that8
(a) n = na + nb
(.b) Aj, . . . , An Aja, . . . , Ana, Ajj,, . . . , Anb

(c) A necessary and sufficient condition that S be controllable (observable)


is that both Sa and Sb be controllable (observable).

Fig. P5.32
402 State Variables and Linear Continuous Systems

5.33 Show that the transfer function H(.s') of a system S can be written as8
n*
H(i) = 2 + D
i=1

where the * indicates S*, and the matrices have rank one. Hint: Start
with the expression H(s') = COB + D where O = [jl — A]-1.
5.34 Consider the system

'-3 r ‘1 r V "i r

x + v,' y =
1 — 3_ _1 i_ j -i_

(a) Find the response of x to x(0), v(r) = 0, in terms of the “modes” of


the system. Sketch the trajectories for various initial conditions in both
the (qx, q2) modal plane, and the xv x2 plane.
(b) Comment on observability and controllability.
5.35 Consider the linear fixed system

x = Ax + Bv
y = Cx + Dv

The characteristic values of A are not distinct.


(a) Suppose that the Jordan form for A is M_1AM = J = a (2x2)
diagonal matrix whose diagonal elements are equal; and that B is (2 x 2),
C is (2 x 2), D = [0]. Draw a block diagram for the system in terms of
the new state variables q = M-1x. What simplification results if C is
(1 x 2)7 Draw a simplified block diagram for

C = [1 -2]

(b) Suppose that the Jordan form for A is M-1AM = J = a (2 x 2) matrix


with a one on the superdiagonal (and equal diagonal elements). B is
(2x1), C is (1x2), D = [0], Draw a block diagram for the system in
terms of new state variables q = M_1x.
(c) Draw a block diagram for
~-l 1
A = [_ 0 -1
1
0
, c = [0 1]
Comment on observability and controllability for this system.
5.36 Consider the unstable system

X = X' -f v |l>| < 1


y = x

(a) Find the region of state space for which the system is controllable.
(b) Find the minimum time required to return a state in the controllable
region to the origin.
Problems 403

5.37 Q&nsider the system

*1 = ~\xi + + |^3 + 2K0


O'1 - 2 ai _____ 8 2 /y»
^2 — 3^1 3^2 3 .3
= x1 — x2 — 2x3 — v(t)

2/(0 = f^l “t" ^2 ~f 3^3

(a) Find the state transition matrix 4>(/).


(b) Suppose v(t) = 0 and ^(0), :r2(0), ^3(0) are given. Write x(/) in terms
of the modes of the system.
(c) Make a change of state variable x = Mq such that the new state vari¬
ables q are “uncoupled.” Draw a block diagram for the system in terms
of the “new” state variables.
(d) Comment on observability and controllability, and separate the system
into S*, S°, Sc, and Sf.
5.38 Consider the system of Fig. P5.38.
(a) Write the vector matrix differential equation describing the system,
defining the output of the integrators as the state variables.
(b) a and b are constant parameters. Sketch those areas in the a, b pa¬
rameter plane for which the system is completely controllable, partially
controllable, and completely uncontrollable.
(c) Assume that a is fixed. Sketch the areas in the b, c parameter plane for
which the system is completely observable, partially observable, and com¬
pletely unobservable.

Fig. P5.38

5.39 Find the state transition matrix for the following systems.

(a) y + ty — v
(b) y + y/t = V
(c) y + ty + y = v
(d) t2y + ty + y = v

(e) y - (1/02/ + [1 + 0/t2)]y = v


404 State Variables and Linear Continuous Systems

5.40 Prove that if <J>(7, r) is a solution to the matrix equation

x = A(t)x
then

|<J>(7, T)l = 14>(0, r)| exp trace A(A) dX

From this, what can be concluded about the nonsingularity of <J>(t, r) so


that its inverse exists? What does this imply about the independence of
solutions? Hint: Find the Wronskian, w(/), and show that the differential
equation for the Wronskian (w(7) = r)f) is given by w = (trace A)w.
5.41 Find t) f°r the systems of Problem 5.39. Show that r) =
4>(r, t). Show that the use of the matrix notation and that of a linear
differential adjoint operator are equivalent.
5.42 What physical significance can be attached to the interchange of rows and
columns as expressed in Eq. 5.9-29?
5.43 Show that, if C is a nonsingular matrix and T is a real number ^ 0, then
there exists a matrix B such that C = exp BE for a periodic system.
5.44 For the equation y + Pi(t)y + />2(/)2/ = 0 (Hill’s equation), where pi(t)
and p2(t) are continuous functions of time and periodic with period T, a
solution which satisfies y(t + T) = Xy(t), where A is a characteristic multi¬
plier, is called a normal solution. If y-^t) and y2(t) are a given base of real
solutions to this equation, show that

yi(t + T) = any±( t) + a12y2(t)

y2(t + T) = tf2i2/i(0 + ^22^(0


so that
a-tii-i A a2\
= 0
a 12 a22 A

for the system to possess nontrivial solutions.


5.45 With the results of Problem 5.44, show that

lim \y(t)\ = oo, | A| > 1


■ 00

lim |?/(0I = 0, |A| < 1


■ 00

Periodic solutions exist for A = ±1.

5.46 For the equation y + p(t)y = 0, where p(t) is continuous with period T,
what are the conditions that this equation admit of periodic solutions?
5.47 A state transition matrix of a system satisfies the equation

4>(f +T,r) = 4>(/, r)A (A = C = ,BT)

where A is a diagonal matrix consisting of characteristic multipliers. Show


that periodic solutions exist only if the diagonal elements = ±1. If
Xi — 1, show that the period of the solution is T. If Ae — —1, show that
the period of the solution is 2T.
Problems 405

5.48 Find the impulse response matrix for the systems of Problem 5.5.
5.49 Find the impulse response for the systems of Problem 5.14.
5.50 For the system with the following A, B, C, and D matrices, find the impulse
response matrix by use of the simulation diagram. Check your solution
using Eq. 5.11-5.

'-1 r ~2~ "3"


A = , B = , C = [1 1], D =
-1 -l _1 _0_

5.51 Find the impulse response matrix for the RLC circuit shown in Fig. P5.51.

5.52 Find the impulse response matrix H(7) for the system

"-1 0 o' ~1 o'


x 0 -3 0 x + 0 1
0 0 —5_ _1 1

1 1 0
y = 0 1 1
0 0 1

5.53 Obtain Eq. 5.10-7 by the method of “variation of parameters” used in


Section 5.8.
5.54 (a) Find H(s) for the following system and determine if the system is
controllable and/or observable.

0 1 o ' '-l'
x = 5 0 2 x + 1
_ —2 0 —2_ -1
y = [—2 1 0] x

(b) Show that the intersection of the controllable and observable spaces
is the space generated by the transfer function matrix.
406 State Variables and Linear Continuous Systems

5.55 Show that an impulse response matrix H(/, r) is realizable by a finite


dimensional dynamical system if and only if there exist two continuous
matrices P(/) and Q(r) such that

H(t, r) = P(OQ(t) for all t, r

Hint: See Section 2.8.


5.56 The impulse response matrix (for a fixed T) of the system

x = A (t)x + B(t)v
y = C(t)x
is given by
H(T, r) = c(D4>(r, t)B(t) t > r
= 0 T < r

((a) Show that the impulse response matrix of the adjoint system

a = — aA(/) + vC(0 (a and q are row vectors)


q = aB(0
is given by
H*(/, T) = C(T)<t>(r, 0B(/) t > T
= 0 t < T

(b) If the change in variable tx — T — t is made for the adjoint system,


and the change in variable tx — T — r is made for the original system,
show that
H*(fr, 0) = QT)4>(r, T - tJB(T - tx > 0
= [0] < o
and
H(T, T - Tj) = C(D<t>(r, T - r1)B(r - Tj) T, > 0

= [0] TJ < 0

(c) From the result of part b, show that observation of the response of
the modified adjoint system over the tx axis (the running variable of the
simulator) is identical with observing the cross-plot at time T of the response
of the original system over the axis. In addition, show that an impulse
placed on one of the inputs to the adjoint system produces a row of the
desired impulse response matrix.
(d) Show that the result of the preceding parts proves that the complete
modified adjoint system is obtained by interchanging the inputs and out¬
puts of the original system and making the change in variable t = T — tv
6
State Variables and Linear
Discrete Systems

6.1 INTRODUCTION

The state variable viewpoint is applied to linear discrete time processes


in this chapter. Much of the effort is directed toward sampled-data
systems. It is also shown by example that the state variable approach is
quite useful in dealing with linear sequential systems, which naturally
arise out of the theory of coding. Viewed in this fashion, the state
variable approach is a unifying concept, as both continuous and discrete
systems fall within its general framework.
The theory of linear discrete time systems follows the theory of linear
continuous systems closely. Therefore, much of what is said in this
chapter is based on the preceding chapters. The similarity between the
theory of linear continuous systems and the theory of linear discrete
systems is indicated.

6.2 SIMULATION DIAGRAMS

The basic building blocks required to construct a block diagram of a


system described by linear difference equations are the adder, the amplifier,
and the unit delay. The adder and the ampli¬
fier are the same blocks that were used for y(kT + T)
continuous systems, and the unit delay for
difference equations is somewhat analogous to
the integrator for differential equations. It is Fig. 6.2-1
shown in Fig. 6.2-1. The input to the unit
delay is y{kT + T), and the corresponding output is y(kT). Thus the
input to the unit delay appears at its output one period later, or delayed

407
408 State Variables and Linear Discrete Systems

by T. It should be noted that unit delays of the order generally required


in control systems are quite difficult to obtain in practice, and seldom is
a system actually simulated in real time by this method. For sequential
circuits, where the variables are either binary in nature or take on discrete
values, this type of simulation can be performed in real time.
The approach used to generate a block diagram of a linear difference
equation is to assume that the variable y(nT + kT) is available, and then
successively pass this variable through unit delays until y(kT) is obtained.
The block diagram then is completed by satisfying the requirements of the
difference equation, or “closing the loop.”

Example 6.2-1. Find the simulation diagram for the system governed by the difference
equation y(JcT + IT) + ay{kT + T) + by{kT) = v(kT).
The first step is to solve for y{kT + IT), as

y{kT + 2T) = v(kT) - ay(kT + T) - by{kT)

The terms y{kT + T) and y{kT) are obtained as shown in Fig. 6.2-2a. Assuming ideal
distortionless delays, a signal which appears at terminal 1 appears at terminal 2 one time
period later, and at terminal 3 two time periods later. Similarly, a signal at terminal 2
appears at terminal 3 one time period later. The completed block diagram (Fig. 6.2-2b)
is obtained by satisfying the requirements of the difference equation.
If the initial conditions are given in terms of y{0) and y{T), then y{0) is the initial signal
at the output of the first delay, and y(T) is the initial output of the second delay. After
one time period, y(T) appears at the output of the first delay unit. After two time periods,
the output of the first delay is y(2T) = n(0) — ay{T) — by{0).
If a comparison is made between Figs. 5.2-2b and 6.2-2'o, it is evident that similar
rules hold for constructing block diagrams of difference equations and differential
equations. The integrator used in simulating differential equations is analogous to the
unit delay used in simulating difference equations.

(a)

(b)
Fig. 6.2-2
Sec. 6.2 Simulation Diagrams 409

v(kT)

Example 6.2-2. Find the simulation diagram for the nth order difference equation
y(nT + kT) + a n-Xy{nT + kT — T) + • • • + oc0y(kT)
=finv(nT + kT) 4- fin-i v(nT + kT — T) + • • • + fi0v(kT)
The general simulation diagram for this system is shown in Fig. 6.2-3, by analogy to
Fig. 5.5-4. The a's and b’s of the block diagram are given by Eqs. 5.5-7 and 5.5-8.
For the specific case of the difference equation
y{kT + 3 T) + 3 y(kT + 2 T) + 4 y{kT + T) + y{kT) = 2 v{kT + 3 T)
+ 3 v(kT + 2 D + v(kT + T) + 2 v(kT)
b0 = fin = 2
b\ = fin-1 an-1^0 = ^
b% ~ fin-2 aw-l6i an-2^0 = 2

63 — fin—3 1^2 *-^n—2^1 3^0 ^

The simulation diagram for this system is shown in Fig. 6.2-4. A comparison of this

v(kT)

Fig. 6.2-4
410 State Variables and Linear Discrete Systems

diagram with Fig. 5.5-5 shows that the only difference between the two diagrams is that
the integrators of Fig. 5.5-5 have been replaced by unit delays.

Example 6.2-3. Find the simulation diagram for the system governed by the difference
equations

2/i (kT + T) + y2{kT) = vx{kT) + 2 v2(kT)


y2{kT + 2T) + 3 yx{kT + T) + 2 y2(kT) = v2(kT + T) + v2{kT) + vx{kT)

These equations can be rewritten as

yi(kT +T) = Vl(kT) + 2v2(kT) - y2(kT)


y2{kT + IT) - v2(kT + T) = v2{kT) + vx(kT) - 2yi{kT + T) - 2y2(kT)

Using y2(kT + 2T) — v2{kT + T) as the input to one delay chain, the block diagram
appears as shown in Fig. 6.2-5. The approach is similar to that used for continuous
systems.

6.3 TRANSFER FUNCTION MATRICES

The transfer function H(z) for a single input-single output discrete time
system is equal to the ratio of the Z transforms of the output and input of
the system. For multivariable systems, the transfer function between
various input-output terminals is similarly defined. Thus

His(z)
m. (6.3-1)
J4(X) — ^5 k 7^ j
V,(z) ’
Sec. 6.3 Transfer Function Matrices 411

where Yfz) is the Z transform of the output at terminal i, and Vfz) is the Z
transform of the input at terminal j.
The transfer function matrix is simply the ordered array of these
transfer functions, where i denotes the row and j denotes the column in
which Hij(z) appears. If the transfer function matrix H(z) is known, then
the output vector transform Y(z) is given by

Y(z) = H(z)V(z) (assuming zero initial conditions) (6.3-2)

where V(z) is the column matrix of the Z transform of the input vector
v(kT).
Example 6.3-1. Find the transfer function matrix for the two input-two output system
described by the difference equations

yfkT + 3F) + 6yfkT + 2 T) + 11 yfkT + T) + 6yx(kT)


= vfkT + T) + vfkT) + v2(kT)
y2(kT +2 D + 5 y2(kT + T) -F 6y2(kT) = vfkT + T) + v2{kT)

Taking the Z transform of both sides of these equations, assuming zero initial con¬
ditions,
o3 + 6z2 + 11a + 6)^(2) = (z + 1)^(2) + v2(z)
(Z2 + 5 z + 6) Y2(z) =(z + 1 )V2(z)

Since 23 + 6z2 + l\z 4- 6 = (2 + 1)(2 + 2)(2 4- 3) and 22 + 5z + 6 = (2 4- 2){z + 3),

1 1
H 11(f) = z \ zrr, : 4: Hl2(z) = -
(2 4- 2)(2 + 3) 12V ; (2 4- 1)0 + 2)(2 + 3)

2 + 1
H2l(z) = 0 H22(z) = -
22V ; (2 + 2)(2 + 3)
or
1 1
(2 4- 2)(z 4- 3) (2 4- 1)(Z + 2)(2 + 3)
H(2) =
(g + 1)
0
(2 4- 2)(2 + 3)

The transfer function block diagram appears in Fig. 6.3-1.

VT (z)

Fig. 6.3-1
412 State Variables and Linear Discrete Systems

Fig. 6.3-2

Example 6.3-2. Find the transfer function matrix for the system governed by the
difference equations
yx{kT + IT) + yx{kT + T) + y2{kT + T) = vx(kT + T) + vx{kT) + v2{kT)
y2{kT + T) 4- yx(kT) = v2(kT)
Transforming both sides of these equations, assuming zero initial conditions,
(z2 + z)Y1(z) + zY2(z) = (z + 1)^(2) + v2(z)
Yx(z) + zY2(z) = V2(z)
Solving for Yx(z) and Y2(z),
(2 + 1) V2(Z) (2+1)
Y1(z) = —-V1(z) T2(2) = Vi(z)
Z2 + 2 — 1 z(z2 +2—1)

The transfer function matrix is


2 + 1
0
22 + 2 — 1
H(2) =
-(2 + 1) 1

_2(22 +2—1) 2_

The transfer function block diagram is shown in Fig. 6.3-2.

The unit delay, represented by the Laplace transform e~sT, corresponds


to 1 /z in the z domain. Thus the integrator, as represented by 1/s for
continuous systems, has the unit delay as its analog in the diagram of
discrete systems. Hence the transfer functions l/(z + a) and z/(z + a) are
obtained in a manner similar to that used for continuous systems (see
Fig. 6.3-3.).
The same note of caution that was injected into Section 5.3 should be
added here. Inspection of only the transfer function matrix may lead to
incorrect conclusions about the order of a system, as the transfer function
relates only the controllable and observable aspects of the system. It may

Fig. 6.3-3
Sec. 6.4 The Concept of State 413

mask system properties which would be obtained from the physical


system.

6.4 THE CONCEPT OF STATE1

The state of a discrete time system can be intuitively defined as the


minimum amount of information about the system which is necessary to
determine both the output and the future states of the system, if the input
function is known. More precisely, a set of variables x qualifies as a state
vector, if two single-valued functions f and g can be found such that
x(kT +T) = i[x(kT),\(kT)]
y(kT) = g[x(kT), v(kT)]
where y(kT) = the output vector at time kT, and v{kT) = the input vector
at time kT. It is interesting to note that these requirements for the state
of a sequential device were independently determined by Huffman and by
Moore while working on related but different problems.2’3
In most of the subsequent material, the outputs of the delay elements of
the simulation diagram of a system are taken as the state of the system.
These outputs provide a convenient and sufficient choice for the state
vector.
Example 6.4-1. The capacitor step charger of Fig. 6.4-1 is so designed that the voltage
on the capacitor increases in steps each time the input pulse appears.4 Assume that
initially there is no charge on capacitor C2. The first input pulse charges Cx through
diode Zh to a voltage V0. After the input pulse disappears, the charge on C\ distributes
itself between C\ and C2 according to the inverse ratio of the capacitances. The second
pulse again charges C\ to a voltage V0, but the subsequent additional charge which is
placed on C2 is less than that from the first pulse, owing to the previous charge left on C2.
The input pulses are continued, and it is expected that the voltage changes appearing on
capacitor C2 diminish asymptotically. Evaluate these changes and verify that the voltage
on C2 is a state variable.
If x(k) is the voltage on capacitor C2 after the /cth input pulse, then the difference
equation for x(k) is given by the principle of conservation of charge as C\ V0 + C2x(k) =
(G + C2)x(k + 1) or
G CiV,o
x{k + 1) — x{k) =
C i + C2 G + c2

v = negative
input
pulses,
amplitude
Vo

Fig. 6.4-1
414 State Variables and Linear Discrete Systems

Fig. 6.4-2

Boolean Identities
And Or
0-0 = 0 0 + 0 = 0
0-1 = 0 0+1 = 1
1-0 = 0 1+0=1
1-1 = 1 1 + 1 = 1
Complement
0= 1 1 = 0

Unit
AB
A(k+l) delay AW And

Or
AB +AB
Logical
inverter And
AB

Logical
inverter

Unit
g(k + i) delay
B(k)
Fig. 6.4-3
Sec. 6.5 Matrix Representation of Linear State Equations 415

Solving this equation in the standard fashion indicated in Section 2.12 yields

,k~\
C,
x{k) =
(51 -(- C‘>,

Therefore the capacitor voltage increases in discrete steps as shown in Fig. 6.4-2. The
step heights are given by
Cx C2
x(k + 1) — x{k) = V0
A -f- CJ yCx + C2

The state of this circuit is given by the capacitor voltage x{k). Note that this state
takes on only discrete values. The next state of the system, x(k + 1), is uniquely deter¬
mined by the present state of the system and the present input. The general state
equations for this system are given by Eqs. 6.4-1. In this case, both f and g are linear,
single-valued, scalar functions.

Example 6.4-2.b A modulo 4 counter is designed so that it cycles through the counts
00, 01, 10, 11, 00, . . . . The circuit for this device is shown in Fig. 6.4-3, and the
reader familiar with sequential circuits can verify that this circuit does indeed cycle
through these counts. The outputs of the delay elements represent the state of the
system, as well as the outputs of the system. Write the state equations.
The logical equations for this counter are normally written in the form

a(1c+\)T = (Ab + AB)kT

B«+1) = (BfT

where the superscripts denote time instants and are not exponents. This is an auton¬
omous sequential circuit, and the general state equations for such a circuit are

x[(k + 1)F] = f[x(AT)] x = state (output of delay elements)


y{{kT)] = g[x(AT)] y = output (outputs of delay elements)

6.5 MATRIX REPRESENTATION OF LINEAR STATE EQUATIONS

The general form of the state equations for a multivariable discrete time
system were given by Eq. 6.4-1 as

x[(H 1)71 = f[x(kT), \(kT)]


y (kT)= g [x(kT),\(kT)]

If the system is linear, then Eq. 6.5-1 can be written as the set of linear
vector-matrix difference equations

x[(k + 1)71 = A (kT)x(kT) + B(kT)\(k T)


y (kT) = C(kT)x(kT) + T>(kT)y(kT)

A(kT), B(AT), C(kT), and D(fc7’) have been indicated as time-varying


matrices. If the system is nontime-varying, these matrices can be written
416 State Variables and Linear Discrete Systems

as the constant matrices A, B, C, and D. The general block diagram,


similar to Fig. 5.5-1, is shown in Fig. 6.5-1.
For a system described by a set of nth order difference equations, the
form shown in Eq. 6.5-2 can be obtained by writing the given equations as
a set of first order difference equations.

Example 6.5-1. Express the second order difference equation

y[(k + 2)71 + ay[{k + 1)7] + by(kT) = v{kT)

represented in Fig. 6.2-2b as a set of first order difference equations.


Let
y(kT) = xx{kT)

y[(k + 1)7 = x&k + 1)7] = xt(kT)


Then
(kT+ T)~ 0 1 xx{k 7)
+ v{kT)
x2(kT + 7) —b —a x2{kT)

xi(kT)
y(kT) = [ 1 0]
x2(kT)

The variables a;1(/:7)and x2{kT) are the outputs of the delay elements, and they represent
the state of the system. These equations are of the general form of Eq. 6.5-2, where

0 r "0“
A = , B = [1 0], D = [0]
—6 a 1

Example 6.5-2. The general form for an nth order difference equation is given in
Example 6.2-2. The simulation diagram appears in Fig. 6.2-3. Find the A, B, C, and
D matrices.
Analogous to the continuous case, the A, B, C, D matrices for this system are given
by Eq. 5.5-9.
Sec. 6.5 Matrix Representation of Linear State Equations 417

Linear Binary Sequential Networks!

An interesting application of the state variable approach is the analysis


of linear binary sequential networks. A linear binary sequential network
consists of pure time delays and modulo 2 summing junctions. A modulo
2 summing junction is an exclusive-OR function having the logical
equation f = (xx + x2)x1x2 = xxx2 + x±x2. The output of such a summer
is zero if the two inputs xx and x2 are the same, and one if the inputs are
different. Although this discussion is limited to modulo 2 networks, the
same type of analysis can be used for modulo p networks.6-8 These linear
modular sequential networks have found a limited application in error-
correcting codes, computer circuits, and in certain types of radar systems.9
For the purpose of illustrating the use of the state variable approach in
the analysis of linear binary sequential networks, consider the network of
Fig. 6.5-2. This network consists of four unit delays (a four-element
shift register) and a single modulo 2 summing junction (a logical exclusive-
OR circuit). The equations of this network are

xfk + 1) = x2(k)
x2(k + 1) = xz(k)
xz(k + 1) = Xi(k)
xjjc + 1) = x^k) © xz(k) © xx(k)
In matrix form,
"0 10 0 "

0 0 10
x(k + 1) = x(/c) = Ax(k) mod 2
0 0 0 1
10 11

The sequence of states through which this network will pass is x(0),
Ax(0), A2x(0), .... If A is nonsingular, then each state x(A:) has a unique
preceding state x(k — 1). For a mod 2 network, the determinant |A| is
either one or zero. In this particular example |A| = 1, so that A is

Fig. 6.5-2
f Readers completely unfamiliar with sequential networks should omit this section.
418 State Variables and Linear Discrete Systems

nonsingular. Hence the inverse A-1 does exist.


"0 1 1 r
1 0 0 0
A-1 =
0 1 0 0
_0 0 1 o_
Since there are four state variables, each of which can have the value
zero or one, there are 24 = 16 possible states of the system. Therefore the
state sequence of an initial state is either periodic with period P < 15 or
goes to an equilibrium state, where the equilibrium state is defined by
x(k + 1) = x(k). The trivial case where all the state variables are zero is
called the null state. All other equilibrium states are called finite equilib¬
rium states.

Consider the case in which the system reaches an equilibrium state. It is


assumed that A is nonsingular, such that Ak is not equal to the null matrix.
This removes the possibility of a nonzero initial state going to the null
state. For the finite equilibrium case x(k + 1) = Ax(k) = x(k). Since
(A — I)x(/:) = 0, the existence of an equilibrium requires that the char¬
acteristic equation |AI — A| = 0 have at least one unity root. For the
network of Fig. 6.5-2, the characteristic equation is A4 + A3 + A2 -f 1 = 0
mod 2. This can be factored into (A + 1)(A3 + A + 1) = 0 mod 2. There¬
fore the characteristic equation has one unity root, and there exist initial
states for which this network goes to an equilibrium state. One of these
initial states is x(0) = (1, 1, 1, 1).
When the network has a periodic state sequence of period P, then
x(k + P) = AFx(k) = x(/c) mod 2
For this condition, |AI — Ap\ = 0 mod 2, must have at least one unity
root. The integer P can be found by determining the smallest integer such
that Ap — I mod 2. If the characteristic polynomial /?(A) divides Ap — 1
without remainder, then Ap — 1 = p(X)q(X), where q(X) is the quotient
polynomial. Therefore Ap — I = P(A)Q(A) = [0] mod 2, and Ap =
I mod 2. Hence the technique to determine the length of the minimum
periodic sequence of a network is to determine the smallest integer P such
that p(X) divides Ap — 1 without remainder. For this particular problem,
the length of the minimum periodic sequence is seven. A typical sequence
is given by
‘O' “0" "0" "1" T "0" "1"

0 0 1 1 0 1 0
—> —>

0 1 1 0 1 0 0
i 1 0 1 0 0 0
Sec. 6.5 Matrix Representation of Linear State Equations 419

Fig. 6.5-3

The class of networks of order n, whose minimum periodic sequence is


equal to 2n — 1, are known as maximal-period networks. The necessary
and sufficient condition for a network to have only maximal-period
sequences is that the characteristic polynomial be irreducible and not a
divisor for Xk — 1, k < 2n — 1. (An irreducible polynomial /(A) is one
which cannot be factored into the form /?(A)g(A), except for the trivial
factorization when g(A) is a constant.) The network of Fig. 6.5-3 has only
maximal-period sequences. The equations of this network are

0 10 0
0 0 10
x(k + 1) = x(k)
0 0 0 1
110 0

|AI — A| mod 2

= A4 4- A3 + 1 = 0 mod 2

Since A4 © A3 © 1 is an irreducible polynomial, and since it is a prime


factor of A15 — 1, the minimum length of the periodic sequence is fifteen.
The prime factors of A15 — 1 are

A15 _ i = (24 + A3 + 1)(A4 + A + 1)(A4 + A3 + A2 + A + 1)


x (A2 + A + 1)(A - 1)

Since the cycle structure of such a network is completely determined by the


characteristic polynomial (assuming that the network matrix A is non¬
singular and that the characteristic polynomial is also the minimum
polynomial of the network), it is possible to synthesize these linear
sequential networks analogously to ordinary lumped element network
synthesis. A set of synthesis procedures is given in Reference 6.

State Equations-Partial Fractions Technique

The partial fractions technique for deriving the state equations for a
linear, fixed, discrete time process follows the same procedure as that
used for continuous systems. For a single input-output system with
420 State Variables and Linear Discrete Systems

transfer function H(z), the transform of the output Y(z) is given by Y(z) =
H(z)V(z). If the denominator polynomial of the transfer function H(z)
has distinct roots Xl9 A2, . . . , An, and if the order of the numerator poly¬
nomial of H(z) is less than the order of its denominator polynomial, then
H(z) can be written as the partial fraction expansion

n c-
m = 2 -A-
i=i 2 — At
The output Y(z) is given by

n*) = 2i —^
z — Ai
v(b
Therefore y(kT) can be written as a sum of terms of the form
n
y(kT) = 2 cixi(kT)
2=1

where the xjiJcT) must satisfy the first order difference equation

*il(fc + i)7i = hx&T) +


The state equations for the system can then be written in the form

*d(k + 1)71 xx{kT)


0 ... Q
x2[(k + 1)T] K x2(kT)
• 0 ... o
— + v(kT)

• 0 0 • • • x
,'"n
*„[(* + 1)71 _xn(kT)
xi(kT) (6.5-3)
x2(kT)

[y(kT)] = [Cl cs cJ

xn(k T)
or
x[(fc + 1)J] = Ax(kT) + B v(kT)
(6.5-4)
y(kT) = Cx(k T)

where the matrices A, B, and C are defined by the equivalence of Eqs.


6.5-3 and 6.5-4. Note that the symbol A is used in place of A, to denote
that A is diagonal. Also note B = b = 1, a vector.
This form of the state equations is of particular importance when
dealing with concepts and proofs, since the diagonal form enables one to
Sec. 6.5 Matrix Representation of Linear State Equations 421

make concise statements about the properties of the system. This form is
particularly convenient when dealing with forcing functions, as the corre¬
sponding state transition matrix is also of diagonal form. However, there
does arise a computational difficulty, because the transfer function H(z)
gives no information about the initial conditions of the system. In fact,
to compute the initial conditions on the x/s in this form, one must find
y(0), y(l), . . ., y(n — 1) and then solve a set of simultaneous equations to
find the relationships between the boundary conditions on the yf s and those
on the state variables of Eq. 6.5-4.
Example 6.5-3. Find the A, B, and C matrices for the sampled-data system of Fig.
6.5-4 a.
With respect to the sampled input and output, the transfer function H(z) of this
system is given by
Y(z) , 1
—— = H(z) = ---—=
V (z) 2 — 1 2 — e al

The output transform T(2) is then

V(z) V(z)e~aT
Y(z) = H(z)V(z) = -!-4- ■ = Xt(z) + X2(z)e~°T
2 — 1 2 — e~al

where the difference equations for xx and x2 are given by

x^k + 1)T] - xfkT) = v(kT)


x2[(k + 1)F] - e-“Tx2(kT) = v{kT)

V(s) / V(z) _ (1—e-a2)s + a


e~sT Y(z)
T s(s + a) T

(a)

X2 l(k+l)T) X2 (kT) -aT

O.
\f —

v(kT)
r~aT
Q -^y(kT)
J+

+^xi \(k+l)T\ x\(kT)

(b)
Fig. 6.5-4
422 State Variables and Linear Discrete Systems

The matrix equations for this system are


x(k + 1 )T = A x(kT) + By(kT)
y (kT) = C x(kT)
where
'1 0 " T
A = , B =
—aT
0 1
The simulation diagram for this system is shown in Fig. 6.5-46.

For the frequently arising sampled-data case in which the numerator


polynomial of H(z) is of the same order as the denominator polynomial of
H(z), the D matrix is not zero. Since D represents a feedthrough term, it
can be found by dividing the numerator polynomial by the denominator
polynomial, stopping after the first term. The first term represents the
feedthrough, or D matrix, term. A simple example of this is H(z) —
1/(1 — 2_1), which represents the z transform corresponding to 1 js.
Using the above criterion for handling this problem,

H(z) = -5- = 1 +
2—1 1
This operation is equivalent to
D = D = lim H(z)
Z~* 00

The remaining transfer function Hx(z) = H(z) — D can be handled in the


same manner as previously described. Actually, once D is found, there is
no need to find H^z), since H^z) has the same poles and the same residues
at these poles as does H(z). Therefore, when the order of the numerator
polynomial of H(z) is of the same order as the denominator polynomial
of H(z) and the poles of H(z) are of first order (distinct roots of the
denominator polynomial), the general state equations are
Aik + 1)7] = Ax(kT) + By (kT)
y(kT) = Cx(kT) + Dy(kT) ' " ’

where A is the diagonal matrix whose elements are the characteristic roots
A, k2, . . . , Xn of the denominator polynomial of H(z), and
B = column matrix whose elements are equal to one
C = row matrix all of whose elements are equal to ci = (* - K)m\z^
D = d{) — single-element matrix = lim H(z)
2—>00

A general block diagram is shown in Fig. 6.5-5.


For the case where the roots of the denominator polynomial of H(z) are
not distinct, the procedure to be followed is the same as that given for the
continuous case in Section 5.5. Rather than repeat the procedure here,
an illustrative example is given.
Sec. 6.5 Matrix Representation of Linear State Equations 423

Example 6.5-4. Find the A, B, C, and D matrices for the system whose transfer function
is
4z3 _ 12Z2 + l3z - 7
= —7-7^7-
(z - 1 )2(z - 2)

This transfer function can be expressed in the partial fraction form


Ci Co Co
H(z) =- T- H--F d0
(z- \y (z - 1) (z- 2)
where
d0 = lim H(z) = 4
Z—*■ 00
Cl=(z- l)2H(z)jz=1 = 2

c2 = ^
dz
[(2 -1 mow = i
c3 = (z- 2)H(z)jz=2 = 3

Therefore the output Y(z) can be written


2 V(z) V(z) 3 V(z)
r<z) = 7-{rz + —4 + —^4 + 4V(z) = 2 Xfz) + X2(z) + 3 X3(z) + 4 V(z)
{z — \y 2—1 2 — 2
Since Xfz) = X2(z)/(z — 1),

[(k + 1)TJ - xfkT) = x2(kT)


424 State Variables and Linear Discrete Systems

Fig. 6.5-6

The state equations are


x[(k + 1)T] = Jx(A:T) + b v{kT)
y(kT) = Cx(kT) + d0v(kT)
where
"1 1 0" "0"
J = 0 1 0 , b = 1 [2 3], d0 — 4
0 0 2 1

The simulation diagram for this system is shown in Fig. 6.5-6.

For the multivariable case where there are multiple inputs or outputs,
the transfer function matrix H(z) can be used in a similar manner to find
the state equations. However, the approach is not so clear-cut as in the
single input-single output case, as there is generally a greater freedom of
choice in assigning elements to the B and C matrices.
Example 6.5-5. Find the A, B, C, and D matrices for the system whose transfer function
diagram is given in Fig. 6.5-7.
The outputs Fi(z) and F20) are given by
z 1
Y1(z)
0 + 1 )(z + 2)
FiO) + V2(z)
2 + 2

Y*(z) = Vx(z) + 4Vt(z)


0 + DO + 3)
Expanding into partial fractions yields

1 2
Yi(z) = - K(z) + Vx(z) + V2{z)
z + 1 Z + 2 2 + 2
9 .
i 2
Y2(z) = V1(z) ~ V2(z) + VM + 4 no)
z + 1 2 + 3
Sec. 6.5 Matrix Representation of Linear State Equations 425

V2(z)

Using these equations, Fig. 6.5-8 was drawn. It is readily apparent that the corre¬
sponding state equations are

*i[(* + 1)7T ■1 0 O' xfkT) 1 O'


vfkT)
*.[(* + 1)71 0 -2 0 xfkT) + 2 1
vfkT)
xf{k + 1)71 0 0 -3 _xfkT)_ ' 29
.
0
xfkT)
'yiikTJ -1 1 O' 0 O' vfkTj
1
xfkT) +
yfkT) 2 0 1 1 4 vfkT)
xfkT)

Fig. 6.5-8
426 State Variables and Linear Discrete Systems

Notice that, although the A matrix in the preceding example is easily


found, the B and C matrices are by no means unique. Certainly, some of
the elements in B could be interchanged with some of the elements in C
and the conditions of the transfer function matrix would still be satisfied.
Without any knowledge of the physical properties of the system, either
choice is equally valid. As long as all the transfer products btjcki remain
the same, B and C can be arranged in any number of ways. The use of the
transfer function, or the transfer function matrix, to obtain the state
equations of the system is at best a compromise, and it should be used only
if the original difference equations are not available.
If the original difference equations are available and A can be
diagonalized, a diagonal form for A can be obtained by use of the modal
matrix M. Assume then, that the state equations are available in the form
of Eq. 6.5-6 below, and that it is desired that a new set of state variables
be obtained such that the A matrix is a diagonal matrix A.

x[(k + 1)71 = A x(kT) + B\(kT)


y(kT) = C\(kT) + X>\{kT) ^

Define a new set of state variables q, such that

x(kT) = Mq (kT) (6.5-7)

Equation 6.5-6 can be rewritten in terms of these normal coordinates as

MqP + 1)7] = AMq(7E) + Bv(£E)


y {kT) = CMq {kT) + D v(kT)

Premultiplying both sides of the first equation by M-1 yields

q[(7 + 1)7] = M-JAMq(/cr) + M^Bv^E)

However, since M !AM = A, vector-matrix state equations in normal


form are
q [(* + l)r] = Aq(kT) + B ny(kT)
y(kT) = C„q(kT) + Dv{kT) ( ' ' }

where Bn = M-1B, the normal form input matrix, and Cn = CM, the
normal form output matrix. This is a general procedure, and, since it
originates from the difference equations of the system, it is to be preferred
over the transfer function matrix approach.

Example 6.5-6. The simulation diagram of Fig. 6.5-9 represents the original system
whose transfer function matrix was given in Example 6.5-5. Determine the state
equations in normal form.
Sec. 6.5 Matrix Representation of Linear State Equations 427

vi(kT) V2(kT)

The vector-matrix equations for this system are

"*i[(* + 1)7T "-1 0 O' “*i (kT)~ "1 O'


~vx{kT)~
xA(k + 1)T] — -1 -2 0 x2(kT) + 1 1
_v2(kT)_
_xz[{k + 1 )T}_ -1 0 — 3_ MkT)_ 1 0_

~xx(k T)~ r-
~yi(kT)' 1 O' ~vx(kT~
xi(kT)
_y*(kT)_ 0 ~ 4 _v2(kT
MkT)_

For the A matrix above, the modal matrix M, and its inverse M1 are given by

' 2 0 0" " i 0 0~


M = -2 -1 0 , M-1 = -1 -1 0
-1 0 1_ 0 1_

Substituting into Eq. 6.5-8, the resulting normal form matrices are

'-1 0 O' " \ 0~


A = 0 -2 0 , B„ = -2 -1
3
0 0 — 3_ 2 0
~ —2 -1 0" ~0 O'
c = , D =
1 0 -3 1 4

A check of all the transfer products bijCki shows that this set of matrices represents the
same system as that of Example 6.5-5.
The new state variables q are related to the old state variables x by the relationship
q = M-1x, or
qi = hxi q* = —(*1 + x*) q3 = ixi + *3
428 State Variables and Linear Discrete Systems

The use of the relationship q = M_1x removes the previous difficulty of finding the
initial conditions on the state variables q in terms of the known system initial conditions.

Finding the modal matrix may involve no more labor than any of the
other methods for finding the state transition matrix. In view of the
advantages of having a diagonal A matrix, the normal form is quite
desirable. It is also interesting to note that the mode expansion technique,
to be considered next, and the normal form produce the same effect of
uncoupling the state equations. In this respect, they are identical
approaches but are written in different forms. The form in which the
system equations are expressed is frequently one of personal preference
and familiarity.

6.6 MODE INTERPRETATION

The concept of expanding the response of a linear fixed system into the
sum of responses along the characteristic vectors of the A matrix can also
be applied to discrete systems with distinct characteristic values. The
development follows directly from the equations in normal form.
From Eq. 6.5-8,
q (T) = Aq(0) + B„v(0) (6.6-1)
and
q(2 T) = Aq(D + B XT) (6.6-2)
Substitution of Eq. 6.6-1 into Eq. 6.6-2 yields

q(2 T) = A2q(0) + ABnv(0) + B ny(T) (6.6-3)


Similarly,
q(3 T) = Aq(2 T) + Bwv(2T)
and Eq. 6.6-3 give

q(3D = A3q(0) + A2B„v(0) + AB„v(0 + B„v(2r)


Continuation of this procedure leads to
fc_1

q(kT) = A*q(0) + I (6.6-4)


3 =0

Then, since q = M_1x and B„ = M-1B,


Jc-l

x(kT) = MAfcM_1x(0) + J MA^^M^BvOT) (6.6-5)


3=0

Now, recalling from Section 4.7 that the columns of

M = [uj u2 • • • u„]
Sec. 6.6 Mode Interpretation 429

form a basis and that the rows of M-1, where


T T

M1 = (r,- = column vectors)

n _

form a reciprocal basis, Eq. 6.6-5 becomes

x(kT) = [ux u2 • • •
u„]A*
**n x(0)

T
■n

k-1

+ 2 [“i u; uJAk—j—1 B vOT)


3= 0

■ n
This can be rewritten as

x(/cT) = 2 [<r,., x(0)>A* + 2>i, By(3T)>/f-I-1]u, (6.6-6)


2=1 3=0

Equation 6.6-6 is the discrete system analog of Eq. 5.6-10.


The “modes” of the system are given by the terms of Eq. 6.6-6 for
i = 1,2, . . . , n. The mode expansion technique separates or uncouples
these modes, so that the response x(kT) is expressed as a linear weighted
sum of the modes. Each mode is directed along the characteristic vector
Uj, defined in terms of the state space zlf x2, . . . , xn. For an unforced
system, the amount of excitation of each mode is given by the scalar
product (r{, x(0)). For a forced system, the scalar product (rt-, Bv) is the
amplitude of the forcing function that is coupled to the zth mode.
In effect, the characteristic vectors represent a new coordinate system,
such that each mode of the system is directed along one of the coordinate
axes. The normal form for the state equations performs the same task but
expresses it in slightly different form. Thus if the original state equation is

x[(k + 1 )T] = A x(kT) + Bv( kT)


430 State Variables and Linear Discrete Systems

the transformation q = M-1x, where M is the normalized modal matrix,


results in the new state equation

q[(k + 1)71 = Aq (JcT) + M^Bv^r)


In this form the state variables ql9 q2, . . . , qn are uncoupled, since A is a
diagonal matrix. The state variable qt is directed along the characteristic
vector U;. The scalar product (rf, x(0)> in the mode expansion is simply
the zth component of the normal form column vector q(0) = M_1x(0), and
the scalar product (ri9 Bv> is the zth component of the column vector
M_1Bv.
Example 6.6-1. Analyze the system of Fig. 6.6-1 using both the mode expansion
technique and the normal form of the state equations.

Fig. 6.6-1

The A matrix for this system is

A =

The characteristic roots are = —2, and A2 = —1. The normalized characteristic
vectors are

1/V2 I/V5
iii = u> =
-1/V2 -2/V5
The reciprocal basis is then
1_

1
1m
>

2V2
1

12 —
1_
<1

_-y/5_
to
1

The normalized modal matrix M and its inverse Mr1 are given by

1/V2 I/V5 ' 2V2 V2


M = M 1 =
_1

1
>

_ —1/V 2 -2/V5_ -V~5_


1

Note that the rows of M-1 are the reciprocal basis vectors r,.
The scalar products (ri5 x(0)> are
<rx, x(0)> = 2V2 + V2x2(0)
<r2,x(0)) = — V5x!(0) - V5 «2(0)
Sec. 6.7 Controllability and Observability 431

The initial conditions on the q vector are

2 V2 ^(0) + V2 z2(0)
q(0) = M-1x(0) =
-V5^(0) - V5x2(0)

The forcing function Bv(kT) is the vector b times the scalar v(kT), where

b =

Thus the scalar products <r2, Bv(AT)) are

<rl5 Bv(AT)) = V2 v(kT)


<r2, Bv{kT)) = - Vs v{kT)

The forcing function B„v(A:r) is


V2
Bn\(kT) = M_1by(A:r) = v(kT)
— V~5
The general expression for the time response x(kT) is

fc-i
x(kT) = [2V2*1(0) + V2z2(0)](-l)*u1 + 2 V2 (-1 )<^-1)y(yT)u1
3=0
k—1
+ [-V5^(0) - Vs z2(0)](-2)*u2 + Ji(-V5)(-2)t'-,-1'v(jT)tt2
3=0

The general expression for the time response q(kT) is, from the above and Eq. 6.6-4,

k-1
(-1)*?1(0) + 2 V2 (-l)IM-I»t)(;T)
i=o
q (kT) =
A:—1
(-2)*?!(0) + 2(-V5)(-2)<*-)-1,b(;T)
3=0

where ^(O) = 2^2 ^i(O) + V2 cc2(0) and q2(0) = — V5 xx(0) — V5 ^2(0).


Obviously both methods are equivalent, the q coordinates of the normal form being
the normalized characteristic vectors of the mode expansion.

6.7 CONTROLLABILITY AND OBSERVABILITY

The controllability and observability concepts presented in Section 5.7


carry over directly to the linear discrete system, so there is little need for
further discussion on these points. To restate the principal ideas:

Controllability is a function of the coupling between the inputs to the


system and the various modes of the system. If the system equations can
432 State Variables and Linear Discrete Systems

be written with distinct A’s in the normal form

q[(k + 1)71 = A q(kT) + Bnv(k T)


y(kT) = Cnq(kT) + B\(kT)

then all the modes are controllable if there are no zero rows of Bn. Stated
in terms of the mode expansion method, this means that none of the scalar
products <rf, Bv) vanishes.
Observability is a function of the coupling between the modes of the
system and the output of the system. All the modes of the system are
observable if there are no zero columns of Cn.\ Alternatively, this require¬
ment could be stated as: The kih mode is not observable if all the scalar
products (c*, uk) = 0 for all Vs, where the vector c* constitutes the zth
row of the original C matrix.
For a sampled-data system there is an additional requirement. If the
continuous system has a partial fraction expansion which contains the
term p/[(s + a)2 + p2], and if the sampling interval T = ir/p, then the
Z transform of this term,

og _r_
= 0
2-2 aT
L(s + of + p\ 1 - lz~\-aT COS PT + Z-V

The system may even be unstable, with a < 0, but this fact could not be
inferred from observations of the output. These are called “hidden oscilla¬
tions,” and they occur when the zeros of the oscillation coincide exactly
with the time that the system is sampled.10-12 In this situation the system
is neither completely controllable nor completely observable. Therefore
the additional requirement for complete controllability and observability
of sampled-data systems is that, if a characteristic root of the continuous
system is — a ± jfi, then T ^ v/p.

6.8 THE STATE TRANSITION MATRIX

The state transition matrix for the linear discrete time system is inves¬
tigated in this section. Similarly to the continuous case, the state transition
matrix is the fundamental matrix of Eq. 6.8-1 below, subject to the condi¬
tion that<|>[(/c0, k0)T] = I, the unit matrix. Consider, then, the time-vary¬
ing state difference equation

x[(k + 1)T] = A (kT)x(kT) (6.8-1)


If the initial conditions x(k0T) are known, then

xp0 + 1)31 = A(k0T)x(k0T)


Sec. 6.8 The State Transition Matrix 433

Similarly,
x[(k0 + 2)T] = A l(k0 + l)T]A(k0T)x(k0T)

By a process of iteration, the continued product13

x(kT) = XT A(nT)x(k0T) (k > /c0) (6.8-2)


n=1co
is obtained.
Since the state transition matrix <{>[(/:, k0)T] is defined by the relationship

x(kT) = 4>[(/c, k0)T]x(k0T) (6.8-3)


then
k—1
<4>[(fe, k0)T] = n A(nT) (k > fc0) (6.8-4)
n=ko

= I (k = fe0)

This process of obtaining the state transition matrix by iteration is similar


to the iterative procedure for computing the matrizant of the analogous
continuous system.
For the case where A(kT) is a constant matrix A0, the state transition
matrix 4>0[(/c, k0)T] is

4>oP, k0)T] = A{0k~k(>) A(kT) = A0, a constant matrix (6.8-5)

This is analogous to the continuous case, where the solution for a fixed
system depends only upon time differences; whereas for a time-variable
case the solution depends upon both the time of application of cause and
the time of observation of effect.
For the time-varying case where A (kT) can be written as the sum of two
matrices A0 and Ai(kT), a perturbation technique can be used to obtain
the state transition matrix. This procedure is useful if the time-varying
matrix Ax(kT) represents a small perturbation upon the constant matrix
A0. For this case
x[(k + 1 )T] = [A0 + A ^kTftxQcT) (6.8-6)

This equation can be viewed as a constant system A0 with a forcing function


y{kT) applied, where \(kT) = A^kT^ikT). Thus

x[(k + 1)7] = A0x(kT) + \(kT), \(kT) = A1{kT)x{kT) (6.8-7)

The solution x(kT) for the system of Eq. 6.8-7 can be determined by the
same process of iteration used to give Eqs. 6.6-4 and 6.8-2. Thus

x(kT) = A(t*Mk0T)+ 2, A[k-n-1>y(r>T) (6.8-8)


434 State Variables and Linear Discrete Systems

Substituting \(nT) = AL(nT)x(nT) into Eq. 6.8-8 yields

x(kT) = A{„~ko)x(k0T) + 2 A(<f-’!-llA1(n7')x(n7’) (6.8-9)


n=k o

This is a summation equation, and it can be solved by the usual methods of


iteration.
The first iteration is
fc-i
x(kT) = A^~ko)x(k0T) + 2 A'lf-"-1>A1(nr)
n=k o
•ai—l
X Aon_#:o,x(fc0r) + 2 Ai""llA1(mT)x(mT)
m=k o
Further iterations yield
x(kT) = [I + S(<J>0Aj) + S(<MiS(4>0Ai)
+ S(<|>0A1S(<i>0A1S(<t>()A1))) + • ' -l+o^oD (6.8-10)
where the S indicates a summation of all terms to the right and<f>0 is given
by Eq. 6.8-5. The state transition matrixk0T) can then be written as
<\>[(k, k0)T] =
[I + + S^A.S^oA,)) + Si^A.Si^A.Si^AJ)) + • • -]<j>0
(6.8-11)
which properly reduces to <J>0 for the case when the system is fixed, i.e.,
MkT) = [0].
Equation 6.8-11 for discrete systems is analogous to Eq. 5.9-21 for con¬
tinuous systems. If Ax represents a small perturbation upon the A0, then
this series is rapidly convergent, and only a few terms are required to find
<(>[(k, k0)T]. The advantage of using this form for the state transition
matrix is that the general time-varying <$> is expressed in terms of successive
corrections upon a constant 4>.

Properties of the State Transition Matrix

The state transition matrix for discrete systems has a set of properties
which are directly analogous to the properties listed for a continuous
system in Section 5.9. Namely,
4>[(k0, k0)T] = I (6.8-12)
4>[(*2, ko)T] = <1>[(*2, K)T) (6.8-13)
4>[(*i, K)T] = 4>-i[(fr2, kx)T] (6.8-14)
For fixed systems,
<\>[(k + n)] = <j>(A-)c|>(«) (6.8-15)
<|>(A:) =4>~1(-^) (6.8-16)
Sec. 6.8 The State Transition Matrix 435

Computation of c|>

In general, the computation of the state transition matrix for the time-
varying case is a formidable task. Clearly, for any large value of n,
Eq. 6.8-4 becomes most unwieldy. In certain cases, where the difference
equations of the system can be handled, an analytical solution can be
obtained (see Section 2.13). However, this occurs rarely. Use of a com¬
puter is generally the best method to obtain a solution.
For the fixed system, an analytical solution can generally be obtained.
Equation 4.10-20 provides one method of computing the state transition
matrix. Some others follow.

1. Cayley-Hamilton Method. For the discrete time case, the Cayley-


Hamilton procedure can be used for computing Ak. Here the /(A*) to be
used is Ak rather than the eXit used for the continuous case.
Example 6.8-1. Compute <J>(U for the difference equation
y[{k + 2)7”] + 5 y[{k + 1)7”] + 6y{kT) = 0
The A matrix for this system is

A =

assuming x-dJcT) = y(kT), x2(kT) = y[(k + 1)7]. The characteristic equation |AI — A|
= 0 has two characteristic roots Ax = —2, A2 = —3. Therefore
F(AX) = Ay = (—2)k = a0 -f oqAj = a0 — 2cc1
F(A2) = Ay = ( — 3)k = a 0 + ax/l2 = a0 — 3ax
From these two equations, a0 = 3(—2)k — 2(—3)k and oc1 = (—2)k — (—3)k. Hence
F(A) = Ak = a0I + axA
or
' 3(—2)fc - 2(—3)* (-2Y-C-3)k -
<KU =
_—6[(—2)fc - (-3)*] —2(—2)k + 3( — 3)fc_

Example 6.8-2. Compute <t>(£) for the system whose A matrix is given by

A =

The characteristic equation |AI — A| = 0 has two characteristic roots located at — 1.


For this case of repeated roots, the conditions which must be used to obtain the a’s are
~n—1
dsF(A) d* d’
dAs dAs
lMr X = \t = 5T*[ct0 +
s = 0, \,2, . . . ,pi — 1
_r=0

where is the order of the root. Hence ( — l)fc = a0 + cclA1 — a0 — ax and —k(—l)k —
ax, or ax = — k{— l)fc and a0 = ( — l)fc(l — k). Therefore

ao al —k ~
<t>(A) = a0I -(- axA =
-«1 a0 — 2ax 1 + k
436 State Variables and Linear Discrete Systems

Example 6.8-3. It is informative to take up the case in which the A matrix may have
complex roots. For this reason, determine <f>(&) corresponding to

0 r
A =
-2 2

The characteristic values of this matrix are 1 ± j. For purposes of evaluating Afc, the
polar form V2 e±^/4 is most useful. The computation of Afc is then

f(A,) = (2)*/V*»/4 = <*„ + «,+

F(A,) = (2)*/2e-«r/4 = «„ + -ya.

Adding and subtracting these equations yields (2)fc/2 cos (Jar14) = a0 + ax and ax =
(2)fc/2 sin (/C77-/4), or a0 = (2)fc/2[cos (kir/4) — sin (/T77-/4)]. Since

a0 ai
<4>(A-) = a„I + axA =
— 2ax a0 + 2ax
then

<J>(T) = (2)fc/2
(kir kir
cos-b sin —
4 4

2. Frequency-Domain Method. The Z transform, analogously to the


Laplace transform, can be used to find the state transition matrix of the
equation
x[(A: + l)T] = A x(kT) (6.8-17)

Transforming both sides of Eq. 6.8-17, zX(z) — zx(0) = AX(z), where use
has been made of Eq. 3.12-3. Thus

X(z) = (zI - A)-1zx(0) (6.8-18)

or X(z) = 4>(z)x(0), where

<p(z) = (zl - A)-1* (6.8-19)

This form is slightly different from the analogous form ^(s) = (si — A)-1
for continuous systems. The state transition matrix is given by the inverse
z transform of 4>(z), or

4>(/c) = £F-l[(zl - A)-Jz]


= 2 residues of (zl — A)_1zfc (6.8-20)

Example 6.8-4. Using the A matrix of Example 6.8-1, determine <|>(T) by the fre¬
quency-domain method.
Sec. 6.8 The State Transition Matrix 437

For this case,


z -1 ”
(zl - A)
6 z -f" 5
so that
z + 5 r
—6 z_
(zl - A)-1
(z 4- 2)(z 4- 3)

Determining the sum of the residues of (zl — AWz* gives

" 3(—2)fc - 2( —3)fc (-2)*-(-3)* -


4>(k) =
_—6[(—2)fc - (-3)*] —2(—2)k + 3( —3)fc

Example 6.8-5. Using the A matrix of Example 6.8-2, determine <\>(k) by the frequency-
domain method.
For this case,
"z -1 “
(zl - A)
1 z -f- 2
so that
"z + 2 r
-l z_
(zl - A)-1
(Z + l)2

Determination of the sum of the residues of (zl — A)_1zfc yields

—k
c|>(/c) = (~l)k
1 + k

3. Transfer Function Method. As for the continuous case, the simula¬


tion diagram can be used to obtain the terms O^(z). The basis for this
method is that the solution to Eq. 6.8-17 is

x(kT) = 4>(/:)x(0) (6.8-21)

Therefore xt{kT) is given by

z>(kT) =2,<f>ii(k)xj(0) (6.8-22)


3=1

If all the state variables except the yth are set equal to zero, and a unit
initial condition is placed on xjt then the response at the ith state variable
xfkT) represents the term ^ifk). Therefore the transfer function from the
output of the yth delay to the output of the zth delay represents the term
O^(z). This is slightly different from the continuous case, where the
transfer function was calculated between the input to they'th integrator and
the output of the zth integrator. The difference is due to the fact that the
transfer function in the continuous case is expressed as the transform of the
438 State Variables and Linear Discrete Systems

impulse response (in the continuous case, a unit impulse input establishes a
unit initial condition immediately), while in the discrete case it is the trans¬
form of the unit initial condition response. If an analogy is desired, then
the transfer function from the input of the yth delay to the output of the zth
delay must be multiplied by 2 to obtain Oi5(z). Since what is generally
desired is (zl — A)-1 = z-1<i>(z), the transfer function from the input of
the yth delay to the output of the zth delay is perfectly suitable.
Example 6.8-6. The simulation diagram of Fig. 6.8-la represents the system of Example
6.8-1. Determine <&(k).
Figure 6.8-16 is the same diagram redrawn for the convenience of computing transfer
functions. Since 1/(1—loop transfer function) is equal to z(z + 5)/(z2 -F 5z -f 6), the

(a) (b)
Fig. 6.8-1

various transfer functions z~1<&ij(z) can be obtained by multiplying the forward transfer
function from j to i by z(z + 5)/(z2 + 5z + 6). By performing this operation, the matrix
(zl — A)-1 = z_1<i>(z) is obtained. Thus
z + 5 1

—6 z_
[zl - A]-1 = z~'®(z)
z2 + 5z + 6

The inverse transform of 4>(z) gives 4>(£) as in Example 6.8-4.

Adjoint System—The State Transition Matrix

Similar to the adjoint of an unforced linear continuous system, there


exists an adjoint corresponding to an unforced discrete linear system. The
adjoint operator L* is defined in terms of the system operator L by

(a, Lx) = (L*a, x) (6.8-23)

where a is the adjoint vector and the inner product denotes!


k—1
(a, b) = 2 aT[(i + l)T}b(iT)
i=lc 0

f In the case of complex elements, the transpose is replaced by the conjugate transpose.
Sec. 6.8 The State Transition Matrix 439

Considering (a, Lx),

(a, Lx) = 2 <xr[0' + 1 )7]{x[(i + 1)7] - A(iT)x(/T)}


i=k o
jc_2

= a.T(kT)x(kT) + 2 aT[(i + l)T]x(i + 1)7]


i—kor-1
7c—1

- ar(k07)x(k07) - 2 «r[(i + l)7]A(i7)x(i7)


£=fr()
fc—1

= 2 {aT(lT) - aT[(* + l)]A(iT)}x(iT)


*'=fco

+ ar(kT)x(kT) - ctT(k07)x(fc07) (6.8-24)


Then by identifying

L*a = aT(iT) - ar[(i + l)7]A(i7) (6.8-25)

so that the adjoint equation is

otr(i7) = aT[(i + l)7]A(i7) (6.8-26)


the unforced system equation

x[(i + 1)71 = A(/7>0T)


can be multiplied by aT[(i + 1)7"] and combined with Eq. 6.8-26 to yield

a T[{i + l)7]x[(i + 1)7] = ar(i7)x(i7)


Then by iteration
aT(kT)x(kT) = aT(k0T)x(k0T) (6.8-27)
Substitution of Eqs. 6.8-25 and 6.8-27 into Eq. 6.8-24 yields Eq. 6.8-23.
Thus Eq. 6.8-26 rewritten as

a[(i + 1)T] = [A-\iT)]Ta(iT) (6.8-28)


is the equation for the adjoint system.
Since the system state transition matrix <\>[(k, k0)T] must satisfy Eq.
6.8-1,
4>[(* + 1, K)T] = A(/c7")4>[(/c, £0)7"]
Taking the inverse and transposing,!

{<t>_1P + 1. k0)T]}T = [A-1(/c7)]r{<|>-1[(k, k0)7]}T (6.8-29)


Comparison of Eqs. 6.8-28 and 6.8-29 indicates that the state transition
matrix of the adjoint system is

{<t>-*[(A, k0)T]}T = 4>T[(k0, k)T] (6.8-30)


t If A(kT) contains complex elements, the conjugate transpose must be taken.
440 State Variables and Linear Discrete Systems

Note that the presentation here is the reverse of that of Chapter 5. There
the continuous analogs of Eqs. 6.8-28 and 6.8-30 were defined, and the
continuous analog of Eq. 6.8-23 resulted. Here Eqs. 6.8-28 and 6.8-30 are
obtained from the definition of the adjoint operator, Eq. 6.8-23.
For the discrete time adjoint system, the state transition matrix

{♦"'[(ft, K)T]}T
can be found by iteration of Eq. 6.8-28. The result is

{4>-1[(k, k0)T]}T = n [A-1(nT)f


n=k o
k > k0 (6.8-31)

The state transition matrix for the original system, for k0 < k, can be
found by reverse iteration of Eq. 6.8-1, i.e., Eq. 6.8-1 can be written as
x(nT) = A~\nT)x[{n + 1)7"], and this expression can then be iterated
from n = k down to n — k0. The result is

x(k0T) = IT A~\nT) x(/cT)


_n=&o
so that
Jc-1

<t>[(/c0, k)T) = n A \nT) k > k0 (6.8-32)


n=k(

These two equations show the validity of Eq. 6.8-14, namely,

4>-1[(k,k0)T]=<S>[(k0,k)T] k> k0

However, Eq. 6.8-31 was obtained by a forward iteration or running the


adjoint system forward in time from k0T to kT, while Eq. 6.8-32 was
obtained by a reverse iteration, or running the original system backward
in time from kT to k0T. This is shown in Fig. 6.8-2.

-1 (nT)
4>Po, k)T =n=k
VA o

k0)T\ = n A (nT)

ko ♦ +k
k-1
[(ko, k)T] = n A(nT)
n — ko

^ k~l -1
<)>-1 \(k, k)T] = n A l(nT)
n=kn

Fig. 6.8-2
Sec. 6.9 The Complete Solution 441

Note that the form of the state transition matrix for increasing time
(Eq. 6.8-4) is different from the form for the state transition matrix for
decreasing time (Eq. 6.8-32). The reason for this difference is that reversing
the time direction for a discrete system entails an inverse A matrix
{x(kT) = Ar\kT)x[{k -f 1)T]}. Reversing the time direction for a con¬
tinuous system simply entails reversing the sign of the A matrix [<d/d(—t) =
—dldt]. A set of alternative forms of the state transition matrix is shown
in Table 6.8-1.
Table 6.8-1

Original System Adjoint System (Transposed)

Forward Direction Forward Direction

<t>{(k, k0)T] = ff MnT) k > k0 Q-'Kk, k0)T] =f[A-\nT) k > k0


n=k0 n=k0

<t> [(*o. k)Tl = fj>


n=k
(nT) k0 > k <t>-1[(*o, k)T] = ffA_1(«7’)
n=k
k0 > k

Reverse Direction Reverse Direction


fa_l ft—1

4>[(*o, k)T] = n A-\nT) k > k0 <t>—1 [(A'o, k)T] = n A(«7") k > k0


n=k0 n=k0

<S>l(k, k0)T] =if A-^nr) k0 > k k0)T] = jjA(nT) k0 > k


n=k n=k

6.9 THE COMPLETE SOLUTION

The complete solution to the set of state equations

x[(k + 1)71 = A (kT)x(kT) + B(kT)\(kT)


(6.9-1)
y (kT) = C (kT)x(kT) + D(kT)\(kT)

can be found by a process of iteration and induction, similar to the method


used to obtain the state transition matrix. However, to illustrate the anal¬
ogy between discrete and continuous systems, the adjoint system is used in
a method similar to the integrating factor method.
The state transition matrix of the adjoint system (c}>—1 [(/c, k^)T]}T must
satisfy Eq. 6.8-29. Taking the transpose and postmultiplying by A {kT)
yields
<tk0)T] =4>-'[(k + 1, k0)T]A(kT) (6.9-2)
442 State Variables and Linear Discrete Systems

If Eq. 6.9-1 for x is premultiplied by <|>-1[(A: + l,k0)T] and Eq. 6.9-2


is postmultiplied by x(kT) and the difference between the two equations
taken, the result is

Q-'Wc + 1, k0)T]x[(k + 1)71 K)T]x(kT)


= 4+ 1, k0)T]B(kT)\(kT) (6.9-3)

If k is replaced by m in Eq. 6.9-3 and both sides are then summed from k0 to
k — 1, the result is

4>-1[(/c, k0)T]x(kT) - 4>_1[(/c0, k0)T]x(k0T)


\
k—1

= 2 <(>_1[(m + 1, /c0)T]B(mT)v(mT) (6.9-4)


m=k o

Since 4>_1Po> k0)T] — I, premultiplication by <f>p, k0)T] gives

x(feT) = <|>p, k0)T]x(kBT) + 2 4>P, m + l)T]B(mT)v(mT) (6.9-5)


m=ko

The first term on the right side of Eq. 6.9-5 represents the initial condition
response of the system, while the second term represents a superposition
summation of the effects of the forcing function. This equation is analo¬
gous to Eq. 5.10-6 for the continuous case.
When the system under investigation is fixed, Eq. 6.9-5 can be written as
the sum of an initial condition response and a convolution summation:
k—1

x(kT) = cf>(/c — k0)x(k0T) -f J <b(k — m — l)Bv(mT) (6.9-6)


m=k o

The corresponding output y(kT) is obtained by substitution of Eq. 6.9-5


or 6.9-6 into Eq. 6.9-1. Thus

y(kT) = C(fcT)4>[(fe, k0)T]x(k0T)


k-1

+ 2 C(kT)4>[(k, m + l)T]B(mT)v(mT) + D(kT)y(kT) (6.9-7)


m=ko

for a time-varying system, or

y {kT) = C<j>(fc - k0)x(k0T) + 2. C<S>(k -m- l)Bv(mT) + D y(kT)


m=ko

(6.9-8)
for a fixed system.
For the case in which the system is fixed, it is frequently convenient to
use the frequency-domain approach. For this case, Y(z) can be found by
transforming Eq. 6.9-1 directly. The result of this operation is
X(z) = (zl - A)-1zx(0) + (zl - A)-1BY(z) (6.9-9)

Y(z) = C(zl - A)_1zx(0) + [C(zl - A)_1B + D]V(z) (6.9-10)


Sec. 6.9 The Complete Solution 443

An example is now given which illustrates the various approaches that can
be taken.

Example 6.9-1. Determine the solution to the difference equation

y(k + 2) + 5 y(k + 1) + 6 y(k) = 1 (T = 1)

by each of the methods indicated, below.

1. Classical Solution—Time Domain. Assume yH(k) = ftk. Then (/?2 + 5(3 + 6)j3k =
0, or /h = —2 and /?2 = —3. The homogeneous solution is then yH(k) = Ci(—2)k +
C2( — 3)fc. Assume that yP(k) = C3. Then C3 + 5C3 + 6C3 = 1, or C3 = A. The
total solution is
y(k) = Cx(—2)fc + C2(—3)fc + A

Using the initial conditions y(0) and y( 1),

2/(0) = Cx + C2 + A and 2/(1) = -2CX - 3C2 + A

The constants Cx and C2 are

Cx = 32/(0) + 2/(1) - i, C2 = i - 22/(0) - 2/(1)

Substituting these constants into the total solution gives the complete solution

y(k) = [3(—2)* - 2(-3)%(0) + [(-2)* - (-3)%(1) + li(—3)fc - K~2)fc + A]

2. State Variables Technique—-Time Domain. From Example 6.8-1, the state


transition matrix for this system is given by

3(-2)* - 2(—3)* (—2)fe - (—3)fc '


4>(*) =
— 6[{—2)k - (—3)fc] — 2(—2)fc + 3(-3)fc_

The B, C, and D matrices are


'O'

B = C = [1 0], D = [0]
1
The output y(k) is given by Eq. 6.9-8 as
k—1
y(k) = C<t>(A:)x(0) + 2 C4>[(^ -m - \)]By(mT) + Dy(kT)
m=0
which, for this case, reduces to
Jc-l
y(k) = <f>u(k)y(0) + ^(l) + J 0i2(k - m - 1)
m=0

The computation of the summation for a forced input may involve some skill in
finding a closed-form expression for the resulting series. The summation formula

ki 1 — ak
X ak-m-1 _ -- (sum of a geometric series)
m=0
1 — a

is of particular use in expressions of this type.13 Since 012(k — m — 1) = (—2)k m 1 —


( —3)fc_m_1, the summation in this case is equal to

1 _ (—2)k 1 — ( —3)fc 1 1 1
-_1 - u - -—L_iL = - + - (-3)fc - - (-2)*
1 - (-2) 1 - (-3) 12 4 3
444 State Variables and Linear Discrete Systems

The complete solution is then

y(k) = [3(-2)* - 2(-3)%(0) + [(-2)* - (-3)%(1) + [i(-3)k - K~2)k + A]

The expression above for the sum of a geometric series is particularly helpful if the
matrix summation
k-1
TO— 1
m=0

is to be found. The Cayley-Hamilton method can be applied where F(2t) = (1 — 2/)/


(1 - K).
A useful relation in finding the closed-form expression, if one exists, for a summation
is the formula for summation by parts, analogous to the formula for integration by parts.
This formula is given by (see Problem 2.12)

N N
^ u(/c) Av(k) — [uij^vik)]1^1 — ^ v(k +1) Au(k)
M M

As an example, the summation


N
prk r ^ l
o
can be found by setting

u(k) = k, A v{k) = rk. Then A u(k) = 1,

k 1 pk — J pk
v(k) = ^ rn + Q = + Ci — + Co
r - 1 r-1

For convenience, let C2 = 0. Then

N 2T+1 1 N
±1
kC 1
^ krk =
o r - 1
1
- [NrN+2 - (A + 1 )rN+1 -hr] r ^ 1
(r - 1)

3. Standard Z Transform Technique. Taking the Z transform of both sides of the


given difference equation, there results

Z2[¥(z) - y(0) - z-h/( 1)] + 5z[Y(z) - 2/(0)] + 6Y(z) =


z - 1
or
zKz - i) + A*y(C) + ?/(i) + 5?/(0)]
Y(z) =
(z + 2)(z -f- 3)

The poles of this function are at 2 = 1, —2, —3. Since

y(k) = ^ residues of Y(z)zk~1 at poles of Y(z)

there are three residues to compute.

Ri = (z- \)Y(z)zk~1\z=1 =A
Rz = (z + 2)Y(z)zk~'\z=_2 = [-1 + 3t/(0) + y(m~2)k

R3 = (z + 3)T(2)2fc-1L=_3 = [| - 2t/(0) - 2/(l).l(—3)fc


Sec. 6.9 The Complete Solution 445

The complete solution is then

y(k) = [3( —2)* - 2(-3)*M0) + [(-2)* - (-3)%(1)

+ [i(-3)s - K-2)‘+ A]

4. State Space Technique—Frequency Domain. From Eq. 6.9-10, the Z transform


of the output y{k) is

Y(z) = C(zl - A)-12x(0) + [C(zl - A)_1B + D]V(z)

From Example 6.8-4, the matrix (2I — A)-1 is given by

2 T 5 1
-6
(zl - A)-1 =
(2 + 2)(z + 3)

The B, C, and D matrices were given in part 2 of this example as

O'
B = C = [1 0], D = [0]
1
The transform of the output, Y(z), is then

x 2(2 + 5)
Y(z) = - — y(0) +
+ + 3)
y( i) +
(2 + 2)(2 + 3) (2 2)(2 (2 - 1)(2 + 2)(2 + 3)

This 2 transform corresponds to the Y(z) found in part 3 of this example. Therefore the
answer is identical with that given in the three preceding parts of this example.

5. State Variables Technique—Normal Form. The modal matrix M is found by


successive substitution of the characteristic values of A into the matrix Adj [21 — A].
The characteristic values of A are = —2, 22 = —3, and Adj [21 — A] is given by

2 + 5 1
Adj [21 - A] =

Therefore M is given by
' 1 r ' 3
M = and M 1 =
-2 -3 -2
Since
-2 O'
M XAM = A =
0 -3

the transformation q = M_1x leads into the form

q (k + 1) = Aq(*) + B nv(k) (T = 1)
y (k) = Cnq (k) + Dv(/:)

where Bn = M_1B and Cn = CM. Therefore qx = 3xx + x2, q2 — —2xx — x2, and

1
B,i = and Cn = [1 1]
-1
The output y(/c) is then
k-1
y (*) = C„A‘q(0) + J C„A‘-”*-1Bnv(m)
m=0
446 State Variables and Linear Discrete Systems

from Eq. 6.6-4 and the above. Since A is a diagonal matrix, finding Ak means simply
raising the elements of A to the k\h power. This is one of the advantages of using this
method. It follows that

y(k) = (-2)^(0) + (-3)*?2(0) +2 - (-3)*-”—}


771=0

or
m = (-2)^(0) + (-3)^2(0) + [K-3)fc - K~2f + A]

If it is desired that y(k) be expressed in terms of the original initial conditions, then
qi(0) = 3aq(0) + *2(0) = 3i/(0) + 2/(1) and qz(Q) = — 2aq(0) — s2(0) = — 22/(0) — 2/(1)
yield
y{k) = [32/(0) + 2/(l)](—2)fc - [22/(0) + 2/(l)]( —3)fc + [£(-3)* - K~2)k + A]

6. Mode Expansion Method. Since


1 1 3 1'
M = and M 1 =
-2 -3 -2 -1
from part 5, it follows that
' r 1'
u, = and u2 =
-2_ •3
form a basis, and
'3' -2 '

r, = and r2 =
1 -1
form a reciprocal basis. Thus, from Eq. 6.6-6,

y(k) = xx(k) = [3%(0) + a?2(0)](—2)k + [—2aq(0) - *2(0)](—3)*


k—1
+ 2 [(—2)fc_m_1 - (-S)*-™-1]
771=0

This is the same expressions as for y(k) in terms of the </>’s in part 2. Thus performance
of the indicated summations and substitution of 2/(0) = z+0) and 2/(1) = «2(0) yields
the desired result.

7. State Variables Technique—Partial Fraction Expansion. Since


1 1 1
H(z) =

z2 + 5z + 6 z + 2 z + 3
then
V(z) V(z)
Y(z) = —4 - 4 - = xx(z) + xt(z)
z+2 z+3

or y(k) = xx(k) + x2(k), where xx and x2 satisfy the first order difference equations

xjjc + 1) + 2 xx{k) = v(k) and x2(k + 1) + 3 x2(k) = —v(k)

Therefore the A, B, C, and D matrices are


-2 O' ' r
A = B = C = [1 1], D = [0]
0 -3 -l

Since these are exactly the same matrices which were derived in part 5, y(k) is then

y(k) = (—2)kx1(0) + (—3Ar2(0) + [£(-3)* - i(~2)k + Ar]


Sec. 6.9 The Complete Solution 447

However, since the transfer function gives no clue to the relationship between y and
x, this relationship must be obtained from examination of y{k). This is necessary
because the known initial conditions are in terms of 2/(0) and y{ 1). The desired relation¬
ships can be found by substituting k = 0 and k = 1 into y(k) = xx{k) + x2(k). This
gives 2/(0) = 3^(0) + z2(0) and 2/(0 = aq(l) + #2(0* Now, using the equations for
x^k + 0 and x2(k + 0 at k = 0, these expressions become

2/(0) = ^(0) + x2(0) and 2/(0 = — 2xx(0) — 3x2(0)

Solving these two simultaneous equations yields

»i(0) = 32/(0) + 2/(0 and »2(0) = -22/(0) - 2/(0

which are the same relationships found in part 5 for the q's. Although this procedure is
not too difficult for a second order system, it may prove to be laborious for higher order
systems with many inputs and outputs.

In looking through these various approaches, there are advantages and


disadvantages to each. For single input-output fixed systems, the standard
Z transform approach is certainly the easiest to use. For multiple input-
output systems, a matrix technique is advisable. Which one of the matrix
techniques to use is a different question. Since time-varying systems are
almost impossible to solve analytically, the time-domain formulation of a
time-varying system is best for computer purposes. For the fixed case,
the Z transform of the state equations is quite useful, since numerous
Z transform tables are available. The use of the Z transform bypasses some
of the difficulties in evaluating the summation forms that are obtained in
a time-domain computation. It appears that the normal form is perhaps the
best form conceptually, for mathematical proofs, and for time-domain
analysis of the unforced system. When the system is subject to a forcing
function, the Z transform of the normal form is most convenient.

Stability of Fixed Discrete Systems

For a discrete fixed system, the state transition matrix approaches zero
as k approaches infinity, if the characteristic values of A are located inside
the unit circle. If a characteristic value lies on the unit circle and is of
order one, then 4>(k) is bounded as k approaches infinity. For any charac¬
teristic values which lie outside the unit circle or for multiple characteristic
values which lie on the unit circle, <\>{k) becomes infinite as k approaches
infinity. These statements can be proved from the Cayley-FIamilton
method of obtaining A/c, which depends upon obtaining certain elements
oim such that
448 State Variables and Linear Discrete Systems

where n is the order of the A matrix. The am are obtained from the equa¬
tions
to—1

or

x—ki dXv k=ki

where p = 0, 1, 2, . . . , r — 1 for the case where the characteristic value


is of order r. When the characteristic valuesv are distinct,7 the elements
of Ak contain linear combinations of elements such as (Af)fc. These ele¬
ments vanish as k approaches infinity if |2J < 1, become bounded if
\2.i\ = 1, and become unbounded if |Aj > 1. When there are multiple
characteristic values, the elements of Ak contain linear combinations of
elements such as fcAJ-1. Clearly, these elements are unbounded for
|AJ > 1 and approach zero as k approaches infinity for < 1.

6.10 THE UNIT FUNCTION RESPONSE MATRIX

The output y(kT) of a linear time-varying discrete system can be written


in terms of the A, B, C, and D matrices as
k-1
y(kT) = 2 C(/cT)<j>[(/c, m + l)T]B(mT)\(mT) + D(kT)\(kT) (6.10-1)

by setting k0 equal to — oo in Eq. 6.9-7. In terms of the unit function


response matrix H[(/:, m)T], the output y(kT) is given by the superposition
summation
k
y(kT) = 2 H[(k, m)T]v(mT) (6.10-2)
m=— oo

A comparison of Eqs. 6.10-1 and 6.10-2 shows that the unit function re¬
sponse matrix is

C(fcD4>[(k. k0 + l)T]B(/c0T) k ^ k0 + 1
H[(/c, k„)T] =
D (kT) k = k0
= [0] k < k0 (6.10-3)

For a fixed system the unit function response matrix is

k ^ k0 -f 1
k — ko
Sec. 6.10 The Unit Function Response Matrix 449

In the frequency domain, from Eq. 6.9-10, the unit function response
matrix H(z) for a fixed system is given by

H(z) = COI - A)-1B + D (6.10-5)


Example 6.10-1. The system of Fig. 6.10-1 represents a simple sampled-data system.
The block (1 — e~sT)/s is commonly called a zero order hold, since it takes the input
sample at time kT and provides this value as its output until time (k + 1 )T. Determine
the unit function response matrix H(kT) assuming T = 1.

r~ -y(kT)
v(t) 1 — e
— sT
1 ■
s(s + 1J
-y(t)
V
T= 1

Fig. 6.10-1

The transfer function G(z) of the forward transmission is

1
G(z) = (1 - z-1)^
s\s + 1) 2 — e”1 2 — 1

Using this transfer function, the sampled-data system of Fig. 6.10-1 can be redrawn as
the discrete time sy stem of Fig. 6.10-2. In this figure, the forward path has been broken

y(kT)

up into its partial fraction expansion. The closed-loop system state equations can then
be written down by inspection. They are
1_

Xx (k + 1) 1 - e—1_
m
rH

+
o

_x2(k + 1)_ _x2(k)


i

xx(k)
y(k) = [i i]
x2(k)_
450 State Variables and Linear Discrete Systems

From Eq. 6.10-4, the unit function matrix H(&) is

Mk- i) <f>l2(k- ly V1 - 1
H(/c) = [1 1]
_4>2l(k 1) 4*22^ 1).
Since
a0 + ax ax — a^-1
<J>(A: — 1) = A(A:_1) = a0I + axA =
ai a0
then H(k) = a0e_1 + ax(l — e-1), where a0 and cn1 are to be determined. The roots of
the characteristic equation

|AI — A| = 22 — 2 + (1 — c-1) = 0

are Ax 2 = Ne±id where N — 0.795, 0 = 0.680 radian. Applying the Cayley-Hamilton


method, where F(kt) — Afc_1, the equations

]\j(k—l)€j(k—l)0 = a0 + oi^ei0

]\f(Jc-l)e-j(k-l)6 = ao +

are obtained. Solving for a0 and ax,

a0 = A(fc-1)[cos (k — \)0 — cot 6 sin (k — 1)0]

A^-2) sin [(k - 1)0]


ai =-—a- sin 0
The unit function response is then

H(k) = €_17V^_1) [cos (A: — 1)0 — cot 0 sin (k — 1)0]

(1 e-i)N(*-2)
+ sin (& — 1)0
sin 0
or
#(£) = (O.795)(fc-O|o.368 cos [0.680(/c - 1)] + 0.724 sin [0.680(A - 1)]}

This result can be checked by using conventional feedback methods.

_ G(z) _ 0.368(2 + 0.72) fW = 0.795


7/(2 ” 1 + G(z) ~ (z - Nejd)(z - Ne-jd) { 0 = 0.680 radian

Taking the inverse Z transform,

(Nejd + 0.72)ej(k-Dd - (Ne~jd + 0.12)e~j(<k-~De


H(k) = 0.368N<k~D
N(eje - c-je)

After some manipulation, this can be written as

0.72 \
H(k) = 0 MSN'*-1' cos (k — 1)0 + ^cot 0 + sin (k — 1)0
N sin 0/

= (0.795)(fc_1){0.368 cos [0.680(A - 1)] + 0.724 sin [0.680(A: - 1)]}

For a time-varying system, the elements of the unit function response can
be obtained by simulation in a manner analogous to that used for con¬
tinuous systems. For an m input-/? output system, the ith output y^kT) is
Sec. 6.10 The Unit Function Response Matrix 451

given by
k m

Vi(kT)= 2 2,hii[(k,n)T]v£nT) (6.10-6)

Theyth column of H[(/c, n)T] can be obtained by setting all the inputs except
theyth equal to zero, and placing a unit function on theyth input at time
nT. The outputs of the system are h^Kk, n)T] for i = 1,2,...,/? and for
fixed j. In order to obtain the complete response /zi;[(/c, n)T] as a function
of both k and n, a set of runs must be performed, each run starting at a
different time nxT, n2T,... . The results of these runs must then be cross-
plotted to obtain the variation with respect to nT, the point in time of
application. Proceeding to different values of /', these tests must be repeated
until all m columns of the unit function response matrix are obtained. This
is the same problem that was presented in the last chapter where the impulse
response matrix H(f, r) was obtained by a similar cross-plotting procedure.
In this discrete case, however, the difference equations can be solved on a
digital computer and the necessary cross-plotting also done by the com¬
puter. Thus the discrete modified adjoint system is not discussed.
Example 6.10-2. In Section 5.11, the differential equation y + ty + y = 0 was ana¬
lyzed, and the impulse response h(T, t) was obtained by performing a set of simulation
runs for different values of r and cross-plotting the results for fixed t = T. The results of
the simulation runs are shown in Fig. 5.11 -2a, and the results of the cross-plotting are
shown in Fig. 5.11-26. Perform the same task, but for the discrete version of the same
differential equation. This method is often used to solve a time-varying differential
equation numerically by either a desk calculator or a simple digital computer routine.
A discrete version of the equation can be obtained in the following manner. Since

d2y y(t + h) — 2 y(t) + y(t — h)


= lim
h-+0 h2

y(t + h) — y{t)
h

an approximate solution can be obtained by letting h — T and t = kT, and using a


“small” value for T. Thus

yikT +T)~ 2y(kT) + y{kT - T) kT[y(kT + T) — y(kT)] , „ _ .


-—-1---b y(kT) = 0
T
or
(1 + kT*)[y(kT + T)] + (T2 - kT2 - 2)[y(kT)\ + [y(kT — F)] = 0

The initial conditions y{r) = 0, yU) = 1 are replaced by

V(k0T) = 0

y(k0T+ T)-y(k0T)
T
or y(k0T) = 0 and y(k0T + T) = T.
452 State Variables and Linear Discrete Systems

k T—>-
Fig. 6.10-3a

Using a value of T — 0.05 and 60 = 0, 4, 8, 12, , 48, the points shown on Fig.
6.10- 3tf are obtained. Obtaining these points is a relatively simple task for a digital
computer, and the cross-plotting can also be performed by the computer. A desk
calculator can also be used. However, care must be taken to use sufficient accuracy
in computing each point, as the round-off errors can build up rapidly. The interested
reader can consult any of the many texts written on numerical solution of differential
equations.14’15
As a comparison of the accuracy that can be obtained by simple numerical methods
the results of the continuous system simulation run and the results of the approximate
numerical solution for r = k0 = 0 are listed in Table 6.10-1. The numerical solution
is given to three places, while the simulation run is given to two places, this being the
accuracy of reading from the original recording. For this comparison, the discrete
simulation reproduces the results of the continuous simulation within the accuracy of the
recording of the continuous information.
A crossplot of the points at KT = 2.0 is shown in Fig. 6.10-36. The points of Fig.
6.10- 36 represent 6(2.0, k0T), the unit response of the system as kT = 2.0 as a function
of the time of application of the unit input. A comparison of the cross-plot obtained by
discrete simulation and the cross-plot obtained by the continuous simulation (Fig.
5.11- 26) is shown in Table 6.10-2. This comparison shows that the discrete simulation is
fairly good, but not within the accuracy of the continuous system. The differences
between the continuous and discrete systems are due to the accumulation of round-off
error and the basic approximation involved. Because these effects are more noticeable
farther out along a simulation run, a cross-plot at kT = KT shows these effects more than
a cross-plot at kT < KT.
Sec. 6.10 The Unit Function Response Matrix 453

Table 6.10-1

T = 0.05

k h(k, 0) h(t, 0) k h(k, 0) h(t, 0)

0 0 0 22 0.751 0.75
2 0.100 0.10 24 0.766 0.76
4 0.198 0.20 26 0.771 0.77
6 0.291 0.29 28 0.769 0.77
8 0.380 0.38 30 0.760 0.76
10 0.461 0.46 32 0.746 0.75
12 0.534 0.53 34 0.727 0.73
14 0.597 0.60 36 0.705 0.71
16 0.651 0.65 38 0.679 0.68
18 0.695 0.69 40 0.653 0.65
20 0.728 0.72

Figure 6.10-36 can also be obtained by a discrete simulation of the modified adjoint
differential equation. The original differential equation is y + ty + y = 0, and the
adjoint differential equation is a — (d/dt)(t(x.) + a = 0 or a — to. = 0. Making the
change in variable t = T0 — tu the modified adjoint differential equation is then

d2o do.
+ (T0 — 11) = 0
~dtT2 dtx

0.7
1-T”
1
o

_ • • T = 0.05
0.6


0.5 _ •

P 0.4
Cross-plot of unit function •
o' response at KT = 2.0
C\j
^ 0.3 — • —

0.2 —


0.1 — —

_1_J__1 -1
0 0.5 1.0 1.5 2.0
kT-^
Fig. 6.10-36
454 State Variables and Linear Discrete Systems

Table 6.10-2

T = 0.05

k h(2.0, k0T) h(2.0, r)

0 0.653 0.65
4 0.628 0.63
8 0.603 0.60
12 0.574 0.56
16 0.541 0.53
20 0.501 0.48
24 0.450 0.43
28 0.385 0.37
32 0.297 0.28
36 0.175 0.17
40 0 0

The discrete simulation of this equation is found by the procedure used to find the
discrete simulation of the original differential equation. The resulting difference
equation is

[1 + (K - k')T2]y(k'T + T) + [-2 - (K - k')T2]y(k'T) + y(k'T - T) = 0

where KT = T0 is the fixed end time. A comparison of the points obtained by this
simulation and those obtained from the continuous system simulation is shown in
Table 6.10-3.

Table 6.10-3

KT = 1.0, T = 0.1

k hjc'*(k', 0) 0)

0 0 0
1 0.100 —

2 0.191 0.18
3 0.276 —

4 0.350 0.33
5 0.420 —

6 0.487 0.47
7 0.550 —

8 0.612 0.60
9 0.673 —

10 0.733 0.72
Sec. 6.10 The Unit Function Response Matrix 455

Transmission Matrices

For single input-output systems, the transmission matrix is sometimes


used to describe the unit response of the system. The transmission matrix
is simply an ordered array of the elements h(iT, jT), where i indicates the
row and j indicates the column where the element is located. The general
form for the transmission matrix is then

" h(0, 0) 0 0 0
h(T, 0) h{T, T) 0 0
Hx(kT, k0T) = h(2T, 0) h(2T, T) h(2T, 2T) 0

h{m T, 0) h(mT, T) h(mT, 2T) • • • h(mT, mT)


(6.10-7)

All elements to the right of the element h(iT, iT) are zero, since the system
is assumed to be physically realizable, or nonanticipative. If the input
v(kT) is ordered into a column vector whose components are i>(0), v{T),
v{2T),. . . , v(mT) then the output of the system can be written as

y(kT) = HT{kT, k0T)\(k0T) (6.10-8)

It is understood that the output y(kT) is ordered into a column vector


whose components are y(0), y(T), y(2T),. .. , y(mT). The ith component
of the column vector y(kT) is then

y(iT) = 2 h[(i, k0)T]v(k0T) (6.10-9)


fc0=o

For a fixed system, the elements of the transmission matrix are h(iT —
jT), the argument being the difference between the time of observation and
the time of application. Thus

m 0 0 • • • 0

h(T) m 0 • • • 0
H T(kT) = h(2T) h{T) h{ 0) 0

h(mT) h(mT — T) h(mT — 2 T) • • • h( 0)


(6.10-10)
These matrices are not very useful when dealing with systems with several
inputs and outputs, but they have some application when dealing with
single input-single output systems or systems comprised of interconnected
single input-single output systems.
456 State Variables and Linear Discrete Systems

Example 6.10-3. Repeat Example 1.9-1 using the transmission matrix description.
y(k), k = 0, 1, 2, ... , can be written as

y{k) = DT(k)\(k)
where
“2/(0)-

y{ i)

y(k) y{ 2)

and from Example 1.9-1,


0
0 0
1
2 0 0
2
4 2 0 0 •
D T(k) = and \(k) 3
8 4 2 0 0

Note that by defining


v(0) 0 *
y(l) v(0) 0 ••
yi<k) v{2) y(l) v(0) 0

and
^(0)
d{ 1)
d(/c) = d{ 2)

y(k) can also be written as y(k) = \T(k)d(k).

Example 6.10-4. Write an expression for y(kT), k — 0, 1, 2,... , for the system of
Example 3.13-6.
Defining the indicated transmission matrices and vectors as above,

q(kT) = \(kT) - GHT(kT)q(kT)


or
q(kT) = [I + GHT{kT)]~x\(kT)
Then
y(kT) = Gr(A:r)[I +
Sec. 6.11 The Method of Least Squares 457

6.11 THE METHOD OF LEAST SQUARES

At this point in the presentation of the preceding chapter, the evaluation


of the mean square outputs of continuous systems was developed from the
viewpoint of the modified adjoint system. As indicated in the previous
section, the discrete modified adjoint system has limited value, because a
digital computer can be programmed to overcome the analogous difficulty
encountered with continuous systems. Thus this discussion of linear
discrete systems departs from paralleling the preceding chapter to consider
the general topic of least squares. This topic has considerable importance
in the areas of communications, control, numerical analysis, prediction,
and others, and, upon completion of this section, the reader is encouraged
to investigate the continuous analogy to this section.16,17
Suppose that the output y(kT) of a linear, stationary, discrete system is
given by a weighted sum of the present and a finite number of past values
of the input v[(k — m)T], m — 0, 1, 2, . . ., M — 1. This can be written
in terms of the weighting factors h(mT) as

y(kT) = v(kT)h(0) + v[(k - 1 )T]h(T) + • • •


jyj_j
+ v[(k - M - 1 )T]h[(M - 1)T] = £ v[(k - m)T]h(mT)
m=0

In order to determine the system-weighting factors, J > M sets of measure¬


ments of v[(k — m)T], m = 0, 1, . . ., M — 1, and the corresponding
y{kT) could be made. This would yield J equations of the form

*hA + v12h2 + • • • + vlMhM = yi


v2ih1 + v22h2 + • • • + v2MhM = y2 ^

+ VJ2^2 + * * * + Vjm^m = VJ

where the subscript on the y's and the first subscript on the v's denote a
particular set of measurements. The subscript on the /z’s and the second
subscript on the v's denote the argument (m + 1 )T. If only M sets of
measurements are made, i.e., J = M, unique solutions exist for the h^ but
noise and measurement errors generally cause Eq. 6.11-1 to yield incorrect
values. Thus, in such an experimental situation, many measurements are
usually taken, and more equations than the number of unknowns are found.
Hence J > M. Now, however, values for the ht cannot be found which
satisfy all the equations. For example, substitution of the values h-f,
h2°, . . . , hm° for the unknown hx, h2, . . . , hm in the left side of Eq. 6.11-1
might yield yf, y2°, . . . ,yj° which differ from yv y2, ... ,yj by et =
yf — y., i = 1,2,...,/. Faced with this dilemma, one might attempt
458 State Variables and Linear Discrete Systems

to determine the hi in such a way that each of Eqs. 6.11-1 is at least approxi¬
mately valid, and such that some measure of the total approximation error
is as small as possible. For example, the h*° can be determined such that
the ei have the smallest possible mean square deviation, i.e.,

/ = £*'
=1
i
= f (Vi9 - i—1
Vif

is a minimum. This is known as the method of least squares. It is, in


essence, an attempt to find the “best” values for the ht.
The preceding problem can be viewed geQmetrically. Let the vectors
v1? v2,. . . , yM denote the columns of

*>11 *>12 *>1 M

*>21 *>22 *>2 M


V =

VJ\ *>J2 VJM

Then the vector


Vi
yf

is given by y° = h1°\1 + h2°\2 + • • • + hM°\M. The problem becomes one


of determining hf, h2°,..., hM°, such that ||c||2 = || y° — y ||2 is a minimum,
where
*/i
^2 y2

e = and y =
• •

• •

Jj_
Thus the problem is to determine
hf
Sec. 6.11 The Method of Least Squares 459

Fig. 6.11-1

such that y° has the smallest possible deviation in norm from y. This is
represented for the case M = 2 in Fig. 6.11-1. In the general case, the set
of all linear combinations of y1} v2, . . . , vM forms a space R°, and the
orthogonal projection of y on R° is the vector in R° which is the closest to y.
This is a simple generalization of what is geometrically apparent for
M — 2 in Fig. 6.11 -1. Thus h° is to be chosen so that the linear combination
yo _ oYi hfv2 + • * * + hM°\M is the orthogonal projection of y on
R°.
Given a space R° and a vector y, which is in general not contained in
R°, y can always be represented in the form

y = y° — e

where the vector y° belongs to R° and e is orthogonal to R°. This is the


geometrical idea behind the Gram-Schmidt orthogonalization procedure
of Section 4.5. Taking vl5 v2, . . • , vM as a basis in R°, y° = h1°\1 +
h2°\2 + • • • + hM°\M, where h-f, h2°, . . . , hM° are to be determined. The
vector e = y° — y must be orthogonal to R°, for which it is necessary and
sufficient that

<e, Vi) = <y° - y, Vi) = 0, / = 1, 2,. . ., M

Substitution for y° yields

{h1°v1 + h2°\2 + ’ * * + hM°\M - y, v^) = 0, i = 1, 2,. . ., M


460 State Variables and Linear Discrete Systems

or
/q°(vi, Vi) + h2°(y2, yx) H-+ yx) = (y, vi>
v2) + h2°(y2, v2) + • • • + hM°(\M, v2) = <y,
(6.11-2)

^i°(vi, vm) + h2°(\2, \M) + • • • + , y^) — <y, vM)

The determinant of this set of equations is the Gram determinant G


defined by Eq. 4.5-11, and it is nonzero as long as v1} v2, . . . , \M are
linearly independent. This is, in essence, an observability condition.
Assuming this, the value of h°, as determinedly least squares, is given by

" <Vi, Vi) <V2, Vi) • • Gi-i, Vi) <y, vx) < v<+i, vx> • • • <Vjf, V!>“

<Vi, v2> <v2, V2> • • v2> <y, v2> V2> • • • (yM> v2>

_<Vi, yM) <V2, yM) • • • <v<_!, \M) (y, vM) <▼<+!» ^M> • • •

(6.11-3)
for i = 1, 2, . . . , M.
The corresponding minimum value of the mean square deviation,
l— ||e||2, can also be determined from geometrical considerations. For
M = 2, it is the square of the magnitude of the altitude of the parallel¬
epiped determined by vl9 v2, and y. In general, it is the square of the mag¬
nitude of the altitude of the hyperparallelepiped determined by vl5 v2, . . . ,
vM, y. If Vy is used to denote the volume of this hyperparallelepiped, then
Vy = V || e ||, or

where V is the volume determined by vls v2, . . . , \M. But Vy2 is Gy, the
Gramian of v1? v2, . . . , \M, y, and V is G, the Gramian of v1? v2, . . . , yM.
This is readily apparent for the case in which the vectors determining the
hyperparallelepiped are orthogonal. For the nonorthogonal case, the
Gram-Schmidt orthogonalization procedure can be used to arrive at
the same conclusion. Thus
2
! mm min —

is the minimum value of the mean square deviation.


It is useful to rewrite Eq. 6.11-2 as VrVh° = \Ty. Then Eq. 6.11-3 is

h° = (VTV)-1VTy (6.11-4)

where the observability condition is that VTV be nonsingular. It is


interesting now to assume that one additional measurement yJ+1 is made,
References 461

which is supposed to equal vJ+11hf + vJ+12hf + • • • + vJ+1MhM°,


and determine a new least squares estimate of h° based on these J + 1
measurements. Let — —

VJ+1,2

V =

_VJ+1, M
Then Eq. 6.11-4 becomes

/ V y
V1
16+1 = ([VT ! V] I [VT 1 V] (6.11-5)
Lv T J -Jjr+l-
where the subscript on hj+1 indicates that J + 1 measurements are used.
Equation 6.11-5 provides a means of updating the least squares estimate
of h°. However, it is not satisfactory from a computational viewpoint if
the updating is to be continued, because of the matrix inversion required
for each new estimate.
In an attempt to avoid repeated matrix inversions, let Pj = (V2 V)_1.
Then define
V
p^+i = [Vr | v] = [Y1 V + vv7Ti-l
]
,T

so that
j+i [P^1 + vv21]-1 (6.11-6)
Direct substitution of

Pj+1 = ?J- PJV{VT*JV + irV11?, (6.11-7)


into Eq. 6.11-6 indicates that Eq. 6.11-7 is a valid expression for P^+i-
Then, denoting the h° based on the first J measurements by h^0, Eq. 6.11-5
can be written, after simplification, as
hj+i = hj° + P^v(vTPjV + 1 T\yJ+1 - vrh/) (6.11-8)
Since (vrPjV + 1) is a scalar, repeated matrix inversions are not required
to update the estimation of h. The updated h° is the former h° plus a
weighting of the difference between the new value of y and the estimate of
y based on J measurements.

REFERENCES
1. J. E. Bertram, “The Concept of State in the Analysis of Discrete Time Control
Systems” 1962 Joint Automatic Control Conference, New York University, June
27-29, 1962, Paper No. 11-1.
2. D. A. Huffman, “The Synthesis of Sequential Switching Circuits,” J. Franklin
Inst., March-April 1954, pp. 161-190, 275-303.
462 State Variables and Linear Discrete Systems

3. E. F. Moore, Gedanken-Experiments on Sequential Machines, Automata Studies,


Princeton University Press, Princeton, N.J., 1956, pp. 129-153.
4. J. Millman and H. Taub, Pulse and Digital Circuits, McGraw-Hill Book Co.,
New York, 1956, pp. 346-353.
5. M. Phister, Logical Design of Digital Computers, John Wiley and Sons, New York,
1958, Chapters 5 and 6.
6. B. Elpas, “The Theory of Autonomous Linear Sequential Networks,” Sequential
Transducer Issue, IRE Trans. Circuit Theory, Vol. CT-6, No. 1, March 1959.
7. B. Friedland, “Linear Modular Sequential Circuits,” Sequential Transducer Issue,
IRE Trans. Circuit Theory, Vol. CT-6, No. 1, March 1959.
8. J. Hartmanis, “Linear Multivalued Sequential Coding Networks,” Sequential
Transducer Issue, IRE Trans. Circuit Theory, Vol. CT-6, No. 1, March 1959.
9. D. A. Huffman, “A Linear Circuit Viewpoint on Error-Correcting Codes,” IRE
Trans. Inf or. Theory, Vol. IT-2, September 1956, pp. 20-28.
10. R. H. Barker, “The Pulse Transfer Function and Its Application to Sampling Servo
Systems,” Proc. IEE, Vol. 99, IV (1952), pp. 302-317.
11. J. R. Ragazzini and G. F. Franklin, Sampled Data Control Systems, McGraw-Hill
Book Co., New York, 1958, pp. 93-94, 136, 199, 217-218.
12. R. E. Kalman, Y. C. Ho. and K. S. Narendra, “Controllability of Linear Dynamical
Systems”, in Contributions to Differential Equations, Vol. I, No. 2, John Wiley and
Sons, New York, 1963.
13. F. B. Hildebrand, Methods of Applied Mathematics, Prentice-Hall, Inc.,
Englewood Cliffs, N.J., 1952, Chapter 3.
14. T. Fort, Finite Differences, Oxford University Press, 1948, pp. 244-245.
15. M. Salvador and M. Baron, Numerical Methods in Engineering, Prentice-Hall, Inc.,
Englewood Cliffs, N.J., 1952.
16. R. E. Kalman and R. S. Bucy, “New Results in Linear Filtering and Prediction
Theory,”/. Bas. Eng. March 1961.
17. Y. C. Ho, “On the Stochastic Approximation Method and Optimal Filtering
Theory,” J. Math. Anal. Appl., Vol. 6, No. 1, February 1963.

Problems

6.1 Draw the simulation diagrams for the following systems.

(a) y{k + 1) — 2y(k) cosh a + y(k — 1) = v(k)


(b) V2y(k) + k Vy(k) + y(k) = v(k)
(c) S2y(k) + k Ay(k) + y{k) = v(k)
(d) Vi(k + 2) + 2yfk + 1) + y2(k + 1) + y-fjc) — vfk)
y2{k + 2) + 3 y2{k + 1) + y2(k) + 3 yfk) = v2(k)

6.2 Find the transfer function matrix for the system shown in Fig. P6.2.
6.3 A synchronous sequential machine accepts a serial binary coded decimal
input. After every fourth pulse, the output of the machine sends out a
signal which tells whether the last four inputs formed a correct binary
coded decimal number (0 through 9). What is the minimum number of
states of this machine?
6.4 If timing pulses tx, t2, t3, tA, tx, t2, t3, tx, . . . are available, what is the
answer to Problem 6.3?
Problems 463

Fig. P6.2

6.5 An electronic lock is designed such that, after the input sequence 101011,
the lock is opened. How many states are required in order to build this
machine?
6.6 What is a suitable set of state variables for the sampled-data system of
Fig. P6.6?

Fig. P6.6

6.7 An ^-dimensional A matrix can be transformed into a diagonal matrix if


the n characteristic roots of A are distinct. Similarly, when dealing with
a mod p sequential network, the coefficient matrix can be transformed
into a diagonal form if the n roots are distinct modulo-// (called incon-
gruent modulo-//). Using Fermat’s theorem,
Ap_1 = 1 mod p 0 < A < // — 1
464 State Variables and Linear Discrete Systems

Show that, if the matrix Ar can be diagonalized, then the periods of such
a system divide r(p — 1).
6.8 The state equation of a linear modular sequential system is given by
~3 r
x(k + 1) = x(k) mod 5
0 2
(a) Show that every state sequence has a period which divides 4.
(b) Show that there must be six sequences of period 4, covering 24 different
states. Find these sequences.
6.9 Find the minimum cycle length for the linear binary networks of Fig. P6.9.

(a)

(b)
Fig. P6.9

6.10 Set up the matrix equations for the system of Fig. P6.10 in
(a) standard A, B, C, D form
(b) normal form

Fig. P6.10
Problems 465

6.11 Using partial fractions expansion, find the matrix equations for the
following systems.
z2 + 2
(a) H(z) ^3 + ?z2 + Hz + 8 (b) H(z) =
z2 + 2z + 2

z3(z + 3)
(c) H(z) = z4 + 523 + 9z2 + 7z + 2

6.12 For the discrete system shown in Fig. P6.12, (a) which modes are con¬
trollable? (b) which modes are observable?

V2(k)

52

Fig. P6.12

6.13 Given the difference equation


y(k + 1) — 3 y(k) + 2 y{k — 1) = 1 + ak
466 State Variables and Linear Discrete Systems

(a) Solve for y(k) by the classical approach.


(b) Draw a simulation diagram for the equation.
(c) Find y(k) by the mode expansion method.
id) Find y(k) by use of the state transition matrix.
(e) Find y(k) by use of the standard Z transform approach.
6.14 Show that the relationships listed in Table 6.8-1 are correct.
6.15 Show that the relationships described in Eq. 6.8-12 through Eq. 6.8-16 are
correct.
6.16 The output of an unforced discrete system is the series of numbers 0, 1, 1,
2, 3, 5, 8, ... , such that each number is the sum of the two preceding
numbers. (These are called the Fibonacci numbers.)
(a) Find the system which generates these numbers.
(b) Find the state transition matrix <J>(&).
(c) Find an expression for the A:th number y(k).

6.17 The inputs to the system x = Ax + Bv consist of a set of piecewise con¬


stant signals, such that over the interval nT </<(« + \)T the inputs
are constant. These signals are obtained by sampling the inputs at time
nT and holding these values until time (n + 1 )T, when a new set of samples
are taken. Show that the continuous system with sample and hold can be
replaced by the discrete system
x[(« + 1)T] = Axx(nT) + B^wT)
where
rr
A± = eAT and Bx = B dr
Jo
6.18 Using the discrete system found in Problem 6.17, solve for the output of
the single input-single output system shown in Fig. P6.18. Check your
answer using the standard Z transform.

V(°) / ,
T= 1
1 — e~sT
s
2
s(s + 2)
->

Fig. P6.18

6.19 Consider a network containing any number of switches operating in syn¬


chronism. During the first half-cycle (0+ < t < T{~\ some switches are
open and some are closed. In the second half-cycle (7\+ < t < T ), the
positions of the switches are reversed. At t = T, the original condition is
restored and the cycle repeated. The system is characterized by x =
Axx + B^ (nT < t < nT + T-J) and x = A2x + B2v (nT + T1 < t <
(n + 1)T.
(a) Find the solutions x(nT 4- T{~) and x[(/2 + 1)T-].
(b) If it is assumed that x(nT + 7\_) = x(nT + 7\+), x(nT~) = x(«T+)
and v = constant, find the difference equation for the system
x[(n + l)Tj = A x(nT) + Bv(«r)
Problems 467

6.20 For the system described by the difference equation V2?/ + 2Vy -f 2y — 1,
(a) Find the matrix formulation in terms of the standard A, B, C, D
matrices. Solve for y{k).
(b) Repeat (a) using normal form representation.
6.21 The switch in the network of Fig. P6.21 is closed at t = 0, 2, 4, . . . and
opened at t = 1, 3, 5, . . . .

7=1 ampere %

(a) Find the difference equation for the inductor current at the end of the
k\h open-close cycle.
(b) Solve the difference equation using the state transition matrix, and
draw a sketch of the inductor current as a function of k/2.
(c) Find the unit response matrix and the transmission matrix of this
circuit.
6.22 For the system shown in Fig. P6.22, (a) find the state equation formulation;
(b) solve for <!>(/:); (c) find the unit response H(&).

6.23 A system is defined by the following difference equations.

xi(k) ~ x2(h — 1) + y(Jk — 1) = 0


xfjk) — x2(k + 1) =0
x1 (k) — 2xfk + 1) + v{k) = 0

Find the unit response H(&).


6.24 For the system shown in Fig. P6.24a:
(a) Find the A matrix.
(b) Find 4>(A:). Interpret your answer in terms of signals in the block
diagram.
468 State Variables and Linear Discrete Systems

(c) Find the transfer function Y(z)/V(z) of the system shown in Fig.
P6.24 b.
(d) Find the unit response of this system by dividing out the transfer
function. Compare the results of this division with part (b).

6.25 (a) Find the matrix for the system of Problem 6.10 by time domain
techniques. (b) Repeat (a) using Z transform methods.
6.26 Derive the discrete analog of Eq. 5.10-14 and, from the result, the modified
adjoint discrete system.
6.27 (a) Determine a discrete approximation for the differential equation

1
y + -y = o

(b) Using T = 0.05, solve for y{kT) from k = 0 to k = 5 for k0 =0, 1,


2, 3, 4, 5.
(c) Check the results of part (b) by calculating h(0.25, 0) of the original
system.
(d) Find the discrete modified adjoint of part (a).
(e) Check the results of part (b) by calculating hr*(k\ 0).
7
Introduction to Stability Theory
and Lyapunov’s Second Method

7.1 INTRODUCTION

The precise definition of stability in the case of a nonlinear system is not


simple. The intuitive concept of stability, which for the most part is
adequate for time-invariant linear systems, fails when one attempts to
extend it to nonlinear systems. A time-invariant linear system is either
stable or unstable, depending upon the location of the zeros of the charac¬
teristic equation in the s-plane. System stability is independent of
the initial conditions or system inputs. This is not true for nonlinear
systems.
Whether the response of a nonlinear system is bounded or unbounded
may depend upon the initial conditions or the forcing function. Further¬
more, a nonlinear system may exhibit oscillations of constant peak value.
Should one refer to such a system as being stable or unstable? Intuitively
one would say unstable if it is a control system and the oscillations are
sufficiently large that the system performance is not satisfactory. A state¬
ment such as this is very vague, however, and leads to confusion. In
treating nonlinear systems, it is necessary to make careful distinctions
between the various intuitive concepts of stability.
The purpose of this chapter is to introduce the reader to the more
rigorous definitions of stability and to Lyapunov’s methods, and the role
that the state variable approach plays. The word “introduce” is carefully
chosen, since books have already been exclusively devoted to stability
theory and Lyapunov’s methods, as the references at the end of the chapter
indicate.
469
470 Introduction to Stability Theory and Lyapunov's Second Method

7.2 PHASE PLANE CONCEPTS

The response of a physical system may be illustrated by a plot of the


system response versus time. It may also be illustrated by using time as a
parameter, and plotting the interrelationships of the behavior of the state
variables of the system in state space. The latter is a geometrical represen¬
tation of the system behavior. It is difficult to visualize for systems of
third or higher order, but it is particularly useful for second order systems.
For example, consider the linear oscillator illustrated in Fig. 7.2-1 and
described by the differential equation

+ o?y = 0 (7.2-1)
dr

where to is the frequency of oscillation and is positive. Defining the state


variables xx(t) = y(t), x2(t) = xx(t) — y{t) permits the system description
of Eq. 7.2-1 to be written as

x2 -f co2^ = 0 (7.2-2)

Multiplication of Eq. 7.2-2 by x2 gives


d_ rx 2 2 *1
0 = x2x2 + co2x1x 2 = X2X2 + 0)2X1X1 = + CD* — (7.2-3)
dt 2 2 .

Integration of Eq. 7.2-3 yields

x2 -f- co2#-,2 = c2 (7.2-4)

where c = constant. Thus, in any solution to Eq. 7.2-1, y(t) = xx{t) and
y(t) — x2(t) are related by Eq. 7.2-4, the equation for an ellipse. Several
solutions for various values of c, corresponding to various initial values of
xx and x2, are shown in Fig. 7.2-2. A solution in the x2 versus xx plane is
called a trajectory, and the aqa^-plane is called the phase plane.
Time is a parameter along any one of the trajectories, and for the par¬
ticular state variables chosen in this example, i.e., x2{t) — xx(t), increasing
time corresponds to clockwise motion in the aqavplane, if the axes are as
chosen in Fig. 7.2-2. As is illustrated later, however, this need not always

Fig. 7.2-1
Sec. 7.2 Phase Plane Concepts 471

y = *2

*2 = C 3

Fig. 7.2-2

be the case. The clockwise motion of the trajectories in this example is


evident from the fact that positive x2i i.e., xx > 0, requires x±(t) to be in¬
creasing, and negative x2, i.e., xx < 0, requires xx{t) to be decreasing.
The time necessary for the solution to move from point a to point b on a
trajectory can be computed from

(7.2-5)

For example, let point a be defined by (0, c3) and point b by (c3/co, 0) on the
c — c3 trajectory. Thus a and b are one-fourth of a period apart. Elimina¬
tion of x2 from Eq. 7.2-5 by substitution of Eq. 7.2-4 for c = c3 yields

(7.2-6)

where T is the period of oscillation. Equation 7.2-6 is the well-known


expression for the period of a linear oscillator, T =
The trajectory corresponding to c = 0 is the trivial solution correspond¬
ing to xx = x2 = 0. This solution is called a singular solution, and the
point (0, 0) is called a singular point or equilibrium point. It is a point of
equilibrium or rest for the linear oscillator. In the case under considera¬
tion it is called a center, because all the trajectories form closed paths
about it.
The phase plane trajectories of Fig. 7.2-2, corresponding to the behavior
of the system described by Eq. 7.2-1, were obtained without explicitly
solving Eq. 7.2-1 for y{t). From Fig. 7.2-2 one can see that any solutions
to Eq. 7.2-1 are periodic and constant in amplitude, since the trajectories
are closed paths. If the trajectories were paths continuously approaching
the origin, one would intuitively say that the system is “stable.” If the
472 Introduction to Stability Theory and Lyapunov’’s Second Method

trajectories were paths continuously diverging from the origin, one would
intuitively say that the system is “unstable.” This representation of the
behavior of a second order system and indication of “system stability”
without actually solving the system differential equation is an important
use of the phase plane approach in connection with nonlinear systems. It
is necessary, however, to provide more precise definitions of stability. This
is done in Section 7.7.

7.3 SINGULAR POINTS OF LINEAR, SECOND ORDER SYSTEMS


V

The center singular point observed for the linear oscillator is only one of
four possible singular points. The remaining types of singular points can
be illustrated by means of the system of Fig. 7.3-1. Notice that, for the case
of zero damping (| = 0), this system is the same as the linear oscillator.
Thus, for £ = 0, the origin of the aqa^-plane is a center, and the trajectories
are ellipses. For ^ 0, however, the trajectories are altered significantly.
The differential equation for the system of Fig. 7.3-1 is

77 + 2foj LL+co2y = 0 (7.3-1)


dt2 dt
Using the indicated but not unique definition of state variables given in
Fig. 7.3-1, this can be written as
dC-% — Xa

(7.3-2)
x2 = — c OiX1 — 2^ cox 2

In matrix notation, this is x = Ax, where


0 1
A =
— co2 —2 £co

The characteristic values can be determined as the roots of


s2 + 2 tjcos + co2 = 0 (7.3-3)

Fig. 7.3-1
Sec. 7.3 Singular Points of Linear, Second Order Systems 473

or from \sl — A| = 0. Either approach yields

(7.3-4)
s2 = —£<jj + coV £2 — 1

Thus at this point one could plot the response y{t) for various values of £
and co, since the general solution of Eq. 7.3-1 is

y(t) = cxeSlt + c2eS2t

where cx and c2 are arbitrary constants. The purpose here, however, is to


illustrate phase plane principles which are useful for nonlinear problems.
Hence the nature of the possible singular points is now considered.
Singular points are points of dynamic equilibrium. In essence, they
correspond to positions of rest for the system. They may be stable. For
example, a pendulum hanging vertically with no kinetic energy is in a
stable equilibrium position. Equilibrium points can also be unstable.
A pendulum which has its mass at rest directly above its point of support is
in equilibrium. In this case, however, the slightest disturbance will set the
pendulum into oscillation. Nevertheless, without a disturbance in either
case, the pendulum is at rest. Thus the significant feature of a singular or
equilibrium point is that all the derivatives characterizing the system be¬
havior are zero. In other words, the derivatives of the state variables are
zero. For the linear, second order system of Fig. 7.3-1, the singular point
is given by setting xx = x2 = 0. Equation 7.3-2 then reveals that the origin
of the x1x2-plane is the only singular point, assuming that co is nonzero.
The slope of the trajectory at any point in the aqavplane is given by

[cox-l + 2£x2]
(7.3-5)

Thus, at a singular point, the trajectory slope is indeterminate. This is true


for all singular points and is not merely a result of the particular choice of
state variables in this example. Along the aq-axis, where x2 is zero, the slope
is infinite. This, however, is due to the particular choice of state variables
and requires that all trajectories in this example cross the a^-axis vertically.
At all ordinary points (points which are not singular points), Eq. 7.3-5
indicates that the trajectory slope is unique. The Cauchy-Lipschitz theorem
then guarantees that only one solution curve passes through a given point
in the phase plane.| Thus trajectories cannot cross one another.
t A sufficient condition for this is that f(x) be a Lipschitz function in a region R con¬
taining the point, i.e., for every pair of points x and y in R, ||f(x) — f(y)|| < K ||x — y||,
where A is a positive constant.1 Note that this is not satisfied where f(x) is a multivalued
function of x. In this case, the afs are not a set of state variables for the system.
474 Introduction to Stability Theory and Lyapunov's Second Method

In final preparation for investigating the nature of the singular points


of the system of Fig. 7.3-1, consider the trajectories which are straight lines
passing through the origin. Since they are straight lines passing through
the origin, they must be described by x2 = mxlm Hence x2 = so that
the slope of any straight-line trajectories is

m — — — — (7.3-6)
x1 xx

Equating i2/^i as given by Eqs. 7.3-5 and 7.3-6 yields

[cox-l -f 2£x2] ' [oo + 2£m\


m = —co-= —co-
x2 m
Thus the equation for the slope of the trajectories which are straight lines

1S m2 + 2 Scorn + co2 = 0 (7.3-7)

Since Eqs. 7.3-7 and 7.3-3 have the same roots, the slopes of any straight-
line trajectories are equal to the characteristic values given by Eq. 7.3-4.
This result is, of course, dependent upon the choice of the state variables,
although the characteristic values themselves are independent of the choice
of state variables. These straight-line trajectories correspond to the modes
of the response, and could have been obtained from Eq. 5.6-10.
With these preliminary thoughts completed, the nature of the singular
points for linear, second order systems is now considered.

Center: £ = 0, co2 > 0

This case was investigated in the preceding section. It was found that the
origin is a center, and that the trajectories are ellipses about the origin.
No straight-line trajectories exist, since there are no real roots to Eq. 7.3-7
for £ = 0, i.e., there are no modes or characteristic vectors corresponding
to real characteristic values.

Focus: |£| < 1, co2 > 0

For this case, it is not possible to carry out directly the procedure of the
previous section to determine the equation for the trajectories. However,
it is again possible to say that there are no straight line trajectories, since
Eq. 7.3-7 has no real roots for |£| < 1.
The substitution of x2 — zxx and dx2 = z dx1 + xx dz permits Eq. 7.3-5
to be written as
dxx z dz
(7.3-8)
z“ -f- 2h,coz -}- oo
Sec. 7.3 Singular Points of Linear, Second Order Systems 475

Equation 7.3-8 can be integrated to yield an expression for the trajectories


in the xxz-plane. The corresponding result in the aqavplane is

21 x2 + £wx1
(x2 -j- £ojx1)2 + co2( 1 — £2)xf = c2 exp tan -i
U/i - ? (;
Wi -

where c is a constant. This rather formidable expression becomes

2f
_
- tan 1 Z2
2 . 2 2
z2 + % = c exp (7.3-9)
Wl - I2 2 1-1
upon introducing the coordinates

'"V i -12 o' xi

£oj 1 Xc

Using a polar coordinate system in the z^-plane, let

z1 — r cos (/>
z2 = r sin </>

In polar coordinates, Eq. 7.3-9 becomes

£<f>
r = c exp —
Wl - f2J
Thus the trajectories in the z^-plane are logarithmic spirals. If 1 > £ > 0,
the radius decreases as <f> becomes more negative (which is the direction of
increasing t), corresponding to Fig. 1.3-2a. This is the case of a stable focal
point. The unstable focal point corresponding to 0 > £ > — 1 is illus¬
trated in Fig. 1.3-2b. As |£| is increased from zero toward unity, the rate
at which the radius changes with <f is increased. This corresponds to
moving the system poles, as given by Eq. 7.3-4, away from the joo axis of

Fig. 7.3-2
476 Introduction to Stability Theory and Lyapunov's Second Method

the 5-plane and agrees with the result one would intuitively expect. The
trajectories about the focal point in the z^-plane are distorted versions
of Fig. 7.3-2 and are illustrated in Fig. 7.3-8. Example 5.6-2 is a specific
case of a stable focal point.

Node: |£| > 1, w2 > 0

In the situation in which the origin is a nodal point, there are two trajec¬
tories which are straight lines. The slopes of these trajectories are given
by the roots of Eq. 7.3-7 and are the characteristic values of Eq. 7.3-4.
These two trajectories are illustrated in Fig. 7.3-3 for £ > 1. The remain¬
ing trajectories about a nodal point in the aqa^-plane are more difficult to
evaluate.
In order to determine the remaining trajectories, it is useful to introduce
the transformation z = M_1x or, more specifically,

«1
(7.3-10)
_ x 2_

The zx- and z2-axes are the straight-line trajectories shown in Fig. 7.3-3.
Thus the z1z2-plane is a distortion of the xxx2-plane. Using the transforma¬
tion of Eq. 7.3-10 and Eq. 7.3-4, Eq. 7.3-2 becomes, for the case of a nodal
point, the normal form equations

«1 = Sfa
(7.3-11)
z2 = SXZ2

Elimination of time as a variable yields

dz2 _ s1 dz1
z2 s2 zx

x2

Fig. 7.3-3
Sec. 7.3 Singular Points of Linear, Second Order Systems 477

Therefore the trajectories in the zYz2-plane are described by


z2 = C(21)ai/a* (7.3-12)
For £ > 1,^! < s2 < 0, so that sjsz > 1. This corresponds to the case of a
stable node. The trajectories in the z^-plane appear as in Fig. 7.3-4#. In
the case of an unstable node where £ < —lis2>s1>0, so that 1 >
Si/s2 > 0. The corresponding ^^-trajectories are shown in Fig. 7.3-4&.
These curves illustrate the fact that the solution curves for a nodal point
all (except the curve along the z2-axis) have the same limiting direction at
the nodal point. In returning to the aqavplane, the trajectories which are
not straight lines are distorted as shown in Fig. 7.3-8. Example 5.6-1
is a specific case of stable node.

Saddle Point: co2 < 0

If the sign of the gain in the outside feedback loop of the system in
Fig. 7.3-1 is changed, the singular point becomes what is known as a
saddle point. For this case, the characteristic equation is
s2 + 2£ | co | s — | co21 = 0
The corresponding characteristic values are

= — £ |co| — |co| Vl2 + 1


$2 = —£ M + \w\ V<P + l
They correspond to one negative real root and one positive real root. Thus
there are two straight-line trajectories as shown in Fig. 7.3-5.
478 Introduction to Stability Theory and Lyapunov's Second Method

*2

Fig. 7.3-5

In order to determine the remaining trajectories, it is again useful to


introduce the transformation of Eq. 7.3-10. The resulting normal form
equations are the same as Eq. 7.3-11, so that Eq. 7.3-12 also characterizes
the z^-plane trajectories for a saddle point. In this case, however, if
£ M > 0, then Sxjsi < — 1, and, if £ |co| < 0, then — 1 < sjs2 < 0. The
corresponding z^-plane trajectories are shown in Fig. 7.3-6# and 7.3-66,
respectively. The aqa;2-plane trajectories are shown in Fig. 7.3-8.

Summary
The conditions on Eq. 7.3-1 for the various types of singular points are
summarized in Fig. 7.3-7. Not considered are the case for which ||| = 1
and the case for which oo2 = 0. The trajectories for these cases are
contained in Fig. 7.3-8. The corresponding expressions for the trajectories
are given in the problems.

22 22

Fig. 7.3-6
Sec. 7.4 Variational Equations 479

The time response of the linear, second order system is easily determined,
so that it is possible to compute the phase plane trajectories from the time
response. For example, Eqs. 5.6-8 and 5.6-14 could have been utilized.
This was not done in this section, because it is not generally possible for
nonlinear systems, and the interest in studying the phase plane trajectories
for linear systems stems from their relationship to nonlinear cases. In non¬
linear cases, past studies often resorted to graphical methods of deter¬
mining the trajectories, such as the method of isoclines, Lienard’s method,
and the phase plane delta method.2-6 In the light of present-day computer
simulation capabilities, however, such graphical techniques lose much of
their appeal. Except for a brief discussion of the method of isoclines in the
next section, the graphical techniques are not pursued further in this book.
This is because of the introductory nature of this chapter and because of
the belief that the proper use of the phase plane method, in a case where the
trajectories cannot be determined analytically, is in conjunction with a
computer simulation. The trajectories can be measured from the simula¬
tion, and the phase plane approach enables one intelligently to determine
proper parameter changes to improve system behavior. For the same rea¬
sons, some of the more sophisticated methods for determining the values
of time along a trajectory are not considered.7

7.4 VARIATIONAL EQUATIONS

As stated in the previous section, there is interest in the phase plane


trajectories of the linear, second order system because of their relationship
to the trajectories of a nonlinear, second order system. In small neighbor¬
hoods about its singular points, a nonlinear system behaves similarly to a
linear system. It is possible to utilize this property to determine the stability
480 Introduction to Stability Theory and Lyapunov s Second Method

y + 2£coy + co2y = 0

Singular Point Eigenvalues Phase Plane

i vX2
> 'Sl
Center 'X \ - T
£ = 0. co2 >0
1
> ' S2
V

_ vX2

X S2
Stable focus
»—7—j-Xi
0<£ <1, co2>0
XS2

XS2
Unstable focus
(
4
\ > Ti
-1 <£<0, co2>0
X si
\ JJ

\ t .*2 \ \

Stable node J-1-xl


—X—X-
1 < £, o;2 > 0 Sl S2

\ = S'2Xi

\X2 = SiXi

/ X9 = S9Xi
\x2 /
/ / ‘
r -/= sixi

Unstable node
-X—X—
£ < -1, co2> 0 S1 S2

Fig. 7.3-8
Sec. 7.4 Variational Equations 481

y + 2£ooy + co2y = 0

Singular Point Eigenvalues Phase Plane

X9 = S9Xl

Saddle point

J
v v
co2<0 A A 1 ^ X
(figure is for £ = 0) si S2

n *2 = SiXi

U2 \ \

Si, S2
S= l V 1 I > X1
stable

X2 = —OOXl

X2 = COXl
/ / i

si, S2
f=-l
unstable a ) 1 > *1

N. N. )
[X2 n\^X2 = — 2£coxi + c

co2 = 0 /
X ?\ \ N. > *1
> 0 -2£w

X'
^2 / ^

GO2 = 0
?/\ V
X A A
£w <0 2£w
y^X2 = 2£wxi + c

Fig. 7.3-8 (Continued)


482 Introduction to Stability Theory and Lyapunov s Second Method

of the singular points of a nonlinear system. Stability in larger regions is


a subject of later sections.
Consider an 77th order dynamical system represented by the state
equations
x = f(x) (7.4-1)

This is the autonomous case, that is, one in which the independent variable
t does not appear explicitly. It corresponds to a system which is both
unforced and time-invariant. Equation 7.4-1 is an abbreviated representa¬
tion of
X1 =f1(x1, x2, . . . , xn)

x2 = fz{Xi, x2, . . . , xn)

Xn = fn(Xl, X2, • • • , Xn)

Thus x and f(x) are /7-dimensional vectors. The singular points of Eq.
7.4-1 are given by x = 0. Hence they are the solutions of

f(x) = 0 (7.4-2)

In the case of a linear system, f(x) is a linear function of x. Thus Eq. 7.4-2
has only the solution x = 0, provided that the determinant of the coeffi¬
cient matrix of the linear system is nonzero. This is true if there are no
free integrators. Then only one singular point exists for the corresponding
linear system, and it is at the origin of the state space. However, in the
nonlinear case, more than one singular point can exist, as is evidenced by
the possibility of more than one solution to Eq. 7.4-2. Denoting the zth
solution by xie, Eq. 7.4-2 becomes f(xie) = 0, z = 1,2,.... If each of the
components of f(x) can be expanded in a Taylor series about the zth
singular point, the result of considering only the linear terms is

— (X - X<e) = X = J(xie)(x - xj (7.4-3)


at
where J(xie) is the Jacobian matrix evaluated at x = xie, that is,

a/, 5/1 a/i


dxx dx2 dxn

Sf, a/2 a/2


J(x«) = dx-L dx2 d0Cn (7.4-4)

dfn 8Jn dfn

dxx dx2 d*n. X=X,


e
Sec. 7.4 Variational Equations 483

In terms of u = x — xie, a variable measured from the singular point,


Eq. 7.4-3 are the variational equations

U = (7.4-5)

The variational equations are linear homogeneous differential equations


which, for the autonomous case under consideration, have constant
coefficients. The stability of Eq. 7.4-5 can thus be determined from the
roots of
1*1 - J(xJI = 0 (7.4-6)

The question then arises as to how the stability of the system of Eq. 7.4-5
relates to the stability of Eq. 7.4-1.
It is possible to state that, if any of the roots of Eq. 7.4-6 has positive
real parts, then the zth singular point is unstable. If all the roots of Eq.
7.4-6 have negative real parts, then the zth singular point is stable.8 (In
this case the singular point is actually what is known as asymptotically
stable. However, asymptotic stability is not defined until Section 7.7.)
These statements about the stable and unstable behavior of the singular
points of a nonlinear system are given here without proof. They are
most easily proved by using Lyapunov’s method, which is introduced in
Section 7.8.
For the case in which the variational equations have roots with zero real
parts, it is impossible to distinguish between stability and instability of the
singular point based on the linear approximation.9 One can intuitively
view this situation as a borderline case, in which the effect of the ignored
nonlinear terms can result in either stable or unstable behavior.
Although the preceding statements concerning the stability of the sin¬
gular points of a nonlinear system, as indicated by the variational equations,
are valid regardless of the order of the system, a second order example is
chosen so that a phase plane may be used to illustrate the results.
Example 7.4-1. Figure 7.4-1 is a simplified pictorial representation of a relaxation
circuit.10 Its differential equation is

dvr d2i di
-Keq = - KRC-± = - KRCL — = L — + RLi + v (7.4-7)
9 dt dt2 ,dt

The negative resistance characteristic is illustrated in Fig. 7.4-2 and represented by

v = —rj + r3/3 (7.4-8)

The objective is to determine the behavior of the circuit.


The substitution of Eq. 7.4-8 into Eq. 7.4-7 yields the system equation
484 Introduction to Stability Theory and Lyapunov's Second Method

Fig. 7.4-1

where a = 1/2KRC, b2 = 2(rx - Rl)/KRCL, and c2 = rJKRCL. Defining the state


variables by xx = i and x2 = xx permits Eq. 7.4-9 to be written as
xx — x2

b2 (7.4-10)
x2 — — xx — c2xx3 — 2 ax2

Equating the right side of Eq. 7.4-10 to zero determines that there are three singular
points located at
(1) xx = 0 (2) xx = b/cV2 (3) xx = -b/cVl
x, = 0 X, 0 Zo = 0

The Jacobian matrix of Eq. 7.4-4 evaluated at the first singular point is
"0 1 '

Ji =
p2l 2 -2 a

For this singular point, Eq. 7.4-6 becomes


s —1 _ c2
s2 + las — — = 0
—b2/2 s + 2a
This corresponds to the case of a saddle point in Fig. 7.3-8, assuming rx > RL.
At the second singular point, the Jacobian
matrix is
T 0 0 "
J2 =
_—b2 -2a _
At this singular point, Eq. 7.4-6 becomes
s —1
= s2 + las + b2 = 0
b2 s + la

Thus the nature of this singular point depends


upon the relative values of a and b, since £ = a/b.
Assuming 0 < £ < 1, the singular point is a
stable focus. The same result is easily obtained
for the third singular point. Thus the phase
plane behavior of the circuit of Fig. 7.4-1 in the
Sec. 7.4 Variational Equations 485

X2 = i

neighborhood of its singular points is as shown in Fig. 7.4-3. Notice that this does not
indicate the system stability or response for an arbitrary initial condition, but merely
indicates the behavior in a small region about each of the singular points. In fact, the
trajectories as illustrated are not precisely correct, since the trajectories about a given
singular point will be distorted from the curves determined for the linear case by
neighboring singular points.
The determination of the trajectories in larger regions of the phase plane is not an
easy task. One approach is to use the method of isoclines, which consists in determining
curves (for linear systems, they are straight lines) which connect points in the phase
plane where the trajectory slopes are equal. One can then sketch through these points
with the appropriate slope and estimate the trajectories.
This procedure is illustrated for this example by using Eq. 7.4-10 to write

dx 2 x2 (b2/2)x1 — c2xx3 — 2 ax2


(7.4-11)
dx ^ X\ x2

Equation 7.4-11 gives the slope at any point in the phase plane. Assuming that the slope
is a constant k, Eq. 7.4.11 becomes

(b2/2) xx — c2xd — lax2


k =
Xo

This can be rewritten in the form


a?j(62/2 — c2x2)
x2 = (7.4-12)
k + la

If c were zero (corresponding to a linear system), Eq. 7.4-12 would be the equation of a
straight line, and the isoclines could easily be evaluated. For this case, however, it is
necessary to pick a value for k and then compute the value of x2 for various values of xx.
The results are the isoclines shown in Fig. 7.4-4 for the arbitrarily selected case of a =
c = 1, b = 2. Sketching through each of the isoclines with the appropriate slope permits
the trajectories of Fig. 7.4-5 to be determined.
486 Introduction to Stability Theory and Lyapunov's Second Method

Fig. 7.4-4
Sec. 7.4 Variational Equations 487

II
K
Separatrix
488 Introduction to Stability Theory and Lyapunov's Second Method

The trajectories of Fig. 7.4-5 are useful in that they indicate the system behavior for
any of the initial conditions considered therein. They show two stable conditions.
Thus the system could be used as a flip-flop. However, for the parameters chosen, the
design is poor in the sense that an extremely large triggering signal is required to change
the system from one stable condition to the other. This is revealed by the fact that the
triggering signal would have to drive the system states across the curve labeled separa-
trix. A separatrix is a trajectory which passes through singular points and divides the
phase plane into regions of different character to the trajectories. The separatrix of Fig.
7.4-5 separates the trajectories about the left and right focal points. Since the negative
characteristic value of the saddle point is approximately the slope of the separatrix,
the phase plane analysis reveals that the size of the triggering signal can be altered by
changing a. Adjustment of b would also change th£ triggering signal requirements,
but it has the simultaneous effect of changing the separation of the singular points.

In concluding this section, it is worth emphasizing the proper utilization


of the phase plane approach. The trajectories of Fig. 7.4-5 can be obtained
much more easily from a simulation of Eq. 7.4-9 than by the method of
isoclines utilized here. The proper place for the phase plane approach is
in conjunction with the simulation, to indicate the appropriate parameter
changes necessary to improve system behavior with respect to the magni¬
tude of the stable states, the size of the triggering signal required, the
transient response, etc. All this information is readily available in the
equations and figures determined in the example.

7.5 LIMIT CYCLES

Nonlinear systems may have particular trajectories called limit cycles,


which are isolated closed curves in the phase plane corresponding to
periodic motions. The linear oscillator considered in Section 7.2 has closed
trajectories, but they are not isolated and hence are not limit cycles. The
amplitude of the periodic motion in the linear oscillator depends upon the
initial conditions, and an infinite number of closed curves exist. A non¬
linear system has a fixed number (possibly infinite) of limit cycles, each of
which separates the phase plane into two regions where the character of
the motion is different.
Limit cycles may be stable, unstable, or semistable. Figure 7.5-lu
illustrates a stable limit cycle. All near trajectories approach the limit
cycle as time approaches infinity. A stable limit cycle corresponds to a
stable periodic motion in a physical system. Figure 7.5-17? illustrates an
unstable limit cycle. The near trajectories move away from the closed
curve. In other words, all near trajectories approach the limit cycle as time
approaches minus infinity. In the case of a semistable limit cycle, for in¬
creasing positive time all the near trajectories on one side of the limit cycle
Sec. 7.5 Limit Cycles 489

approach it, while those on the other side of the limit cycle leave it. Such
a case is shown in Fig. 7.5-lc.
The existence or nonexistence of limit cycles in the behavior of a system
is of fundamental importance to engineers. For example, a control system
engineer generally desires systems without limit cycles, although small-
amplitude oscillations are sometimes acceptable. On the other hand, an
engineer designing an oscillator would definitely want a system with a
stable limit cycle. Several theorems are available to guide the engineer in
this respect.

Poincare’s Index1213

The index n of a closed curve in the phase plane is given by N, the total
number of centers, foci, and nodes enclosed, minus S, the number of
saddle points enclosed. That is, n = N — S. A necessary condition for a
490 Introduction to Stability Theory and Lyapunov's Second Method

closed curve to be a limit cycle is that its index be +1. This criterion is not
sufficient, however. As an illustration, consider Example 7.4-1. A large
closed curve enclosing the two focal points and the saddle point in Fig.
7.4-5 has an index of +1. Thus a limit cycle conceivably could exist which
encloses all the singular points. This is not the case, however, as can be
shown by Bendixson’s negative criterion.

Bendixson’s Negative Criterion

This criterion is sometimes useful for proving that no limit cycle exists
in a region of the phase plane. Consider tile equations xx — fx{x^ x2),
x2 = f2(x1, x2). The slope of any trajectory is given by

dx2 __ £2 = /2O1, s2)

dx 1 xx /iOj, x2)
This can be rewritten as fx(xl5 x2) dx2 — f2(x1, x2) dxx = 0. Thus, around a
limit cycle,

(> [fi(xi> ^2) dx2 — f2(xx, x2) dxx\ = 0 (7.5-1)

By Gauss’ theorem,! Eq. 7.5-1 becomes

3/1 + 3/2 dxx dx2 = 0 (7.5-2)


_dxx dx2_

If the integrand of Eq. 7.5-2, i.e.,


j _ dfi_ +% (7.5-3)
dxx dx2

does not change sign or vanish identically within a region of the phase
plane, the integral of Eq. 7.5-2 cannot be zero. Since Eq. 7.5-2 applies
along a limit cycle, no limit cycle can exist within a region of the phase
plane in which / does not change sign or vanish identically.
Example 7.5-1. As an illustration of the use of this theorem, determine I for Example
7.4-1.
Use of Eqs. 7.4-10 and 7.5-3 yields I = —2a. Since this does net change sign, nor is it
zero anywhere in the aq:z2-plane, no limit cycle can exist for the flip-flop of Fig. 7.4-1.
Thus the circuit of Fig. 7.4-1 cannot oscillate. This is an important property for a flip-
flop.

Example 7.5-2. The system of Fig. 7.5-2 is characterized by

x1 = x2

x2 = —x1 + 2£»a ( 1 (7.5.4)

f See any standard text on advanced calculus or electromagnetic theory.


Sec. 7.5 Limit Cycles 491

Fig. 7.5-2

Assuming 0 < £ < 1, this system has only one singular point, an unstable focus, at
x-l = x2 = 0. Thus its index is -fi 1 and a limit cycle could exist. Apply Bendixon’s
negative criterion.
Evaluation of I given by Eq. 7.5-3 yields

/ = 2|
a

Since I does not change sign or identically vanish for \x2\ < Va, no limit cycle can exist
which is wholly contained within the region specified by \x2\ < Va, assuming a > 0.
A limit cycle which is not wholly contained within this region does exist, however, since
/
Eq. 7.5-4 corresponds to the Rayleigh equation14 y — 2$y II — — I
V \ y — 0. Notice,

from the expression for /, that no limit cycle can exist for negative values of a.

Poincare-Bendixson Theorem15

The Poincare-Bendixson theorem states that, if a trajectory remains


inside a finite region without approaching any singular points, then the
trajectory is either closed or approaches a closed trajectory. Such a
closed trajectory need not be a limit cycle, e.g., the closed trajectories about
a center.
If a region can be determined for which the theorem applies, then it is of
considerable use. Unfortunately, however, the determination of such a
region is often very difficult. One approach which is occasionally useful
is to choose two concentric circles Cx and C2, which define a region R as
shown in Fig. 7.5-3. If there are no singular points in R or on Cx and C2,
and if trajectories enter R through every point of C1 and C2, then there is at
least one closed trajectory in the region R. The same is true if the trajec¬
tories leave R.
Example 7.5-3. Else the concept of the Poincare-Bendixson theorem to enlarge the
known region in which the Rayleigh equation cannot possess a limit cycle.
492 Introduction to Stability Theory and Lyapunov's Second Method

*2

From Eq. 7.5-4, the slope of any trajectory is given by

dx2 —xx + l^x2(\ — x22l3a)


dxx x2

Let Cx be a circle of radius r with its center at the origin, so that xx2 -f x22 = r2. Then
the slope of the circle is given by

dx 2 xx xx
dxx Vr2 — xx2 X2

The difference between the slope of the trajectory and the slope of the circle at a common
point is

For all |aj2| < V3a, 6 > 0, so that the slope of the trajectory is more positive than the
slope of the circle. This indicates that all trajectories pass out of Cx for any r < V 3a.
Thus the system of Fig. 7.5-2 has no limit cycles within the region defined by r < V3a.
This region is larger than that defined by Bendixson’s negative criterion.
In this example, it is not possible to use similar reasoning to determine a circle C2 for
which all trajectories enter, in an attempt to locate the limit cycle. The necessary curve
is more complicated.16

Poincare’s Successor Function17,18

The successor function introduced by Poincare is useful for proving


certain theorems relating to limit cycles. More important to engineers,
however, is the fact that the concept can be used to derive conditions on
system parameters for which limit cycles can exist. This is particularly
true for piecewise linear systems, where expressions for the trajectories
in various regions of the phase plane can be determined.
The successor function can be illustrated by means of Fig. 7.5-4. In
Fig. 7.5-4a is shown a curve C, which is intersected by a trajectory first at
Sec. 7.5 Limit Cycles 493

Fig. 7.5-4

point a, and next at point b. Thus point b is the successor to point a. If


u is a parameter such as arc length on C, points a and b can be written as
functions of u in the form a = a(wa), b = a(w6). Furthermore, since a and b
are points on a solution curve, they are related by this solution. Thus it is
possible to write ub as a function of ua in the form ub = g(ua). The function
g(u) is the successor function. In the case of a limit cycle, points a and b
coincide as shown in Fig. 1.5-Ab. Such points are given by the solution of
u ~ g(u)' The use of the successor function concept to determine condi¬
tions on system parameters for which limit cycles can or cannot exist is
demonstrated by means of an example.
Example 7.5-4. The system to be considered is the relay control system with rate
feedback shown in Fig. 7.5-5a. The relay is assumed to have deadband and hysteresis as
indicated by the characteristic of Fig. 7.5-56. The relay characteristic has odd symmetry,
so that m(a) = —m(—a), where a = —(xj + kx2). The relay output m(a) can have the
values 0, +1, or —1, depending upon the value of a and its past history. The system
dynamics follow linear relationships for each of these values, so that the system is
piecewise linear. That is, the phase plane can be divided into the three regions

m(a) = 0, Region I
m(a) = — 1, Region II
/77(a) = -f 1, Region III

and the trajectories in each of these regions correspond to those of a linear system. Use
the successor function to determine the conditions for a limit cycle.
The trajectories in each of the three regions can be determined from the state equations
x1 = x2 and x2 = —x2 + m(a). The trajectory slopes are given by

dx2 —x2 + m(a)


(7.5-5)
dx i x2

In Region I, m(a) is zero, so that Eq. 7.5-5 becomes


494 Introduction to Stability Theory and Lyapunov's Second Method

(b)
Fig. 7.5-5

Fig. 7.5-6
Sec. 7.5 Limit Cycles 495

This can be integrated directly to give the expression for the trajectories in Region I.
They are the straight lines

x2 = — xx + Ci, cx = constant (7.5-6)

shown in Fig. 7.5-6. It should be observed that there is an infinity of singular points in
Region I given by la^l < d, x2 = 0.
In Region II, Eq. 7.5-5 is
dx 2 1 + x2
(7.5-7)
dx x x2
Integration of Eq. 7.5-7 yields

x1 + x2 = In (1 + x2) + c2, c2 = constant (7.5-8)

The trajectories in Region II are shown in Fig. 7.5-6. They are shown only for x2 > — 1,
since Eq. 7.5-8 is valid only in this range. Since the slope of the trajectories is zero at
x2 = — 1, as indicated by Eq. 7.5-7, no trajectories can cross x2 = —1. Thus no limit
cycle can exist outside the range [rc2| < 1. The trajectories in Region III have the same
shape as those in Region II, but they are rotated by 180°.
The three regions in Fig. 7.5-6 are divided by the four switching lines as indicated,
assuming k > 0. These lines are the locus of points at which the relay output switches
value. The reader should coordinate these switching lines with the relay characteristic
of Fig. 7.5-5b and verify that such is the case.
A possible stable limit cycle is shown in Fig. 7.5-6. It is yet to be determined if a limit
cycle actually exists. In order to do this, a successor function will be determined.
Because of the symmetry which exists in this example, however, the successor function
need span only one-half of the possible limit cycle. That is, a successor function will be
determined which relates the point [x10 = — {hd + kx20), x20] to the point [x12 = hd —
kx22, x22]. If x12 = —x10 and x22 = — x20, then a limit cycle exists.
From Eq 7.5-6, it is apparent that the points (x10,x20) and (xxx, x2X) are related by

#21 #20 = (#11 #io) (7.5-9)


Also, since the two points are on switching lines, the expressions for these switching lines
can be used to eliminate x10 and xxx from Eq. 7.5-9. That is, since x10 + kx20 = —hd
and xxx + kx2x = d, Eq. 7.5-9 can be written as

d{ 1 -f- h)
#21 = #20 - -7- (7.5-10)
1 — k
Similarly, Eq. 7.5-8 and the appropriate switching line expressions can be used to relate
x2X and x22 by

(1 — k)(x22 - x2x) — d{ 1 — h) = In ( -a- ) (7.5-11)


\1 + #21/
Equations 7.5-10 and 7.5-11 can be combined to eliminate x2x and thereby yield a
relationship between x20 and x22. After some manipulation, it can be written as

d( 1 + h)
(1 + x22) exp [—^(l — k)] = 1 + #20 exp [— a?20(l — k) + 2dh]
1 - k
(7.5-12)

Although this expression cannot be solved explicitly for x22 in terms of x20, it is in essence
a successor function relating these two coordinates. If x22 = — x20 satisfies Eq. 7.5-12,
496 Introduction to Stability Theory and Lyapunov s Second Method

Fig. 7.5-7

there is a limit cycle. This is true since this equality would also necessitate x12 = —x10,
because the two points are on lines of equal slope, symmetrically located about the
origin.
In order to determine if any solutions to Eq. 7.5-12 and x22 = —x20 exist, let

d( 1 + h)
p0(x) = 1 + x — exp [—^(l — k) + 2dh\
1 - k

and p2(x) = (1 — x) exp [x{\ — k)]. If there is a value of x for which p0(x) = p2(x),
then that value of x is the x20 = — x22 satisfying Eq. 7.5-12, since p0(x) and p2(~x)
correspond to the right and left sides of this equation, respectively. Typical curves of
p0(x) and p2(x) with an intersection, and hence a limit cycle, are shown in Fig. 7.5-7
for x in the range d( 1 + h)/( 1 — k) < x < 1. The upper limit was specified previously
in connection with Eq. 7.5-8. The lower limit is a consequence of the fact that the Region
I portion of any limit cycle trajectory cannot intersect the region of singular points,
otherwise the system would come to rest. The initial conditions for which no periodic
motions exist are shown in Fig. 7.5-9 in Area I.
From the values of p0(x) and p2{x) at x — <7(1 -f h)/( 1 — k), it is apparent that the
curves do not intersect unless
d( 1 + h)
e
—2d
< (7.5-13)
1 - k

Thus Eq. 7.5-13 is a necessary condition for a limit cycle. The values of dand h satisfying
Eq. 7.5-13 for various values of k are those less than the values on the curves of Fig.
7.5- 8. For k > 0.5, no limit cycle can exist.
In order to show that a limit cycle exists if Eq. 7.5-13 is satisfied, assume that x20 —
x = xa in Fig. 7.5-7. Then p0(xa) = Pztxb) corresponds to a solution of Eq. 7.5-12, so
that the resulting x22 = — xb. This is the value of x20 for the next half-cycle, so that the
new x22 = xc. This procedure continues until x20 = x22 = x0. This is a stable limit
cycle, since the same result is achieved starting from xa'.
For the case illustrated, the phase plane can be divided into three areas as in Fig.
7.5- 9. If the initial conditions are such that the initial point is in Area I, the system is
Sec. 7.5 Limit Cycles 497

Fig. 7.5-8
498 Introduction to Stability Theory and Lyapunov's Second Method

stable and the trajectory goes to the singular region. If the initial point is in Area II,
the trajectory approaches the stable limit cycle from within. If the initial point is in
Area III, the trajectory approaches the stable limit cycle from outside.
This example has demonstrated the use of the successor function for determining
conditions under which limit cycles can or cannot exist. Typical results are those of
Fig. 7.5-8, which indicate the amount of rate feedback, in terms of the relay deadband
and hysteresis, required to prevent the existence of a limit cycle. Figure 7.5-7 illustrates
a method of determining the size of the limit cycle if one exists.

Summary

This section has considered limit cycles, and some of the theorems and
methods which may be useful in determining their existence. They are, in
essence, methods of estimating if a system has periodic motion without
explicitly evaluating the system response. Unfortunately, the phase plane
approach is limited primarily to unforced, second order systems. Theoreti¬
cally at least, the concepts may be extended to higher order phase space to
treat systems of higher than second order.5,19 In general, however, the
extension has not been an overwhelming success. This is primarily due to
the problem of determining, presenting, and visualizing trajectories in
phase space. For this reason, most of the theory has not been extended to
include higher order systems.

7.6 RATE OF CHANGE OF ENERGY CONCEPT—INTRODUCTION


TO THE SECOND METHOD OF LYAPUNOV

Contrasted with the techniques already presented in this chapter, the


second or direct method of Lyapunov can be used to determine the
stability of the behavior of higher order systems. It applies to systems
which may be forced or unforced, linear or nonlinear, stationary or time-
varying, and deterministic or stochastic. However, much remains to be
done to make the method useful and practical in many cases. For this
reason, and because of the introductory nature of this chapter, the dis¬
cussion which follows is not so all-inclusive.
The philosophy of Lyapunov’s second method is the same as most of the
stability methods utilized by control engineers, that is, to answer the ques¬
tion of stability without solving the characterizing differential equation.
The major limitation of the method is one of determining a suitable
Lyapunov function required to indicate stability. If one can be found, then
stability is verified. If one cannot be found, however, this does not
necessarily indicate instability. It may be a reflection on the experience
and ingenuity of the user of the method. This difficulty is gradually being
Sec. 7.6 Introduction to the Second Method of Lyapunov 499

alleviated as more methods for generating possible Lyapunov functions


are developed. Although first published in a Russian journal in 1892 and
translated into French in 1907, it does not appear that Lyapunov’s second
method was employed to investigate the stability of the responses of
nonlinear control systems until 1944.20,21 yqe method was available in the
United States by 1949 but did not become widely known to engineers here
until I9 60.22,23 Thus extensive experience with the
method is yet to be acquired in this country.
As an introduction to the second method of
Lyapunov, consider the mechanical configuration of
Fig. 7.6-1, in which the unit mass is permitted to
move in the y direction only. The spring and friction
effects are nonlinear, as represented by the functional
force equations F3prln„= -k(y) and Ffriction = -b(y).
Thus a summation of the forces acting on the mass
indicates that the system can be described by the nonlinear differential
equation ij + b(y) + k(y) = 0. In the conservative or dissipationless case,
b(y) is zero and the total energy of the system is constant.
Studying first the behavior of this conservative case in the phase plane,
let aq — y and x2 = aq. Then the system is described by
/y» - /y»
\Aj "p — kAj 2

(7.6-1)
x2 = —k(x 1)

Assuming /:(0) is zero and /c(aq) 3^ 0 for aq ^ 0, there is only one equilib¬
rium point, a center, located at the origin of the aqaq-plane. The trajec¬
tories enclose the center and are given by the solutions of the equation
dxr /c(aq)
(7.6-2)
daq x2
Equation 7.6-2 can be integrated by separation of variables to yield
Xo
+ /c(aq) dxx = c, c = constant (7.6-3)

Thus the trajectories take the form shown in Fig. 7.2-2. The exact shape
depends upon k(y).
The total energy of this conservative system is the kinetic aq2/2, plus the
potential

/c(aq) dxx

Thus the total energy is


500 Introduction to Stability Theory and Lyapunov’’s Second Method

Equations 7.6-3 and 7.6-4 yield E(x1? x2) = c. In other words, the xxx2
trajectories are contours of constant total energy for this conservative
system. Alternatively, the time rate of change of the system energy is zero,
as is also shown by

dE
— = x2x2 -f k(x1)x1 = x2[x2 + ^(aq)] = 0 (7.6-5)
dt

where x2 = — k{x^) from Eq. 7.6-1.


For the case with dissipation [xjjix^ > 0, for x2 ^ 0], the system is
described by
x1 = x2
(7.6-6)
x2= — [kfa) + b(x2)]

The trajectories are given by the solution of

dx2 [/c(^x) + b(x2)]

This expression could be graphically integrated by the method of isoclines


to determine solutions.
If only information about the stability of the system is desired, however,
a simpler approach is to evaluate the time rate of change of energy. The
rate of change of energy, as defined in Eq. 7.6-5, is dEjdt = x2[x2 + /^(^i)].
Using Eq. 7.6-6, the rate of change of energy for the nonconservative case
becomes dEjdt = —x2b(x2). If x2b(x2) is positive for any nonzero x2,
this shows that the energy is always decreasing along any solution of
Eq. 7.6-6, except on the line x2 = 0. Therefore, if a trajectory for this case
were superimposed on those of Fig. 7.2-2, the result would appear as in
Fig. 7.6-2. The motion of the system is from one contour of constant
energy to a contour of lesser constant energy, except on the line x2 — 0.
*2

Fig. 7.6-2
Sec. 7.7 Stability and Asymptotic Stability 501

On this line, dx2/dx1 is infinite if xx ^ 0. Thus no trajectory can terminate


there unless x1 = 0, and hence the system approaches the equilibrium point
at the origin.
The preceding discussion is a demonstration of the intuitive reasoning
that, if the time rate of change of energy of an isolated physical system is
negative for every possible state, except for a single equilibrium state, then
the energy will continually decrease until it assumes its minimum value at
the equilibrium state.23 This agrees with one’s intuitive concept of stability.
A problem arises, however, when one attempts to convert this intuitive
concept into a rigorous mathematical technique for determining stability.
One of the difficulties is that there is no natural way of defining energy
when the system equations are given in purely mathematical form. In
order to circumvent this difficulty, the concept of a Lyapunov function is
introduced.
A Lyapunov function is a scalar function of the state of the system. As
in the preceding discussion, it may sometimes be taken as the total energy
of the system. This need not always be the case, however. By means of
theorems due to Lyapunov and others, it is possible to specify various types
of stability, corresponding to various conditions satisfied by the Lyapunov
function. Various types of stability are considered in the next section.

7.7 STABILITY AND ASYMPTOTIC STABILITY

The preceding sections of this chapter have considered the “stability” of


autonomous nonlinear systems. In the course of these considerations,
several significant points should have occurred to the reader familiar only
with stability in connection with linear systems.
First of all, the stability of a linear system is a property of the system
itself and is not influenced by the states of the system or the input signals.
A linear system is stable or unstable in all of state space. Stability of the
behavior of a nonlinear system, however, as indicated by the system’s
variational equations, is a concept applicable only in an infinitesimal
region about each of the singular points of the system. Furthermore, a
nonlinear system may have both stable and unstable singular points and
hence may exhibit stable behavior in some regions of state space and
unstable behavior in others. Thus, in considering the stability of a non¬
linear system, one must associate the concept with a region of state space.
As will be seen, this region may extend to all of state space in some cases.
Secondly, excluding the cases of the free integrator and the undamped
linear oscillator, if a “stable” linear system is perturbed from an equilib¬
rium point, the system returns to the equilibrium point after an infinite
502 Introduction to Stability Theory and Lyapunov's Second Method

time interval. However, a perturbed nonlinear system may return to the


equilibrium point, move to some other stable equilibrium point, exhibit
limit cycle oscillations, or have one or more of its states increase without
bound. The intuitive concept of stability is not sufficient to characterize
these possibilities.
Finally, how are stable, unstable, and semistable limit cycles explained
by the intuitive concept of stability? Note that here one is considering the
stability of a closed trajectory, not the stability of a system. In the case of a
linear system, because stability is a property of the system alone, one often
talks about “system stability.” In the case of a nonlinear system, however,
stability is not a property of the system alone. Hence the proper approach
is not to consider stability of the system, but to consider stability of motions
or trajectories. That is, the precise concepts of stability relate to devia¬
tions or perturbations of the states of a system about some specific
motion.
In order to emphasize these concepts, assume that us(t) is a specific
solution of
u = g(u, t) (7.7-1)
so that
Us = g(us> 0 (7.7-2)
From a precise stability viewpoint, the appropriate question to ask is, “If
u(t) is suddenly perturbed from us(t), how does the resulting solution u(/)
behave with respect to us(r)? Does u(t) remain in the neighborhood of
us(r) as time passes, or does it diverge from us(0?” Mathematically this is
equivalent to introducing the variable x = u — us, which measures the
deviation of u(t) from the specific solution us(t). Then Eq. 7.7-1 becomes

u = x + us = g[x + us, t] (7.7-3)

Substitution of Eq. 7.7-2 into Eq. 7.7-3 yields the differential equation for
the perturbed motion x(t) as

x = g[x + us, /] - g[us> *] (7.7-4)

Equation 7.7-4 has an equilibrium point at the origin, since it has the
trivial solution x = 0. Thus one can now consider stability of the origin
of the equivalent system
x = f(x, t) f(0, 0 = 0 (7.7-5)

where f(x, t) = g[x + us, t] — g[us, t]. Unfortunately, Eq. 7.7-5 is often
more complicated than the original version, Eq. 7.7-1. Furthermore, the
specific solution of Eq. 7.7-1 is required to determine Eq. 7.7-5. However,
Eq. 7.7-5 does permit the definitions of stability to be formulated in terms
of the stability of an equilibrium state at the origin, and most of the
Sec. 7.7 Stability and Asymptotic Stability 503

stability theorems have been so formulated in recent works. A similar


approach is utilized in the definitions which follow.
The equilibrium state of Eq. 7.7-5 is stable if, subsequent to a small
perturbation from the equilibrium, all motions of the system remain in a
correspondingly small neighborhood of the equilibrium. With reference
to Fig. 7.7-1, this can be written in precise mathematical terms as
follows.24,25
The equilibrium of Eq. 7.7-5 is stable if, given any e > 0, there exists a
3(e, t0) > 0 such that ||x0|| < 3 implies \\x(t; x0, t0)\\ < 6 for all t > t0.
x(t; x0, tQ) is the response at time t to a sudden perturbation x0 = x(70),
which exists at time t0, and ||x|| is the Euclidean length of the vector x,
i.e., ||x||2 = sq2 T -j- • • • -f x2. In other words, a bound e on the
perturbed response is first specified. If one can then find a bound 3 on the
size of the perturbation x0, so that the perturbed response x(/; x0, t0) due
to any x0 within the bound 3 always remains within its bound e, then the
equilibrium is said to be stable. The equilibrium is unstable if it is not
stable.
As indicated previously, this stability concept is a local one. It is
sometimes referred to as stability in the small, since one does not know
a priori how small it may be necessary to choose 3. For example, a stable
equilibrium point could be surrounded in the phase plane by an unstable
limit cycle. If 3 is chosen too large, the perturbed response would not
return to the vicinity of the equilibrium point.
A center singular point, characteristic of a linear oscillator, is stable
according to the definition above, since one can make the amplitude of the
oscillation arbitrarily small by decreasing the initial condition. A stronger
type of stability is desired for most control systems (except perhaps for
regulators), however. Usually, in that case, one would desire a perturbed
504 Introduction to Stability Theory and Lyapunov's Second Method

system to return to the equilibrium point. This corresponds to asymptotic


stability. Such motion is characteristic of stable foci and nodes. Again
this is a local concept, defined in the Lyapunov sense as follows.24,25
The equilibrium of Eq. 7.7-5, defined to be at the origin, is asymptotically
stable if, in addition to the equilibrium being stable, there exists a <50(T, t0) >
0 such that, if ||x0|| < d0, the solution x(7; x0, t0) approaches 0 as t
approaches infinity. Asymptotic stability is also represented in Fig. 7.7-1.
It is characterized by the fact that the perturbed response approaches the
equilibrium point as time approaches infinity.
There are generalizations of these stabilityvdefinitions, and in fact many
more stability definitions are in the mathematical literature. The general¬
izations and additional definitions which are useful to the engineer are
presented in later sections. The next section considers the application of
Lyapunov’s second method to investigate the behavior of autonomous
continuous systems.

7.8 LYAPUNOV’S SECOND METHOD FOR AUTONOMOUS


CONTINUOUS SYSTEMS—LOCAL CONSIDERATIONS

As indicated previously, the stability or asymptotic stability of the


equilibrium at the origin of the autonomous system

x = f(x), f(0) = 0 (7.8-1)

can often be determined using Lyapunov’s second method. If the equilib¬


rium is not at the origin, the method used to arrive at Eq. 7.7-5 can be
utilized so that an equilibrium at the origin can be considered. f(x) is
assumed to be continuous, and such that the existence, uniqueness, and
continuous dependence of the solutions upon the initial conditions are
assured.
Lyapunov’s second method requires the utilization of a continuous
scalar function of the state variables V(x), in conjunction with Eq. 7.8-1,
the state equations. Depending upon the properties of V(x) and its time
derivative, instability or various forms of stability of the equilibrium can
be proved. Two types of V(x) of particular importance are the semidefinite
and the definite forms.*)*
The function V(x) is semidefinite in a neighborhood about the origin if
it is continuous and has continuous first partial derivatives, and if it has
the same sign throughout the neighborhood, except points at which it is
zero. A V(x) > 0 is positive semidefinite, while a V(x) < 0 is negative
semidefinite.
f See also Section 4.9.
Sec. 7.8 Lyapunov's Second Method for Autonomous Continuous Systems 505

The function V(x) is definite in a neighborhood about the origin if it is


continuous and has continuous first partial derivatives, and if it has the
same sign throughout the neighborhood, and is nowhere zero, except
possibly at the origin. A K(x) > 0 for x ^ 0 is positive definite, while a
V(x) < 0 for x ^ 0 is negative definite.
These definitions are illustrated by considering examples in an n-
dimensional state space where V(x) = V(xl9 x2,. . . , xn).

Example 7.8-1. V(x) = (^i + x2)2 + xf-> n = 3.


This V(x) is positive semidefinite, since it is positive except for OC ^ — dC — T — 2 3 0 and
xx = — x2, x3 = 0.

Example 7.8-2. V(x) = xx2 + x22, n = 2.


This V(x) is positive definite, since it is positive except at the origin, where it is zero.

Example 7.8-3. V(x) = xx2 + x22, n = 3.


This V(x) is positive semidefinite, since, for xx = x2 = 0, it is zero for any x3.

Example 7.8-4. V(x) = xx2 + xf + • • • + xn2.


This F(x) is positive definite for arbitrary n.

Example 7.8-5. V(x) = —(xx2 + x22 + • • • + xn2).


This F(x) is negative definite for arbitrary n. The negative of any positive definite
function is negative definite.
n
Example 7.8-6. V(x) = ^ quxixi = <x, Qx>, qfj = qH.
i,j=l
Sylvester’s theorem states that the necessary and sufficient conditions for V(x) to be
positive definite are that all the successive principal minors of Q be positive, i.e.,

q 12 ... qx

qn o,
7n q\2
>0, .. ., ^21 q 22 ... q2
^21 ^22

qn i qn2 • • • qn
This theorem was proved in Section 4.9.

In addition to the nature of the function V(x), Lyapunov’s second


method requires consideration of the time derivative of V(x) along the
system trajectories. The time derivative of L(x), denoted by W, is

,,, . dV dV dV , dV .
H-x n (7.8-2)
dt dXl dx2 dx n

Along any trajectory corresponding to Eq. 7.8-1,

i\ = x2, . . . , xf)
x2 — x2-> • • • J Xn)

Xn —fniX 1’ X2-> • • • j Xn)


506 Introduction to Stability Theory and Lyapunov's Second Method

Thus, along any system trajectory, Eq. 7.8-2 is

dV
w= y-Aw + dV + + dx
p-fjx)
U U OC'2 n

In terms of the gradient of F(x), denoted by grad V(x), the time derivative
of V(x) becomes
W = (grad F, f> (7.8-3)

The function W is the time derivative of F(x) for the differential equation
of Eqs. 7.8-1. Note, however, that the solutions of Eqs. 7.8-1 are not
required to evaluate W. v-
Lyapunov’s stability theorem can be written in terms of the functions
F(x) and W.

Lyapunov’s Stability Theorem.26-28 Given the differential system of Eqs.


7.8-1, the equilibrium is stable if it is possible to determine a definite V(x),
such that F(0) = 0 and W is semidefinite of sign opposite to V(x) or
vanishes identically. If V(x) satisfies the requirements of Lyapunov’s
stability theorem, it is called a Lyapunov function.
The requirements of this theorem on the Lyapunov function can be
viewed on an intuitive basis by considering the nonlinear mechanical
system of Section 7.6. Assume that F(x) is the total energy in the con¬
servative case, as in Eq. 7.6-4. This equation is written below as Eq. 7.8-4.

x 2 CXl
E(xlt x2) = V{xx, x2) = — + k(u) du (7.8-4)
2 Jo

Note that F(0) = 0, and F(x) is positive definite since any physical spring
is such that k(u) has the same sign as u. From Eq. 7.6-5, the time derivative
of F(x) is
dE
=~= W = x2[xt + kizj] (7.8-5)
dt

For the conservative case, x2 = —k(x1) and W vanishes identically. This


is the situation in which the origin is a center, and the system moves
around a trajectory which is a contour of constant energy. Thus, given a
perturbation from the origin, the system remains in the vicinity of the
origin and is stable.
For the nonconservative case, Eqs. 7.6-5 and 7.6-6 give W = —x2b(x2).
If x2b(x2) > 0, corresponding to a negative semidefinite W, the energy can
never increase. The perturbed system remains in the vicinity of the origin
and is stable. If x2b(x2) >0, x2 ^ 0, the energy must decrease. The
perturbed system returns to the origin and is asymptotically stable. This
Sec. 7.8 Lyapunov's Second Method for Autonomous Continuous Systems 507

requirement on W for asymptotic stability is indicated in Lyapunov’s


asymptotic stability theorem.

Lyapunov’s Asymptotic Stability Theorem.29 31 Given the differential


system of Eqs. 7.8-1, the equilibrium is asymptotically stable if it is
possible to determine a definite L(x), such that L(0) = 0 and W is definite
of sign opposite to V(x).
Example 7.8-7. Figure 7.8-1 represents a second order system with nonlinear feedback.
The objective is to determine conditions on a and g( ) for which the equilibrium is
asymptotically stable.

Fig. 7.8-1

The state equations are x1 = x2 and x2 = —gixj — ax2, where it is assumed that
g(xt) = 0, only at xx = 0, so that the only equilibrium is at xx = x2 = 0. From the
similarity of these equations to those of Eqs. 7.6-6,

x 2 f*i
V(x) = -L + g(u) du
1 Jo
is suggested as a possible Lyapunov function. Note that K(0) = 0, and V(x) for x#0
is positive if the^p/) versus u characteristic is anywhere in the first and third quadrants
as shown by the solid curve in Fig. 7.8-2. The latter condition can be expressed as

g(u)
7— >0, u 0 (7.8-6)
u

V(x) is positive definite if Eq. 7.8-6 is satisfied.


Now, considering the requirement on the time derivative of L(x), W is given by
W = g(v1)x2 — x2[g(xi) + ax2] = —axf. This is negative semidefinite for a > 0,
indicating a stable equilibrium.
The equilibrium is asymptotically stable if a > 0 and Eq. 7.8-6 is satisfied. This does
not come directly from Lyapunov’s theorem, since W is not negative definite. However,
everywhere that W does not comply with the definition of negative definiteness, i.e.,
x2 = 0, xl ^ 0, dx2/dxl is infinite. Hence no trajectory could terminate on the x^axis
except at the origin.
Note that this equilibrium is still asymptotically stable for the nonlinear characteristic
shown dotted in Fig. 7.8-2. F(x) would also be positive definite for this case, since the
area under the characteristic is still positive for any u measured from the origin. However,
this characteristic introduces other equilibria, which may not be asymptotically stable.
508 Introduction to Stability Theory and Lyapunov's Second Method

g(u)

Fig. 7.8-2

Example 7.8-8.32,z3 Barbasin investigated the stability of the third order nonlinear
differential equation
d3y d2y
+ a + h{y) = 0 (7.8-7)
dt» dt2

The functiong{r) is assumed to be continuous, and h(u) to be continuously differentiable.


It is also assumed that g(r) = 0 and h(u) = 0 only at r = 0 and u = 0, respectively.
The state variables are as defined in Fig. 7.8-3, so that the state equations

xx = x2

x2 — xz (7.8-8)

x3 = —h(xx) — g(x 2) — ax3

indicate a single equilibrium at x1 = x2 = x3 = 0. Use Lyapunov’s second method to


determine the conditions for asymptotic stability.

Fig. 7.8-3
Sec. 7.8 Lyapunov s Second Method for Autonomous Continuous Systems 509

A suitable Lyapunov function for this system isf

'*2
V(x) = a h(r) dr + g(r) dr + %(ax2 + x3)2 + x2h(xx)

Note that E(0) is zero. The time derivative of V(x) is


dh(x i)
W = ah{xx)xl + g(x 2)x2 + (ax2 + x3)(ax2 + x3) + a?2 —-— xx + h(x1)x2
dxx
Substitution for xx, x2, and x3 as defined by Eq. 7.8-8 yields

g(x2) dh(xi)
W— — a x0
x2 dx1

W satisfies the conditions for negative definiteness, except on the line x3 = 0, if


g(x2) dhixA
a >0, x2 7^ 0 (7.8-9)
Xo dxx

Then, using reasoning similar to that of Example 7.8-7, the equilibrium of the system of
Fig. 7.8-3 is asymptotically stable if V(x) is positive definite. These requirements on
V(x) are now examined.
If a and h{u) are such that
a > 0
h(u)
— > 0, u 0 (7.8-10)
u

then the conditions for positive definiteness of the first term of the T-function are
satisfied. Furthermore, satisfaction of Eq. 7.8-10, in conjunction with Eq. 7.8-9, yields
g(x2)jx2 > 0 for x2 0. This means that the second term of the K-function is positive
definite if Eqs. 7.8-9 and 7.8-10 are satisfied. Also, the third term of the E-function
satisfies the positive definite conditions, with the exception that it is zero for ax2 = —x3.
Even if this is the case, however, V(x) is not zero since the second term is positive.
Thus it remains to consider the last term in the E-function for x2 ^ o. When x2 ^ 0,
it is possible to write the E-function in the form
x-\ r x<>
g(r) dh(u)
[2 G(x2) + x2h(x i)]2 + 4 h(u) a r dr du
fo Jo du
L(x) = Max 2 + x3)2 +
4 G{x2)
where
‘X2
G(x2) = g(r) dr

The sum of the first term of E(x) and the first term of the fraction in E(x) is always
positive for nonzero x2. Also, the integration with respect to r is positive because of
Eq. 7.8-9. Hence the integration with respect to u is positive. Thus it is now possible to
say that E(x) is positive definite and the equilibrium is asymptotically stable if Eqs.
7.8-9 and 7.8-10 are satisfied.
It is interesting to note that the criteria for asymptotic stability of the equilibrium
contain two different types of linearizations.23,34 These are dh{u)\du and h(u)/u. These

t This Lyapunov function is derived in Example 7.14-2.


510 Introduction to Stability Theory and Lyapunov’’s Second Method

linearizations are represented in Fig. 7.8-4 at a point u = u0. The first linearization is
the slope of the function, while the second is the slope of the intercept. If h(u) were
voltage and u were current for a nonlinear resistance, then dh/du would be the a-c or
dynamic resistance, while h(u)/u would be the d-c or static resistance, both evaluated
at
If the dynamic and static linearizations are the same, as they would be for a linear
h(y) function, and if g(y) is linear, then it is possible to replace h(y) by a0y and^O/) by
axy in Eq. 7.8-7. It then becomes

d*y d2y dy
(7.8-11)
d? + aiP + a'dt + a°y = 0

and the stability conditions of Eqs.-7.8-9 and 7.8-10 are

a > 0

a0>0 (7.8-12)

aax — a0 > 0

These are precisely the Routh-Hurwitz conditions for stability of the linear system
described by Eq. 7.8-11. Unfortunately, however, when one attempts to analyze a
nonlinear system by introducing some type of linearization, it is usually not evident
which type of linearization should be used. This is illustrated by this example, in which
both types of linearizations are involved.

In concluding this section, it should be emphasized that the stability or


asymptotic stability of the equilibrium at the origin of autonomous systems
has been considered. The conclusions about such an equilibrium are local
ones, valid only in a small neighborhood of the equilibrium. However,
one might feel that the asymptotic stability of the system of Fig. 7.8-3 for
the conditions of Eqs. 7.8-9 and 7.8-10 is more than local, since the
“equivalent” linear system described by Eq. 7.8-11 is stable everywhere if
Eq. 7.8-12 is satisfied. Such is indeed the case, as is verified in the next
section.
Sec. 7.9 Asymptotic Stability in the Large 511

7.9 ASYMPTOTIC STABILITY IN THE LARGE

The discussion of the preceding section indicated the possibility of an


equilibrium being asymptotically stable, regardless of the point x0 from
which the motion originates. In such a case, the equilibrium is said to be
asymptotically stable in the large. This form of stability, is extremely
important to the engineer, since it guarantees that all perturbed motions
will return to the equilibrium point. Furthermore, this type of stability
is related to the effect of disturbances upon system motions, which is
considered in the next section.

Theorem 7.9-1.35,36 If the differential system of Eq. 7.8-1 is asymptotically


stable, i.e., if V(x) > 0 and W < 0 for all x^O, and if V(x) approaches
infinity as ||x|| approaches infinity, then the origin is asymptotically stable
in the large.

The requirement for asymptotic stability in the large, beyond those of


asymptotic stability, is that V(x) approach infinity as x approaches infinity
in any direction. This requirement can be intuitively comprehended by
returning to the nonlinear mechanical system considered in Section 7.6
and again in Section 7.8. The curves of constant energy are given in
Eq. 7.8-4 by
OCi I 1
x2) = V(x1, x2) = — -f- k(u) du = c, c = constant (7.9-1)

Suppose that k(u) is such that the potential energy term does not become
infinite as x1 approaches infinity, but approaches a finite limit c0. If
c0 < c, the curves of constant V may not all be closed curves as in Fig.
7.6-2, but they could appear as in Fig. 7.9-1. In this case the origin would
be asymptotically stable, but not asymptotically stable in the large. It
would not be asymptotically stable in the large, since for sufficiently large
x0 it is possible for the system to move from a state of higher energy to a
state of lower energy without approaching the equilibrium at the origin.
The requirement for asymptotic stability in the large removes this
possibility.37

Example 7.9-1. Determine the conditions for which the equilibrium of the system of
Fig. 7.8-1 is asymptotically stable in the large.
In this case, Eq. 7.8-6 must be replaced by g(u)/u > a > 0, // ^ 0. a is a constant.
This ensures that F(x) approaches infinity as ||x|| approaches infinity.

Example 7.9-2. Determine the conditions for which the equilibrium of the system of
Fig. 7.8-3 is asymptotically stable in the large.
512 Introduction to Stability Theory and Lyapunov's Second Method

X2

The conditions of Eqs. 7.8-9 and 7.8-10 are replaced by


a > 0
h(u)
-> olx > 0, «#0
u

g(x 2) dhM
—-— > aoc2 — ax > 0, 0
a- X2
x2 dx1

where ax and a2 are constants.

Example 7.9-3.3S The Euler equations

Ixd) x (l y I z)(X) ytX) z 51 x

1 yd) y (I, I^)ajxo z 51 y


lzd) z ~~ (fx ly^O)xCOy - 51 z

are the equations of motion of a rigid body written about the principal axes of inertia.
Ix, Iy, and Iz denote the moments of inertia about these axes; co*, cov, and coz, the angular
velocities; and Mx, Mv, and Mz, the external torques.
Assuming that the rigid body is a space vehicle tumbling in orbit, it is desired to stop
the tumbling by applying control torques proportional to the angular velocities. The
control torques are Mx = —kxojx, My — —kycov, and Mz = —kzcoz. Lyapunov’s
method can be used to determine the stability of the responses.
Choosing the state variables xx = cox, x2 — tov, and x3 = coz permits the system to be
described by
x = A(x)x (7.9-2)
where

A(x) = (7.9-3)
Sec. 7.10 First Canonic Form of Lur'e—Absolute Stability 513

Note that the system has an equilibrium at x = 0. Let F(x) be the positive definite
quadratic form of Example 7.8-6, i.e., F(x) = <x, Qx>, and choose

0 0'

Q = o If 0 (7.9-4)
0 o If
so that F(x) = Ifxf -f Ifxf + Ifxf, the square of the norm of the total angular
momentum. The corresponding time derivative of F(x) is W = (x, Qx) + <x, Qx).
Using Eq. 7.9-2 yields W = (A(x)x, Qx) + (x, QA(x)x). This can be rewritten as

W = <x, [Ar(x)Q + QA(x)]x> (7.9-5)

Equation 7.9-5 is of the form W — — (x, Px), where

-P = Ar(x)Q + QA(x) (7.9-6)

Thus W is negative definite if P is positive definite. Then, since F(x) is positive definite,
and because of its form must approach infinity as x approaches infinity, the equilibrium
is asymptotically stable in the large, if P is positive definite. Substitution of Eqs. 7.9-3
and 7.9-4 into Eq. 7.9-6 yields
2 kxIx 0 0 ■
P = 0 2 kyly 0
0 o 2 kzIz

If each of the k's is positive, P is positive definite and the equilibrium is asymptotically
stable in the large.

7.10 FIRST CANONIC FORM OF LUR’E—ABSOLUTE STABILITY

General stability criteria can be derived which are applicable to classes


of system configurations. Several standard configurations have been
considered. The most useful of these seems to be the first canonic form
of Lur’e, studied by Lur’e, Letov, LaSalle and Lefschetz, and others.40-43
It is the only standard form considered here.
The first canonic form of Lur’e can be useful for investigating the
stability of systems which can be manipulated into the form shown in
Fig. 7.10-ltf. The equivalent form obtained by combining Gx(s), G2(s),
and H(s), so that G(s) = — G1(s)G2(s)H(s), is given in Fig. 7.10-16. By
expanding G(s) in partial fractions, as in Section 5.5, the system rep¬
resentation takes the form in Fig. 7.10-1 c. The A’s are the characteristic
values or eigenvalues and are assumed to be distinct and negative real.f
The 6’s are the residues of G(s) at its poles. A single free integrator in the
linear dynamics is included in the formulation.

t The case in which some of the A’s occur in complex conjugate pairs with negative real
parts can be handled in a somewhat similar fashion.
514 Introduction to Stability Theory and Lyapunov's Second Method

(a)

cr Zero memory k(o) ^


-> nonlinear 'G(s)
element

(b)
Sec. 7.10 First Canonic Form of Lur'e—Absolute Stability 515

Figure 7.10-lc indicates that the systems to be considered are char¬


acterized by
x = Ax + &((r)b
z = b0k(o) (7.10-1)
a = (1, x) + z

A is a diagonal matrix whose elements are the characteristic values 2.l9


22, . . . , Xn9 and b and 1 are vectors with components bl9 b2, ... ,bn and
1, 1, . . . , 1, respectively. Since a = (1, x) + z, it is possible to write a as

a — (X, x) — rk(o) (7.10-2)

where the scalar r — — (b0 + bx + • • • + bn) and the vector X has com¬
ponents 2l9 2.2, ..., 2n.
A Lyapunov function for this system is

F(x, 2, a) = (x, Qx) + /c(a) da.


Jo

where Q is positive definite and k(a) is restricted to the class for which

> e > 0, a 0, € constant


G (7.10-3)

Note that a = 0 together with x = 0 requires z = 0, so that F(x, z, a) is


positive definite for all x, 2, and a. Furthermore, V(x, 2, a) approaches
infinity as ||x||, a, or 2 approaches infinity. The latter condition, i.e.,
V(x, 2, o) approaches infinity as 2 approaches infinity, is observed from the
fact that, if 2 approaches infinity in Fig. 7.10-lc, either a approaches
infinity or one of the x’s approaches minus infinity.
The time derivative of F(x, 2, <7) is W = (x, Qx) + (x, Qx) + k(o)a.
Substitution of Eqs. 7.10-1 and 7.10-2 yields

W = (x, [AtQ + QA]x) + k(a)[(b, Qx) + (x, Qb>]

+ k(a)[(k, x) - rk(a)]

Similarly to Eq. 7.9-6, it is desirable to choose a positive definite P such


that — P = A7Q + QA. In this case, however, A is diagonal so that
AT = A. Furthermore, since Q is symmetrical, (b, Qx) = (x, Qb). Then

W = — (x, Px) + k(a)[2bTQ + XT]x — rk2(a)

IF is a quadratic form in x and k(a) and is negative definite if and only if44

r > [Qb + |X]TP_1[Qb + £X] (7.10-4)


516 Introduction to Stability Theory and Lyapunov's Second Method

Assume that Q is given by

<3n • • • 0

22
qu > 0, i = 1, 2, ..., n

0 <7 nn
where all the nondiagonal elements are zero. Then Eq. 7.10-4 becomes
r > —S v (7.10-5)
where S is given by
s _1 y Ibj + W8
4z=i A-i/qa

This condition, in conjunction with Eq. 7.10-3, is sufficient for asymptotic


stability in the large. However, it may be more demanding than necessary.
Thus it is desirable to determine the qu which are the least demanding on r.
S is negative since Xi < 0. Thus the least restriction on r is to have \S\
a minimum. If bf > 0, choosing qu such that 'ki\qii = — bi makes the ith
term of S zero. If bi < 0, the z'th term of |S| is a minimum for ^Jqa = bt.
The corresponding value of the ith term is b{. Thus, if the first m of the bL
are positive and the remaining n — m are negative, Eq. 7.10-5 becomes
each
positive

— b0 > bx + b2 + • • * + bm (7.10-6)
It is readily apparent that Eq. 7.10-6 is applicable only if (—b0) is more
positive than the sum of the positive residues of G(^). Since Eq. 7.10-3,
together with Eq. 7.10-6, form sufficient, but not necessary, conditions for
asymptotic stability in the large, failure to satisfy these conditions does not
necessarily mean that the system is unstable. It may mean that the
criterion does not apply in the particular case under consideration.
If Eq. 7.10-6 is satisfied, the equilibrium is asymptotically stable in the
large for any nonlinear characteristic satisfying Eq. 7.10-3. Stability of
this type, i.e., asymptotic stability in the large for somewhat arbitrary Ar(cr),
is called absolute stability. Various other absolute stability criteria can be
derived by modifying the previous procedure (e.g., by choosing a different
Q), or the Lyapunov function.45,46
Example 7.10-1. The absolute stability of the equilibrium of the system with G(s) =
—(5 + 3)/(s + 2)(j — 1) is to be investigated.
Since G(s) does not have a pole at the origin, Eq. 7.10-6 is of no direct use. However
it can be utilized by making use of a technique known as pole-shifting.47 The pole-
shifting technique can be illustrated by means of Fig. 7.10-2. The feedforward and feed¬
back f's cancel so that the basic system and its conditions for stability remain unchanged.
Sec. 7.10 First Canonic Form of Lur'e—Absolute Stability 517

Fig. 7.10-2

However, the feedback around G{s) can be utilized to move the poles of G'(s), since

G(s)
G'(s) =
1 - <f>G(s)
In the example under consideration,
~(s + 3)
G'(s)
j2 + (1 + <f>)s + (3<f) - 2)
so that choosing <f> = -I results in G'(s) having a pole at the origin. Now Eq. 7.10-6 can
be applied.
For <£ = f, 9 4
—(i1 + 3) .

G\s) = i + _^
s(s + f) 5 s + f

Thus b0 = — f and bx — f, so that Eq. 7.10-6 indicates absolute stability if

k\o) 'k(o)—%o
> e > 0
a

Since k\o) is restricted to the first and third quadrants by Eq. 7.10-3, k(o) must be
restricted to the unshaded area of Fig. 7.10-3. If k(o) is replaced by a linear gain k, the
Routh-Hurwitz condition on k for stability is precisely that of Fig. 7.10-3. Aizerman
has shown that this is true in general for second order systems.48

k(o)
518 Introduction to Stability Theory and Lyapunov's Second Method

7.11 STABILITY WITH DISTURBANCES—PRACTICAL STABILITY

Previous sections considered system motions from the viewpoint of a


perturbation suddenly moving the system from its equilibrium state, and
the perturbing force then immediately disappearing. Stability is the
property of the motion which indicates that the effect of a small per¬
turbation upon the motion is not large. Asymptotic stability further
indicates that the effect of a small perturbation disappears with time.
Asymptotic stability in the large indicates that the effect of the perturbation
tends to disappear, independently of its size. In the physical world of real
systems, however, many disturbances are not of the impulsive type. In
fact, they are often stochastic in nature. The preceding discussion says
nothing about disturbances which are not impulsive.
Total stability is a form of stability relating to constantly acting per¬
turbations. In essence, total stability means that the system motion will
remain near the equilibrium if it is not “too far” away initially, and if the
perturbations are not “too large.” Considering the unperturbed system

x = f(x), f(0) = 0 (7.11-1)

and denoting the perturbations by u(x, t), the perturbed system is

x = f(x) + u(x, t) (7.11-2)

Theorem 7.11-1. The equilibrium of Eq. 7.11-1 is totally stable, if for every
e > 0 there is a (5(e) > 0 and a ^(e) > 0, such that, if ||x0|| < S and
|| u(x, t) || < (5X for all x and t > t0, then || x(t; x0, t0) || < e for all t > t0.49-53

Total stability is analogous to stability and asymptotic stability in the


sense that one does not know how large x0 and u(x, t) can be in practice.
It would be useful to have a form of total stability analogous to asymptotic
stability in the large. Such a form is provided by LaSalle and Lefschetz.53
They refer to total stability as practical stability and hence refer to the
desired form of stability as strong practical stability. This type of stability
is the condition under which the equilibrium is totally stable and, in
addition, each motion x(/;x0, /0) of Eq. 7.11-2 for each disturbance
u(x, t) in a region U lies ultimately in a closed bounded region R containing
the equilibrium. The motions are assumed to start at t0 in a region R0
contained in R.

Theorem 7.11-2.53 Let L(x) be a scalar function which for all x has contin¬
uous first partial derivatives and be such that V(x) approaches infinity as
|| x || approaches infinity. If W, the time derivative of V(x), evaluated for
Sec. 7.12 Estimation of Transient Response 519

Eq. 7.11-2 is such that W < — e < 0 for all x outside R0, for all u(x, t) in
U, and all t > t0, and if V(x) < K(q) for all x in R0 and all q outside R,
then Eq. 7.11-1 possesses a strong practical stability.

This theorem provides an indication of the sizes of x0 and the dis¬


turbance for which total stability can be realized.
Example 7.11-1. Returning to Example 7.9-3, the effects of disturbance torques such
as solar radiation are to be considered. Thus the external torques are

ATx kxcox T
My ~ —kyCOy -f Uy

ATz kza>z -}- uz

where the control torques are as previously considered, and the disturbance torques on
the x, y, and z axes are ux, uy, uz, respectively.
Using the same F(x) function as in Example 7.9-3, W for the perturbed system is

W = — 2[Ixx1(kxx1 — uj) + Iyx2(kyX2 — u2) + Izx3(kzx3 — w3)]

where uu u2, and u3 have been written for ux, uy, and uz, respectively. Let ulm, u2m, and
u3m denote the largest absolute values for the disturbances in the region U. Choosing
the region R0 to be the interior of an ellipsoidal surface containing the three points
(,U\mlkx, 0, 0), (0, u2Jky, 0), and (0, 0, u3Jkz), then W < — e < 0 for all x outside R0,
all u(x, t) in U, and all t > t0. Also, by choosing R to be some region larger than R0,
V(x) < V(q) for all x in R0 and all q outside R. Thus the system possesses a strong
practical stability. Note that, if the maximum disturbances are increased, the same
regions R0 and R can be maintained by increasing the gains kx, kv, and kz.

7.12 ESTIMATION OF TRANSIENT RESPONSE

If a Lyapunov function is known for an autonomous system with an


asymptotically stable equilibrium, then the Lyapunov function can be used
to estimate a limit on the transient response.23 This aspect of Lyapunov’s
method is best introduced by means of the simple linear system of Fig.
7.12-1. The system speed of response is determined by T, the closed-loop
time constant. It is the reciprocal of the loop gain.
The system is characterized by xl = —xJT. A suitable V(x) function
for stability analysis is E^) = x^/2. The rate of change of Vixf) is
W = dV/dt = x1x1 = —x^\T. As expected, Vixf) and W indicate that
the system is asymptotically stable in the large, if T is positive.

Fig. 7.12-1
520 Introduction to Stability Theory and Lyapunov's Second Method

The point to be emphasized here is not stability, however. The system


speed of response is of interest. A measure of this is the normalized rate
at which the Lyapunov function changes. In this case, for example,

dVjdt _ _ W _ 2
(7.12-1)
V ~ V ~ T

Integration of Eq. 7.12-1 yields V = V(xlo)e~-t,T for the variation of the


Lyapunov function with time along a motion. Since the Lyapunov
function is, or is similar to, depending upon its choice, the system energy,
it depends generally upon the squares of tjie state variables. For that
reason, the time constant for the variation in the Lyapunov function is
one-half the system time constant. Therefore — W/V is a measure of the
system speed of response in this example.
For more general systems, the situation is not as clear as in this simple,
linear, first order system. Generally, — W/V changes as the states of the
system change. This is illustrated in the following example.
Example 7.12-1. For the system of Example 7.8-7,

W 2a
— 2 7~ 0
V fxi
1 + (2[x22) g(u) du
Jo
Discuss the effects of the system parameters on the speed of response.
From the viewpoint of speed of response, the designer should choose a as large as
possible. One must be careful about drawing conclusions concerning the functional )
however, since its effect on — W/V varies with the state variables x1 and x2.

7.13 RELAY CONTROLLERS FOR LINEAR DYNAMICS

The concept of the previous section can be used for the design of relay
controllers for controlled elements described by

x = Ax + Bm (7.13-1)

where m is the input to the controlled elements and is restricted to


|m/OI ^ By determining a Lyapunov function for the unforced
system, i.e., m = 0, and then choosing m to make W as negative as
possible, the resulting system removes initial errors rapidly.23,39
Lyapunov functions for asymptotically stable in the large linear systems
can be determined by choosing L(x) to be the positive definite quadratic
form L(x) = (x, Qx), where Q is a positive definite, symmetric matrix
which satisfies
a'q + qa = -p (7.13-2)
Sec. 7.13 Relay Controllers for Linear Dynamics 521

and P is any symmetric, positive definite matrix. Then

W = (x, Qx) + (x, Qx) = — (x, Px) (7.13-3)

which is negative definite.


For the system with control, substitution of Eq. 7.13-1 into Eq. 7.13-3
yields W — — (x, Px) + (Bm, Qx) + (x, QBm). Since Q is symmetric,
W can be written as W — — (x, Px) + 2(m, B2Qx). The first term of W
is negative definite by choice of P. The proper choice of m is to make
W as negative as possible. Thus it is to set each component of m to its
maximum magnitude, with a sign opposite that of the corresponding
component of B7Qx. Therefore the proper choice of each component of
m is
mft) = —Mt sgn [BtQx]z-

This leads to a relay controller operating on a linear combination of the


state variables.
Example 7.13-1. The objective is to determine a relay controller for the linear system
in which
■ 0 r “0 0"
A = , B = , a > 0, b > 0
_-b —a _0 1_

Choose P to be the unit matrix. Then the Q matrix can be determined as

'a2 + b( 1 + b) 1
lab 2b
Q =
1 1 + b
2b lab
B^Qx can be determined to be
0
BQx = xi (1 + b)x 2
2b + lab
Thus mx — 0 and
(1 + b)x2(t)'
m2(t) = —M2 sgn xi (0 + a

where the positive constant 1/2b has been removed from each of the terms within the
brackets, since it does not affect the sign of m2(t).
The resultant control system is asymptotically stable in the large by design. Further¬
more, if u is the maximum value of a disturbance at the point indicated in Fig. 7.13-1,
then the system has a strong practical stability since, for xx = x2 and x2 = —bxx —
ax2 + m2 + u, IF is given by

x1 (1 + b
W = — (xf -f x22) — |m2 sgn
l+\ IT
Strong practical stability is a general result for systems of this type.39
522 Introduction to Stability Theory and Lyapunov's Second Method

Fig. 7.13-1

It is important to note that the method, as presented, applies only to


controlled elements which have feedback around the integrators. If the
controlled elements possess free integrations, then the A matrix is singular
and one cannot satisfy Eq. 7.13-2. This is made evident by letting b
approach zero in this example.
For the case in which the controlled elements have a single free integra¬
tion, the procedure above can be utilized with slight modification.
Although presented here for the single input (m a scalar) case, the method is
readily extended to the multiple input case.
If the state equations describing the linear dynamics are determined
using the partial fractions technique of Section 5.5, the result can be put
in the form shown in Fig. 7.13-2. The elements comprising G(s) can be
characterized by Eq. 7.13-1. In this case, however, ra is a scalar, and
B = b, a vector. Thus x = Ax + bra, where A may or may not be
diagonal, depending upon how far one carries out the partial fraction
expansion. The free integrator can be characterized by z = b0m. Now let
V(x, z) = (x, Qx) + z2, which is positive definite, for Q as chosen in
Eq. 7.13-2. Then
W = — (x, Px) + 2ra[(b, Qx) + b0z]

Y(s)

Fig. 7.13-2
Sec. 7.13 Relay Controllers for Linear Dynamics 523

Therefore the proper choice of m is

m(t) = —M sgn [(b, Qx) +- b0z] (7.13-4)

where M is the largest possible value for m{t).

Example 7.13-2. A relay controller is to be designed for the dynamics of Fig. 7.13-3a.

Y(s) 1 _ 1 1

M(s) s(s + 1) 5 5+1

State equations are defined for the system of Fig. 7.13-36 as x = —x + m and z = m.
Note that the first equation also applies to Fig. 7.13-3(7. z, defined in Fig. 7.13-36, is
x + y of Fig. 7.13-3(7. Assuming P = 2, then Q = 1. Since 60 = 1 and b is a scalar

Fig. 7.13-3

equal to unity, m{t) = — M sgn (x + z). Neither z nor x + z is directly measurable,


since the system of actual interest is the one of Fig. 7.13-3^, not the one of Fig. 7.13-36.
Thus it is necessary to obtain x + z from x + z = y + lx. The resulting system is
illustrated in Fig. 7.13-3a.
If u is the maximum value of a disturbance input to the linear elements, as shown in
Fig. 7.13-3<7, then since

W — —2[a:2 + M(x + z) sgn (x + z) — u(x + z)]

the system has a strong practical stability which can be made as strong as desired by
increasing M. This capability, of course, is bounded by the saturation tendencies of the
linear dynamics.
524 Introduction to Stability Theory and Lyapunov's Second Method

7.14 DETERMINATION OF LYAPUNOV FUNCTIONS—VARIABLE


GRADIENT METHOD

Lyapunov’s second method has been used in preceding sections for the
determination of stability and for the design of relay systems. Other than
for linear systems and the nonlinear case considered in Problem 7.21, no
method has been presented for the determination of Lyapunov functions.
Since various theorems guarantee the existence of Lyapunov functions for
the stable and asymptotically stable (locally or in the large) cases if f(x)
satisfies the Lipschitz condition! (locally or ih the large), it is important to
consider possible methods of determining such functions.54 At least three
methods are presented in the literature.55-60 The variable gradient method
is considered here.
Given the existence of a Lyapunov function, its gradient must exist.
From grad V, V(x) and W may be determined, since, for x = f(x), Eq.
7.8-3 indicates

W = — = (grad V, f> = (grad V, x) (7.14-1)


dt
and hence

L(x) = f (grad V, dx)


Jo
(7.14-2)

The upper limit of integration does not imply that V(x) is a vector. It
indicates that the integral is a line integral to an arbitrary point x =
(xl9 x2, . . . , xn) in state space. For the scalar function L(x) to be obtained
uniquely from the line integral of the vector grad V, V(x) must be inde¬
pendent of the path of integration. The necessary and sufficient con¬
ditions for this arej

0(grad V)i 5(grad F),-


i,j = 1, 2, n (7.14-3)
dx. dx,
where (grad V)f is the component of grad V in the i direction. That is,

(grad V\
(grad V)2

grad V =

(grad V)n
t The Lipschitz condition implies continuity of f(x) in x.
f See any standard text on vector calculus.
Sec. 7.14 Determination of Lyapunov Functions 525

Since V(x) is independent of the path of integration if Eq. 7.14-3 is satisfied,


any path may be chosen. A particularly simple path is the one indicated by

Xl

F(x) = [grad V(ul9 0, 0, , 0)]! dux


'o
X2

+ [grad V(xx, u2, 0, . . . , 0)]2 du2 +


'o

+ [grad V(xlt x2, x3, , xn_lt un)]n dun (7.14-4)

This path is utilized in what follows.


Assume (grad V) = [a]x, where

"«ll(x) a12(x) • • • aln(x)

a2i(x) a22(x) • • • a2n(X)


[a] =

a„i(x) *n2(x) ‘ ‘ *

As the term “variable gradient” implies, grad V is assumed to have n


undetermined components. The a’s consist of a constant and a variable
part, which is a function of the state variables. That is, = txijc + ccijv(x),
where the subscripts c and v denote constant and variable, respectively.
Without loss of generality, the coefficients cniiv are restricted to be functions
of xi only. Furthermore, to assist in the verification that V(x) = constant
represents closed surfaces as required (see Section 7.9), [a] is chosen
independent of xn and is set equal to unity. This is a slight loss in
generality. With these restrictions, [a] is

"allc + &iiv(xi) a12cT a12l/(a:l> X2> • • • Xn—l) • Q^lnc “1” •*'2> • • • ^n—l)

• j ^n—l) *^22c T &22v(%2) • ^2nc "E ^2r!v(^l» ^2•> • • • > %n—l)

—^nlc4~ 5 ^2? • • • 5 -^n—l) ^n2c T &.n2v(-^l’ •r2> • • • » ^n—l) 1

With [a] as defined above, the steps in the variable gradient method of
determining a Lyapunov function are:

1. Assume a gradient of the form grad V = [a]x.


2. Determine W from W = (grad V, f).
3. Subject to Eq. 7.14-3, constrain W to be at least negative semi-
definite.
4. Determine V(x) from Eq. 7.14-4, and check that it represents closed
surfaces.
526 Introduction to Stability Theory and Lyapunov's Second Method

This procedure is now illustrated by examples.


Example 7.14-1. Determine a Lyapunov function for the system of Example 7.8-7.
The system to be considered is second order and is defined by xx = x2 and x2 =
~g(xi) ~ ax2• Then grad V is

[allc T T [&12C T a12«(*l)l*2


grad V =
[a21c 4“ a21«(*l)l*l T X2
and

W = anc + — acc21c — acc21v(xx)


g(xi) xxx2
x.
— [a21c + a2iv(Xi)]Xig(x1) - [a - a12c — al2v{xx\]x2

Equations 7.14-3 yield that


g(grad V)x
= ai2c + a12w(a?1)
dx9
equals
5(grad V)2 d[ a21i;(*l)*l]
= a2!f + -
dxx dxx

This requirement can be satisfied and W simplified considerably by choosing a12c =


a2ic — ai2u = a2i« = anc = 0 and ccllv(x1) = g(xx)/xx. Then W = —ax22, which is
negative semidefinite for a > 0.
For these restrictions on the a’s, grad V is

grad,/
V =
,

x0
Then, from Eq. 7.14-4, V is given by

*2
V(x) = g(ux) dux + u92 du
<-<“2

Integration of the second term yields


xx
X0
F(x) = g(u) du +

For g(u) as defined in Example 7.8-7, V(x) is positive definite. The Lyapunov function
determined here is precisely the one used in Example 7.8-7.

Example 7.14-2. Determine a Lyapunov function for the system of Example 7.8-8.
The system to be considered is third order and is defined by xx = x2, x2 = x3, and
x3 = —h(xx) — g(x2) — ax3. For a third order case, grad V is

laiic + + lai2c + a12B(a;i, x^)]x2 + [a13c + cc13v(xx, x2)]x3


la21c + <X2lv(Xl> *2)1*1 T la22c + ^22u(*2)l*2 T [a23c T a23*)(*l 5*2)! *3

[a31c + «-3lv{Xl, *2)1*1 + [<*32c + ^32d(*1? *2)1*2 + *3

For i — l, j = 2, Eq. 7.14-3 yields

^la12v(*l, *2)*2l ^[ai3u(*l> *2)1 ^[a21u(*lJ *2)*ll ^[a23v(*l5 *2)]


^120 4-^-h X3-^- = ai2c 4-^-1- *3 -
3Xn dx9 dxx dxx
Sec. 7.15 Nonautonomous Systems 527

This can be satisfied simply by a12c = a21c = a13u = a23u = 0, ocm = dh{x^\dxx, and
oc2iv = h(x^)lxx. Similarly, for i = l,y = 3, and the a’s above, Eq. 7.14-3 yields

d[«3i„0a, ^Kl d[*Z2v(Xl, x2)]


a 13c = a 31c + + x.
dxx dxx

This can be satisfied simply by a13c = a31c = a31„ = a32v = 0. Similarly, for i = 2,
j = 3, and the a’s above, Eq. 7.14-3 yields a23c = a32c. This can be satisfied simply by
a23c = a32c = a. For these a’s, W is
dh(xx)
W = [allc + an^J]®!®, + —— x22
dx1

+ h(xi)a?3 + [a22c + cc22v(x2)]x2x3 -f ax2


— [ax2 + x3][ax3 + h(xx) + g(x2)]

It is now apparent that some of these choices for the a’s were determined while
observing W.
Now W can be put into a rather simple form by choosing allc = 0, a11(l(®i) =
ah(x1)/x1, a22c = a2, and cc22v(x2) = g(x2)/x2. Then W becomes

g(x2) _ dh(xJ'
W = — a Xo
x« dx1
The corresponding grad V is
dh(xx)
ah(x J + Xg
dxx ~2

h(xx) + a2x2 + g(x2) -f ax3


ax 2 -f x3

Then, from Eq. 7.14-4, V(x) is


Xo *3
F(x) = J ah(ux) dux + | [//(^i) + a2u2 + g(u2)] du2 + (ax2 + w3) du3

The integrations yield


r*i
F(x) = a h(u) du + I g(u) du + \(ax2 + x3)2 + x2h(xx)
Jo Jo

The V(x) determined here is precisely the one utilized in Example 7.8-8. The conditions
for asymptotic stability are given there.

From these examples, the usefulness of the variable gradient method


should be evident. The only particular difficulty is in the selection of the
a’s. This choice is somewhat arbitrary, and yet it can affect both the
results and the ease in obtaining them.

7.15 NONAUTONOMOUS SYSTEMS

The preceding discussion of Lyapunov’s second method has been


limited to autonomous cases. If one considers the nonautonomous case
x = f(x, t), f(0, /) = 0 for 0 (7.15-1)
528 Introduction to Stability Theory and Lyapunov's Second Method

the stability definitions remain unchanged. However, the definition of a


positive definite function must be slightly modified.
The function V(x, t) is positive definite in a neighborhood about the
origin if it is continuous and has continuous partial derivatives with
respect to all arguments, L(0, t) = 0 for t > 0, and a positive definite
(according to the definition of Section 7.8) function VJx) exists such that
V(x, t) > L0(x) for all x in the neighborhood and all t > 0.
Note that now W = dV/dt is given by

dv,dv
ww= —- + — /i(x) + — /2(x) +.
dt dx±
s^_dv

dx2
\ , dV
+ —A(x)
do: n
.

In terms of the gradient of F(x, t),

dV
W= — + (grad V, f)
dt

If W < 0 in the neighborhood, then V(x, t) is a Lyapunov function for


Eq. 7.15-1 in the neighborhood.
With this definition of a positive definite V(x, t), the previous theorems
on the stabilities of autonomous systems apply also for nonautonomous
systems. In the case of asymptotic stability, F(x, t) must be dominated by
a positive definite Fb(x), i.e., V(x, t) < Kb(x). A negative definite fL(x, t)
is also defined in terms of some negative definite scalar Wa(x) as
W(x, t) < JF0(x). Even though the previous theorems apply, the difficulty
in determining Lyapunov functions has limited the number of useful
applications of Lyapunov’s second method in nonautonomous cases.
Example 7.15-1. The stability of the linear time-varying system of Fig. 7.15-1 is to be
investigated.
The state equations are x = A(r)x, where

r 0
n
a

_ y(t) a_

Fig. 7.15-1
Sec. 7.16 Eventual Stability 529

Assume V(x, t) = <x, Q(r)x>, where


' 2a
P + -. 1
MO
CKO =
2

Then
2a
V(x, t) = P + Xi2 + 2xxx2 + - X
M0_ P
Let Va(x) = eyc-L2 + 2a:1x2 -f- e2x22. Va(x) is positive definite, and certainly V(x,t) >
Vaix) if
n 2a 2
P + -s-77:
Py(t) > ei > 1 and ^
p > *2 > 1
Now, considering W,
W = <x, Q(t)x) + (x, Q(t)x) + <x, Q(r)x>

Substitution of A(t)x for x yields W = — <x, P(r)x), where —P(0 = AT(r)Q(r) +


Q(/)A(0 + Q(t). For this specific example,

<x-y(0
1 + 0
y( t) Py(t)_
P =
2
0
a

Then W is negative definite, if a > 0 and

1 «y(0
1 + > 0
7(0 M0_
Sufficient conditions for satisfaction of these inequalities, and that some Vb(x) dominate
V(x, t), and hence asymptotic stability in the large, are a > y(t) > 0, 2 > ft > 0, and
7(0 > 0.

7.16 EVENTUAL STABILITY

The types of stability discussed in previous sections are concerned with


specific motions. The specific motion determines an equilibrium, and the
effects of perturbations on the equilibrium are studied.
In some cases, however, an equilibrium does not exist. For example, it
is extremely difficult to specify an equilibrium for many types of adaptive
control systems in which the system, its environment, and the desired
responses are continually changing. This eliminates Lyapunov stability
concepts.
For cases of this type, LaSalle and Rath consider eventual stability.61
Essentially eventual stability states that, if a system behaves properly for a
sufficiently long time, it can be expected to behave properly in the future.
530 Introduction to Stability Theory and Lyapunov's Second Method

The origin of x = f(x, t) is eventually stable if, given e > 0, there exists
d and T such that ||x0|| < (5 implies \\x(t; x0, f0)|| < e for all t > t0^ T.%1
Lyapunov’s second method can be extended to study eventual stability.
LaSalle and Rath contribute several theorems to this extension. One of
these, and their example relating to adaptive control, are presented here.

Theorem 7.16-1.61 Assume for the system x = f(x, z , t), z = g(x, z, t)


that f(x, z, t) is bounded for bounded x and z and all t > 0, and assume
also the existence of a scalar function V(x, z) such that:

1. F(x, z) is positive definite and has continuous first partial derivatives


for all x and z.
2. V(x, z) approaches infinity as ||x||2 + ||z||2 approaches infinity.
3. W < — Va(x) + h^qix, z) -f h2(t)V(x, z), where Va(x) is continuous
and positive definite for all x; ^(x, z) is continuous for all x, z and

\ht(t)\ dt <oo, i = 1, 2
Jo

Under these conditions, the state x = z = 0 is eventually stable, and,


given r > 0, there is a Tr such that ||x(r0)||2 + ||z(f0)||2 < r2 for some
t0 > Tr implies that z{t) is bounded and x(t) approaches 0 as t approaches
infinity.

If in addition,

4. For some K and some 0 < a < 1, \q(x, z)| < KV~a(x, z), then all
solutions z(t;z0,t0) are bounded and all x(t;x0,t0) approach 0 as t
approaches infinity.

Example 7.16-1.61,62 It is desired to control the velocity xx of a body moving through a


viscous fluid, the viscosity of which is incompletely described. The uncontrolled system
is = —[l(t)xl. It is known only that (Uj) is bounded and f3(t) approaches /S0, an un¬
known constant, as t approaches infinity. This implies

1^(0 - 001 dt < 00

The desired velocity is xx = k, a known constant.


If 0(/) were known, the system described by

±i = —fi(t)xi + [0(0 — a]xx + ak, a > 0

would solve the problem, i.e., the steady-state solution would be k. Since 0(0 and 0O
are unknown, however, replace 0(0 in the brackets by an adjustable feedback gain
zx(t). Thus
xx = —0(0^1 + (zi — d)xx + ak, a > 0
Sec. 7.17 Discrete Systems 531

Let V(xx, zx) = %[(xx — k)2 + (zi — /50)2]. Then W = dV/dt is given by

W = —a(xx - k)2 + [zx - p0][xx(xx - k) + zx] — (|8 — fio)xi(xi ~ k)

For zx = —xx(xx — k), W becomes

IV = -a(xx - ky - 0 - p0)xi(xi ~ k)

Denoting Va(x) = a(xx - k)\ hx(t) = V2 \k\ • |P(t) - /S0|, A2(/) = 2 |^(0 - j50|, and
<7(x, z) = V F(xl5 2X), then

W < — Ka(x) + hx(t)q(x, z) + /*20) F(x, z)

Thus the controlled system xx = —f}(t)xx + (zx — a)xx + ak, zx = —xx(xx — k)


satisfies all the conditions of the theorem, so that the state xx = k, zx = is eventually
stable. Furthermore, for all initial states of xx and zx, zx is bounded and xx approaches
k as t approaches infinity.

7.17 DISCRETE SYSTEMS

Chapter 6 indicated that the state variable approach is applicable to


discrete systems as well as continuous systems. Thus the discussion now
turns to the application of the second method of Lyapunov to discrete
systems. In the present considerations, there is little difference between
this and the case of continuous systems. Hence the discussion is quite
brief, indicating only the salient features and indicating the method by
means of example.
In the case of discrete systems, the state equations are difference
equations of the general form

Ex(kT) = x[(k + 1)7] = f(x, kT) (7.17-1)

The equilibrium is given by the solutions of f(0, kT) = 0 for all k, and
f(x, kT) is assumed to be real and continuous with respect to x for fixed k.
The fundamental stability definitions, given in preceding sections for
continuous systems, are applicable to discrete systems. A Lyapunov
function used for the investigation of the stability of an equilibrium of a
discrete system need only be defined for the discrete instants of time kT.
Similarly, the definiteness requirements need hold only for these discrete
instants. Thus, instead of the total derivative of the Lyapunov function,
which is of interest for continuous systems, one considers the total
differencef

W = AF = V{x[(k + 1)71, (k + 1)T} - V[x{kT), kT]


(7.17-2)

f W can be divided by T to determine the rate of increase of V; however, this does not
change the definiteness of the function.
532 Introduction to Stability Theory and Lyapunov’’s Second Method

With these modifications, the previous theorems on the application of


Lyapunov’s second method to continuous systems may be applied to
discrete systems.
For linear, autonomous difference equations, a Lyapunov function can
be determined by a procedure analogous to that used in Sections 7.9, 7.10,
and 7.13. Let V(x) be the positive definite quadratic form F(x) = (x, Qx).
Then, corresponding to Eq. 7.17-2, W — ((Ex), Q(£x)) — (x, Qx). For
linear, autonomous difference equations, Eq. 7.17-1 can be written as
Ex = Ax, so that W = (x, ATQAx) — (x, Qx) or W = (x, (ArQA — Q)x),
where T denotes the transpose and should not be confused with the
sampling interval. Let P be any symmetric, positive definite matrix
which is the unique solution of the linear equation — P = ArQA — Q.
Then W = — <x, Px), which is negative definite. This procedure is
successful only if the absolute values of all characteristic values of A are
less than unity.

7.18 APPLICATION TO PULSE WIDTH MODULATED SYSTEMS

Pulse width modulated systems provide an excellent illustration of the


application of Lyapunov’s second method to nonlinear discrete systems.63,64
Consider the system represented in Fig. 7.18-1. The pulse width modulator
output is

f$(k) and y(Jc) are yet to be determined functions of the state variables of
the linear dynamics.
Expanding G(s) in partial fractions, as in Sections 5.5 and 7.10, x =
Ax + mb, where A is the diagonal matrix with elements A1? A2, . . . , Xn.

Linear
dynamics

Pulse width M(s) _ ^ , Y(s)


modulator ^ G(s) -

Fig. 7.18-1
Sec. 7 AS Application to Pulse Width Modulated Systems 533

The general solution of the state equations is

x(0 = €{t~to)Ax(t0) + f m(r)eU_7)Ab dr


to
where t0 is an arbitrary initial instant. Substitution of the following two
sets of values for t0 and t

t0 = kT t = kT + <5 (k)

t0 = kT+ d(k) t = (k+ 1 )T


yields
*kT+5(k)
[/cT+5(fc)-r]Ab dr
x[kT + d(k)] = €3{k)Ax(kT) + (sgn (3) f
Jh
'kT

x[(k + 1 )T] = eLT~d(mAx[kT + d(k)]


Combining these equations gives
. f;kT+5(lc)
:LkT+s(k)-T]Ab dr
x[(k + 1)T] = erAx{kT) + (sgn p)eiT~d(mA
Jk
The integration results in
TA A-l| -S(k)A
x[(k + 1)T] = eTAx(kT) + (sgn /3)eTIVA 1[I — e ]b
where I is the unit matrix. If T« 1/—Xi9 i = 1,2then
-8(k)ki
1 -
= m
k
so that
x[(k + 1)T] = eTAx(kT) + (sgn (3) d(k)eTAb
It is assumed that this difference equation satisfactorily characterizes
system behavior.
In order to determine the modulator characteristics for stability, choose
as a Lyapunov function V(x) = (x, x). Then

W — — (x, x) + ([eTAx + (sgn (3) d(k)eTAb], [eTAx + (sgn (3) d(k)eTAb])


which can be rewritten as

W — — (x, [I — e2rA]x) + 2(sgn f3) d(k)(x, e2TAb> + d2(k)(b, e2TAb>


The first term is negative definite if < 0, i = 1, 2Considering
the second term, let sgn fi = —sgn < x, e‘2rAb>. Then

W = — (x, [I - e2rA]x> - 25{k) |(x, c2rAb>| + d\k)(b, e2TAb)


W has its most negative value for any x if

|<x, e2TAb>|
d(k) =
b, c2TAb>
534 Introduction to Stability Theory and Lyapunov's Second Method

Since 3(k) cannot exceed T, the best choice of 3{k) is


I<X, €2rAb)|
3{k) = T sat
T(b, €2TAb>
For stable linear dynamics and the modulator characterized by these
definitions, the equilibrium is asymptotically stable in the large.
Example 7.18-1. Determine the characteristics of a pulse width modulator for the
linear dynamics of Fig. 7.18-2#, where
2^ + 3 1 1
G(s) = +
(5 + l)(s + 2) ■S + 1 7+2

(a)

(b)
Fig. 7.18-2

For the state variables defined in Fig. 7.18-26, <x, e2T^b) — e_2r(x1 + e 2Tx2) and
(b, e27^b> = e~2r(l + e~2T). Thus the sign of the modulator output is
sgn (1 = —sgn [xx + e_22,x2]
and the pulse width is
+ c 2r^2|
3(k) = T sat
7(1 + e~2T)j
Note that neither xx nor xx exists as such in Fig. 7.18-2#. However, by the methods of
Chapter 5, xx — 2z1 + z2 and x2 = — (zx + z2). Then, if zl and z2 can be measured by
appropriate sensors, the proper switching signals for the pulse width modulator can be
obtained from a linear combination of these measurements.
References 535

REFERENCES

1. R. A. Struble, Nonlinear Differential Equations, McGraw-Hill Book Co., New York,


1962, Chapter 2.
2. A. A. Andronow and C. E. Chaikin, Theory of Oscillations, Princeton University
Press, Princeton, N.J., 1949.
3. A. Lienard, “Etude des Oscillations Entretenues,” Rev. gen elec., Vol. 23, 1928, pp.
901-946.
4. R. N. Buland, “Analysis of Nonlinear Servos by Phase Plane-Delta Method,”
J. Franklin Inst., Vol. 257, 1954, pp. 37-48.
5. Y. H. Ku, Analysis and Control of Nonlinear Systems, The Ronald Press Co.,
New York, 1958.
6. P. S. Hsia, “A Graphical Analysis for Nonlinear Systems,” Proc. IEE, Vol. 99,
Pt. II, 1952, pp. 120-165.
7. D. Graham and D. McRuer, Analysis of Nonlinear Control Systems, John Wiley
and Sons, New York, 1961, pp. 304-313.
8. L. S. Pontryagin, Ordinary Differential Equations, Addison-Wesley Publishing Co.,
Reading Mass., 1962, pp. 201-213.
9. Struble, op. cit., p. 177.
10. L. M. Vallese, “A Note on the Analysis of Flip-Flops,” Proceedings of the Sym¬
posium on Nonlinear Circuit Analysis, Polytechnic Institute of Brooklyn, Vol. VI,
1956, pp. 347-365.
11. L. M. Vallese, “On the Synthesis of Nonlinear Systems,” Proceedings of the
Symposium on Nonlinear Circuit Analysis, Polytechnic Institute of Brooklyn, Vol.
II, 1953, pp. 201-214.
12. Struble, op. cit., pp. 172-177.
13. N. Minorsky, Nonlinear Oscillations, D. Van Nostrand Co., Princeton, N.J., 1962,
pp. 77-80.
14. A. A. Andronow and C. E. Chaikin, Theory of Oscillations, Princeton University
Press, Princeton, N.J., 1949, Appendix C, pp. 343-347.
15. N. Minorsky, op. cit., pp. 84-91.
16. J. P. LaSalle, “Relaxation Oscillations,” Quart. Appl. Math., Vol. 7, 1949, pp.
1-19.
17. Pontryagin, op. cit., pp. 223-236.
18. N. Minorsky, op. cit., pp. 165-169.
19. S. T. Bow and J. E. Van Ness, “Use of Phase Space in Transient-Stability Studies,”
Trans. AIEE, Vol. 77, Pt. II, 1958, pp. 187-191.
20. A. M. Lyapunov, “Probleme general de la stabilite du mouvement,” Ann. fac. sci.
Univ. Toulouse, Vol. 9, 1907, pp. 203-474.
21. A. I. Lur’e and V. N. Postaikow, “Concerning the Stability of Regulating Systems,”
Prikl. Math. Mekh., Moscow, Vol. 8, 1944, pp. 246-248.
22. Reprint of Reference 20, Ann. Math. Study No. 17, 1949, Princeton University
Press.
23. R. E. Kalman and J. E. Bertram, “Control System Analysis and Design via the
Second Method of Lyapunov: I, Continuous-Time Systems; II, Discrete-Time
Systems,” Trans. ASME, Ser. D, J. Basic. Eng., Vol. 82, 1960, ASME Papers No. 59-
NAC-2 and -3, RIAS Monograph M 59-13.
24. S. Lefschetz, Differential Equations: Geometric Theory, Interscience Division,
John Wiley and Sons, New York, 1957, p. 78.
25. Pontryagin, op. cit., p. 202.
536 Introduction to Stability Theory and Lyapunov's Second Method

26. J. LaSalle and S. Lefschetz, Stability by Lyapunov's Direct Method with Applications,
Academic Press, New York, 1961, p. 37-38.
27. Struble, op. cit., p. 161.
28. W. Hahn, Theory and Application of Liapunov's Direct Method, Prentice-Hall, Inc.,
Englewood Cliffs, N.J., 1963, pp. 14-15.
29. Ibid., p. 15.
30. Struble, op. cit., p. 164.
31. LaSalle and Lefschetz, loc. cit.
32. Hahn, op. cit., pp. 47-48.
33. E. A. Barbasin, “Stability of the Solution of a Certain Nonlinear Third-Order
Equation” (in Russian), Prikl. Mat. Mekh., Vol. 16, 1952, pp. 629-632.
34. W. J. Cunningham, An Introduction to Lyapunov's Second Method (AIEE Work
Session in Lypanuov’s Second Method, edited by L. F. Kazda,) Sept. 1960, p. 30.
35. Hahn, op. cit., p. 15.
36. LaSalle and Lefschetz, op. cit., p. 67.
37. E. A. Barbasin and N. N. Krasovskii, “Concerning the Stability of Motion as a
Whole,” Dokl. Akad. Nauk SSSR, Vol. 36, No. 3, 1953.
38. E. I. Ergin, V. D. Norum, and T. G. Windeknecht, “Techniques for Analysis of
Nonlinear Attitude Control Systems for Space Vehicles,” Aeronautical Systems
Division, Dir./Aeromechanics, Flight Control Lab, Wright-Patterson AFB, Ohio,
Rept. No. ASD-TDR-62-208, Vol. II, June 1962.
39. J. P. LaSalle, “Stability and Control,” J. SIAM, Ser. A on Control, Vol. 1, No. 1,
1962, pp. 3-15.
40. A. I. Lur’e, Some Nonlinear Problems in the Theory of Automatic Control, Gos-
tekhizdat (in Russian), 1951; English translation, Her Majesty’s Stationery Office,
London, 1957.
41. A. M. Letov, Stability in Nonlinear Control Systems, Princeton University Press,
Princeton, N.J., 1961.
42. LaSalle and Lefschetz, op. cit., pp. 75-105.
43. J. P. LaSalle, “Complete Stability of a Nonlinear Control System,” Proc. Nat.
Acad. Sci., U.S., Vol. 48, No. 4, April 1962, pp. 600-603.
44. LaSalle and Lefschetz, op. cit., p. 85.
45. J. E. Gibson, Nonlinear Automatic Control, McGraw-Hill Book Co., New York,
1963, pp. 324-326.
46. G. Franklin and B. Gragg, “Discussion of Stability Analysis of Nonlinear Control
Systems by the Second Method of Liapunov,” Trans. IRE, AC-1, October 1962, pp.
129-130. -
47. Gibson, op. cit., pp. 328-334.
48. M. A. Aizerman, Lectures on Theory of Automatic Regulation (in Russian), Moscow,
1958.
49. G. N. Dubosin, “On the Problem of Stability of a Motion under Constantly Acting
Perturbations,” Trudy gos. astron. Inst., Sternberg, 1940.
50. I. I. Vorovich, “On the Stability of Motion with Random Disturbances,” Izv. Akad.
Nauk. SSSR, Ser. mat., 1956.
51. Hahn, op. cit., p. 107.
52. LaSalle, loc. cit.
53. LaSalle and Lefschetz, op. cit., pp. 121-126.
54. Hahn, op. cit., pp. 68-82.
55. D. R. Ingwerson, “A Modified Liapunov Method for Nonlinear Stability Analysis,”
Trans. IRE, AC-6, May 1961.
Problems 537

56. G. P. Szego, “A Contribution to Liapunov’s Second Method: Nonlinear Auton¬


omous Systems,” Paper No. 61-WA-192, ASME Annual Winter Meeting, Nov¬
ember 1961.
57. G. P. Szego, “On a New Partial Differential Equation for the Stability Analysis of
Time Invariant Control Systems,” J. SIAM, Ser. A on Control, Vol. 1, No. 1,
1962. pp. 63-75.
58. Hahn, op. cit., pp. 78-82.
59. S. G. Margolis and W. G. Vogt, “Control Engineering Applications of V. I.
Zubov’s Construction Procedure for Lyapunov Functions,” Trans. IEEE, AC-8
No. 2, April 1963, pp. 104-113.
60. D. G. Schultz and J. E. Gibson, “The Variable Gradient Method for Generating
Liapunov Functions,” Trans. AIEE, Vol. 81, Pt. II, 1962, pp. 203-210.
61. J. P. LaSalle and R. J. Rath, “Eventual Stability,” Proc. Fourth Joint Automatic
Control Conference, 1963, pp. 468-470, 1963 IFAC Conference.
62. E. R. Rang, “Adaptive Controllers Derived by Stability Considerations,” Minnea-
polis-Honeywell MPG Report 1529-TR9, March 1962.
63. W. L. Nelson, “Pulse Width Relay Control in Sampling Systems,” ASME Paper
No. 60-JAC-4, 1960.
64. T. T. Kadota and H. C. Bourne, “Stability Conditions of Pulse-Width-Modulated
Systems through the Second Method of Lyapunov,” Trans. IRE, AC-6, September
1961, pp. 266-276.

Problems
7.1 Show that the trajectories for the linear, second order system of Fig. 7.3-1
with the outer feedback loop gain equal to zero (co2 = 0, £a> 0) are
given by x2 + 2icox1 = c.
7.2 Show that the trajectories for the linear, second order system of Fig. 7.3-1
with ||| = 1 are given by
x2 + t;0)X1 = Ce^(ox1Ux2+^co*i)} ! = +1 or -1
7.3 Sketch the phase plane trajectories for each of the cases considered in
Fig. 7.3-8 for the system of Fig. 7.3-1 where v{t) now is not zero but is
C/_i(0, a unit step function.
7.4 Derive the phase plane trajectories of Section 7.3 from the mode inter¬
pretation viewpoint of Chapter 5.
7.5 (a) Determine the location and nature of any singular points in the example
of Section 7.4 for the cases: a = 0; a > b > 0.
(b) Sketch the phase plane portraits for each of the preceding cases.
Indicate any separatrices.
(c) In which of the cases is the system on oscillator or a flip-flop?
(.d) On your sketches of part (b), indicate the value of x required to make
the circuit oscillate or flip, as the case may be.
7.6 A one-farad capacitor and a one-henry inductor are connected in series
with a tunnel diode negative resistance device characterized by i —
—v + v3. Let xx be the capacitor voltage and x2 be the current as shown
in Fig. P7.6.
538 Introduction to Stability Theory and Lyapunov's Second Method

vc ~ ^-'1

A
Negative
resistance

Fig. P7.6

(a) Determine the location and nature of any singular points, and sketch
the xxx2 phase plane portrait. Are x1 and x2 a set of state variables for
this system? '
(b) Repeat part («) in a w^-plane.

7.7 (a) Determine the location and nature of any singular points, and sketch
the phase plane portrait for the frictionless pendulum described by
6 + sin 6 = 0. Indicate any separatrices.
(b) Repeat part (a) for 6 + 0.16 + sin 6 =0.

7.8 Determine the location and nature of any singular points, and sketch the
phase plane portrait for the nonlinear system described by

y - (0.1 - ^y2)y + y + y2 = 0

Could the system be used as an oscillator?11 (Using the Poincare-


Bendixson theorem, prove that a stable limit cycle exists.)
7.9 A feedback system is composed of a relay amplifier driving a motor. The
motor speed is sufficiently low for the back emf to be neglected. The block

/v m

+1

-1
+]

(b)

Fig. P7.9
Problems 539

diagram for the system is shown in Fig. PI.9a. The relay characteristic is
given in Fig. 1.9b.
(a) Assuming v(t) = 0, sketch the phase plane trajectories.
(b) If the motion is periodic, what is the period as a function of the maxi¬
mum velocity during the oscillation ?
(c) Is a limit cycle present?
7.10 The relay system of Problem 7.9 is modified by the addition of nonlinear
feedback. The block diagram for the modified system is given in Fig.
P7.10a. The nonlinear element has the characteristic shown in Fig. P7.106.

v(t) = 0

?o
e(t) m(t) y(t)

-A
Relay
tQ
-A
/ J

Nonlinear
q(t) element P(0

(a)

Q
A

-^p

■H

(b)
Fig. P7.10

(a) Repeat the questions of Problem 7.9 for the modified system.
(b) What is the effect of the nonlinear feedback on stability?
(c) Discuss the investigation of the stability of this system by the describing
function method.
7.11 The system of Fig. Pl.Wa has a nonlinear error detection means and
deadband in the load. The deadband characteristic is shown in Fig. P7.1 lb.
(a) Determine the location and nature of any singular points in an x±x2-
plane between the limits 1^1 < 2tt + a.
(b) Sketch the phase plane trajectories for this system in the same range,
and discuss system stability.
(c) Discuss the possibility and difficulty of analyzing this system by the
describing function method.
540 Introduction to Stability Theory and Lyapunov's Second Method

Deadband

(b)
Fig. P7.ll

7.12 For the system with coulomb friction illustrated in Fig. P7.12:
(a) Determine the differential equation characterizing the system.
(b) Sketch the phase portrait, indicating any singular region.
(c) Does a limit cycle exist?

7.13 (ci) Sketch the trajectories for the system of Fig. 7.5-5 in an ad-plane.
(b) Derive the successor function corresponding to part (a), and discuss
system stability.
7.14 In the system of Fig. P7.14, m(t) can be +1 or —1. The value of m{t) is
controlled by the switching logic, so that the transient error and error
derivative to a step input (i.e., v(t) = f7_i(0) are reduced to zero in the
shortest possible time. This corresponds to the least number of changes
of sign of m(t) and is the case of optimum switching.
Problems 541

v(t) ^ e(t) m(t) f r y( t)


Switching > J > J
d e(t) logic
-* * —*

Fig. P7.14

(a) Indicate on an ee-plane the regions where m(t) should be +1 and


where m(t) should be —1 for optimum switching.
(.b) Does the system of part (a) switch in an optimum fashion for v(t) = tl
(c) Repeat part (b) for v(t) = I2/2.
(d) If the switching logic corresponding to part (a) has a pure time delay
of Td seconds (i.e., m(t) changes sign Td seconds after it should according
to the optimum switching criterion), determine a relationship in terms of
Td between the e when switching should occur, as determined in part (a),
and the e when switching actually occurs.
(e) Sketch the switching curves which result with the time delay.
(/) Estimate the stability of the system with the time delay.
7.15 (a) Sketch the trajectories in an cc^-plane assuming T > 0, a > /? > 0
in the system of Fig. P7.15. Where are the singular points?
(b) For what values of a, (3, and T is the system (i) stable in the large?
(ii) unstable in the large? (iii) asymptotically stable in the large? Prove
your answer using the successor function.

7.16 Let H = T + V, where T is the kinetic energy and V is the potential


energy, be the Hamiltonian of a system with n degrees of freedom. The
system can be described by n generalized coordinates qXi q2, . . . , qn and
n generalized momenta px, p2, . . ., pn. The equations of motion are q =
dH/ dp and p — — dHj dq. Let the origin of the pq space be an isolated
equilibrium. Use Lyapunov’s method to determine the conditions under
which the equilibrium is stable (Lagrange’s theorem).
542 Introduction to Stability Theory and Lyapunov's Second Method

7.17 Using the Lyapunov function

V(x) =

determine the conditions for which the equilibrium of xx = x2, x2 =


— oo2x1 — g(xly x2)x2 is asymptotically stable in the large.

7.18 (a) Show that the system


dxx dx2
— = -H(Xl) +x2, — = -gM

where

is equivalent to Lienard’s equation

d2x, dx,
+ h(x'] Tt + s(Xl) = 0
(b) Using the K-function
x 2
V(x) = — + J *i
g(u) du

and assuming ^(0) = H(0) = 0, determine the conditions for asymptotic


stability in the large.
7.19 A simplified representation of a single-axis satellite attitude control system
with a gravity gradient effect is shown in Fig. P7.19. Using the function
~
vvi
2 ™
c\
2
V(x) = — + — + 1 — cos 2xx

determine any requirements on T for the system to be asymptotically stable


in the large. Discuss the practicality of the compensation.

Vehicle dynamics

7.20 The RLC network of Fig. P7.20 consists of a passive resistance network,
and capacitors and inductors, all of which may be nonlinear. Choose as
the state variables the charge on each of the capacitors and the flux linkage
of each of the inductors. Let the F-function be the stored energy, and
Problems 543

em +1

A
£m + 2

/V

e*

determine the conditions for asymptotic stability in the large. How would
hysteresis affect your solution?
7.21 (a) Show that

f(x) = J(sx)x ds

where J is the Jacobian matrix.


(b) Use this result to show that if, for any positive definite quadratic form
V = (x, Qx), Jr(x)Q + QJ(x) is negative definite for all x ^ 0, then the
equilibrium of x = f(x), f(0) = 0 is asymptotically stable in the large.39
(c) Apply this method to Example 7.9-3.
7.22 (a) Investigate the absolute stability of the system in which a zero memory
nonlinear controller is in a loop with

(Is2 +45 + 1)
(i) ^ = - 853 - 115 -3

1
(ii) G(s) =
(5 + a)(5 + P)
(b) Derive the equation corresponding to Eq. 7.10-6, but which applies if
some of the A’s are complex with negative real parts. Use this result for
the cases
(53 + 652 + 125 + 8)
(i) G(s) ^ _|_ 1)^2 -j. 2s + 2)

K(s2 + 1.5 5 + 1.5)


(ii) G(s) =
s(s2 +25 + 2)
544 Introduction to Stability Theory and Lyapunov's Second Method

7.23 Using a method analogous to that of Section 7.10, let P = uuT, where u
is a vector with components wl5 u2, ... , un, develop conditions sufficient
for stability in the large of the system of Fig. 7.10-lc. Assume that b0 = 0,
and that the controlled elements are second order with both A’s negative
real. Can this P be used to prove asymptotic stability in the large? Why?
7.24 Can an equilibrium be stable without being totally stable? Think of an
undamped pendulum. Can an equilibrium be totally stable without being
stable?
7.25 (a) Assuming v(t) = 0, for what values of K is the equilibrium at x = 0
of the system of Fig. P7.25 stable?
(<b) Assuming v(t) ^ 0, but is bounded with a maximum absolute value
of vm, determine values of K for which x ultimately lies within a finite
value of x =0. How does this finite value vary with vm and K?

Fig. P7.25

7.26 Discuss the effect of H(xl) on the speed of response of the system of
Problem 7.18.
7.27 Discuss the effect of T on the speed of response of the system of Problem
7.19.
7.28 Use the procedure of Section 7.13 to design a relay controller for the
system

a > 0, b > 0

7.29 Use the procedure of Section 7.13 to design a relay controller for the
system
0 1 0 "0 0 0“
A = 0 0 1 , B = 0 0 0
—c -b —a _0 0 1

where the characteristic values of A are distinct, negative real.


7.30 Extend the procedure of Example 7.13-2 to the case in which m is a vector.
7.31 Use the variable gradient method to determine a Lyapunov function for
a system in the form of Fig. 7.10-16, where

S + cc
G(s) =
^(5 + 1)
Problems 545

What are the conditions on a and k( ) for which the system is asymptot¬
ically stable in the large?
7.32 (a) Use the variable gradient method to determine a Lyapunov function
for the system of Fig. 7.10-1/) where

s + 1
G(s) =
s(s + /3)

(/)) What are the conditions on /3 and k( ) for which the system is asymp¬
totically stable in the large?
(c) Resolve your answer to part (b) with Aizerman’s conjecture (see
Example 7.10-1).
7.33 Using the variable gradient method, determine F(x) and W for the system
of Example 7.4-1. Assume

allc = ^2/2, = C2<ri2j a21 c = a12 c = a21v = a12v = 0

In what regions of the ^c^-plane are the V(x) curves closed? What can
you say about the behavior of the system?
7.34 Investigate the stability of the system of Example 7.15-1, when a and /?
are also functions of time.
7.35 Develop the method analogous to Section 7.13 for the case in which x and
m are discrete signals.
7.36 Investigate the application of Lyapunov’s second method to the first
canonic form of Lur’e, when o, x, and 2 are discrete signals.
7.37 The linear dynamics characterized by x = Ax + Bm are such that
| AI — A| =0 has one root at A =0, negative real roots, and some complex
roots with negative real parts. The roots are distinct. These linear
dynamics are to be controlled by a controller such that the various com¬
ponents of m are given by

m%{t) = Mi sgn (x(k), af)

for (k -1- 1)T > t > kT, i.e., can change sign at the discrete time
instants kT, (k + \)T, . . . . Between these discrete instants, /?7Z- is a con¬
stant equal to +MZ or -Mf. Can the az be determined so that the con¬
trolled system is asymptotically stable in the large?
8
Introduction to Optimization Theory

8.1 INTRODUCTION

The Second World War provided a great impetus to the development


of the feedback control systems area. After a somewhat dormant period
in the 1950’s, the area again received a strong stimulus. This was caused
by the interest in industrial automation and, even more, by the advent of
the space age. Modern control system design has become exceedingly
complex because of the desire to control large-scale, inherently nonlinear
processes which sometimes operate in widely changing environments, and
because of extremely stringent specifications on the performance of such
systems. Optimization theory appears to offer the control system engineer
a means of combating the complexities of modern control system design.
It is an excellent example of the usage of linear vector space concepts.
Although much research is presently being performed in the optimization
area, this chapter is limited to attempting to provide the reader with the
basics of optimization theory, and to indicating the nature of some of the
difficulties involved with its application.
The philosophy of optimization theory is to design the “best” system.
This, of course, implies some criterion or performance index for judging
what is “best.” The determination of a suitable performance index is often
a problem in itself. Performance indices are discussed in later sections.
In comparison with more conventional methods for feedback control
system design, the advantages of optimization theory include:

1. The design procedure is more direct, because of the inclusion of all


the important aspects of performance in a single design index.
2. The best the designer can hope to achieve with respect to the perform¬
ance index is apparent. Thus the ultimate performance limitations, and
546
Sec. 8.1 Introduction 547

the extent to which these limitations affect a given design problem, are
indicated.
3. Inconsistent sets of performance specifications are revealed.
4. Prediction is naturally included in the procedure, because the design
index evaluates performance over the future interval of control.
5. The resulting control system is adaptive, if the design index is refor¬
mulated and the controller parameters recomputed on-line.
6. Time-varying processes do not cause any added difficulty, assuming
that a computer is used to determine the optimum.
7. Nonlinear processes can be treated directly, however, at the expense
of increased computational complexity.

The difficulties of optimization theory include:

1. The conversion of prescribed design specifications into a meaningful


mathematical performance index is not a straightforward process, and it
may involve trial and error.
2. Existing algorithms for the computation of the optimum control
signals in nonlinear cases require complex computer programs and, in some
cases, a large amount of computer time.
3. Proven techniques for the design of controllers for large regions of
state space, rather than merely for small regions about nominal trajectories,
are presently unavailable for nonlinear cases.
4. The resulting control system performance is highly sensitive to erro¬
neous assumptions about and/or changes in the values of the parameters
of the controlled elements.

Considerable research is presently being devoted to these limitations.


The subject of system optimization had its birth in the optimum linear
filter theory of Wiener.1 This theory was extended to the time-varying
case by Booton.2 Neither of these is directly applicable to control system
optimization, however, since the limitations of physical components are
not considered. On the basis of Wiener’s optimum filter theory, Newton
considered the limitations of physical components by introducing con¬
straints on functions of signals in the system.3 With reference to Fig.
8.1-ltf, Newton’s method can be viewed as determining the transfer func¬
tion of the optimum compensation for the system. As such, it is necessarily
restricted to linear systems. Furthermore, this method neglects the effect
of the configuration of the system on its performance.
A departure from the preceding procedure is indicated with respect to
Fig. 8.1-16, by seeking the optimum control signals for the controlled
elements. The optimum control law, i.e., the dependence of these optimum
control signals on the state variables of the controlled elements and the
548 Introduction to Optimization Theory

^ Controlled
Compensation -- -^
elements
/V

(a)
Control
signals
Controlled
elements

(b)
Fig. 8.1-1

desired system behavior, must also be determined in order to realize the


system. The optimum control law indicates the optimum system configura¬
tion.
The latter approach to system optimization is utilized in the modern
procedures developed around the dynamic programing concepts of Bell¬
man, and the extended variational calculus methods of Pontryagin.
Numerous others, some of whom are mentioned later in this chapter, have
also made many important contributions to optimization theory. Notable
among these is Merriam, who has been particularly concerned with mak¬
ing optimization theory of practical value to the control engineer. Much
of the material of this chapter has been taken from his writings and those
of Ellert, one of his associates.

8.2 DESIGN REQUIREMENTS AND PERFORMANCE INDICES

The primary task of the control engineer is to design practical control


systems for physical processes. The application of optimization theory to
this problem, as considered here, consists of three fundamental steps. They
are:

1. Formulation of mathematical models for both the behavior of the


physical process to be controlled and the performance requirements. The
mathematical model of the performance requirements is the performance
index.
2. Computation of the optimum control signals.
3. Synthesis of a controller to generate the optimum control signals.

This section considers various performance indices and their relationship


to the performance requirements.
Control systems must satisfy numerous requirements relating to the
Sec. 8.2 Design Requirements and Performance Indices 549

performance of the system and its implementation. For example, system


performance requirements may include:

1. Desired system response.


2. Desired control effort.
3. Limits on the control effort.
4. Limits on the system response, dictated either by the nature of the
system mission or by saturation limits.
5. Desired system response at some future terminal time.
6. Minimization or maximization of some function of a process variable
or time.
7. Disturbances, initial conditions, parameter variations, etc., which
must be tolerated.
8. Damping ratio.
9. Undamped natural frequency.

Requirements 1 through 7 are objective requirements, since they can be


mathematically described for any system. The last two requirements are
subjective, however, because they have a precise meaning for linear, second
order, time-invariant systems only. Nevertheless, they are useful for
approximately characterizing the relative
stability and the speed of response of more u(t) Disturbance
signals
general feedback systems.
r_
The system implementation require- m (t) ( Controlled y(t)
ments may include the specification of Control elements
Output
signals signals
1. Available sensors. x(t) State
2. Available controller components. r signals

3. System size, weight, cost, and reli¬


Fig. 8.2-1
ability.
Implementation requirements are exceedingly difficult to include directly
in any design procedure.
The performance index is a mathematical model of the performance
requirements. It is expressed in terms of the inputs, outputs, and state
signals of the controlled elements. These are indicated in Fig. 8.2-1.
Many performance indices have been proposed in the literature.4-16 A
substantial portion of these are special cases of the performance index

I f(t)g[e(t)] dt

where/(t) is a factor which weights g[e{t)] as a function of time. g[e{t)] is a


function of the error eft). f(t) is usually one of the functions 1, t, t2, . . . ,
or tn, and g[e(t)] is usually e\t) or \e(t)\. In particular, the integral square
550 Introduction to Optimization Theory

error index

(8.2-1)

leads to responses which tend not to be sufficiently damped, because large


errors are counted more heavily than small errors. Thus minimization of
the index requires that large errors be removed rapidly. However, this
performance index is often used because of its analytical tractability.
Performance indices of the form of Eq. 8.2-1 are not suitable when
multiple design specifications are encountered, since the error may be only
one of these specifications. For this reason, performance indices of more
general forms have been proposed. For example, Ellert uses the form16
'tf{ n ye
V( 0 - yl O' yfjt) - Vi(t)
i = 2 (4(0 + £«( 0
*J to \z=l L.
M rm,d(t) - m/t) ra/(Q - rnj(t)
+ 2 fait) + Az( 0 dt (8.2-2)
i=l

where yf and mf are the desired output and control effort, respectively;
ly. and lm. are related to limits on yt and mrespectively; <f>u{t), £u{t),
ipuit) and are time-dependent weighting factors; and yi and fii are
integers. The performance index considers the system behavior during the
future time interval t0 < t < tf, where tf may be a constant, a variable, or
infinity.
The weighting factors permit the various terms of the performance
index to be emphasized or weighted in time, depending upon the relative
importance of these terms. The terms raised to the powers yt and pLi are
penalty functions, which tend to maintain output and control signals
within prescribed limits. This is accomplished by heavily weighting these
signals, if they exceed their limits.
A unique set of weighting factors and penalty functions to satisfy
prescribed design specifications generally does not exist. Furthermore, the
selection of these quantities is unfortunately not a straightforward matter.
However, the lack of uniqueness of the weighting factors and penalty
functions does introduce a flexibility which makes their selection simpler.
From an engineering viewpoint, an efficient procedure for selecting weight¬
ing factors and penalty functions is needed. As discussed in later sections,
Ellert has partially answered this need.
The performance indices above, and many of the specialized indices
found in the literature, can be put in the form

i = I q[y(t), m(0, t] dt (8.2-3)


J to
Sec. 8.3 Necessary Conditions for an Extremum 551

For example, in the flight of a vehicle from one point to another with least
fuel consumption, minimization of Eq. 8.2-3 is desirable, if q(y, m, t) is
chosen as the fuel consumption per unit time. In chemical process control,
one might seek a maximum of Eq. 8.2-3. In the latter case, however,
q[y, m, t] typically would represent the instantaneous yield of the process.
As a final illustration, minimization of the time required for a system to go
from one state to another can be accomplished by minimizing Eq. 8.2-3,
with q[y, m, t] chosen to be a constant. In such a case, constraints would
exist on the maximum velocities and accelerations which can be tolerated.
Many more examples of optimization problems could be listed. How¬
ever, the important aspect of this discussion is that, even though these
problems are different, they are all closely related mathematically by the
objective of finding a maximum or a minimum of Eq. 8.2-3. Problems
of this type can be solved by Pontryagin’s method, or by the dynamic
programming techniques of Bellman.

8.3 NECESSARY CONDITIONS FOR AN EXTREMUM-


VARIATIONAL CALCULUS APPROACH

The problem to be considered is one of determining the control signals


m(t) which minimize (or maximize) the performance index of Eq. 8.2-3.
The controlled elements are described by the equations

x = f(x, m, t)
(8.3-1)
y = g(x, 0
The elements of f are assumed to be continuous with respect to the ele¬
ments of x and m, and continuously differentiable with respect to the ele¬
ments of x. The controlled elements are assumed to be observable and
controllable, i.e., all state variables are measureable, and it is possible to
excite every state of the controlled elements. The presentation here is
further limited to the special case for which there are no restrictions on the
amplitudes of the control signals or state variables. A more general pres¬
entation is given by Pontryagin et al.17
Before considering minimization (maximization) of a functional, as
Eq. 8.2-3, it is worthwhile to consider the more familiar case of minimiza¬
tion (maximization) of a function. All engineers have encountered prob¬
lems of trying to minimize (maximize) a function of a finite number of
independent variables, say 0(x). Points at which all the first partial deriva¬
tives of the function are zero are known as stationary points. If the function
is a minimum (maximum) at a stationary point, then that point is called an
extremum.
552 Introduction to Optimization Theory

If the variables of the function are not independent but are subject to
equality constraints, e.g., w(x) = 0, necessary conditions for an extremum
can be determined by Lagrange’s method of multipliers. This method con¬
sists in introducing as many new parameters (Lagrange multipliers)
p^p2, • • • (which may be regarded as the components of a vector p) as
there are constraint equations, forming the function 6C = 0(x) + (p, w)
and determining necessary conditions for an extremum from

gradx 0C = 0 and gradp Qc = 0.

Thus these conditions are


dOc _ dOc
• • = 0
dX ^ XQ

ddc
Wi(x)
30c
dpi dpi
Lagrange’s method avoids having to solve the constraint equations for the
cc’s and substituting the results into 0(x). This is accomplished by intro¬
ducing the above additional restrictions.
The calculus of variations is also concerned with the determination of
extrema.f Rather than extrema of functions, however, the object of the
calculus of variations is to determine extrema of functionals. Section 1.4
indicates that, if x has a unique value corresponding to each value of t
lying in some domain, then x(t) is said to be a function of t for that domain;
to each value of t, there corresponds a value of x. In essence, & functional
is a function of a function, rather than of a variable. For example, f[x{t)]
is a functional if, to each function x(t), there corresponds a value of /.
The performance index / of Eq. 8.2-3 is also a functional.
If the second of Eqs. 8.3-1 is substituted into Eq. 8.2-3, the result can be
written as{

I = f
J to
/0(x, m, 0 dt (8.3-2)

Then the problem of determining an extremum of Eq. 8.3-2 for the con¬
trolled elements of Eqs. 8.3-1, is one of determining the function m(t)
which makes / an extremum, subject to the n equality constraints
f(x, m, t) — x = 0. The method of Lagrange multipliers is also useful for

f Reference 18 is a particularly readable presentation of the calculus of variations.


Reference 19 provides a higher degree of rigor.
+ In order to exclude degenerate problems, it is assumed that all state variables contri¬
bute to the value of the performance index. This may be due to the state variables
appearing explicitly in/0, or through their effect on other state variables which appear
in/0.
Sec. 8.3 Necessary Conditions for an Extremum 553

minimizing (maximizing) functionals, subject to functional equality


constraints, which is the problem of interest.
Thus the functional

h = fVo + <P, f - *» dt
J to
(8.3-3)

is formed. The components of p are Lagrange multipliers. If the optimum


values (i.e., those furnishing the extremum of /) of x, m, and p are denoted
by x°, m°, and p°, respectively, then perturbations in these variables from
their optimum values are indicated by

x = x° + Axa
m = m° + Bma (8.3-4)
p = p° + Tpa

where A, B, and T are diagonal matrices with elements oq, & and yi9
respectively, af, /?*, and yi are parameters which adjust the amount of
perturbation that the quantities xf, mf, and pf introduce into xi9 mi and
pi9 respectively. It is assumed that these perturbations are unrestricted.
From the first of Eqs. 8.3-4, it is apparent that

x = x° + Axa (8.3-5)

If Eqs. 8.3-4 and 8.3-5 are substituted into Eq. 8.3-3, Ic has its optimum
value If for a = A1 = 0, p = B1 = 0, y = n = 0, since x, m, and p
then have their optimum values x°, m°, and p°, respectively. Thus Eq.
8.3-3 has a stationary point at a = p = y = 0, and necessary conditions
for the optimum are
grad„ Ic |a—p—y=o = 0
gradp Ic |a=p—y=0 = 0 (8.3-6)
grady 1C |o=p=Y=0 = 0

Application of Eq. 8.3-6 to Eq. 8.3-3, after substitution of Eqs. 8.3-4 and
8.3-5, yields

|(Xa 8radx° Hc° + Xa grad^o Hc°) dt = 0


J tn
'tf
(M“ gradmo Hc°) dt = 0 (8.3-7)
*0
'tf

(Pa gradP° Hc°) dt = 0


'<0

where Xa, Ma, and Pa are diagonal matrices whose elements are the ele¬
ments of xa, ma, and pa, respectively, and Hc° is the optimum value of the
554 Introduction to Optimization Theory

integrand ofEq. 8.3-3, i.e., Hc° = f0° -f (p°, f° — x°). Integration by parts
of the second term in the first of Eqs. 8.3-7 allows that equation to be
written as
t=tf

gradxo Hc° — (gradio Hc°) dt + Xa grad*<> Hc° = 0


dt t=t 0

xa is an arbitrary perturbation, except at t = t0, and possibly at t = tf.


At t = /0, xa = 0, so that x°(70) = x(/‘0) in order for the optimum solution
to apply to the problem of interest. For a problem with specified terminal
conditions x(//), x°(//) = x(tf) and xa(tf) = 0. If the terminal conditions
on x are not specified, xa(tf) is arbitrary. Thus the preceding equation
requires that

gradx. Hc° - ^ (gradj» Hc°) = 0 (8.3-8)


dt
and either x°(//) = x(tf) or
gradx» Hc° 0 (8.3-9)
The last two of Eqs. 8.3-7 are satisfied if
gradmo Hc° = 0
(8.3-10)
gradpo Hc° = 0
Equations 8.3-8 and 8.3-10, together with the boundary conditions x°(?0) =
x(t0), and either x°(tf) — x(tf) or Eq. 8.3-9, constitute the first necessary
condition for an optimum.
Pontryagin’s equations are usually written in a form analogous to
Hamilton’s equations of analytical mechanics. This can be accomplished
by defining H, analogous to the Hamiltonian, as
H(x, m, p, t) = <p, f) (8.3-11)
The vector p in Eq. 8.3-11 differs from the one of Eq. 8.3-3 in that it has a
zeroth component equal to unity. Likewise, f in Eq. 8.3-11 differs from
the one of Eq. 8.3-1 in that it has a zeroth component equal to /0(x, m, t)
of Eq. 8.3-2. Thus p and f are now vectors with n + 1 components.
In terms of the optimum H, the first necessary condition for an optimum
for the case of unspecified terminal conditions on the state variables can
be written as
gradxo H° = — p°
gradmo H° = 0 (8.3-12)
gradpo H° = x°

subject to the boundary conditions x°(/‘0) = x(f0) and p°(^) = 0.| For
specified terminal conditions on x°, the latter boundary condition is
f The latter boundary conditions are a special case of the so-called transversality condition.
Sec. 8.3 Necessary Conditions for an Extremum 555

replaced by x°(//) = x(//).f The x° equation above is equivalent to Eq.


8.3-2 and the first of Eqs. 8.3-1, and hence is always part of the problem
statement. Equations 8.3-12 are called the Euler or Hamilton equations in
canonic form. Their simultaneous solution yields the control signal
m°(t) which makes I stationary.
In the calculus of variations, a distinction is made between weak and
strong maxima and minima. In addition to the Euler equations, two other
necessary conditions for a weak maximum or minimum must be satisfied.
They are the Legendre condition and the Jacobi condition.18 The Legen¬
dre condition for a weak minimum requires that the matrix (see Eq.
4.11-11) gradmo > <gradm0 H° be positive definite. This is analogous to
the requirement that the second derivative be positive in the minimization
of a function by the usual techniques of calculus. As indicated in the next
section, proper formulation of f0 guarantees satisfaction of the Legendre
condition, if the controlled elements are linear.
The Jacobi condition requires that no conjugate points exist for t0 <
t < tf. A conjugate point is one at which x{a(tj) is restricted, where t0 <
t± < tf4 Hence, at such a point, the perturbation in xft) is restricted.
If a conjugate point existed, the controller parameters could become un¬
bounded. Thus, to ensure bounded controller parameters, the Jacobi
condition must be satisfied. If the controlled elements are linear, proper
formulation of f0 also guarantees satisfaction of the Jacobi condition.
This is indicated in Section 8.4.

Coordinate Optimization Interpretation

Pontryagin’s formulation of the optimization problem restates the prob¬


lem as one of optimization of a coordinate. In essence, a zeroth coordinate
of x is introduced as rt
xo(0 = /oO, m> T) dr
J to

so that x0(t) = /0(x, m, t). Optimization of x0(t) at t = tf is optimization


of the performance index, since

I = x0(tf) = I /0(x, m, 0 dt (8.3-13)


J to
which is Eq. 8.3-2.
For first order controlled elements described by xx = fx(xly mls t) the
optimization problem with a specified terminal condition xf(tj) = x1 (tf)
can be interpreted in the a^a^-plane of Fig. 8.3-1. (The generalization to
t Problems in which some, but not all, of the state variables have specified terminal
conditions are also considered in the literature.17
+ Conjugate points have very interesting geometrical interpretations.18’19 However,
these are beyond the scope intended here.
556 Introduction to Optimization Theory

xi

higher order controlled elements is straightforward, but difficult to picture.)


The desired terminal condition is the line x1 = xxFor various m^t),
corresponding values of / = x0(tf) can be determined from Eq. 8.3-13.
Assuming that minimum / is desired, the optimum control signal
is the one for which x0(tf) = / has the smallest coordinate x0°(tf) = 1°.
If I does not depend upon mx(t), a desirable mj°(t) is an impulse. Then
xx(t) would be transferred from aq(t0) to a;1°(t/) in zero time, and hence with
zero I. However, impulses in m^t) could not be realized physically.
m^(t) must be chosen from a set of admissible control signals, which are
defined to be bounded, and also continuous for all t0 < t < tf, except
possibly at a finite number of t.
In the case of unspecified terminal conditions on x°(t/), all components of
p° are zero at t — tf except for the p0°(tf) component, which is unity. Thus
the problem of minimizing (maximizing) / = x°(tf) can be viewed as
minimizing (maximizing) (p, x) at t = tf. In other words, starting from the
initial conditions x°(t0), m°(t) is to be chosen to move the state of the system
(including the x0 component) as little (much) as possible in the direction of
the vector p. But the first and last of Eqs. 8.3-12 are the same as Hamil¬
ton’s equations of analytical mechanics. H is analogous to the Hamil¬
tonian, or total energy, and p and x are analogous to the momenta and
generalized coordinates, respectively. Since H is the total energy for
moving the state, x, m(t) should be chosen at each instant of time to mini¬
mize (maximize) H. This is indicated by the second of Eqs. 8.3-12.

8.4 LINEAR OPTIMIZATION PROBLEMS

In this section, it is assumed that the controlled elements are described


by
x = A (t)\ + B(r)m (8.4-1)
Sec. 8.4 Linear Optimization Problems 557

and the performance index is given by substituting

/o(x, m, t) = i[((xd - x), £2(xd - x)) + (m, Zm)]

into Eq. 8.3-2. xd is the desired state behavior, and £2 and Z are symmetric
matrices which are possibly time-varying.| The dimensions of £2 are less
than (n x n), unless all components of (xd — x) are included in f0% The
objective is to determine x°, m° and the dependence of m° on x° and xd.
From Eq. 8.3-10,

H° = l[(xd — x°, £2(xd — x0)) + (m°, Zm0)] + (p°, Ax° + Bm°)

From the second of Eqs. 8.3-12, Zm° -f BTp° = 0 since Z is symmetric.


Then
m° = -Z_1BV (8.4-2)

This is an expression for the optimum control signal, but it is in terms of


p°. The control law requires m°(r) in terms of x°(/).
From the first of Eqs. 8.3-12, p° = —ATp° + ££(xd — x°). From Eq.
8.4-1, after substituting Eq. 8.4-2, x° = Ax° — BZ-1BTp°. The last two
equations can be written as

— BZ_1B T' “x°" 0“


T" £2 (8.4-3)
n
-At Lp J Lx J

Equation 8.4-3 represents 2n linear, first order differential equations in the


2n unknowns aq°, x2°, . . . , xn°, p1°,p2°, . . . ,pn°. They are subject to n
boundary conditions at t = t0, i.e., x°(^0) = x(/0), and n boundary condi¬
tions at t = tf, i.e.. either p°(r/) = 0 or x°(/r) = x(^), depending upon the
nature of the problem. Equation 8.4-3, subject to the preceding boundary
conditions, is a two-point boundary value problem. Its solution yields the
optimum control signal m0(t) and the corresponding behavior of the con¬
trolled elements x°(t) for t0 < t < tf.

Conversion of the Two-Point Boundary Value Problem

For the case under consideration, the two-point boundary value problem
can be converted into two one-point boundary value problems. Equation
8.4-3 consists of a set of interrelated linear differential equations for x° and

t The Euler equations together with a positive semidefinite £2 and a positive definite Z
constitute necessary and sufficient conditions for a minimum of the performance index,
for the class of problems considered here. Furthermore, the corresponding linear
optimum control system is stable (asymptotically stable if £2 is positive definite).25
X A similar statement holds with respect to Z in terms of the dimensions of m.
558 Introduction to Optimization Theory

p°. Thus x° and p° must be related by a linear transformation. This trans¬


formation may be expressed by

p° = Kx° - y° (8.4-4)

where K is a square matrix of time-varying gains and v is a time-varying


vector. Substitution of Eq. 8.4-4 into the second of Eqs. 8.4-3 yields

Kx° + Kx° - v° = -S2x° - AtKx° + AV + &xd

Then substituting for x° from the first of Eqs. 8.4-3 and using Eq. 8.4-4
results in

(K + KA -f AtK - KBZ-1BrK + £2)x°


= v° + (AT - BZ_1Br)v° + £lxd

Since this expression must be valid for all possible x, the conditions are

K + KA + AtK - KBZ_1BtK + ft = [0]


(8.4-5)
v° + (AT - KBZ_1Br)v° + £lxd = 0

The first of Eqs. 8.4-5 is a set of first order nonlinear differential equations
of the Riccati type.20 The second of Eqs. 8.4-5 is a set of linear, time-vary¬
ing, first order differential equations.! In the case of unspecified terminal
conditions on x°, p°(^) = 0. Thus the boundary conditions on K and v°
for this case are that each of the elements of K and v° is zero at t — tf, as
indicated by Eq. 8.4-4.
Once K and v° are determined, the control law for the optimum system,
is given by substituting Eq. 8.4-4 into Eq. 8.4-2 to obtain

in0 = — Z_1Bt(Kx° - v°) (8.4-6)

Thus, for this case, the control law is linear, and the controller feedback
gains K are independent of the state of the controlled elements. Further¬
more, since the control law is independent of the initial conditions of the
state variables, the system configuration as defined by Eq. 8.4-6 is optimum
for all initial conditions. Merriam, who first noted this property, refers to
this as the optimum configuration.14 Figure 8.4-1 illustrates this configura¬
tion for the general linear case.
Once m° is determined, the response of the optimum system can be
obtained from
x° = (A - BZ_1BtK)x° + BZ_1Brv° (8.4-7)

which results from substituting Eq. 8.4-6 into Eq. 8.4-1. Thus the two-
point boundary value problem has been converted into two one-point
f These equations are adjoint to the equations of the closed-loop (controlled) system.
Sec. 8.4 Linear Optimization Problems 559

Controlled elements

Fig. 8.4-1

boundary value problems. These are the solution of Eq. 8.4-5 backward in
time from t = tf to t — t0, and subsequently solving Eq. 8.4-7 forward in
time from t — t0 to t — tf.
In the nonlinear case, where the controlled elements are nonlinear and/or
the performance index is nonquadratic, it is not possible to convert the
two point boundary value problem in the above manner. Also, the opti¬
mum control law is not linear. These aspects then generally demand com¬
puter solution of the equations defining the optimum system, as is
considered in later sections.

Example 8.4-1. Determine the optimum controller according to the performance index

r*f
I= i [x1co11x1 + d dt
J to
for the first order controlled elements described by xx(t) = axlxx(t) + blxmx{t). The
system is assumed to be a regulator, so that xxd — 0. xx{tf) is unspecified.
For xd = 0, Eq. 8.4-5 indicates v° = 0. Therefore mx°(t) = — l\\Xbxxkxx(t)xp(t), where
kxx{t) is given by the solution to

kxx + 2kxxaxx ^j^iT T wn = 0? =: 0

from Eqs. 8.4-6 and 8.4-5. In order to determine kxx{t), let r = tf — t and kxx(tf — r) =
kxx°(r). Then

jn p o _ k\\~ + wii5 *11°(0) = 0

Let kxx°(j) = £,xxzlblx2z, which yields


c/Ji^u2
z — 2axxz — - 2 = 0
til
This is a linear differential equation with constant coefficients; it can be solved by
classical or transform methods. The solution is z — cxeXlT + c2ek2T, where cx and c2 are
constants, and Xx — axx + /?, k2 = axx — (3, and

+ 0Jn
560 Introduction to Optimization Theory

Thus
£ii(liCi*PT "t“ @T)
k\\ °(T)
bix\c^T + c2e~^T)
Since ^u°(0) = 0, c2 = — Then
con sinh (It
*11°(T) =
(3 cosh (It — alx sinh fir
Therefore
con sinh P(tf — t)
k „(*) =
P cosh P(tf — t) — <2U sinh P(tf — t)
and m°(0 is given by
COll^H sinh P(tf -r 0
md(t) = —
*i°(0
£n _P cosh P(tf — t) — axx sinh P(tf — t)_

The resultant regulator is shown in Fig. 8.4-2.

If tf is a constant, the terminal time of the performance index becomes nearer as real
time advances, assuming t < tf. In this so-called shrinking interval problem, the optimum
system is time-varying. If tf is a fixed time Tin the future relative to real time, i.e., tf —
t + T, the terminal time of the performance index slides ahead in time as real time
advances. This is called a sliding interval problem, and, if xd, SI, Z, and the linear
controlled elements are time-invariant, the resultant system is stationary. A special case
of these is given by infinite tf. This is the infinite interval problem, f If xd, 52, Z, and the
controlled elements are time-invariant, the resultant system designed according to an
infinite interval performance criterion is stationary. In this example, mP(t) becomes

COji^n
mx\t) V(0
Ui(P ~ On)
corresponding to a stationary system.
For the case in which the controlled elements consist of an integrator without feed¬
back, axl = 0. Then, for the infinite interval case, mx0(t) is

V%
m-P(t) I *i°(0

As (OnlCii is increased, so that the performance index emphasizes the system error
relative to the cost of reducing it, the loop gain increases. Also, the speed of response as

f These names were coined by Merriam.


Sec. 8.4 Linear Optimization Problems 561

indicated by

\A
®1°(0 = xiVo) exp (t - to)

increases. This agrees with one’s intuition based on conventional feedback control
theory.

If the optimum system is stationary, K = [0] and the Riccati equation


given in Eq. 8.4-5 reduces to a set of nonlinear algebraic equations
defining the elements of K. Even in this special case, however, it generally
is not possible to determine K in closed form for controlled elements above
second order. Thus the preceding discussion was presented to indicate
that the control law of Eq. 8.4-6 exists for the linear case, rather than to
provide a general method of determining it. Since the control law is of the
form of Eq. 8.4-6, it is important to choose, as the state variables, variables
which can be measured with available sensors.
In the general case, analytical determination of the optimum system
makes use of direct solution of the two-point boundary value problem,
rather than converting it to two one-point boundary value problems.

The Time-Invariant Case

If A, B, £2, and Z are time-invariant, Eq. 8.4-3 can be solved by means


of Laplace transforms. Assuming t0 = 0, the transform of Eq. 8.4-3 is

I <*>(s)BZ1Br" X°(s)" 4>(s)x°(0)


_i

-1
G
•e*

1—

.-*r(-s)[p0(0) +£lXd(s)]_
i

-P°(s).
<

where 4>(s) = (si - A)-1 and -&T(-s) = (si + A2)-1, and X°(s), P°(s)
and Xd(s) are vectors. 3>(s) and — <!>(—s) are the Laplace transforms of the
state transmission matrices for Eq. 8.4-1 and its adjoint, respectively.
Since
-1
I a12~ (I - a12a21) 1 -(I - a12a21) la12

_«21 1 _ _-(I - (I - a^a^)-1 _


X°(s) and P°(s) can be written as
X°(s) = [I + 4>(s)BZ_1Bt4>77( — s)^]-1
x <£(s){x°(0) + BZ-1Br<f>r(—s)[p°(0) + ftXd(s)]}

P°(s) = [I + <f>r(-s)S2<*>(s)BZ-1Bz’r1 (8'4'8)


x r(—s){Sl<&(s)x°(0) - [p°(0) + flXd(s)]}
Equations 8.4-8 and 8.4-2 can be utilized to determine the optimum control
law and the response of the optimum system. Elowever, the procedure is
less direct than the previous method.
562 Introduction to Optimization Theory

Example 8.4-2. Repeat Example 8.4-1, using the Laplace transform method.
For this case,
1
<*>0) = [si - A]-1 =
S - an
and Xd(y) = 0. Then
-i
(Dub ii 2 1 c°iixi0(Q)
iY(s) = - 1 - pm
£u0 - fluX-s + «n)_ s + an a ii

«>ii®i0(0) pm(s - an)


+
(■s + P)(s - p) 0 + P)(s - P)
where p is defined in Example 8.4-1. Inverse transformation leads to

Pi°(t) = - °hlX^(0) Sinh pt + ^ cosh pt _ aii sinh


P P
The boundary condition on pP(t) is pi°(tf) = 0. Since the problem is linear, this may be
accomplished by adjustment of pPiO). Thus
^lFh^O) sinh Ptf
Pm =
P cosh Ptf — axl sinh ptf
Then pAt) can be written as

afiiV(O) sinh Ptf(P cosh Pt — axl sinh Pt)


Pi\t) = sinh Pt + n 1 n
P P cosh ptf — alx sinh ptf
In a similar fashion,
P cosh P(tf — t) — an sinh P(tf — t)
xy{t) = a?!°(0)
P cosh ptf — an sinh ptf
This expression can be solved for ^i°(0), and the result substituted for ^^(O) in the
equation for pAO- This yields

a>n sinh P(tf — t)


pA 0 = *i°(0
P cosh P(tf — t) — alr sinh P(tf — t)_
Then, from Eq. 8.4-2,
(D11bn sinh P(tf — /)
m-Pit) — — *1°(0
in IP c°sh p(tf — t) — alx sinh P(tf — t)_

which is the result previously obtained by the more direct method. This result is also
illustrated by Fig. 8.4-2.

Example 8.4-3. Determine the optimum system for the controlled elements of Fig.
8.4-3. The performance index is described by

'(On O' 0 0 ■
£1 = z =
0 0 o £22

and tf is infinite, corresponding to an infinite interval problem. Again a regulator


problem is assumed, so that xd(t) = 0. Also, x(tf) is unspecified.

m2 = *2 X2 = *1 xi
/ -> I

Fig. 8.4-3
Sec. 8.4 Linear Optimization Problems 563

Since Z is singular, the first of Eqs. 8.4-8 cannot be used directly. The factor Z-1 in
Eqs. 8.4-8 is due to solving
Zm° + Brp0 = 0 (8.4-9)

for m° and substituting the result into Eq. 8.4-1. In this case, Z is as given above anc

"0 0 “

B =
0 1

Thus the only information contained in Eq. 8.4-9 is m2° = —(p2°fC22). But, by the
problem definition, mi° = 0. Then Eq. 8.4-9 is unchanged if Z is replaced by

1 0 "

0 £22

Therefore Eqs. 8.4-8 can be used if Z-1 is replaced by Z1 1.f Then since

’5-1 s~2'
3>Cs) =
0 j -1
P°(.y) can be determined to be
0 W11
— 5“ —5
0
<0
1

con x°(0) - p°(0)


<M<M

5 1
.S2 —53
P°(5) =
54 + (wn/^22)

where p°(0) is to be adjusted so that p°(oo) = 0. But s4 + (con/^22) = G(s)G(—s), where

co M CO 11
G(s) =s2 + (2)W I -i-11 s +
22, £ 22,

This shows that P°(j) has two right-half-plane poles and two left half-plane poles,
symmetrically located with respect to the origin. In order to have p°(oo) = 0, the
residue in the right half-plane poles must be zero. A partial fraction expansion of either
Pi\s) or P2°(s) reveals that the requirements on p°(0) for zero residue in each of the
right half-plane poles of P0(.s) are

(4cOn3^22)/4 (W11^22)/2
P°(0) = x°(0)
(con^22)/2 (4con£223)/4
Since
s3 J2
1 —5
COli x°(0) + £2r4 P°(0)
-S3 5 — s2
£22 J
X°(j)
^4 “b (^11/^22)

t Note that Zt is positive definite and hence the resulting linear optimum system is
asymptotically stable, since the other requirements previously given for this are also
satisfied.
An obvious alternative to this procedure is to rederive Eq. 8.4-8, but for the case in
which B is a vector.
564 Introduction to Optimization Theory

x°(t), for the above values of p°(0), is x°(t) = 4>(r)x°(0), where

7T
(2)^ sin ( at + - a-1 sin at
<\>(t) = €~at
7T
■2a sin cct ■(2)^ sin ( at — -

and a = (con/4£22)1A. Similarly,


, . / 7T \ X2°(0) / 77
m2°(t) = — £22_1/>20(0 = — 2<x2e~0lt x1°(0)(2)i^ sin I a?-I-sin I cct 4—
a

Substitution of x°(0) = <J>-1(0X°(0 yields


m2°(t) = —2a[a#1°(7) + £2°(01

This is the control law for the controlled elements of Fig. 8.4-3. As expected, it is a
linear function of the state variables.
As a is increased, the performance index emphasizes the error relative to the cost of
reducing it. From <\>(t) or G(s), it can be seen that the effect is to increase the speed of
response and the natural frequency of the system. The damping ratio, however, remains
constant at 0.707. Increased damping would have been obtained if co22 > 0 had been
chosen in the performance index.

The Time-Varying Case21

If any of the elements of A, B, £2, or Z are time-varying, a time-domain


solution of Eq. 8.4-3 is generally necessary. The solution of Eq. 8.4-3 is
given by its state transition matrix, which is defined by

4>(t, t0) = <J>0, t0)


-SI —At
The state transition matrix has 2n rows and In columns and can be parti¬
tioned into four (n x n) submatrices

4*ll(b A) *1*12(A A)
4*(b *o) = (8.4-10)
_4*2l(A fi)) Ct>22(b fi>)_
Since 4>(f0, t0) = I,
4*n(fi)> fi)) = 4*22( A? A) = I
4*12(20? A) = 4*2i(fi)> A) = [o]
In terms of <J>(7, t0), the solution of Eq. 8.4-3 is

x°(0‘ x°(t0) ^l(b A)


= 4*(b *0) + (8.4-11)
Lp°(0. Lp°(A)J _b2(b fo)_
where
biO, t0) 0
= | 4*(b r) dr (8.4-12)
t„t_ Jt0 L£2(t)x (t)_
Sec. 8.4 Linear Optimization Problems 565

Substitution of cf>(7, t) yields

bi(b to) = 4*i2(b T)^(r)xd(r) dr


to
't
b2(b to) = 4*22(b t)£1{t)x(,(t) Jt
to

For unspecified terminal conditions on x°(t), the terminal boundary


conditions are p°(?/) = 0. Thus

P°(0 = 4>2iO/, t0)x°(r0) + <t>22(0, 4>P°Oo) + b2(t„ t0) = 0

Solving for p°(t0),

p°(t0) = 4*22 1(t/, to)[4>2i(t/? t0)x°(r0) + b2(tf, t0)] (8.4-13)

Substitution of t for t0 yields

P°(0 = -4>22_1(t/,0[4>2l(t/,t)x0(0 + b 2(tf9t)\

Then, from Eq. 8.4-2, the definition of b2(//51), and the fact that

4*22 1(t/, t)4*22(t/, t) = 4>22(t, T)


the control law for unspecified terminal conditions on x°(t) is

m\t) = Z_1Br <i>22 X<f’ 0<t>21 Of, t)X°(t)


'tf
+ 4*22(7, t)£2(t)x (t) dr (8.4-14)

The resulting response x°(t) can be found from Eqs. 8.4-11 and 8.4-13.
For specified terminal conditions on x°(t), i.e., x°(tf) = x(tf), Eq. 8.4-11
gives
x°(tf) = 4>11(t/, t0)x°(t0) + 4>i2(7/, t0)p°(t0) + bi(t/? t0)
Then
P°(70) = -4>i2_1(t/, t0)[4*n(t/, t0)x°(t0) - x°(tf) + b1(r/, t0)] (8.4-15)
Substituting t for t0, using Eq. 8.4-2, the definition of b^, t) and the fact
that
4*12 04*i2(7/, t) == 4*12(7, t)
yields for the control law, in the case of specified terminal conditions on
x°(0,

m°(() = Z”IBrJ<}>12-1((/, f)[<t>ii(f/. t)x°(t) - x°(t,)]

+ JTuO. r)Sl(r)xd(T) drj (8.4-16)

The resulting response x°(t) can be found from Eqs. 8.4-11 and 8.4-15.
566 Introduction to Optimization Theory

The reader should note that these control laws are of the form of Eq.
8.4-6, where
K = <j>22 1{tf, 0<4>2l(^/? 0

for unspecified x°(^), and

K = <J>12 1(tf, 04*11.(//> 0

for specified x°(//). Also, the control law requires knowledge of xd in the
future interval of control, i.e., xd(r) for t < r < tf. This is a general
requirement for optimization according to this performance criterion.
Example 8.4-4. Repeat Example 8.4-3 in the time cjpmain.
Let
" 0 -1 0 0 ■
0 0 0
" A -BZ-1Br“
G = = (-l)
-SI -Ar
COn 0 0 0
_ 0 0 1 0
so that 4>(Y, /0) = eG(<-<oh Use of the Cayley-Hamilton technique gives
1_

CLx(t - t0)
8

t0) 1(X2(i to) "


1
o

£22 1(X3(/ £22

W11C22 la3(/ t0) cc0(t - t0) ^22 la2^ ^0) £22 1<xi(t to)
4>(h t0) =
o)iiCCi(t r0) co11oc2(t /0) cc0(t - to) ft>11^22 1(X3(t ~~ to)
_co11<x2{t t0) wna3(/ io) — <*i(t - t0) oc0(t - r0) -

where
a0(O = cosh at cos cut
sinh cut cos at + cosh cut sin cut
ai(0 =
2a
sinh cut sin cut
a2(0
2a^
cosh cut sin cut — sinh cut cos at
a3(0
4od

and a is as defined in Example 8.4-3. Substitution into Eq. 8.4-14 yields, for infinite tf,
m2 °(t) = —2a[ax1°(r) + ^2°(01

the same result obtained in Example 8.4-3.

Example 8.4-5. Determine the optimum system for the controlled elements character¬
ized by
1 1
x1 = — - xx + - mx

The performance index is

/ = \ f ixi + m i2) dt
Jt0
and xxd{t) = 0. The terminal conditions are unspecified.
Sec. 8.4 Linear Optimization Problems 567

From the problem statement,

" A -BZBr_ 1-1 —t~2~


G =

_
>
-1

1
Then the components (fruit, t0) and (fr21(t, t0) of 4>(r, /0) must satisfy

0ll(/> *0) = ~ (frllit, ^0) j (fruit, ^0)


t t
1
<f>2l(L to) — , t2) + — (f>2l{t, t0)

Solving for (fruit, t0) in the second equation and substituting the result into the first
yields

(fruit, to) ; (fruit, to) = 0


t1

This is a form of Euler’s equation, considered in Example 2.8-2. The change of variable
t = ez gives
021 'V, eZ°) - 021V, <*») - 0.l(€«, e20) = 0

where the primes denotes differentiation with respect to 2. The differential equation
for 021(ez, ez°) has the solution
/ V5 + 1 \ /l — V5
02 ii*z, eZ°) = ki(zo) exp -2 + k2iz0) exp I-2
2 / \ 2
so that
(f>2lit,t0) = C1(^V(V5 + 1)/a + C2(r0)?F-V 5)/2

Similarly,
1 - V5' t(V*-i)/« + e2(;0)^ + V 5j r(Vi+i)/2
(fruit, t0) —

Since 011(7O, f0) = 1 and 02i(/o, *o) = 0, cx(r0) and c2(/0) can be determined to be
1
_ ?(l-V5)/2
Clito) —

V5 °
1
C2it0)=--t^V 5)/2
V5
Then
, . V5 - 1\ /t0V(V5-l)/2 /V5+ l\//0\(V/5+l)/2

wu) = lTr f +(tvt f


and
. , . *o /t0\{Vs-i)n *o /To\-(V5+1)/2
^,r o)--7 v_n

Similarly,
r1 /to\-(V5-l)/2 t~1 /^o\(^5+1)/2

^('•,o) = -v!7 +v17

(fruit, t0) —
568 Introduction to Optimization Theory

From Eq. 8.4-14,

mx°(t) =-aq°(0
t ^22(^5 ^/)
Thus
V5
1 - (///,)
mx°(t) = —2 *1 °(0
_(V5 + 1) + (V5 - 1)(////)VI_

In the infinite interval problem, tf is infinite. Then

mAO = - V(0
V5 + 1
In this case, a time-invariant control law is obtained, even though the controlled
are time-varying.

Example 8.4-6. Determine the optimum system for the controlled elements of Example
8.4-5, if the performance index is

/ = £ j wij2 dt
Jt0

and the terminal condition xx°(tf) = 0 is specified.


For this case,
-t-1 -t~2'

0 t -1

4>(r, t0) is the solution to 4>(7, /0) = G<J>(/, t0). The equations can be integrated by
separation of variables to yield
to 1 1
t t to
<\>(t, to) =
t
0
to
From Eq. 8.4-16,

md(t) = - *i °(0
tf - t

The resulting response, as found from Eqs. 8.4-11 and 8.4-15, is

to(tf — t)
*i°(0 = , ., '
t\tf to)

The response does satisfy the terminal condition xx°(tf) = 0. For t > tf, the response is
not zero, however. This is to be expected, since the performance index does not consider
this part of the response.
Although the time-varying feedback gain becomes infinite at t = tf, mx°(t) is always
finite. In fact, from the expressions for mx°(t) and ^(r).

™i°(0 = ~
tf 10
S<?c. 8.5 Selection of Constant Weighting Factors 569

a constant. For infinite tf, corresponding to an infinite interval problem, w1°(r) = 0.


Thus the system is open-loop, and the response is the open-loop initial condition re¬
sponse xff) = Oo/OVOo). This response satisfies all the requirements in the infinite
interval case and obviously has the smallest possible value for the performance index.
The system would be undesirable, however, owing to its poor performance with respect
to unwanted disturbances.

The controlled elements in the examples of this section are rather


simple, and yet considerable effort is required in some cases to determine
the optimum system. This is true in spite of the fact that the systems are
linear. In practical situations systems are nonlinear, and analytical
determination of optimum systems is virtually impossible. For the most
part, optimization theory is practical only when used in conjunction with
computers, as considered later in this chapter.

8.5 SELECTION OF CONSTANT WEIGHTING FACTORS

The examples of the previous section indicate that the control law and
system response are greatly influenced by the weighting factors £2 and Z
chosen in the performance index. Selection of these weighting factors is a
difficult task, since the relationships between the weighting factors and
the optimum system parameters or the system response are generally
very complex. However, Ellert has developed a technique for the selection
of weighting factors in the time-invariant case.16
Consider, as an example, the second order linear controlled elements
described by
a ii *12 "0 0 “
x = X + m
L*21 *22 _0 b22_

The performance index is the one of Section 8.4, with infinite tf and

a>n 0 "0 cr
£2 = Z =
0 (x>22_ _0 i_

Using the method of Eqs. 8.4-5, the optimum control law is found to be

m2°(t) = —b22[k21xf(t) + k22xf(t)] + b22i\\t) (8.5-1)

where the k's are defined by

OJ22 “b 2tf22A:22 “b 2<312^21 — b22k22 — 0


(x> n + 2tf21k21 + 2 cinkn b22k2f — 0 (8.5-2)

*21^22 “b *22^21 "b *11^21 “b *12^11 b22k22k2X = 0


570 Introduction to Optimization Theory

and is defined by

— Vi = co22x2d + a22v2° + a12v±Q — b222k22v2°


— i)2° = (oiyx^ + a21vxQ + ^n^i° — bxl2k21v2Q

Since this is a linear, time-invariant system, the closed-loop transfer


function can be determined to be

Xi\s) _
(8.5-4)
F(s) s2 + z-lOJqS + co02
where
%co0 = b22 k22 cin -*r a22
a)02 = an(a22 b222k22) + ci12{b22k21 a21)

js/ x h 2
ci12o22
V(s) = -7- ^lO)
o)0

and V^s) is the Laplace transform of the system input.


With these definitions, k22 and k21 can be written as

^22 — : O (Z1W0 + ail + a2z)


b222
(8.5-5)
k21 = —— (co0“ + 0 + a12a21 + fln2)
®2\b22

From Eqs. 8.5-2 and 8.5-5,

con = —l—o [wo4 + + u112(3212 + 2)co02 + 4u1132:1co0


*12 *>22
-j- (2{J1^di2Cl21Cl22 -j- 2$-q ^12^21 + al2 a21 + ^li4)] (8.5-6)

M22 = 7 2 [(2! 2)too #11 #22 2^12^21]


^22

These expressions determine con and co22, once values of z1 and co0 have
been selected.
Ellert’s procedure is to choose z1 to provide the desired relative stability
of the system, assuming that none of the system variables exceed their
prescribed limits. co0 is then chosen in accordance with the system band¬
width requirements or any limits on m2(t). The relationship between m2{t)
and co0 is given by substituting Eq. 8.5-5 into Eq. 8.5-1. It is

m2(0 (— *1(0 + 11 xl(t) + ^2(0


l u12 Lu12 J

+ (— + #2iW(0 + (#11 + 022)^2(0] + ^22^i(0 (8.5-7)


xa12 l )
Sec. 8.5 Selection of Constant Weighting Factors 571

Specification of the maximum available value of m2(t), worst case values


of xx(t) and x2(t), and solution of Eq. 8.5-3 permits Eq. 8.5-7 to be solved
for co0.
When zx and co0 have been determined, the weighting factors con and co22
are given by Eqs. 8.5-6. Since the performance index should be convex,
con and oo22 must be non-negative. This requirement, in a sense, tests the
compatibility of the design requirements, assuming that a quadratic
performance index with constant weighting factors is a reasonable choice.16
For controlled elements of higher order, Eq. 8.5-4 becomes

=-— - N(S)-—- (8.5-8)


K(s) s" + Z^COqS71 + • • * + Zj^COq + co0n
where
N(s) = cn0n
N(s) = 21COo_15 + W0n
or
N(s) = Z2OJq~2S2 + 21COo“1S + co0n
for types one, two, or three systems, respectively, i.e., systems with zero
steady-state error to a unit step input, zero steady-state error to a unit
ramp input, etc., respectively. Ellert’s procedure for selection of the
performance index weighting factors can be applied to these higher order
cases, if the z’s can be determined without undue trial and error. Criteria
for selecting the z’s to obtain acceptable responses have been presented
in the literature. In fact, tabulations of numerical values of the z’s, called
standard forms, can be found.6,10,22 Whiteley’s standard forms for the
characteristic equations are given in Table 8.5-1.22 The corresponding step
responses are shown in Fig. 8.5-1. Since many practical control systems

Table 8.5-1

Maximum
System Percent
Type Standard Forms Overshoot

Zero («) s2 -f \Aoo0s + co02 5


position (b) s3 + 2co0s2 -f 2co02s -f co03 8
error
(c) s4 + 2.6co0s3 + 3.4co0252 + 2.6co03s -f- co04 10
Zero (d) s2 + 2.5co0s + co02 10
velocity (<d s3 -+- 5.1co0s2 + 6.3 (o02s + co03 10
error
(/) s4 + 1.2(jd0s3 + \6co02s2 + 12 oj03s + co04 10
(?) s5 + 9 co0s4 + 29a>02^3 + 38co03s2 + 18co045 + co05 10
(h) sG + 1 lco055 + 43co02s4 + 83co0353 + 73 a>04s2 + 25 cofs + co06 10
Zero O’) s3 + 6.1oj0s2 + 6.7 oofs -+- co03 10
acceleration s4 + 1.9co0s3 + 15co02s2 + 7.9co03s -f co04 20
0)
error
(*) s5 + 18co054 + 69co02s3 + 69co03s2 + 18co045 + co05 20
(/) s6 + 36 co0s5 + 251co025’4 + 485a>03^3 + 251a>04.y2 + 36co055 + co0 6 20
572 Introduction to Optimization Theory

r(t)

r(t)

r(t)
Sec. 8.6 Penalty Functions 573

have transfer functions of the form of Eq. 8.5-8, Whiteley’s standard


forms can often be used, in conjunction with Ellert’s procedure, to
determine the performance index weighting factors which satisfy the
subjective design requirements. Objective design requirements, such as
limits on the control signals or state variables, must be approached in a
different fashion. This is considered in the next section.

8.6 PENALTY FUNCTIONS

The design specifications on most control systems require that some of


the variables be constrained between prescribed limits. Such constraints
may be imposed by saturation-type limits in the controlled elements, or
they may be due to the mission requirements associated with the application
of the system. For example, the maximum allowable stagnation tem¬
perature on the nose of a re-entry vehicle is often given as 3500° Fahrenheit.
The temperature constraint, as stated, is a “hard” constraint, in the sense
that it is a value not to be exceeded. Both “hard” and “soft” constraints
are often stated, but in practice most constraints really are soft. For
example, a temperature of 3600° would probably also be acceptable, since
safety factors are usually included in such figures. No design procedure or
subsequent implementation is precise, nor can any design procedure
consider the uncertainties associated with the ultimate operation of the
system. Although hard constraints are conceptually useful from a mathe¬
matical viewpoint, they normally do not physically exist. Furthermore,
hard constraints cause considerable difficulty in obtaining computer
solutions to optimization problems, and in controller realization. For
these reasons, constraints are treated here by means of penalty functions.
One can approach a hard constraint by making the penalty more severe.
A penalty function is a performance index term which increases the
value of the index when the constrained variable approaches its limit.
For example, the second and fourth terms of Eq. 8.2-2 are penalty
function terms. Many other penalty functions have been proposed in the
literature.23-25
Weighting factors, as contained in the first and third terms of Eq. 8.2-2,
are selected to satisfy the subjective design requirements. This was
discussed in the preceding section. Once the weighting factors are selected,
the penalty functions may be selected to satisfy the constraints. If this is
done, the system response satisfies the relative stability and speed of
response specifications when none of the variables are at their limits, and,
furthermore, the variables are properly constrained when they attempt to
exceed these limits. This is the basis for Ellert’s design philosophy.
574 Introduction to Optimization Theory

The introduction of nonquadratic penalty functions into the performance


index makes the optimization problem a nonlinear one, even if the
controlled elements are assumed linear.! For this reason, computers are
generally required to solve practical optimization problems. Computer
techniques for solving optimization problems are considered next.

8.7 BOUNDARY CONDITION ITERATION

As indicated in Sections 8.3 and 8.4, the solution of Eq. 8.3-12 requires
the solution of a two-point boundary value problem. It is not possible
to solve simultaneously the x and p equations forward in time, unless
correct boundary values for p(/0) are known. A similar problem exists if
one attempts to solve the equations simultaneously backward in time.
One possible approach is to assume a set of values for p(f0), indicated by
p1^). The superscript “1” is used to denote the first choice of p(f0).
These n conditions, in conjunction with the n conditions x°(/0) = x(/0)
are sufficient to solve Eq. 8.3-12 forward in time to determine x\t) and p 1(t).
The computed values of p1(^/) or as the case may be, can then be
compared with the correct values specified by the terminal boundary
conditions. If these two sets of values are identical (this would indeed be
fortunate), pl(t) = p°(f) and x\t) = x°(t), and the problem is solved.
Generally, however, they differ. If the x and p equations are linear,
superposition can be used to directly revise p1^) so that the terminal
boundary conditions are satisfied, and hence immediately yield the
optimum solution. In the more general case, a revised choice for p(/0),
namely p2(r0), must be made and the equations solved again. This process
is repeated until the terminal boundary conditions are satisfied, to an
acceptable degree of accuracy. This technique is called boundary condition
iteration.
More direct methods for performing boundary condition iteration are
suggested by Merriam, Neustadt, Scharmack, and Speyer.25-28 For
example, a hill-climbing problem can be formulated so that the optimum
occurs when pn(tf) = p°(A). Thus the computer techniques for hill¬
climbing problems can be utilized.
There are two significant problems associated with boundary condition
iteration methods in the general case. First, as indicated by Eq. 8.4-8 for

t In this case, the first footnote of Section 8.4 still applies, if the terms “positive semi-
definite” and “positive definite” are replaced by “convex and differentiable” and
“strictly convex and differentiable,” respectively.25 Of course, these terms now apply
to the appropriate terms in the integrand f0 of Eq. 8.3-2, rather than to £2 and Z.
Sec. 8.8 Dynamic Programming 575

the linear case, the differential equations to be solved form an unstable


set. Thus the solutions to these equations are extremely sensitive to errors
resulting from the starting conditions chosen and numerical computational
procedures. This sensitivity generally increases as the optimum is
approached because the system speed of response, and hence the speed of
response of the unstable adjoint equations, increases. Second, at each
value of t in the range t0 < t < tf, the second of Eqs. 8.3-12 must also be
solved to obtain numerical values for the control signals. Since this
equation is often a nonlinear algebraic equation, or even a transcendental
equation, its solution may be difficult.
Partially offsetting the disadvantages of boundary condition iteration is
the requirement of less computer memory than other methods to be
considered. Also, the relative simplicity of the computer program may be
important, for example, in the case of airborne real time optimization.

8.8 DYNAMIC PROGRAMMING29

A second method for solving the two-point boundary value problem is


discrete dynamic programming. In this method, the n state variables and
time are quantized, establishing a grid in the n + 1-dimensional xt space.
This is illustrated for first order controlled elements in Fig. 8.8-1. The
abscissa in this figure represents quantized future time, from the initial
time t0 to the terminal time tf. Real time t advances from t0 to tf. The
ordinate represents quantized values of xv Since this is a discrete problem,
only the values of x1 and time at the intersections of the broken lines are
important. These intersections are called nodes, and the solid lines with
arrows denote the permissible changes in xx from one node to another.

to = koT (ko + 1 )T (ko + 2 )T

Fig. 8.8-1
576 Introduction to Optimization Theory

The discrete form of Eq. 8.3-2 is the performance index


JCf

I* = T 2/0[x(/c), m(/c), k] (8.8-1)


k=ko

The value of /0 in the interval from k = kj to k = kf is determined by


x(kj), the state of the controlled elements at k = kj, and by the control
signal m(fc) for kj < k < kf. Thus the value of the performance index
over the interval kj < k < kf is determined by x{kj) and m(k), and also by
kf — kj. Assuming that kf is a constant, this dependence is upon x(^),
m(/c), and kj and is indicated by

R[x(kj), m(/c), kj] = T 2/0[x(/c), m(k), k]


k=kj

where R[x(kj), m(k), kj] denotes the portion of /* due to kj < k < kf.
Defining a policy to be any rule for making decisions which yields an
allowable sequence of decisions, Bellman’s principle of optimality can be
stated as, “An optimal policy has the property that, whatever the initial
state and initial decision are, the remaining decisions must constitute an
optimal policy with regard to the state resulting from the first decision.”
In essence, this is a statement of the intuitively obvious philosophy that,
if /* is to be minimized by a choice of m(&), the portion of /* denoted by R
must be minimized. Furthermore, this principle states that, if R is
minimized by a selection of the sequence m(k), kj < k < kf, the optimum
sequence m°(k) is a function of the states x{k0). The state of the controlled
elements at k = kj determines m°(k), k}< k < kf. This functional
dependence can be indicated by m°(/c) = gfc[x(/c;)]. Thus the minimum
value of R is a function only of the state of the controlled elements at kj9
and kj itself. Then

R°[x(kj), kj] = min 2 T/0[x(/c), m(/c), k] (8.8-2)


mOc) k=kj

The symbolism before the summation indicates that m(k) is to be chosen


with k in the interval kj < k < kf to minimize the summation.
Obviously, if kj = kf, R°[x(kf), kf] = 0. With this boundary condition
as a starting point, Eq. 8.8-2 can be used for a computation backward in
time from k = kf to k = k0, to determine m°(k) and evaluate the minimum
value of the performance index. This computation gives m°(/) and the
corresponding x°(?) from each node in the xt space to the nodes at tf.
Thus the solution of the two-point boundary value problem is determined
for a number of initial conditions, rather than only one. The original
problem has been “embedded” within a number of similar problems.
Unfortunately, in order to compute — i), kf — /] for each node
at a time tf — iT, it is necessary to have ^[x^ — i + 1), kf — i + 1]
Sec. 8.8 Dynamic Programming 577

temporarily stored in the computer for each node at time tf — (i — 1 )T.


The temporary computer memory required is excessive for nominal grid
sizes. In fact, the order of the controlled elements must be less than three
to avoid saturating the thirty-two thousand word memory of an IBM 704.
Several methods have been proposed to alleviate the excessive memory
requirements of dynamic programming.30,31 One approach uses a coarse
grid for an initial solution. This is followed by a second solution using a
finer grid in the region of xt space where the optimum has been located by
the initial solution. Another method approximates the minimum value
of the performance index at a given instant in time by an orthogonal
series. Thus the series coefficients, rather than R°, are stored. These
methods still have difficulties, however.
Before considering other computer techniques for solving optimization
problems, it is interesting to note that Eq. 8.3-12 can be obtained by means
of dynamic programming. Since i?°[x(fc3-), /cj is a constant for any
Eq. 8.8-2 can be written as
kf
min (t/0[x(/c,), m{k0), /cj + T I /„[x(fc), m(k), k] - R°[x(/c,.), kg = 0
m(k) l k = ki +1
This is the same as

R°[x(/cm), kj+1] - R0[x(/c,), kj\


min /o[x(/c,), kj\ + = 0
m (fc) T
Taking the limit as T approaches zero, the continuous form of Bellman’s
equation is determined as

dl°[x°(r), t]'
/0[x°(r), m0(r), r] + = 0
dr
where t0 < r < tf, and 7° is the minimum value of the performance index.
The total time derivative can be written in terms of partial derivatives,
and, using the first of Eqs. 8.3-1, this expression is

/„[x°, m°, r] + d-f + (grad*° 1°, f) = 0


Or
This first order partial differential equation in the dependent variable 7° is
called the Hamilton-Jacobi equation. Defining pk°(r) = dl°ldxk°, Eq.
8.3-10 permits the Hamilton-Jacobi equation to be written as

+ 77(x°, m°, p°, r) = 0


Or
The solution of this equation by the method of characteristics yields
Eq. 8.3-12, as indicated above.25
578 Introduction to Optimization Theory

8.9 CONTROL SIGNAL ITERATION

The reason the two-point boundary value problem under consideration


cannot generally be separated and solved exactly as two one-point boundary
value problems is that the x° and p° equations are coupled by the expression
for m°. If, however, Pontryagin’s equation for m° is not used, and instead
m is determined by iteration, the equations can be uncoupled. This is the
basis of the methods of Kelley and Bryson.32,33 For a number of linear
and nonlinear problems, this control signal iteration procedure converges
to the optimum control variable m° which satisfies Pontryagin’s equations.
This method is most useful for trajectory problems, where an open-loop
type of operation is used.
The first step in the control signal iteration procedure is to guess a
reasonable value for m1^) and integrate the x equations forward in time
from t0 to tf to determine x1^). The values of m1(t) and xJ(t) are stored in
the computer memory. The corresponding value of the modified per¬
formance index I* of Eq. 8.3-3 is also computed and stored.
Next, the p or adjoint equations are solved backward in time from tf to
t0. The stored values of x1^) generally appear in these equations and are
utilized for the backward computation. During the backward com¬
putation, the effects of changes in mJ(0 are considered, to determine a
new control signal m2(7) for the next forward computation. The effects of
these changes can be determined for the (i + l)th iteration by setting
(3m2 = m?+1 — m\ The corresponding change in the state variables is
(5x2 = x2+1 — x2. These changes cause a change in the modified per¬
formance index of dlj = 72+1 — Ic\ Assuming that Ic is to be minimized,
the iterative procedure is converging if dlj is negative. The procedure is
continued until dICL and/or (3nT are negligible.
At this point, an algorithm for computing (5m2 is required. The
derivation presented here follows that of Merriam.25 From Eq. 8.3-3,

rc+1 = y
J to
'[/‘+1 + <p2+i, r+i - x2+1>] dt
Substitution of .,, . .
ft1 =/.*+ 4/7
pi+1 = p‘ + dp'
f«+i = fi +

xm = x‘ + dx‘
yields

Ip = }{Uo + f - X(>] + dfj + {p\ 8V) + (dp/ df*>


*1 to
- <p/ dx‘> - <dp/ dr) + (dp/ f' - x’)} dt
Sec. 8.9 Control Signal Iteration 579

The first term in the integrand gives Ic\ The last term is zero by the first of
Eqs. 8.3-1, applied to the zth iteration. Thus
'if

Ic =i: + \ [Sfo + <p‘, -5f4> + <V> <5f*) - <p\ dr) - (V, <5**>] dt
•l to

(8.9-1)

If /J+1, P*+1> and V+1 are expanded in Taylor series about their values
in the preceding iteration,

dfo =/o?+1 fo = (gradx’/o*, <5x*') + (gradm*^', dm*) + • • •


dp* = Pm ~ P* = P(x') dx* + P(m') (5m' + • • •
df* = f+1 - V = F(x') dx* + F(m*) dm* + • • •

where F(x*) is the Jacobian matrix of Eq. 7.4-4, for f = f* and x = xi.e.,
the zth iteration. F(nT) is the corresponding matrix with x* replaced by nT.
P(xl) and P(nT) are matrices which correspond to F(x2) and F(m*), with f
replaced by p. Substituting these expansions into Eq. 8.9-1 and neglecting
all terms in (5nT and dx1 above first order yields

n+1 = i: + I tf{(s\ dm*) - <pf, dx*)


ho
+ (gradx*/0\ <5x*> + <p*, F(x*) dx*)} dt
where
s* = gradmtf0* + FT(m*)p* = gradm* H* (8.9-2)

From Eqs. 8.3-11 and 8.3-12,

-p4 = gradx^/o^ + F V)P* (8.9-3)


Then I*+1
c
becomes
'if

n+1 = v + {(s\ dm1 - [(p\ dx*) + <p\ dx*)]} dt


to

The bracketed quantity in the integrand is (d/dt) (p*, dx*). Since (5x?(/0)
must be zero, and since either pl(tf) or dxl{tf) must be zero, the result of
integrating the bracketed quantity is zero. Thus

(5nT) dt (8.9-4)

The algorithm for determining m*+1 is based on requiring dlc to be negative,


or possibly zero. The zero value applies only if s* is zero. One possible
choice for <3nT is given by dmz = — |/?/| sgn s/, where the /?/ are yet to
be determined.
Since Eq. 8.9-4 is only an approximation, (5nT must be restricted in
magnitude to ensure the validity of the approximations made in its
580 Introduction to Optimization Theory

derivation. However, (5m* should be as large as possible for rapid con¬


vergence of the iteration procedure. One possible choice of the /?/ is a
constant, = rj. The computer logic can be chosen to adjust rj to
attempt to maintain a satisfactory compromise between the conflicting
requirements of rapid, and yet guaranteed, convergence. Merriam suggests
additional possible choices for the /?/.25
Control signal interation is somewhat of a compromise between the
extremes of boundary condition iteration and discrete dynamic program¬
ming. Control signal iteration uncouples the equations of the two-point
boundary value problem, so that stable differential equations are solved
both in the forward and the backward time directions, if the controlled
elements are stable. Furthermore, the memory requirements are between
those of boundary condition iteration and discrete dynamic programming.
The most serious drawback of control signal iteration is the problem of
selecting the ft/. A number of unsuccessful forward computations are
often made before suitable values are determined.
r

It should also be indicated that the algorithm for changing the control
signal ignores the fact that the shape of the surface on which the optimum
value of the Hamiltonian function H lies changes as x and p change.
This reduces the rate of convergence and contributes to the difficulty in
determining the

8.10 CONTROL LAW ITERATION25

By a clever development, somewhat similar to that of the preceding


section but considerably more involved, Merriam derives algorithms which
include a first order approximation to the changes in the shape of the
surface on which H° lies. This leads to exceedingly rapid convergence of
the iteration procedure, when x* is in a suitably small neighborhood of the
optimum trajectory. Furthermore, Merriam’s method leads directly to a
feedback controller, whereas the control signal iteration method yields an
open-loop type of operation. The feedback controller is a linear approxi¬
mation to the optimum nonlinear controller, and it is optimum for the
neighborhood described above.
Equation 8.9-1 is also used for developing the control law iteration
method. The Taylor series expansions for /*+1, p*+1, and f*+1 are again
substituted. In this case, however, the second order terms in (5nT and Sxl
are included. Merriam shows that the resulting equation analogous to
Eq. 8.9-4 is /*+1 = I* + v\ where

v* = r[h{dm\ T (5m*) + (dm\ si + R*' dx*) + i<K* <5x\ R* dx*)] (8.10-1)


J to
Sec. 8.10 Control Law Iteration 581

sz is as defined in Eq. 8.9-2, and

T = gradmO(gradmi f0* + gradmO(FT(nT)p*


(8.10-2)
Rl = gradmO(gradxi /0* + gratW'XF^x^p*’ + FT(m*)P(x*)

and Kl is defined by T'K* = R\ The elements of K* are kP = — dmi/dx/,


which are feedback gains in the linear approximation of the nonlinear
optimum control law. With these definitions, Eq. 8.9-3 can be written as

-p* = -(RYu* + gradx^/o* - (K*)T gradmif* + [F(x*) - F(nE)K<] V


(8.10-3)
where u* is defined by TV = — s\ Because second variations are con¬
sidered, the additional relationship

-P(x') = gradxi)(gradxi/0* + gradxi><F3'(xi)pi


+ F^xOPCx*) + Pr(x’)F(x*) - (R’^K* (8.10-4)

subject to the boundary condition P(xl)\t=tf — [0], is required.

If (5nT is chosen so that the approximations leading to the expression


for Ilc+1 are valid, and yet viis negative, the iteration procedure converges
to yield m°. Because of the complexity of the expression for v\ however,
the determination of an iteration algorithm is not obvious. Merriam
realized that the most rapid convergence results if 6nT is such that vl has
its most negative value, i.e., vi is minimized. He further realized that, since
vl is a quadratic form, a second two-point boundary value problem can be
avoided if (3mz is chosen to minimize v\ subject to the constraint of
linearized controlled element equations. These equations are

dxl = F(xz) dxl + F(m?) <5nT, (5x2(r0) = 0 (8.10-5)

The minimization of

vl + (g\ F(x*) Sxl + F(m*) dmi - dx1) dt

where g is analogous to p in the first minimization, introduces the linear


differential equation

-gl = (RZ)V + [F(xz) - F(in*)K*] Tg\ g%) = 0 (8.10-6)

The feedback control law obtained from the second minimization is

dm2 = w* — Kl Sxl (8.10-7)

where w4 is defined by
TzwJ = — sz — FT(ml)g (8.10-8)
582 Introduction to Optimization Theory

In order to restrict the magnitude of <5nT, as required by the approxi¬


mations, Merriam suggests the iteration algorithm
(5nT = evs1 — Kl dxl (8.10-9)
where e is a parameter in the range 0 < e < 1.
Merriam’s control law iteration method possesses two significant
advantages with respect to the method of the previous section. As
indicated previously, Merriam’s method has a faster convergence rate in
the neighborhood of the optimum trajectory. Also, the propagation of
truncation errors is reduced because both the forward equation, Eq.
8.10-5, and the backward equations, Eqs.v 8.10-3, 8.10-4, and 8.10-6,
possess the stability properties of linear optimum control systems in the
vicinity of the optimum.

8.11 DESIGN EXAMPLE

In order to illustrate the ideas of the preceding sections, a simplified


design problem is considered. Ellert’s procedure for determining the
weighting factors and penalty functions, and Merriam’s control law
iteration method are utilized.!
The problem considered is the speed control of a rotary shear shown in
Fig. 8.11-1. The system is used to produce material cut to a specified
length. Because the material feed velocity drifts during operation, this
velocity is sensed along with that of the cutting rollers. These signals are
used in the normal operating mode to maintain the length of the cut ma¬
terial within specified tolerances.
The specific problem considered here is not the general control problem,
but the one of transition from one cut length to another by changing the

f The computer results presented were provided by Dr. F. J. Ellert and were obtained
using a computer program developed at the General Electric Research Laboratory
under the direction of Dr. C. W. Merriam, III.
Sec. 8.11 Design Example 583

52

(b)
Fig. 8.11-2

cutting roller speed. In order to avoid damage to the material being fed
to the shear, the speed transition must be smooth. However, the transition
should be rapid to reduce the amount of material of undesired length
produced during the transition, since such material is waste. The desired
transitional speed versus time is shown in Fig. 8.11-2#. For simplicity,
the initial speed is normalized to unity, and the transition is to zero.
The controlled elements are represented in Fig. 8.11-2b. The variable
y1 is the cutting roller speed. The roller drive motor torque is y2. Thus the
equations for the controlled elements are

x = Ax + Bm
(8.11-1)
y = Cx
where
1
1
o
<N

" 0 " '0 0“


A = B = c=
_ —0.25 — 0.2_ _0 10_
584 Introduction to Optimization Theory

Fig. 8.11-3

Fig. 8.11-4
Sec. 8.11 Design Example 585

The objective of the design is a linear controller which causes y^t) to


follow yi(t), subject to the constraints

\m2(t)\ < 0.2 and \y2{t)\ < 0.2

The system saturates outside these ranges. The initial condition response
of the system should be slightly underdamped, with the peak overshoot
not exceeding 5 percent. The ranges of initial conditions are

l-0<yi(*o)< 1.5 or 0.5 < x&o) < 0.75

and 0 < y2(t0) < 0.2.


After the determination of the state equations for the controlled elements,
as given in Eq. 8.11-1, the next step in the design is to formulate the
performance index. As a first attempt, assume
"10
I = i [(xrf — x, £2(xd — x)) -f (m, Zm>] dt (8.11-2)
Jo

where Fig. 8.11-2 defines x^(t) as

xi(i) = ^1 + cos^j, 0 < t < 10


Also, since x2 = xlt

x2(t) — — — sin — , 0 < t < 10


W 40 10

For purposes of determining the weighting factors in the performance


index, assume that tf is infinite. The resultant closed-loop system is
described by Eq. 8.5-4. Since the system is to be slightly underdamped
with a peak overshoot not exceeding 5 percent, z± = 1.4 is chosen from the
first standard form of Table 8.5-1.
In order to specify co0, Eq. 8.5-3 is solved for tq°(0). Substituting the
result into Eq. 8.5-7, along with Eq. 8.5-6 and the worst case values
aq(0) = 0.75, z2(0) = 0.2, and m2{0) = —0.2, computer solution yields
ojq = 2.2812839, con = 0.06786696, and co22 = 0.02075401. Once these
values are determined, the assumption of infinite tf must be removed and
the optimum system determined for tf = 10.
For the case under consideration, Eq. 8.4-5 yields v2°(t), k21(t), and k22(t)
as shown in Fig. 8.11-3. The corresponding system response is given in
Fig. 8.11-4 for the two extremes of initial conditions. Figure 8.11-5 gives
the corresponding curves for y2{l), and the required control signals are
shown in Fig. 8.11-6. The system meets the required specifications
except for the limitation on y2(t), as evidenced by Fig. 8.11-5.
586 Introduction to Optimization Theory

In order to maintain y2(t) within the desired limits, a penalty function is


introduced into the performance index. Thus the integrand of Eq. 8.11-2
is augmented by the addition of
y2 x2(t)
V2 (0
= ro22
0.2 0.2
where y2 is to be adjusted so that the limitation on y2(t) is satisfied. The
solution for the optimum system is to be determined by Merriam’s control
law iteration method.

A
0.16 -

0.08 ■ni2(0, 3'i(0) = 1.0, j2(0) = 0

0 /.I I 1 I t
1 8 10

-0.08 -/^ m2(t), ji(0) = 1.5, y2(0) = 0.2

/
/

- 0.16

-0.24

Fig. 8.11-6
Sec. 8.11 Design Example 587

From Eq. 8.10-2,


_ _ 0 0
TPZ
0 0
and Rz = 10^
0 1 10 dp>
dx^ dx2
Then, from TZKZ = R\
0 0
K3 = dpi
10^ 10
dx-p dx 2 J
Also, from Eq. 8.9-2,
0
S' =
+ 10/V_
Then, from T'V = — si
0
u =
_ — {mj + 10pj)_

With these expressions, and those for gradx* f0\ gradm* f0\ F(xz), F(nE),
and P(xz), the equations for the computer solution can be determined.
The first forward time expression is obtained by rewriting Eq. 8.10-9 in
the form wz+1 = ew2 + v2 — k21lxi+1 — k22x^+1, where generally y* =
m? + K*x*. In this specific case, v2 = mj + k^xp -f k22ix2i. The y
equations are solved backward in time. The forward time state equations
can be written from Eq. 8.11-1 as
V+1 ,*+i
'2

%*+i* = -0.25x[+1 - 0.2^+1 + 10m2+1


xo
'2

The final forward time equation is given by Eq. 8.11-2 as


‘10
ri-t-1 _ 1 i+1\2 ,t+l\2i
[°hi(xi ~ x\+1)2 + o)22(x2d — xl2+1)2 + (m2+1)“] dt

The backward time expressions for p are obtained from Eq. 8.10-3 as

-pi = -0.25pi - a>u(x^ - xi)


y2-l
^2272 Xc
—pi = Pi ~ 0.2p,' — co22(xi — x2’) + sgn xs
0.2 0.2
Equation 8.10-4 gives the backward time expressions for P(x2) as

-P11 = ojn - 0.50p21z - 100


—pi2 = P21 = -0.25p22* + P11 - 0.2p21{ - 100p21lp22l

~p22 — (022 1 T + 2(p12z — 0.2p22) — 100(p22*)


588 Introduction to Optimization Theory

where pu — dpjdxj. The backward time equations for g are given by


Eq. 8.10-6 as
-gi = -10(w2i + I0pj)p21* + (100/7~ 0.25)gj
-g2 = — 10(m2? + 10pj)p22 + gi + (100/722*' - 0.2)g2l
Equation 8.10-8 yields the final backward time expression
w2* = -[mj + 10 {pj + g2i)]

Fig. 8.11-8
Sec. 8.11 Design Example 589

Using the results of the optimization corresponding to aq(0) = 0.75 and


:r2(0) = 0.2 without the penalty function for the initial computation, the
iterations yield the results of Figs. 8.11-7 through 8.11-10. In carrying out
the computer solution, y2 was increased until x2(t) was within the prescribed
limits. This required ay2= 16. y2 was then fixed at this value, and the
iterations continued until the optimum was essentially reached.
Inspection of Figs. 8.11-7 through 8.11-10 indicates that the corre¬
sponding system satisfies all the design specifications. Flowever, the solid

Fig. 8.11-10
590 Introduction to Optimization Theory

curve of Fig. 8.11-7 does have an undesirable dip. This is due to the fact
that the optimum was based on the initial conditions corresponding to the
dashed curve. In design problems such as this, in which the range of
initial conditions is large, a linear controller is limited with respect to the
performance it can provide. In such cases, the desirability of a nonlinear
controller is indicated. Design methods for such controllers are still under
investigation, however.25

8.12 SINGULAR CONTROL SYSTEMS


\

In Section 8.3 it was stated that m°(t) must be chosen from a set of
bounded functions. The possibility of forcing m°(t) to be bounded by use
of penalty functions is indicated in Section 8.6. Presently of somewhat
more mathematical than practical interest is the design of a system with a
hard constraint on m°(r) without employing penalty functions. For
example, if the equations describing the optimization problem, including
the zeroth coordinate introduced because of the performance index, are
of the form

Xj = fj(x 1, x2,. . ., xn) + b^mjY, j = 0, 1, . . . , n, k odd


then
n

H ^'IPoUo + bAmof]> k °dd


9=0

H is a minimum (maximum) if the rrij are as large as possible in magnitude


and opposite (the same) in sign as the pjm In such a case, the optimum
control system is of the “bang-bang” type, since the mf switch back and
forth between their limits.
For this “bang-bang” system, gradm0 H° = 0 yields

kp/bi°(A:_1) = 0, j = 0, 1, . . . , n, k odd (8.12-1)

if rrij is not a constant. If this expression is satisfied, H° is independent of


the mjy and hence H° then yields no information about m°. Such a
condition is called singular. In spite of the fact that H° is independent of
m°, Pontryagin has shown that Eq. 8.3-12 is a necessary condition for an
optimum.17

Linear Time Optimal System

If/o = 1 in Eq. 8.3-2, I = tf — t0. Thus, if this I is minimized subject


to the boundary conditions x°(/0) = x(/0) and = x(tf), the resulting
system is time optimal in the sense that the states are changed from x°(/0)
Sec. 8.12 Singular Control Systems 591

to in least time. If the controlled elements are linear,


x = Ax + Bm (8.12-2)
Then
H = <p, Ax + Bm) + 1 (8.12-3)

and Eq. 8.3-11 yields Eq. 8.12-2, and the equation

-p° = ATp° (8.12-4)

which is adjoint to Eq. 8.12-2. Since Eq. 8.12-4 is linear and does not
contain x° or m°, the general form of the solutions can be easily determined.
Elowever, the boundary conditions on p° are unknown.
Since Eq. 8.3-12 also yields

gradm0 H — 0 = BTp° (8.12-5)

except when m° is constant, it must be that Brp° = 0 when m° is not con¬


stant, i.e., at the switching instants for the “bang-bang” controller.
Between the switching instants, when m° is a constant, H is minimized if
(p°, Bm°) is as negative as possible. This is satisfied if ntj = —Mj sgn (BTp)y,
where Mj is the maximum value of These conditions determine m°(f),
if the complete solution to Eq. 8.12-4 is known. These solutions depend
on x°(f0) and x°(//). Thus it is normally easier to solve Eq. 8.12-4 subject
to arbitrary boundary conditions, to determine all the possible m°(t)
which satisfy Eqs. 8.12-2, 8.12-4, 8.12-5, and the requirement on (p°, Bm°).
Then the solution appropriate for the particular boundary conditions is
selected from all the possible m°(t).
Example 8.12-1. Determine the controller, which for arbitrary x(70), x(r) reaches the
origin in least time. The controlled elements are those of Example 7.13-1, with the
restriction \m2(t)\ < M2.
The controlled elements are described by Eq. 8.12-2, where

“ o r "0 O'
and B =
—b —a_ _0 1_

For convenience, the transformation w = S-1x is introduced, in which

s2 1
where Si and s2 are the characteristic values

2
a
- b
592 Introduction to Optimization Theory

and for simplicity it is assumed that (a/2)2 — b is positive. Then Eq. 8.12-2 becomes
w = Aw + S-1 Bm, where
S2 O' "0 r
A = and S-1B =
_0 Si_ _0 1_

Applying Eq. 8.12-4 to this system yields

pi — s2px
p2 = ~Sip2 (8.12-6)

and Eq. 8.12-5 indicates that (S_1B)Tp = 0 at the switching instants. Thus the
switching instants are given by
/>i(0 +/>2(0 = 0 (8.12-7)

Since Eq. 8.12-6 corresponds to Eq. 7.3-11 except for the minus signs, the pxp2 trajec¬
tories are the zxz2 trajectories of Fig. 7.3-4a, except that the direction of the arrowheads
must be reversed. Superimposing the line px + p2 = 0 on the trajectories of Fig.
7.3-4a reveals that Eq. 8.12-7 is satisfied once for thepxp2 trajectories in the second and
fourth quadrants, and not at all for the first and third quadrant trajectories. Thus the
optimum controller for this system switches once for some initial values of x, and not
at all for some other initial conditions. This is a useful piece of information, which later
helps to define the optimum controller. It is a special case of the general result that the
time optimal control for an /?th order linear system with all real characteristic values has
no more than n — 1 switching instants.34 Unfortunately, this is the only information
provided by Eq. 8.12-5. The optimum controller must be determined from this informa¬
tion and geometrical considerations involving the xxx2 trajectories.
For the case in which (a[2)2 — b and a are both positive, the xxx2 trajectories have
the form of those for the stable node illustrated in Fig. 7.3-8. They are displaced
horizontally, however, since the optimum m2 is not zero, as assumed in Fig. 7.3-8, but

m2 = —M2 sgn (S_1Bp)2 = — M2 sgn (px + p2)

This moves the singular point. For px + p2 positive, the displacement is MJb to the left.
For px + p2 negative, the displacement is M2/b to the right. These cases are illustrated

Fig. 8.12-1
Sec. 8.12 Singular Control Systems 593

in Fig. 8.12-1. Since the two trajectories labeled “switching curve” are the only tra¬
jectories which pass through x = 0, it is apparent that the optimum x(t) must reach
x = 0 by one of these two trajectories. If the initial point is on one of these two
trajectories, no switches of m° are required. The system follows the trajectory to
x — 0.
If the initial conditions do not correspond to one of the switching curves in the
segment before the particular curve intersects the origin, then the motion of the system
must be to follow one of the trajectories upon which the initial point lies. There are
two of these trajectories, corresponding to px + p2 positive or negative. If the motion is
to reach the origin in one switch, however, the motion must follow the trajectory which
intersects a switching curve. Only one of the two trajectories passing through the
initial point does this, as can be observed by superimposing Figs. 8.12-1 a and 8.12-1 b.
This dictates the proper sign of px + p2 until the motion reaches the switching curve.
At this instant, m2°{t) must switch sign, and the motion then follows the switching curve
to the origin. This is the only manner in which the origin can be reached in one switch,
corresponding to the requirement discussed above. From these considerations, the
optimum trajectories are concluded to be those of Fig. 8.12-2, determined by combining
the appropriate portions of Figs. 8.12-1 a and 8.12-16. For the upper switching curve,
and for all points to the right of the switching curves, the optimum m2{t) is m2°(t) =
— M2. For the lower switching curve, and for all points to the left of the switching
curves, m2°(t) = M2. If m2°(t) is as specified by these two equations, x(t) reaches the
origin in least time.
594 Introduction to Optimization Theory

If the controller corresponding to Fig. 8.12-2 is to be implemented, expressions for


the switching curves are required. Since the trajectories are merely the curves corre¬
sponding to Eqs. 7.3-12 and 7.3-10, but displaced, the switching curves are described

by
(—s2x 1 + x2) = c( —S1X1 + x2)sl's2

if x1 is replaced by xx + 1 for m2°(0 = —M2 and x1 — 1 for m2°(t) = M2, and c is then
determined so that xx — x2 = 0 is a solution of the equation. The result is

for m2\t) = — M2, i.e., the upper switching curve of Fig. 8.12-2. The lower switching
curve is described by

These expressions, and measured values of xx{t) and x2(t), could be utilized by a special
purpose “computer” to generate m2°{t).
It is interesting to compare the controller determined in this example with the one
determined in Example 7.13-1, since the controlled elements are the same. Since the
switching function of Example 7.13-1 is restricted to be a linear combination of the state
variables, the switching curve is the straight line x2 = —axJi 1 4- b). This switching
line is shown in Fig. 8.12-2, for a = 3^ -f 3-1^, b = 1. It results in more than one
switch to reach the origin, and correspondingly the motion takes longer time. Typically,
the motion would “chatter” along the switching line to reach the origin.

Example 8.12-2. The controlled elements for the single-axis attitude control of a space
vehicle are approximated by the representation of Fig. 8.12-3. Determine the optimum
controller to remove initial condition errors according to a minimum of the performance
index

I = [m2x2 sgn (m2x2) + c] dt, c = constant


J to
subject to the constraint \m2\ < M2. The first term in the performance index is the
energy required for control, assuming an ideal controller. Minimization of the energy is
often of importance in systems of this type. The second term corresponds to the minimal
time criterion of Example 8.12-1. The reason for such a term in the performance index
of an attitude system is less obvious. However, such a term, combined with the terminal
requirements x(tf) = 0, is a useful approximation to other indices for which the op¬
timum solution may be more difficult to determine.35

m2 = X2 X2 = Xl *i
-^ /
/
Torque Rate Angle

Fig. 8.12-3
Sec. 8.12 Singular Control Systems 595

X2

*2 X2

Fig. 8.12-4

For this system, H Pix2 + F2w2 + c -f m2x2 sgn (m2x2), and

-pi = 0
— p2 = Pi + m2 sgn (m2x2)
H is minimized by
(0 if \x2\ > \p2\
m2 =
\-M2 sgn(/?2) if \x2\ < \p2\

and m2 switches when \x2\ — \p2\. This problem differs from Example 8.12-1 in that a
region of state space is introduced for which m2 — 0. This is a result of the energy term
in the performance index.
Since dx2/dxl = m2/x2, the trajectories are given by

x2 — constant, m2 = 0

x2 2
+ [M2 sgn (p2)]x 1 = constant, m2 0
2

They appear as in Fig. 8.12-4. Since only two of the trajectories pass through the origin,
they must comprise an optimum switching curve, as shown in Fig. 8.12-5a. However,
596 Introduction to Optimization Theory

there must be a second switching curve, since the p2x2 trajectories are vertical lines for
m2 = 0, indicating two values ofp2 for which \p2\ = \x2\. These trajectories are illustrated
in Fig. 8.12-56, and they also show that the change in p2 between the two switches is
equal to twice the value of x2 at which the switches take place. This information,
combined with another of Pontryagin’s equations, determines the other switching curve.
Pontryagin has shown that H is a positive constant or zero along an optimum tra¬
jectory, if H does not contain t explicitly.17 Furthermore, if tf is not fixed, as it is not in
this case, H° = 0 at t = tf. If this were not true, a further minimization of I could be
obtained by increasing tf. Thus H° = 0 for all t0 < t < tf. In particular, in the region
where m2° = 0,
H0 = 0 = p1°x2s + c

where x2s is the value of x2° when m2° = 0, i.e., w'hen the switches occur. But, if
m2° = 0,

dp2° _ pi _ c
= constant
dx!° a2s x2s2

*2

Fig. 8.12-5
Sec. 8.12 Singular Control Systems 597

Then the change in x-^ between switches is

/.x2s
7 IT ^
<5V
c

The latter relationship results from the fact that the change in p2° between the two
switches is 2x2s. Since the previously determined switching curve is given by

o
V/
— + M2xl = 0, x2 > 0,

rH
/v> 2

o
A
- M2x1 = 0, x2 < 0,

rH
L

the other switching curve is

/y»
Xo
2
it 2x23\
o'

-£~ + m2 + - > 0, xx < 0


II

<M
2 \
V c /
ry> 2
/ 2x23\
o'

<0, x2 > 0
7
II

h+ c
<M

The two switching curves and typical trajectories are shown in Fig. 8.12-6.
It is interesting to note that, as c approaches infinity, the two switching curves coalesce
and the region for which m2° is zero disappears. This is the time optimal controller for
this case, since the energy term in I is negligible compared to the infinite weighting on
minimum tf. (See also Problem 7.14.) As c approaches zero, the latter switching curve
becomes the horizontal axis. The system is turned off*. This is obviously the way to
save energy, if one does not care how long it takes to reach the origin.

x2

Fig. 8.12-6
598 Introduction to Optimization Theory

The examples of this section have illustrated the singular nature of


minimal time optimization problems. Minimal time problems have
received extensive coverage in the literature.36-39 However, they do
suffer from the drawback that the controller gain becomes infinite at the
terminal time. This is evident in these examples from the fact that the
switching curves pass through the origin. A similar result was observed
in Example 8.4-6. When precise terminal requirements are imposed on
the state variables, the corresponding optimum controller feedback gains
become infinite at the terminal time. Because of physical limitations, this
cannot be satisfied. Thus an actual feedback system is not able precisely to
satisfy terminal requirements in the presence of disturbances. For this
reason, some workers use penalty functions to approximate terminal
requirements, rather than using precise terminal requirements.
A second difficulty with singular problems is indicated by the fact that
geometrical considerations were utilized to determine the switching curves.
In the case of controlled elements above second order, this can be extremely
difficult. For this reason, the possibility of designing special purpose
computers to solve singular problems has been considered in the
literature.40,41

REFERENCES

1. N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series,


John Wiley and Sons, New York, 1949; also NDRC Report, 1942.
2. R. C. Booton, Jr., “An Optimization Theory for Time-Varying Linear Systems with
Nonstationary Statistical Inputs,” Proc. I.R.E., August 1952, pp. 977-981.
3. G. C. Newton, Jr., L. A. Gould, and J. F. Kaiser, Analytical Design of Linear Feed¬
back Controls, John Wiley and Sons, New York, 1957.
4. F. J. Ellert and C. W. Merriam, III, “Synthesis of Feedback Controls Using Opti¬
mization Theory—An Example,” Trans IEEE AC-8, No. 2, April 1963, pp. 89-103.
5. F. J. Ellert and C. W. Merriam, III, A Longitudinal Guidance System for Aircraft
Landing during Flare-out, Proceedings of the Second International Congress of
IFAC, Butterworth and Co., Ltd., London, England, 1963.
6. D. Graham and R. C. Lathrop, “The Synthesis of ‘Optimum’ Transient Response:
Criteria and Standard Forms,” Trans. AIEE, Vol. 72, pt. 2, November 1953, pp.
278-288.
7. W. C. Schultz, and V. C. Rideout, “The Selection and Use of Servo Performance
Criteria,” Trans. AIEE, Vol. 76, Pt. 2, 1957, pp. 383-388.
8. W. C. Schultz and V. C. Rideout, “Control System Performance Measures: Past
Present, and Future,” Trans. IRE AC-6, No. 1, February 1961, pp. 22-35.
9. J. E. Gibson et al., “Specification and Data Presentation in Linear Control
Systems—Part Two,” AFMDC-TR-61-5, School of Electrical Engineering, Purdue
University, Lafayette, Indiana, May 1961, pp. 46-115.
10. J. Wolkovitch et ah, “Performance Criteria for Linear Constant-Coefficient
Systems with Deterministic Inputs,” Tech. Rpt. No. ASD-TR-61-501, Aeronautical
Systems Division, Wright-Patterson Air Force Base, Ohio, February 1962.
References 599

11. R. Magdaleno and J. Wolkovitch, “Performance Criteria for Linear Constant-


Coefficient Systems with Random Inputs,” Tech. Rpt. No. ASD-TDR-62-470,
Aeronautical Systems Division, Wright-Patterson Air Force Base, Ohio, January
1963.
12. M. A. Aizerman, Lectures on the Theory of Automatic Control (Russian) second
edition, Gostekizdat, 1958, pp. 302-320.
13. Z. V. Rekasius “A General Performance Index for Analytical Design of Control
Systems,” Trans IRE, AC-6, No. 2, May 1961, pp. 217-222.
14. C. W. Merriam, III, “Synthesis of Adaptive Controls,” Sc.D. Thesis, Massachusetts
Institute of Technology, May 1958.
15. R. E. Kalman and R. W. Koepcke, “Optimal Synthesis of Linear Sampling Control
Systems Using Generalized Performance Indices,” Trans. ASME, Vol. 80, Nov¬
ember 1958, pp. 1820-1826.
16. F. J. Ellert, “Indices for Control System Design Using Optimization Theory,”
Doctoral Thesis, Dept, of Electrical Engineering, Rensselaer Polytechnic Institute,
Troy, N.Y., 1963.
17. L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko,
The Mathematical Theory of Optimal Processes, Interscience Division, John Wiley
and Sons, New York, 1962.
18. L. E. Elsgolc, Calculus of Variations, Addison-Wesley Publishing Co. Reading,
Mass., 1962.
19. I. M. Gelfand and S. V. Fomin, The Calculus of Variations, Prentice-Hall, Inc.,
Englewood Cliffs, N.J., 1963.
20. W. W. Seifert and C. W. Steeg, Jr., Control System Engineering, McGraw-Hill
Book Co., New York, 1960, pp. 60-61.
21. G. L. Collina, and P. Dorato, “Application of Pontryagin’s Maximum Principle:
Linear Control Systems,” Research Report No. PIBMRI-1015-62, Polytechnic
Institute of Brooklyn, June 1962.
22. A. L. Whiteley, “Theory of Servo Systems, with Particular Reference to Stabili¬
zation,” lEEProc., Vol. 93, August, 1946, pp. 353-372.
23. W. Kipiniak, Dynamic Optimization and Control, a Variational Approach, John
Wiley and Sons, New York, 1961.
24. H. J. Kelley, “Method of Gradients,” Chapter 6 of Optimization Techniques,
edited by G. Leitmann, Academic Press, New York, 1961.
25. C. W. Merriam, III, Optimization Theory and the Design of Feedback Control
Systems, McGraw-Hill Book Co., New York, 1964.
26. L. W. Neustadt, “A Synthesis Method for Optimal Controls,” Proc. Optimum
System Synthesis Conference, Tech. Documentary Rept. No. ASD-TDR-63-119,
Aeronautical Systems Division, Wright-Patterson Air Force Base, Ohio, February
1963, pp. 374-381.
27. D. K. Scharmack, “The Equivalent Minimization Problem and the Newton-
Raphson Optimization Method,” Proc. Optimum System Synthesis Conference,
Tech. Documentary Rept. No. ASD-TDR-63-119, Aeronautical Systems Division,
Wright-Patterson Air Force Base, Ohio, February 1963, pp. 119-158.
28. J. L. Speyer, “Optimization and Control Using Perturbation Theory to Find
Neighboring Optimum Paths,” Symposium on Multivariable System Theory, SIAM
Meeting, Cambridge, Mass., November 1-3, 1962.
29. R. E. Bellman, Dynamic Programming, Princeton University Press, Princeton, N.J.,
1957.
30. R. Bellman and R. Kalaba, “Reduction of Dimensionality, Dynamic Programming,
and Control Processes,” Rand Corp. Rept. R-1964, June 1960.
600 Introduction to Optimization Theory

31. M. Aoki “Dynamic Programming and Numerical Experimentation as Applied to


Adaptive Control,” Doctoral Thesis, U.C.L.A., November 1959.
32. H. J. Kelley, “Gradient Theory of Optimal Flight Paths,” J. Am. Rocket Soc., Vol.
30. 1960, pp. 947-953.
33. A. E. Bryson, and W. F. Denham, “A Steepest Ascent Method for Solving Optimum
Programming Problems,”/. Appl. Mech., June 1962, pp. 247-257.
34. Pontryagin, Boltyanskii, Gamkrelidze, and Mishchenko, op. cit., p. 120.
35. E. W. Owen, “Optimum Reaction Wheel Attitude Control,” Doctoral Thesis,
Dept, of Electrical Engineering, Rensselaer Polytechnic Institute, Troy, N.Y., 1963.
36. C. A. Desoer, “The Bang-Bang Control Problem Treated by Variational
Techniques,” Information and Control, Vol. 2, December 1959, pp. 333-348.
37. J. Wing and C. A. Desoer, “The Multiple-Input Minimal Time Regulator Problem
(General Theory),” Trans. IEEE, Vol. AC-8, No. 2, April 1963, pp. 125-136.
38. J. P. LaSalle, “The Time Optimal Control Problem,” Contributions to the Theory of
Nonlinear Oscillations, Vol. 5, Princeton University Press, Princeton, N.J., 1959.
39. E. B. Lee, “Mathematical Aspects of the Synthesis of Linear Minimum Response-
Time Controllers,” IRE Trans., Vol., AC-5, September 1960, pp. 283-289.
40. Pontryagin, Boltyanskii, Gamkrelidze, and Mishchenko, op. cit., pp. 172-181.
41. E. G. Gilbert, “Hybrid Computer Solution of Time-Optimal Control Problems,”
Proc. Spring Joint Computer Conference, 1963.

Problems

8.1 Show that optimization according to the performance index

I =
rtf/0(x, rn, r) dr

is equivalent to optimization according to the criterion of Eq. 8.3-13.


8.2 Derive Eq. 8.4-8 by transforming the time-invariant case of Fq. 8.4-11.
8.3 Determine the optimum system corresponding to Example 8.4-3, if

a =

8.4 Determine the optimum system corresponding to Example 8.4-5, subject to


the terminal condition = 0. Assume that tf is finite. What happens
as tf becomes infinite?
8.5 The controlled elements of a positional control system for a small radar
dish are characterized by the transfer function

Yfis) K
Mz(s) s(Ts + 1)

Vi(t)is to be controlled in an optimum fashion according to the perform¬


ance criterion of a minimum value of

/ = (
{[yi (r) - Vi t)]2 + £22"V(t)} dr
Problems 601

(a) The system is to be designed for optimum slewing operation. For this
case, is taken as a step signal of amplitude a. Assume K = 0.2,
K[ 1,22^ ~ 2 and T = I/V5. Determine m£(t) and y^it).

(b) Repeat part (a) for the case in which |ra2(OI < M, where

8 2
M =- =- = 7 t
1 + V5 2 + V5

Compare m2°(t) and y^it) with those of part (a). Discuss the effect of
decreasing K on these responses.
(c) For optimum tracking operation, y1d(t) is taken as a unit ramp.
Repeat parts (a) and (b).
8.6 Derive Eqs. 8.5-1, 8.5-2, and 8.5-3.
8.7 Derive Eqs. 8.10-6 and 8.10-7.
8.8 Repeat Example 8.12-1 for (a/2)2 — b < 0, a > 0.
8.9 Repeat Example 8.12-1 for (#/2)2 — b > 0, a < 0.
8.10 Sketch the curves of Fig. 8.12-6, taking into account the fact that for a
space vehicle
%\(t/) — 2.nk , k =0, i 1, i2, . . . , x2(tf) = 0
are all acceptable terminal conditions.
8.11 Derive the discrete form of Pontryagin’s equations.
V
\

/I

• ' <
Index

Absolute stability, 513-517 Booton, R. C., Jr., 547


Adjoint operator, 74, 371-372 Boundary condition iteration, 574-575
Adjoint system, continuous, 369-372, 377, Branch cut, 140-141
379-381, 558, 561 Branch point, 140
discrete, 438-441 Bryson, A. E., 578
modified, 388-394, 451
Admissible control signal, 556 Casorati’s determinant, 83
Admissible input, 326-327 Cauchy-Lipschitz theorem, 473
Aizerman, M. A., 517 Cauchy-Riemann conditions, 126
Analytic continuation, 130, 159 Cauchy’s theorem, 128-129
Analytic function, 126-127 Cayley-Hamilton technique, 281-286
Anticipative system, 3 Cayley-Hamilton theorem, 275-276
Antidifference operator, 78 Center, 471, 474, 480
Asymptotic stability, definition, 504 Characteristic equation, continuous sys¬
in the large, 511-513 tems, 54, 146
of optimum systems, 557, 574 discrete systems, 85, 180
theorem, 507-511 of a matrix, 233-235
Autocorrelation matrix, 383-384 Characteristic exponents, 374
Autonomous system, 482 Characteristic function, 155
Characteristic value, 232-241
Barbasin, E. A., 509 Characteristic vector, 232-242
Basis, 219, 242, 429 Circle of convergence, 129
reciprocal, 222, 242, 346, 429 Cofactor, 204-207
Bellman, R. E., 548, 551, 576, 577 Commutativity condition, 363
Bendixson’s negative criterion, 490- Complementary solution, continuous sys¬
491 tems, 53, 55, 67-68, 145
Bessel’s equation, 147 discrete systems, 83-86, 92-93, 180
Bessel’s inequality, 293 Complex convolution, 114, 142-143
Bilinear form, 262-263 Complex variables, 125-134
Bilinear transformation, 181 Conjugate point, 555
Bdcher’s formula, 234 Consistency conditions, 326-328

603
604 Index

Continuous fixed systems, vector solution, Discriminant, 267


360-362 Distribution function, 15
Continuous signal, 6 Doublet, 18
Continuous time-varying systems, vector Dyadic product, 215
solution, 375-376 Dynamic programming, 575-577
Contour integral, 127-129
Controllability, continuous systems, 349- Eigenfunction, 157
354 Eigenvalue, 232
discrete systems, 431-432 Eigenvector, 232
Control law, 547-548, 558 Elementary function, 8-1 1, 35-37, 61,
Control law iteration, 580-582 153-158
Control signal iteration, 578-580 Ellert, F.M., 548, 550, 569-573, 582
Convolution integral, 25-32, 124-125,376 Equilibrium point, 471-488
table, 25 Equilibrium state, binary sequential net¬
Convolution summation, 37-39, 43, 179— works, 418
180, 442 Essential singularity, 132
table, 38 Euler equations, 555, 557
Cramer’s rule, 223-225 Euler’s equation, 69, 74, 567
Euler transform, 156
Definite forms, 266-270, 504-505 Eventual stability, 529-531
Delay, unit, 407-408 Exponential order, 112
Delta function, 35 Extremum, 551
Delta response, 39-40, 43, 93-94, 177
Desoer, C. A., 326 Factorial polynomial, 77, 79
Determinants, 202-211 Fibonacci numbers, 466
cofactor, 204 Finite equilibrium states, 418
derivative, 210-211 Fixed system, 4
Laplace expansion, 205-207 Floquet theory, 70
minor, 204 Focus, 474-476
order, 202 Forcing function, 48
pivotal condensation, 207-210 Fourier integral, 101, 105-109
product, 210 Fourier series, 101-105
properties, 203-204 Fourier transform, 105-109, 118, 155
Deterministic system, 3, 326 Frequency spectrum, 63, 103-104
Difference equation, 75, 79-96 Function, 14-15
homogeneous, 82-92 Functional, 552
time-varying, 94-96 Function space, 290-295
vector solution, 441-442 basis, 292
Difference operator, 75-79 completeness relation, 293
Differential equation, 47-75 norm,291
arbitrary constants, 63-65 scalar product, 291
first order, 53-54, 376-377 Fundamental matrix, 356
homogeneous, 52-61
homogeneous initial conditions, 65-66 Gamma function, 142
impulse response, 71-75 Geometric series, 443
simultaneous, 49-51 Gilbert, E. G., 353, 355
time-varying, 68-75, 375-379 Gradient, 289
vector solution, 360-362, 375-376 Gramiam, 217-218, 460
Direction cosines, 251 Gram-Schmidt procedure, 221-222, 459-
Discrete signal, 6, 35-37 460
Index 605

Gravity gradient, 542 Laplace transform, two-sided, 109-111, 155


Green’s function, 73-74 LaSalle, J. P„ 513, 518, 529
Group property, 369 Latent root, 232
Latent vector, 232
Hamiltonian, 541, 554, 556 Laurent’s series, 131-132
Hamilton-Jacobi equation, 577 Least squares, 292, 457-461
Hamilton’s equations, 554-556 Lebesque condition, 291
Hamming code, 304 Lefschetz, S., 513, 518
Hankel transform, 149, 156 Legendre condition, 555
Hermitian form, table, 268 Letov, A. M., 513
Hidden oscillations, 432 Lienard’s equation, 542
Hilbert transform, 156 Limit cycle, 488-498
Hill’s equation, 70 Linear binary sequential networks, 417-
Huffman, D. A., 413 419
Linear codes, 304-305
Impedance, 62, 121 Linear dependence, 52
Impulse function, 13-18 Linear equations, solution, 223-232
Impulse response, 19-20, 23-25, 32, 34, 43, Linear independence, 52, 216-218
71-74, 120-121, 123 Linearization, 509-510
Impulse response matrix, 381-394 Linear system, 3
RLC circuit, 381-383 periodic coefficients, 372-375
simulation, 383-394 Linear vector space, 218-221
Impulse train, 37 dimension, 219
Index, 489-490 projection, 299, 459
Input, 1 Lipschitz function, 473
Integrating factor, 53, 376-377 Lur’e, A. I., 513
Inverse state transition matrix, 369-371 Lyapunov function, 506
Inversion, 203 determination, 524-527
Irreducible polynomial, 419 Lyapunov’s asymptotic stability theorem,
Isocline, 485 507-511
Lyapunov’s second method, 498-501,504-
Jacobian matrix, 482 513
Jacobi condition, 555 discrete systems, 531-534
Jordan canonical form, 255-262, 340-343 nonautonomous systems, 527-529
Lyapunov’s stability theorem, 506-507
Kelley, H. J., 578
Kinariwala, B. K., 365 Maclaurin’s series, 129
Kronecker delta, 207 Mathieu’s equation, 70
Matrix, 193
Lagrange multipliers, 552-553 adjoint, 211
Lagrange’s theorem, 541 calculus, 200-201, 213, 286-290
Lagrange technique, 265 canonical form, 247, 252
Laguerre polynomials, 294 conformability, 197
Laplace transform, 109-125, 134-153 definitions, 194-195,214
convergence, 109-111, 134-135, 138 degeneracy, 216-217, 226
inversion integral, 134-143 diagonalization, 240-241,252-262
one-sided, 110-111 elementary operations, 195-201, 245-
properties, 112-114 246
tables, 111, 114 equality, 196
time-varying, 146-147 equivalent, 246
606 Index

Matrix, full degeneracy, 237 Operator, 3, 49


function, 276-286 Optimization theory, 546-598
index, 252 advantages, 546-547
infinite series, 272-274 boundary condition iteration, 574-575
inverse, 211-214 control law iteration, 580-582
Jordan canonical form, 255-262, 340- control signal iteration, 578-580
343 design example, 582-590
minor, 204 difficulties, 547
modal, 235-239, 285, 341 dynamic programming, 575-577
normal form, 246, 247-249 linear systems, 556-569
partitioned, 199-200 necessary conditions, 551-555, 557, 574
polynomials, 272 penalty functions, 573-574
positive definite, 266-270 singular control systems, 590-598
powers, 270 time optimal systems, 590-598
rank, 216-217, 226 weighting factor selection, 569-573
signature, 252 Optimum configuration, 558
singular, 216 Ordinary point, 473
symmetric, 239-240 Orthogonal functions, 292
trace, 234 Orthogonal projection, 299, 459
trigonometric identities, 274 Output, 1
Matrizant, 363-365
Maximal period networks, 419 Partial fraction expansion, 116-118
Mean square error, 102, 293, 458 Particular solution, continuous systems,
Mellin transform, 149, 156 53, 55-67, 70, 145
Meromorphic function, 136 discrete systems, 83, 92-93, 96, 180
Merriam, C. W., Ill, 548, 558, 560, 574, Peano-Baker method, 364
578, 580-582, 586 Penalty functions, 550, 573-574
Minor, 204 Performance index, 548-551
Mode, 56, 235, 345-349, 429-431 Periodic function, 101
Mode interpretation, continuous systems, Periodic systems, 372-375
344-349 Phase plane, 470-472
discrete systems, 428-431 Pivotal condensation, 207-210, 298, 300-
Modified adjoint system, 388-394 301
Modified Z transform, 182-185 Poincare-Bendixson theorem, 491-492
Modulo 4 counter, 414-415 Poincare’s index, 489-490
Modulo p networks, 417 Poincare’s successor function, 492-498
Moore, E. F., 413 Pole, 132, 143, 174
Multivariable system, 313 Pole shifting, 516-517
Policy, 576
Neumann series, 364, 366 Pontryagin, L. S., 548, 551, 554, 555, 578,
Neustadt, L. W., 574 590, 596
Newton, G. C., Jr., 547 Practical stability, 518-519
Node, 476-477 Principle of optimality, 576
Nonanticipative system, 3, 48 Probabilistic system, 3
Nondeterministic system, 3 Pulse width modulated systems, 532-534
Normal coordinates, 242
Quadratic form, 263-270, 505
Observability, continuous systems, 350- differentiation, 288-290
354 negative definite, 267-268, 270, 505
discrete systems, 432 negative semidefinite, 267-268, 505
Index 607

Quadratic form, positive definite, 266-270, State equations, linear RLC networks, 330-
505, 520-521 332
positive semidefinite, 267-268, 505 normal form, 340-344, 426-428
table, 268 partial fraction technique, 336-340,
Quantized signal, 7 419-426
standard form, 335
Ramp function, 12
time-varying differential equations, 397—
Random signal response, 383-385
398
Rath, R. J., 529-530
State of a system, 325-329, 413
Rayleigh equation, 491
State space, 328
Realizable system, 3
State transition matrix, continuous sys¬
Reciprocal basis, 222, 242, 346, 429
tems, 356-360
Relay controllers, 520-524
discrete systems, 432-441
Residue, 131-134
periodic systems, 372-374
Residue theorem, 132-133
properties, 369, 434
Resonance, 56, 67
time-varying continuous systems, 362-
Riccati equation, 558, 561
369
Saddle point, 477-478 Stationary point, 551
Sampling property, 16, 31 Step function, 11-12
Sampling theorem, 7, 166-167 Step response, 19, 32-34
Scharmack, D. K., 574 Stored energy, 16, 67, 121-122
Schwarz inequality, 215, 310 Strong practical stability, 518-519, 521
Semidefinite forms, 267-268, 504 Summation by parts, 444
Separatrix, 488 Superposition integral, 32-33, 376
Shannon, C. E., 7, 167 table, 33
Shaping filter, 384-385, 388 Superposition principle, 3, 23
Shifting operator, 75 Superposition summation, 38-43, 442
Signal, continuous, 6 table, 42
discrete, 6 Sylvester’s theorem, 276-281, 357, 505
quantized, 7 System, anticipative, 3
Simulation diagrams, continuous systems, deterministic, 3
314-322 fixed, 4, 48
discrete systems, 407-410 linear, 3
Singular control systems, 590-598 nonanticipative, 3
Singularity, 132 nondeterministic, 3
Singularity functions, 11-13, 18-19 probabilistic, 3
Singular point, 471-488 realizable, 3
table, 480-481 time-invariant, 4
Singular solution, 471 System function, 61, 93, 104, 122-123,
Spectral function, 9-10, 154, 157 157, 176-177
Spectral representation, 399 time-varying, 149-153, 181
Speyer, J. L., 574
Stable equilibrium, 503 Taylor’s series, 129-131
Stable system, 67, 92, 146, 447-448 Telescopic series, 78, 79
Stability, 501-504 Terminal control system, 380
Stability theorem, 506-507 Time-invariant system, 4
State determined system, 328 Time optimal systems, 590-598
State equations, 328 Total stability, 518
linear continuous systems, 329-344 Trajectory, 328, 470
linear discrete systems, 413-428 Transfer function, see System function
608 Index

Transfer function matrices, continuous Vectors, basis, 219, 242, 429


systems, 322-325, 354-355 column, 194
discrete systems, 410-413 linear independence, 216
Transformations, 196, 249-262 norm, 215
collineatory, 241, 249-250 orthogonal, 215
congruent, 249, 251, 252 orthogonalization, 221-222
orthogonal, 249-251 orthonormal, 221
similarity, 241, 249-250, 254 outer product, 215
unitary, 249, 251 reciprocal basis, 222, 242, 346, 429
Transient response estimation, 519-520 row, 194
Transmission matrix, 455-456 scalar product, 214-215
Transversality condition, 554 unit, 216
Triangle inequality, 215 Volterra integral equation, 364, 366
Two-point boundary value problem, 557—
561 Weighting factors, 550, 569-573
Weighting function, 26, 30, 33, 294
Undetermined coefficients, difference
Whiteley, A. L., 571, 573
equations, 86-87
Whittaker transform, 156
differential equations, 55-56
Wiener, N., 547
Unit doublet, 18
Wronskian, 52, 60, 66, 83
Unit function response matrix, 448-451
Unit impulse function, 13-18
Zadeh, L. A., 149, 157, 326
Unit ramp function, 12
Zero, 130, 143, 174
Unit step function, 11-13, 107
Zero order hold, 36
Unstable equilibrium, 503
Z transform, 158-173
Unstable system, 67, 92, 146
convergence, 160-162, 172-173
Variable gradient method, 524-527 difference equations, 175-176
Variational calculus, 551-555 inversion, 169-173
Variational equations, 479-488 one-sided, 160-169
Variation of parameters, difference equa¬ properties, 162-169
tions, 88-92 tables, 161, 163, 164
differential equations,57-61,66, 361-362 two-sided, 159-161
\

You might also like