A Course in Robust Control Theory - A Convex Approach
A Course in Robust Control Theory - A Convex Approach
A Course in Robust
Control Theory
a convex approach
Contents
0 Introduction 1
0.1 System representations . . . . . . . . . . . . . . . . . . . 2
0.1.1 Block diagrams . . . . . . . . . . . . . . . . . . . 2
0.1.2 Nonlinear equations and linear decompositions . . 4
0.2 Robust control problems and uncertainty . . . . . . . . . 9
0.2.1 Stabilization . . . . . . . . . . . . . . . . . . . . . 9
0.2.2 Disturbances and commands . . . . . . . . . . . . 12
0.2.3 Unmodeled dynamics . . . . . . . . . . . . . . . . 15
1 Preliminaries in Finite Dimensional Space 18
1.1 Linear spaces and mappings . . . . . . . . . . . . . . . . 18
1.1.1 Vector spaces . . . . . . . . . . . . . . . . . . . . 19
1.1.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . 21
1.1.3 Bases, spans, and linear independence . . . . . . 22
1.1.4 Mappings and matrix representations . . . . . . 24
1.1.5 Change of basis and invariance . . . . . . . . . . 28
1.2 Subsets and Convexity . . . . . . . . . . . . . . . . . . . 30
1.2.1 Some basic topology . . . . . . . . . . . . . . . . 31
1.2.2 Convex sets . . . . . . . . . . . . . . . . . . . . . 32
1.3 Matrix Theory . . . . . . . . . . . . . . . . . . . . . . . . 38
1.3.1 Eigenvalues and Jordan form . . . . . . . . . . . 39
1.3.2 Self-adjoint, unitary and positive denite matrices 41
1.3.3 Singular value decomposition . . . . . . . . . . . 45
1.4 Linear Matrix Inequalities . . . . . . . . . . . . . . . . . 47
ii Contents
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2 State Space System Theory 57
2.1 The autonomous system . . . . . . . . . . . . . . . . . . 58
2.2 Controllability . . . . . . . . . . . . . . . . . . . . . . . . 61
2.2.1 Reachability . . . . . . . . . . . . . . . . . . . . . 61
2.2.2 Properties of controllability . . . . . . . . . . . . 66
2.2.3 Stabilizability and the PBH test . . . . . . . . . . 69
2.2.4 Controllability from a single input . . . . . . . . . 72
2.3 Eigenvalue assignment . . . . . . . . . . . . . . . . . . . 74
2.3.1 Single input case . . . . . . . . . . . . . . . . . . 74
2.3.2 Multi input case . . . . . . . . . . . . . . . . . . . 75
2.4 Observability . . . . . . . . . . . . . . . . . . . . . . . . 77
2.4.1 The unobservable subspace . . . . . . . . . . . . . 78
2.4.2 Observers . . . . . . . . . . . . . . . . . . . . . . 81
2.4.3 Observer-Based Controllers . . . . . . . . . . . . 83
2.5 Minimal realizations . . . . . . . . . . . . . . . . . . . . 84
2.6 Transfer functions and state space . . . . . . . . . . . . . 87
2.6.1 Real-rational matrices and state space realizations 89
2.6.2 Minimality . . . . . . . . . . . . . . . . . . . . . . 92
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3 Linear Analysis 97
3.1 Normed and inner product spaces . . . . . . . . . . . . . 98
3.1.1 Complete spaces . . . . . . . . . . . . . . . . . . 101
3.2 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.2.1 Banach algebras . . . . . . . . . . . . . . . . . . . 107
3.2.2 Some elements of spectral theory . . . . . . . . . 110
3.3 Frequency domain spaces: signals . . . . . . . . . . . . . 113
3.3.1 The space L^ 2 and the Fourier transform . . . . . 113
3.3.2 The spaces H2 and H2? and the Laplace transform 115
3.3.3 Summarizing the big picture . . . . . . . . . . . . 119
3.4 Frequency domain spaces: operators . . . . . . . . . . . . 120
3.4.1 Time invariance and multiplication operators . . 121
3.4.2 Causality with time invariance . . . . . . . . . . . 122
3.4.3 Causality and H1 . . . . . . . . . . . . . . . . . 124
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4 Model realizations and reduction 131
4.1 Lyapunov equations and inequalities . . . . . . . . . . . 131
4.2 Observability operator and gramian . . . . . . . . . . . . 134
4.3 Controllability operator and gramian . . . . . . . . . . . 137
4.4 Balanced realizations . . . . . . . . . . . . . . . . . . . . 140
4.5 Hankel operators . . . . . . . . . . . . . . . . . . . . . . 143
4.6 Model reduction . . . . . . . . . . . . . . . . . . . . . . . 147
Contents iii
4.6.1 Limitations . . . . . . . . . . . . . . . . . . . . . 148
4.6.2 Balanced truncation . . . . . . . . . . . . . . . . 151
4.6.3 Inner transfer functions . . . . . . . . . . . . . . . 154
4.6.4 Bound for the balanced truncation error . . . . . 155
4.7 Generalized gramians and truncations . . . . . . . . . . . 160
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5 Stabilizing Controllers 167
5.1 System Stability . . . . . . . . . . . . . . . . . . . . . . . 169
5.2 Stabilization . . . . . . . . . . . . . . . . . . . . . . . . . 172
5.2.1 Static state feedback stabilization via LMIs . . . 173
5.2.2 An LMI characterization of the stabilization prob-
lem . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.3 Parametrization of stabilizing controllers . . . . . . . . . 175
5.3.1 Coprime factorization . . . . . . . . . . . . . . . . 176
5.3.2 Controller Parametrization . . . . . . . . . . . . . 179
5.3.3 Closed-loop maps for the general system . . . . . 183
5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
6 H2 Optimal Control 188
6.1 Motivation for H2 control . . . . . . . . . . . . . . . . . 190
6.2 Riccati equation and Hamiltonian matrix . . . . . . . . . 192
6.3 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.4 State feedback H2 synthesis via LMIs . . . . . . . . . . . 202
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
7 H1 Synthesis 208
7.1 Two important matrix inequalities . . . . . . . . . . . . 209
7.1.1 The KYP Lemma . . . . . . . . . . . . . . . . . . 212
7.2 Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.3 Controller reconstruction . . . . . . . . . . . . . . . . . . 222
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8 Uncertain Systems 227
8.1 Uncertainty modeling and well-connectedness . . . . . . 229
8.2 Arbitrary block-structured uncertainty . . . . . . . . . . 234
8.2.1 A scaled small-gain test and its suciency . . . . 236
8.2.2 Necessity of the scaled small-gain test . . . . . . . 239
8.3 The Structured Singular Value . . . . . . . . . . . . . . . 245
8.4 Time invariant uncertainty . . . . . . . . . . . . . . . . . 248
8.4.1 Analysis of time invariant uncertainty . . . . . . . 249
8.4.2 The matrix structured singular value and its upper
bound . . . . . . . . . . . . . . . . . . . . . . . . 257
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
iv Contents
9 Feedback Control of Uncertain Systems 270
9.1 Stability of feedback loops . . . . . . . . . . . . . . . . . 273
9.1.1 L2 -extended and stability guarantees . . . . . . . 274
9.1.2 Causality and maps on L2 -extended . . . . . . . . 277
9.2 Robust stability and performance . . . . . . . . . . . . . 280
9.2.1 Robust stability under arbitrary structured uncer-
tainty . . . . . . . . . . . . . . . . . . . . . . . . . 281
9.2.2 Robust stability under LTI uncertainty . . . . . . 281
9.2.3 Robust Performance Analysis . . . . . . . . . . . 282
9.3 Robust Controller Synthesis . . . . . . . . . . . . . . . . 284
9.3.1 Robust synthesis against a;c .......... 285
9.3.2 Robust synthesis against . . . . . . . . . . .
TI 289
9.3.3 D-K iteration: a synthesis heuristic . . . . . . . . 293
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
10 Further Topics: Analysis 298
10.1 Analysis via Integral Quadratic Constraints . . . . . . . 298
10.1.1 Analysis results . . . . . . . . . . . . . . . . . . . 303
10.1.2 The search for an appropriate IQC . . . . . . . . 308
10.2 Robust H2 Performance Analysis . . . . . . . . . . . . . 310
10.2.1 Frequency domain methods and their interpretation 311
10.2.2 State-Space Bounds Involving Causality . . . . . 316
10.2.3 Comparisons . . . . . . . . . . . . . . . . . . . . 320
10.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . 321
11 Further Topics: Synthesis 323
11.1 Linear parameter varying and multidimensional systems 324
11.1.1 LPV synthesis . . . . . . . . . . . . . . . . . . . . 327
11.1.2 Realization theory for multidimensional systems . 333
11.2 A Framework for Time Varying Systems: Synthesis and
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
11.2.1 Block-diagonal operators . . . . . . . . . . . . . 338
11.2.2 The system function . . . . . . . . . . . . . . . . 340
11.2.3 Evaluating the `2 induced norm . . . . . . . . . . 344
11.2.4 LTV synthesis . . . . . . . . . . . . . . . . . . . . 347
11.2.5 Periodic systems and nite dimensional conditions 349
A Some Basic Measure Theory 352
A.1 Sets of zero measure . . . . . . . . . . . . . . . . . . . . 352
A.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 355
A.3 Comments on norms and Lp spaces . . . . . . . . . . . . 357
B Proofs of Strict Separation 359
C -Simple Structures 365
Contents v
C.1 The case of 1; 1 . . . . . . . . . . . . . . . . . . . . . . 366
C.2 The case of 0; 3 . . . . . . . . . . . . . . . . . . . . . . 370
References 375
This is page vi
Printer: Opaque this
Preface
Research in robust control theory has been one of the most active areas of
mainstream systems theory since the late 70s. This research activity has
been at the con
uence of dynamical systems theory, functional analysis,
matrix analysis, numerical methods, complexity theory, and engineering ap-
plications. The discipline has involved interactions between diverse research
groups including pure mathematicians, applied mathematicians, computer
scientists and engineers, and during its development there has been a sur-
prisingly close connection between pure theory and tangible engineering
application. By now this research eort has produced a rather extensive
set of approaches using a wide variety of mathematical techniques, and
applications of robust control theory are spreading to areas as diverse as
control of
uids, power networks, and the investigation of feedback mech-
anisms in biology. During the 90s the theory has seen major advances and
achieved a new maturity, centered around the notion of convexity. This em-
phasis is two-fold. On one hand, the methods of convex programming have
been introduced to the eld and released a wave of computational meth-
ods which, interestingly, have impact beyond the study of control theory.
Simultaneously a new understanding has developed on the computational
complexity implications of uncertainty modeling; in particular it has be-
come clear that one must go beyond the time invariant structure to describe
uncertainty in terms amenable to convex robustness analysis.
Our broad goal in this book is to give a graduate-level course on ro-
bust control theory that emphasizes these new developments, but at the
same time conveys the main principles and ubiquitous tools at the heart
of the subject. This course is intended as an introduction to robust control
Contents vii
theory, and begins at the level of basic systems theory, but ends having
introduced the issues and machinery of current active research. Thus the
pedagogical objectives of the book are (1) to introduce a coherent and uni-
ed framework for studying robust control theory; (2) to provide students
with the control-theoretic background required to read and contribute to
the research literature; (3) the presentation of the main ideas and demon-
strations of the major results of robust control theory. We therefore hope
the book will be of value to mathematical researchers and computer sci-
entists wishing to learn about robust control theory, graduate students
planning to do research in the area, and engineering practitioners requiring
advanced control techniques. The book is meant to feature convex methods
and the viewpoint gained from a general operator theory setting, however
rather than be purist we have endeavored to give a balanced course which
ts these themes in with the established landscape of robust control the-
ory. The eect of this intention on the book is that as it progresses these
themes are increasingly emphasized, whereas more conventional techniques
appear less frequently. The current research literature in robust control
theory is vast and so we have not attempted to cover all topics, but have
instead selected those that we believe are central and most eectively form
a launching point for further study of the eld.
The text is written to comprise a two-quarter or two-semester gradu-
ate course in applied mathematics or engineering. The material presented
has been successfully taught in this capacity during the past few years by
the authors at Caltech, University of Waterloo, University of Illinois, and
UCLA. For students with background in state space methods a serious
approach at a subset of the material can be achieved in one semester. Stu-
dents are assumed to have familiarity with linear algebra, and otherwise
only advanced calculus and basic complex analysis are strictly required.
After an introduction and a preliminary technical chapter, the course
begins with a thorough introduction to state space systems theory. It then
moves on to cover open-loop systems issues using the newly introduced
concept of a norm. Following this the simplest closed-loop synthesis issue
is addressed, that of stabilization. Then there are two chapters on synthesis
which cover the H2 and H1 formulations. Next open-loop uncertain system
models are introduced; this chapter gives a comprehensive treatment of
structured uncertainty using perturbations that are either time invariant
or arbitrary. The results on open-loop uncertain systems are then applied to
feedback control in the following chapter where both closed-loop analysis
and synthesis are addressed. The nal two chapters are devoted to the
presentation of four advanced topics in a more descriptive manner. In the
preliminary chapter of the book some basic ideas from convex analysis are
presented as is the important concept of a linear matrix inequality (LMI).
Linear matrix inequalities are perhaps the major analytical tool used in
this text, and combined with the operator theory framework presented later
viii Contents
provide a powerful perspective. A more detailed summary of the chapters
is given below.
List of Figures
0
Introduction
y P u
y P2 v P1 u
v = P1 (u)
y = P2 (v):
We see that this interconnection takes the two subsystems P1 and P2 to
form a system P dened by P (u) = P2 (P1 (u) ). Thus this diagram simply
depicts a composition of maps. Notice that the input to P2 is the output
of P1 .
z w
P
u
y
Q
G
y u
Figure 2. System decomposition
by linear dierential equations, and Q is a static nonlinear mapping. By
static we mean that the output of Q at any point in time depends only on
the input at that particular time, or equivalently that Q has no memory.
Thus all of the nonlinear behavior of the initial system (1) is captured in
Q and the feedback interconnection.
We will almost exclusively work with the case where the point (0; 0),
around which this decomposition is taken, is an equilibrium point of (1).
Namely
f (0; 0) = 0:
In this case the functions g and r satisfy g(0; 0) = 0 and r(0; 0) = 0, and
therefore Q(0; 0) = 0. Also the linear system described by
x_ = Ax + Bu
y = Cx + Du
is the linearization of (1) around the equilibrium point. The linear system
G is thus an augmented version of the linearization.
6 0. Introduction
Higher order dynamics
In the construction just considered we were able to isolate nonlinear system
aspects in the mapping Q, and our motivation for this was so that later
we will be able to replace Q with an alternative description which is more
easily analyzed. For the same reason we will sometimes wish to do this
not only with the nonlinear part of a system, but also with some of its
dynamics. Let us now move on to consider this more complex scenario. We
have the equations
x_ 1 = f1 (x1 ; x2 ; u) (6)
x_ 2 f2 (x1 ; x2 ; u)
y = h(x1 ; x2 ; u);
Following a similar procedure to the one we just carried out on the system
in (1), we can decompose the system described in (6) to arrive at the
equivalent set of equations:
x_ 1 = A1 x1 + B1 u + g1 (x1 ; x2 ; u) (7)
x_ 2 = f2 (x1 ; x2 ; u)
y = C1 x1 + Du + r(x1 ; x2 ; u):
This is done by focusing on the equations x_ 1 = f1 (x1 ; x2 ; u) and y =
h(x1 ; x2 ; u), and performing the same steps as before treating both x2 and
u as the inputs. The equations in (7) are equivalent to the linear equations
x_ 1 = A1 x1 + B1 u + w1 (8)
y = C1 x1 + Du + w2 ;
coupled with the nonlinear equations
x_ 2 = f2 (x1 ; x2 ; u) (9)
(w1 ; w2 ) = (g1 (x1 ; x2 ; u); r(x1 ; x2 ; u) ):
Now similar to before we set G to be the linear system
G : (w1 ; w2 ; u) 7! (x1 ; u; y)
which satises the equations in (8). Also dene Q to be the system described
by (9) where
Q : (x1 ; u) 7! (w1 ; w2 ):
With these new denitions of P and Q we see that Figure 2 depicts the
system described in (6). Furthermore part of the system dynamics and all
of the system nonlinearity is isolated in the mapping Q. Notice that the
decomposition we performed on (1) is a special case of the current one.
0.1. System representations 7
Modeling Q
In each of the two decompositions just considered we split the initial sys-
tems, given by (1) or (6), into a G-part and a Q-part. The system G was
described by linear dierential equations, whereas nonlinearities were con-
ned to Q. This decomposing of systems into a linear low dimensional part,
and a potentially nonlinear and high dimensional part, is at the heart of
this course. The main approach adopted here to deal with the Q-part will
be to replace it with a set of linear maps which capture its behavior.
The motivation for doing this is that the resulting analysis can frequently
be made tractable.
Formally stated we require the set to have the following property: if
q = Q(p), for some input p, then there should exist a mapping in the set
such that
q = (p): (10)
The key idea here is that the elements of the set can be much simpler
dynamically than Q. However when combined in a set they are actually
able to generate all of the possible input-output pairs (p; q) which satisfy
q = Q(p). Therein lies the power of introducing : one complex object can
be replaced by a set of simpler ones.
We now discuss how this idea can be used for analysis of the system
depicted in Figure 2. Let
S(G; Q) denote the mapping u 7! y in Figure 2.
Now replace this map with the set of maps
S(G; )
generated by choosing from the set . Then we see that if the input-
output behaviors associated with all the mappings S(G; ) satisfy a given
property, then so must any input-output behavior of S(G; Q). Thus any
property which holds over the set is guaranteed to hold for the system
S(G; Q). However the converse is not true and so analysis using can in
general be conservative. Let us consider this issue.
If a set has the property described in (10), then providing that it has
more than one element, it will necessarily generate more input-output pairs
than Q. Specically
f(p; q) : q = Q(p)g f(p; q) : there exists 2 , such that q = qg:
Clearly the set on the left denes a function, whereas the input-output
pairs generated by is in general only a relation. The degree of closeness
of these sets determines the level of conservatism introduced by using
in place of Q.
We now illustrate how the behavior of Q can be captured by with two
simple examples.
8 0. Introduction
Examples:
We begin with the decomposition for (1). For simplicity assume that x, y
and u are all scalar valued functions. Now suppose that the functions r and
g, which dene Q, are known to satisfy the sector or Lipschitz bounds
jw1 (t)j k11 jx(t)j + k12 ju(t)j (11)
jw2 (t)j k21 jx(t)j + k22 ju(t)j;
for some positive constants kij . It follows that if for particular sig-
nals (w1 ; w2 ) = Q(x; u), then there exist scalar functions of time
11 (t),12 (t),21 (t) and 22 (t), each satisfying ij (t) 2 [;kij ; kij ], such that
w1 (t) = 11 (t)x(t) + 12 (t)u(t) (12)
w2 (t) = 21 (t)x(t) + 22 (t)u(t);
Dene the set to consist of all 2 2 matrix functions which satisfy
= 11 ((tt)) 12 ((tt)) ; where jij (t)j kij for each time t 0.
21 22
From the above discussion it is clear the set has the property that given
any inputs and outputs satisfying (w1 ; w2 ) = Q(x; u), there exists 2
satisfying (12).
Let us turn to an analogous construction associated with the decomposi-
tion of the system governed by (6), recalling that Q is now dynamic. Assume
x1 and u are scalar, and suppose it is known that if (w1 ; w2 ) = Q(x1 ; u)
then the following energy inequalities hold:
Z 1 Z 1 Z 1
jw1 (t)j2 dt k1 jx1 (t)j2 dt + ju(t)j2 dt
0 0 0
Z 1 Z 1 Z 1
jw2 (t)j2 dt k2 jx1 (t)j2 dt + ju(t)j2 dt ;
0 0 0
when the right hand side integrals are nite. Dene to consist of all linear
mappings : (x1 ; u) 7! (w1 ; w2 ) which satisfy the above inequalities for
all functions x1 and u from a suitably dened class. It is possible to show
that if (w1 ; w2 ) = Q(x1 ; u), for some bounded energy functions x1 and u,
then there exists a mapping in such that (x1 ; u) = (w1 ; w2 ). In this
sense can generate any behavior of Q. As a remark, inequalities such
as the above assume implicitly that initial conditions in the state x2 can
be neglected; in the language of the next section, there is a requirement of
stability in high-order dynamics that can be isolated in this way.
Using a set instead of the mapping Q has another purpose. As already
pointed out physical systems will never be exactly represented by models
of the form (1) or (6). Thus the introduction of the set aords a way to
account for potential system behaviors without explicitly modeling them.
For example the inequalities in (11) may be all that is known about some
0.2. Robust control problems and uncertainty 9
higher order dynamics of a system; note that given these bounds we would
not even need to know the order of these dynamics to account for them
using . Therefore this provides a way to explicitly incorporate knowledge
about the unpredictability of a physical system into a formal model. Thus
the introduction of the serves two related but distinct purposes:
provides a technique for simplifying a given model;
can be used to model and account for uncertain dynamics.
In the course we will study analysis and synthesis using these types of
models, particularly when systems are formed by the interconnection of
many such subsystems. We call these types of models uncertain systems.
0.2.1 Stabilization
One of the most basic goals of a feedback control system is stabilization.
This means nullifying the eects of the uncertainty surrounding the initial
conditions of a system. Before explaining this in more detail we review
some basic concepts.
Consider the autonomous system
x_ = f (x); with some initial condition x(0). (13)
We will be concerned with equilibrium points xe of this system, namely
points where f (xe ) = 0 is satised. Without loss of generality in this dis-
cussion we shall assume that xe = 0, since this can always be arranged by
redening f appropriately. We say that the equilibrium point zero is stable
if for any initial condition x(0) suciently near to zero, the time trajec-
tory x(t) remains near to zero. The equilibrium is exponentially stable if
it is stable and furthermore the function x(t) tends to zero at an exponen-
tial rate when x(0) is chosen suciently small. Stability is an important
property because it is unlikely that a physical system is ever exactly at an
equilibrium point. It says that if the initial state of the system is slightly
perturbed away from the equilibrium, the resulting state trajectory will
not diverge. Exponential stability goes further to say that if such initial
deviations are small then the system trajectory will tend quickly back to
the equilibrium point. Thus stable systems are insensitive to uncertainty
about their initial conditions.
10 0. Introduction
We now review a test for exponential stability. Suppose
x_ = Ax
is the linearization of (13) at the point zero. It is possible to show: the
zero point of (13) is exponentially stable if and only if the linearization
is exponentially stable at zero.1 The linearization is exponentially stable
exactly when all the eigenvalues of the matrix A have negative real part.
Thus exponential stability can be checked directly by calculating A, the
Jacobian matrix of f at zero. Further it can be shown that if any of the
eigenvalues have positive real part, then the equilibrium point zero is not
even stable.
We now move to the issue of stabilization, which is using a control law
to turn an unstable equilibrium point, into an exponentially stable one.
Below is a controlled nonlinear system
x_ = f (x; u); (14)
where the input is the function u. Suppose that (0; 0) is an equilibrium
point of this system. Our stability denitions are extended to such con-
trolled systems in the following way: the equilibrium point (0; 0) is dened
to be (exponentially) stable if zero is an (exponentially) stable equilibrium
point of the autonomous system x_ = f (x; 0).
Our rst task is to investigate conditions under which it is possible to
stabilize such an equilibrium point using a special type of control strategy
called a state feedback. In this scenario we seek a control feedback law of
the form
u(t) = p(x(t));
where p is a smooth function, such that the closed loop system
x_ = f (x; p(x(t)) )
is exponentially stable around zero. That is we want to nd a function p
which maps the state of the system x to a control action u. Let us assume
that such a p exists and examine some of its properties. First notice that in
order for zero to be an equilibrium point of the closed loop we may assume
p(0) = 0:
Given this the linearization of the closed loop is
x_ = (A + BF )x;
where A = d1 f (0; 0), B = d2 f (0; 0) and F = dp(0). Thus we see that the
closed loop system is exponentially stable if and only if all the eigenvalues
of A + BF are strictly in the left half of the complex plane. Conversely
1
This is under the assumption that f is suciently smooth.
0.2. Robust control problems and uncertainty 11
notice that if a matrix F exists such that A + BF has the desired stability
property, then the state feedback law p(x) = Fx will stabilize the closed
loop.
In the scenario just discussed the state x was directly available to use
in the feedback law. A more general control situation occurs when only
an observation y = h(x; u) is available for feedback, or when a dynamic
control law is employed. For these the analysis is more complicated, we
defer the study to subsequent chapters. We now illustrate these concepts
with an example.
Stabilization of a double pendulum
Shown below in Figure 3 is a double pendulum. In the gure two rigid
links are connected by a hinge joint, so that they can rotate with respect
to each other. The rst link is also constrained to rotate about a point
which is xed in space. The control input to this rigid body system is a
= torque
g = gravity
1
2
measurement action
Control law
d
+ i
v
The gure depicts a schematic diagram of an electric motor controlled
by excitation: a voltage applied to the motor windings results in a torque
applied to the motor shaft. While physical details are not central to our
discussion, we will nd it useful to write down an elementary dynamical
model. The key variables are
v applied voltage.
i current in the eld windings.
motor torque and d opposing torque from the environment.
angle of the shaft and ! = _ angular velocity.
The objective of the control system is for the angular position to follow
a reference command r , despite the eect of an unknown resisting torque
d . This so-called servomechanism problem is common in many applica-
tions, for instance we could think of moving a robot arm in an uncertain
environment.
We begin by writing down a dierential equation model for the motor:
di
v = Ri + L dt (15)
=
i (16)
d!
J dt = ; d ; B! (17)
Here (15) models the electrical circuit in terms of its resistance R and
inductance L; (16) says the torque is a linear function of the current; and
14 0. Introduction
(17) is the rotational dynamics of the shaft, where J is the moment of
inertia and B is mechanical damping. Since the electrical transients of (15)
are typically much faster than the mechanical dynamics, it seems reasonable
to neglect the former, setting L = 0. Also in what follows we normalize, for
simplicity, the remaining constants (R,
, J , B ) to unity in an appropriate
system of units.
We now address the problem of controlling our motor to achieve the
desired objective. We do this by a control system that measures the output
and acts on the voltage v, following the law
v(t) = K (r ; );
where K > 0 is a proportionality constant to be designed. Intuitively,
the system applies torque in the adequate direction to counteract the error
between the command r and the actual output angle . It is an instructive
exercise, left for the reader, to express this system in terms of Figure 4.
Here the driving signals are the command r and the disturbance torque
d , which are unknown at the time we design K .
Given this control law, we can nd the equations of the resulting closed
loop dynamics, and with the above conventions obtain the following.
d = 0 1 + 0 0 r
dt ! ;K ;1 ! K ;1 d
Thus our resulting dynamics have the form
x_ = Ax + Bw
encountered in the previous section.
We rst discuss system stability. It is straightforward to verify that the
eigenvalues of the above A matrix have negative real part whenever K >
0, therefore in the absence of external inputs the system is exponentially
stable. In particular, initial conditions will have asymptotically no eect.
Now suppose r and d are constant over time, then by solving the dif-
ferential equation it follows that the states (; !) converge asymptotically
to
d and !(1) = 0:
(1) = r ; K
Thus the motor achieves an asymptotic position which has an error of d =K
with respect to the command. We make the following remarks:
Clearly if we make the constant K very large we will have accurate
tracking of r despite the eect of d . This highlights the central
role of feedback in achieving system reliability in the presence the
uncertainty about the environment. We will revisit this issue in the
following section.
0.2. Robust control problems and uncertainty 15
The success in this case depends strongly on the fact that r and d are
known to be constant; in other words while their value was unknown,
we had some a priori information about their characteristics.
The last observation is in fact general to any control design question.
That is some information about the unknown inputs is required for us to
be able to assess system performance. In the above example the signals were
specied except for a parameter. More generally the information available
is not so strongly specied, for example one may know something about the
energy or spectral properties of commands and disturbances. The available
information is typically expressed in one of two forms:
(i) We may specify a set D and impose that the disturbance should lie
in D;
(ii) We may give a statistical description of the disturbance signals.
The rst alternative typically leads to questions about the worst possible
system behavior caused by any element of D. In the second case, one usually
is concerned with statistically typical behavior. Which is more appropriate
is application dependent, and we will provide methods for both alternatives
in this course.
We are now ready to discuss a more complicated instance of uncertainty
in the next section.
= torque
g = gravity
1
2
At this point the reader may be thinking that this diculty was due
exclusively to careless modeling, we should have worked with the full, third
order model. Note however that there are many other dynamical aspects
which have been neglected. For instance the bending dynamics of the motor
shaft could also be described by additional state equations, and so on. We
could go to the level of spatially distributed, innite dimensional dynamics,
and there would still be neglected eects. No matter where one stops in
modeling, the reliability of the conclusions depends strongly on the fact that
whatever has been neglected will not become crucial later. In the presence of
feedback, this assessment is particularly challenging and is really a central
design question.
To emphasize this point in another example, consider the modied double
pendulum shown in Figure 5. In this new setup a vessel containing
uid has
been rigidly attached to the end of the second link. Suppose following our
discussion of the previous two sections, that we wish to stabilize this system
about one of its equilibria. The addition of the
uid-vessel to this system
signicantly complicates the modeling of this system, and transforms our
two rigid body system into an innite dimensional system which is highly
intractable, to the point where it is even beyond the scope of accurate
computer simulation.
0.2. Robust control problems and uncertainty 17
However to balance this system an innite dimensional model is probably
not required, and perhaps a low dimensional one will suce. An extreme
model in this latter category would be one that modeled the
uid-vessel
system as a point mass; it may well be that a feedback design which renders
our system insensitive to the value of this mass would perform well in the
real system. But possibly the oscillations of the
uid inside the vessel may
compromise performance. In this case a more rened, but still tractable
model could consist of an oscillatory mass-spring type dynamical model.
These modeling issues become even more central if we interconnect many
uncertain or complex systems, to form a \system of systems". A very simple
example is the coupled system formed when the electric motor above is used
to generate the control torque for the
uid-pendulum system.
The main conclusion we make is that there is no such thing as the \cor-
rect" model for control. A useful model is one in which the remaining
uncertainty or unpredictability of the system can be adequately compen-
sated by feedback. Thus we have set the stage for this course. The key
players are feedback, stability, performance, uncertainty and interconnec-
tion of systems. The mathematical theory to follow is motivated by the
challenging interplay between these aspects of designed dynamical systems.
1
Preliminaries in Finite Dimensional
Space
1.1.2 Subspaces
A subspace of a vector space V is a subset of V which is also a vector space
with respect to the same eld and operations; equivalently, it is a subset
which is closed under the operations on V .
Examples:
A vector space can have many subspaces, and the simplest of these is the
zero subspace, denoted by f0g. This is a subspace of any vector space and
contains only the zero element. Excepting the zero subspace and the entire
space, the simplest type of subspace in V is of the form
Sv = fs 2 V : s = v; for some 2 Rg;
given v in V . That is each element in V generates a subspace by multiplying
it by all possible scalars. In R2 or R3 , such subspaces correspond to lines
going through the origin.
Going back to our earlier examples of vector spaces we see that the
multinomials Pm[n] are subspaces of F (Rm ; R), for any n.
Now Rn has many subspaces and an important set is those associated
with the natural insertion of Rm into Rn , when m < n. Elements of these
subspaces are of the form
x = x0
22 1. Preliminaries in Finite Dimensional Space
where x 2 Rm and 0 2 Rn;m .
Given two subspaces S1 and S2 we can dene the addition
S1 + S2 = fs 2 V : s = s1 + s2 for some s1 2 S1 and s2 2 S2 g
which is easily veried to be a subspace.
w = Av = A(1 v1 + + n vn )
= 1 Av1 + + n Avn
!
n X
X m m
X n
X
= k ajk wj = k ajk wj ;
k=1 j =1 j =1 k=1
and therefore by uniqueness of the coordinates we must have
m
X
j = k ajk ; j = 1; : : : ; m:
j =1
To express this relationship in a more convenient form, we can write the
set of numbers ajk as the m n matrix
2 3
a11 a1n
[A] = 64 ... ... ..
.
7
5 :
am1 amn
Then via the standard matrix product we have
2 3 2 32 3
1 a11 a1n 1
6 .. 75 = 64 .. ... .. 76 .. 75 :
4 . . . 54 .
n am1 anm n
1.1. Linear spaces and mappings 25
In summary any linear mapping A between vector spaces can be regarded
as a matrix [A] mapping Fn to Fm via matrix multiplication.
Notice that the numbers akj depend intimately on the bases fv1 ; : : : ; vn g
and fw1 ; : : : ; wm g. Frequently we use only one basis for V and one for W
and thus there is no need to distinguish between the map A and the basis
dependent matrix [A]. We therefore simply write A to denote either the
map or the matrix, making which is meant context dependent.
We now give two examples to more clearly illustrate the above discussion.
Examples:
Given matrices B 2 C kk and D 2 C ll we dene the map : C kl !
C kl by
(X ) = BX ; XD;
(X1 + X2 ) = B (X1 + X2 ) ; (X1 + X2 )D
= (BX1 ; X1 D) + (BX2 ; X2 D)
= (X1 ) + (X2 ):
If we now consider the identication between the matrix space C kl and
the product space C kl , then can be thought of as a map from C kl to C kl ,
and can accordingly be represented by a complex matrix which is kl kl.
We now do an explicit 2 2 example for illustration. Suppose k = l = 2
and that
B = 13 24 and D = 50 00 :
0 3 0 4
in this basis.
Another linear operator involves the multinomial function Pm[n] dened
earlier in this section. Given an element a 2 Pm[k] we can dene the mapping
: Pm[n] ! Pm[n+k] by function multiplication
(p)(x1 ; x2 ; : : : ; xm ) := a(x1 ; x2 ; : : : ; xm )p(x1 ; x2 ; : : : ; xm ):
Again can be regarded as a matrix, which maps Rd1 ! Rd2 , where d1
and d2 are the dimensions of Pm[n] and Pm[n+k] respectively.
Associated with any linear map A : V ! W is its image space, which is
dened by
ImA = fw 2 W : there exists v 2 V satisfying Av = wg:
This set contains all the elements of W which are the image of some point
in V . Clearly if fv1 ; : : : ; vn g is a basis for V then
ImA = spanfAv1 ; : : : ; Avn g
and is thus a subspace. The map A is called surjective when ImA = W .
The dimension of the image space is called the rank of the linear mapping
A, and the concept is applied as well to the associated matrix [A]. Namely
rank[A] = dim(ImA):
If S is a subspace of V , then the image of S under the mapping A is denoted
AS . That is
AS = fw 2 W : there exists s 2 S satisfying As = wg:
1.1. Linear spaces and mappings 27
In particular, this means that AV = ImA.
Another important set related to A is its kernel, or null space, dened
by
ker A = fv 2 V : Av = 0g:
In words, ker A is the set of vectors in V which get mapped by A to the
zero element in W , and is easily veried to be a subspace of V .
If we consider the equation Av = w, suppose va and vb are both solutions;
then
A(va ; vb ) = 0:
Namely the dierence between any two solutions is in the kernel of A. Thus
given any solution va to the equation, all solutions are parametrized by
va + v0 ;
where v0 is any element in ker A.
In particular, when ker A is the zero subspace, there is at most a unique
solution to the equation Av = w. This means Ava = Avb only when va = vb ;
a mapping with this property is called injective.
In summary, a solution to the equation Av = w will exist if and only if
w 2 ImA; it will be unique only when ker A is the zero subspace.
The dimensions of the image and kernel of A are linked by the
relationship
dim(V ) = dim(ImA) + dim(ker A);
proved in the exercises at the end of the chapter.
A mapping is called bijective when it is both injective and surjective, i.e.
for every w 2 W there exists a unique v satisfying Av = w. In this case
there is a well dened inverse mapping A;1 : W ! V , such that
A;1 A = IV ; AA;1 = IW :
In the above, I denotes the identity mapping in each space, i.e. the map
that leaves elements unchanged. For instance, IV : v 7! v for every v 2 V .
From the above property on dimensions we see that if there exists a
bijective linear mapping between two spaces V and W , then the spaces
must have the same dimension. Also, if a mapping A is from V back to
itself, namely A : V ! V , then one of the two properties (injectivity or
surjectivity) suces to guarantee the other.
We will also use the terms nonsingular or invertible to describe bijective
mappings, and apply these terms as well to their associated matrices. Notice
that invertibility of the mapping A is equivalent to invertibility of [A] in
terms of the standard matrix product; this holds true regardless of the
chosen bases.
28 1. Preliminaries in Finite Dimensional Space
Examples:
To illustrate these notions let us return to the mappings and dened
above. For the 2 2 numerical example given, maps C 22 back to itself.
It is easily checked that it is invertible by showing either
Im = C 22 ; or equivalently ker = 0:
In contrast is not a map on the same space, instead taking Pm[n] to the
larger space Pm[n+k] . And we see that the dimension of the image of is at
most n, and the dimension of its kernel at least k. Thus assuming k > 0
there are at least some elements w 2 Pm[n+k] for which
v = w
cannot be solved. These are exactly the values of w that are not in Im.
V Fn Fn
A Av Au = T ;1Av T
E T;
V
1
Fn Fn
v1 v1 v2
v2
Clearly any vector space is convex, as is any subset fvg of a vector space
containing only a single element.
We can think of the expression v1 + (1 ; )v2 for a point on the line
L(v1 ; v2 ) as a weighted average. To see this instead write equivalently
v = 1 v1 + 2 v2 ;
where 1 ; 2 2 [0; 1] and satisfy 1 +2 = 1. Then we see that if both 1 and
2 are both equal to one half we have our usual notion of the average. And
if they take other values the weighted average \favors" one of the points.
Thus the clear generalization of such an average to n points v1 ; : : : ; vn is
v = 1 v1 + + n vn ;
where 1 + + n = 1 and 1 ; : : : ; n 2 [0; 1]. A line segment gave us
geometrically a point on the line between the two endpoints. The general-
ization of this to an average of n points, yields a point inside the perimeter
dened by the points v1 ; : : : ; vn . This is illustrated in Figure 1.2.
v co(fv1 ; : : : ; v6 g)
v1 2
v
v5 v6 v3
v4
Figure 1.2. Convex hull of nite number of points
Given v1 ; : : : ; vn we dene the convex hull of these points by
n
X n
X
co(fv1 ; : : : ; vn g) = fv 2 V : v = k vk ; with k 2 [0; 1]; k = 1g:
k=1 k=1
Thus this set is all the points inside the perimeter in Figure 1.2. In words
the convex hull of the points v1 ; : : : ; vn is simply the set comprised of all
weighted averages of these points. In particular we have that for two points
L(v1 ; v2 ) = co(fv1 ; v2 g). It is a straightforward exercise to show that if
34 1. Preliminaries in Finite Dimensional Space
Q is convex, then it necessarily contains any convex hull formed from a
collection of its points.
So far we have only dened the convex hull in terms of a nite number
of points. We now generalize this to an arbitrary set. Given a set Q, we
dene its convex hull co(Q) by
co(Q) =fv 2 V : there exists n and v1 ; : : : ; vn 2 Q
such that v 2 co(fv1 ; : : : ; vn g)g:
So the convex hull of Q is the collection of all possible weighted averages
of points in Q. It is straightforward to demonstrate that for any set Q:
the subset condition Q co(Q) is satised;
the convex hull co(Q) is convex;
the relationship co(Q) = co (co(Q)) holds.
We also have the following results which relates convexity of a set to its
convex hull.
Proposition 1.4. A set Q is convex if and only if co(Q) = Q is satised.
Notice that, by denition, the intersection of convex sets is always convex;
therefore, given a set Q, there exists a smallest convex set that contains Q;
it follows easily that this is precisely co(Q); in other words, if Y is convex
and Q Y , then necessarily co(Q) is a subset of Y . Pictorially we have
Figure 1.3 to visualize Q and its convex hull.
Q co(Q)
v1
Figure 1.4. A hyperplane
An important property of a hyperplane, which is clear in the above ge-
ometric case, is that it always breaks up the space into two half-spaces:
these have the form fv : F (v) ag and fv : F (v) ag.
This leads to the notion of separating two sets with a hyperplane. Given
two sets Q1 and Q2 in V , we say that the hyperplane dened by (F; a)
separates the sets if
(a) F (v1 ) a, for all v1 2 Q1 ;
(b) F (v2 ) a, for all v2 2 Q2 .
Geometrically we have the illustration in Figure 1.5 below.
Q1
Q2
Unitary matrices are the only matrices that leave the length of every vector
unchanged. We are now ready to state the spectral theorem for Hermitian
matrices.
Theorem 1.9. Suppose H is a matrix in H n . Then there exist a unitary
matrix U and a real diagonal matrix such that
H = U U :
Furthermore, if H is in Sn then U can be chosen to be orthogonal.
1.3. Matrix Theory 43
Notice that since U = U ;1 for a unitary U , the above expression is a
similarity transformation. Therefore the theorem says that a self-adjoint
matrix can be diagonalized by a unitary similarity transformation. Thus
the columns of U are all eigenvectors of H . Since the proof of this result
assembles a number of concepts from this chapter we provide it below.
Proof . We will use an induction argument. Clearly the result is true if H
is simply a scalar, and it is therefore sucient to show that if the result
holds for matrices in H n;1 then it holds for H 2 H n . We proceed with
the assumption that the decomposition result holds for (n ; 1) (n ; 1)
Hermitian matrices.
The matrix H has at least one eigenvalue 1 , and 1 is real since H is
Hermitian. Let x1 be an eigenvector associated with this eigenvalue, and
without loss of generality we assume it to have length one. Dene X to be
any unitary matrix with x1 as its rst column, namely
X = [x1 xn ]:
Now consider the product X AX . Its rst column is given by X Ax1 =
1 X x1 = 1 e1 , where e1 is the rst element of the canonical basis. Its
rst row is described by x1 AX which is equal to 1 x1 X = 1 e1 , since
x1 A = 1 x1 because A is self-adjoint. Thus we have
X AX = 0 A ; 1 0
2
where A2 a Hermitian matrix in H n;1 . By the inductive hypothesis there
exists a unitary matrix X2 in C (n;1)(n;1) such that A2 = X2 2 X2, where
2 is both diagonal and real. We conclude that
A= X 0 X I 0 1 0 I 0
2 0 2 0 X2 X :
The right-hand side gives the desired decomposition.
If H is a real matrix, that is in H n , then all the matrices in the
construction above are also real, proving the latter part of the theorem.
We remark in addition that the (real) eigenvalues of H can be arranged
in decreasing order in the diagonal of . This follows directly from the
above induction argument: just take 1 to be the largest eigenvalue.
We now focus on the case where these eigenvalues have a denite sign.
Given Q 2 H n , we say it is positive denite, denoted Q > 0, if
x Qx > 0;
for all nonzero x 2 C n . Similarly Q is positive semidenite, denoted Q 0,
if the inequality is nonstrict; and negative denite and negative semidenite
are similarly dened. If a matrix is not positive or negative semidenite,
then it is indenite.
44 1. Preliminaries in Finite Dimensional Space
The following properties of positive matrices follow directly from the
denition, and are left as exercises:
If Q > 0 and A 2 C nn , then A QA 0. If A is invertible, then
A QA > 0.
If Q1 > 0, Q2 > 0, then 1 Q1 + 2 Q2 > 0 whenever 1 > 0, 2 0.
In particular, the set of positive denite matrices is a convex cone in
H n , as dened in the previous section.
At this point we may well ask, how can we check whether a matrix is
positive denite? The following answer is derived from Theorem 1.9:
If Q 2 H n ; then Q > 0 if and only if the eigenvalues of Q are all positive.
Notice in particular that a positive denite matrix is always invertible, and
its inverse is also positive denite. Also a matrix is positive semi-denite
exactly when none of its eigenvalues are negative; in that case the number
of strictly positive eigenvalues is equal to the rank of the matrix.
An additional useful property for positive matrices is the existence of a
square root. Let Q = U U 0, in other1 words the diagonal elements of
are non-negative.
12
Then we can dene 2 to be the matrix with diagonal
elements k , and
Q 21 := U 12 U :
Then
12 21
Q 21 0 (also Q 21 > 0 when Q > 0) and it is easily veried that
Q Q = Q.
Having dened a notion of positivity, our next aim is to generalize the
idea of ordering to matrices, namely what does it mean for a matrix to be
larger than another matrix? We write
Q>S
for matrices Q, S 2 H n to denote that Q ; S > 0. We refer to such
expressions generally as matrix inequalities. Note that for matrices that
it may be that neither Q S nor Q S holds, i.e. not all matrices are
comparable.
We conclude our discussion by establishing a very useful result, known
as the Schur complement formula.
Theorem 1.10. Suppose that Q; M , and R are matrices and that M and
Q are self-adjoint. Then the following are equivalent
(a) The matrix inequalities Q > 0 and
M ; RQ;1 R > 0 both hold;
(b) The matrix inequality
M R
R Q > 0 is satised:
1.3. Matrix Theory 45
Proof . The two inequalities listed in (a) are equivalent to the single block
inequality.
M ; RQ;1 R 0 > 0 :
0 Q
Now left and right multiply this inequality by the nonsingular matrix
I RQ;1
0 I
and its adjoint, respectively, to get
M R I RQ;1 M ; RQ;1R 0 I 0
R Q = 0 I 0 Q Q;1 R I > 0:
Therefore inequality (b) holds if and only if (a) holds.
We remark that an identical result holds in the negative denite case,
replacing all \<" by \>".
Having assembled some facts about self-adjoint matrices, we move on to
our nal matrix theory topic.
C Xn+1
Xn
Xmin
1.5 Exercises
1. Find a basis for C mn as a real vector space. What is the dimension
of this real vector space?
2. The spaces H n and Sn are both real vector spaces, nd bases for each.
How are they related?
3. Determine the dimension of the set of homogeneous multinomials
P3[4] . What is the general formula for the dimension of Pm[n] .
4. We consider the mapping dened in x1.1.4. Let a 2 P3[1] be
a(x1 ; x2 ; x3 ) = x2 , and consider : P3[1] ! P3[2] , which is dened
by
(p)(x1 ; x2 ; x3 ) = a(x1 ; x2 ; x3 )p(x1 ; x2 ; x3 ):
Choose bases for P3[1] and P3[2] , and represent as the corresponding
matrix [].
5. Suppose A : V ! W . Let fAv1 ; : : : ; Avr g be a basis for ImA and
fu1 ; : : : ; uk g be a basis for ker A. Show that fv1 ; : : : ; vr ; u1; : : : ; uk g
is a basis for V and deduce that dim(V ) = dim(ker A) + dim(ImA).
6. Given a mapping A : V ! V show that both ker A and ImA are
A-invariant.
7. By direct calculation, show that given A 2 C mn and B 2 C nm ,
the identity TrAB = TrBA holds. Use this to prove that the trace of
any square matrix is the sum of its eigenvalues.
54 1. Preliminaries in Finite Dimensional Space
8. Prove that every linear functional F on Sn can be expressed as
F (X ) = Tr(Y X ) for some xed Y 2 Sn.
9. Suppose A is a Hermitian matrix, and that 0 and 1 are two eigenval-
ues with corresponding eigenvectors x0 and x1 . Prove that if 0 6= 1 ,
then x0 and x1 are orthogonal.
10. Show that if P 0 and Q 0, then Tr(PQ) 0.
11. Suppose A 2 C mn has the singular value decomposition U V .
(a) Show that if x 2 spanfvk+1 ; : : : ; vn g then jAxj k+1 jxj.
(b) If A is n n, let 1 2 n be the diagonal elements of
. Denote (A) = n . Show that A is invertible if and only if
(A) > 0, and in that case
(A;1 ) = (1A) :
(c) If A 2 H n , then ; (A) I A (A) I .
12. Suppose that
> 0 and that X 2 Rnm .
(a) Show that (X )
if and only if X X
2 ;
(b) Convert the constraint (X )
to an equivalent LMI
condition.
13. The spectral radius of a matrix M 2 C nn is dened as
(M ) := maxfjj such that is an eigenvalue of M g:
(a) Show that (M ) (M ) and nd both numbers for
2 3
0 1 0
6 . . 7
M = 664 . . . . 775
1
0 0
(b) Show that
(M ) inf (DMD;1 )
D invertible
(c) Prove that there is equality in (b). Hint: use the Jordan form.
(d) Deduce that (M ) < 1 if and only if the set of LMIs
X > 0; M XM ; X < 0
is feasible over X 2 H n .
14. Consider a real LMI given by
F (X ) < Q;
where the linear map F : X ! Sn, Q 2 Sn, and X is a real vector
space. Show that any LMI of the standard form given in x1.4 with
1.5. Exercises 55
respect to the Hermitian matrices H n , can be converted to a real LMI
with n = 2n. Hint: rst nd an condition for a matrix A 2 H n to be
positive denite, in the form of an LMI on Re(A) and Im(A).
15. In x1.4 two types of LMI problems were introduced, the feasibility
problem and the linear objective problem. Show how the feasibility
problem can be used with iteration to solve the latter.
16. Another type of LMI optimization problem is the so-called generalized
eigenvalue minimization problem, which is
minimize:
;
subject to: F0 (X ) +
F1 (X ) < Q0 +
Q1 and X 2 X ,
where F1 and F2 are linear mapping from X to Sn, and Q0 , Q1 2 Sn.
Show that the linear objective problem can be reformulated in this
format. Further show that this problem can be solved by iteration
using the feasibility problem.
17. Let C be a convex set in a real vector space X . A function : C ! R
is said to be convex if it satises
(x1 + (1 ; )x2 ) (x1 ) + (1 ; )(x2 )
for every x1 , x2 in C and every 2 [0; 1]. The minimization of such
a function is called a convex optimization problem. As an important
example, the function (x) = ; log(x) is convex in (0; 1). Clearly,
any linear function is convex.
(a) Prove that for a convex function, every local minimum is a global
minimum.
(b) Show a function is convex if and only if for any x1 , x2 in C ,
the function f () := (x1 + (1 ; )x2 ) is convex in 2 [0; 1].
(c) Prove that (X ) = ; log(det(X )) is convex in the set of positive
matrices fX > 0g. Hint: use the identity
1 1 1 1
X1 + (1 ; )X2 = X22 I + X2; 2 (X1 ; X2 )X2; 2 X22
and express det(X1 + (1 ; )X12 ) in terms of the
12
eigenvalues of
;
the Hermitian matrix H := X2 (X1 ; X2 )X2 .
2 ;
(d) Deduce that if F : X ! H n is linear, ; log(det[Q ; F (X )]) is a
barrier function for the set C = fX 2 X : F (X ) < Qg.
2
State Space System Theory
We will now begin our study of system theory. This chapter is devoted to
examining one of the building blocks used in the foundation of this course,
the continuous time, state space system. Our goal is to cover the fundamen-
tals of state space systems, and we will consider and answer questions about
their basic structure, controlling and observing them, and representations
of them.
The following two equations dene a state space system.
x_ (t) = Ax(t) + Bu(t); with x(0) = x0 (2.1)
y(t) = Cx(t) + Du(t);
where u(t); x(t) and y(t) are vector valued functions, and A; B; C and D
are matrices. We recall that the derivative x_ (t) is simply the vector formed
from the derivatives of each scalar entry in x(t). The rst of the above
equations is called the state equation and the other the output equation.
The variable t 0 is time and the function u(t) is referred to as the system
input. The functions x(t) and y(t) are called the state and output of the
system respectively, and depend on the input.
For later reference we dene the dimensions of the vectors by
u(t) 2 Rm ; x(t) 2 Rn and y(t) 2 Rp :
Thus A is an nn matrix; the matrix B is nm; matrices C and D are pn
and p m respectively. We will restrict ourselves to real matrices during
the chapter, however all the results we prove hold for complex matrices as
well. Notice that the system given above is a rst order linear dierential
equation with an initial condition, and therefore has a unique solution.
58 2. State Space System Theory
The state space formulation above is very general because many systems
of higher order linear dierential equations can be reformulated in this
way. This generality motivates our study of state space systems. Before
considering this system as a whole we will examine simpler versions of it
to successively build our understanding.
2.2.1 Reachability
We begin our study by asking, given a xed time t, what are the possible
values of the state vector x(t)? Or asked another way: given a vector in Rn
is it possible to steer x(t) to this value by choosing an appropriate input
function u(t)? We will answer this question completely, and will nd that
it has a surprisingly simple answer. To do this we require three related
concepts.
Set of reachable states
For a xed time t > 0, let Rt denote the states that are reachable at time
t by some input function u. Namely Rt is the set
Rt := f 2 Rn : there exists u such that x(t) = g :
This denition is made respecting the state equation given in (2.5), where
the initial condition is zero. It turns out that Rt is a subspace of Rn . This is
a simple consequence of the linearity of (2.5), which we leave as an exercise.
Controllable subspace
Next we dene a subspace associated with any state equation. Given the
state equation x_ (t) = Ax(t) + Bu(t) we call the matrix
B AB A2 B An;1 B
62 2. State Space System Theory
the associated controllability matrix. Equivalently we say it is the con-
trollability matrix of the matrix pair (A; B ). Recall that A is an n n
matrix, and notice that the dimension n plays an important role in the
above denition.
Associated with the controllability matrix is the controllability subspace,
denoted CAB , which is dened to be the image of the controllability matrix.
CAB := Im B AB An;1 B :
Thus we see that CAB , like Rt above is a subspace of Rn . When dimension
of CAB is n, or equivalently the controllability matrix has the full rank of n,
we say that (A; B ) is a controllable pair. In the same vein we refer to a state
space system as being controllable when the associated matrix pair (A; B )
is controllable. We shall soon see the motivation for this terminology.
Controllability gramian
Here we dene yet another object associated with the state equation,
a matrix which depends on time. For each t > 0, the time dependent
controllability gramian is dened to be the n n matrix
Z t
Wt := eA BB eA d :
0
Having dened the set of reachable states, the controllability subspace
and the controllability gramian we can now state the main result of this
section. It will take a number of steps to prove.
Theorem 2.2. For each time t > 0 the set equality
Rt = CAB = Im Wt holds:
This theorem says that the set of reachable states is always equal to the
controllability subspace, and is also equal to the image of the controllability
gramian. Since the controllability subspace CAB is independent of time so
is the set of reachable states: if a state can be reached at a particular time,
then it can be reached at any t > 0, no matter how small. According to
the theorem, if (A; B ) is controllable then Rt is equal to the entire state
space Rn ; that is all the states are reachable by appropriate choice of the
input function u.
Let us now move on to proving Theorem 2.2. We will accomplish this by
proving three lemmas, showing sequentially that
Rt is a subset of CAB ;
CAB is a subset of ImWt ;
ImWt is a subset of Rt .
These facts will be proved in Lemmas 2.6, 2.8 and 2.9 respectively.
2.2. Controllability 63
Before our rst result we require a fact about matrices known as the
Cayley-Hamilton theorem. Denote the characteristic polynomial of matrix
A by charA (s). That is
charA (s) := det(sI ; A) =: sn + an;1 sn;1 + a0 ;
where det() denotes the determinant of the argument matrix. Recall that
the eigenvalues of A are the same as the roots of charA (s). We now state
the Cayley-Hamilton theorem.
Theorem 2.3. Given a square matrix A the following matrix equation is
satised
An + an;1 An;1 + an;2 An;2 + + a0 I = 0;
where ak denote the scalar coecients of the characteristic polynomial of
A.
That is the Cayley-Hamilton theorem says a matrix satises its own
characteristic equation. In shorthand notation we write
charA (A) = 0:
We will not prove this result here but instead illustrate the idea behind the
proof, using the example matrices considered above.
Examples:
First consider the case of a diagonalizable matrix A as in (2.3). Then it
follows that
2 3
charA (1 ) 0
charA (A) = T charA ()T ;1 = T 64 ... 7 ;1
5T :
0 charA (n )
Now by denition each of the eigenvalues is a root of charA (s), and so we
see in this case charA (A) = 0.
Next we turn to the case of a Jordan block (2.4). Clearly in this case A
has n identical eigenvalues, and has characteristic polynomial
charA (s) = (s ; )n :
Now it is easy to see that charA (A) = 0 since
charA (A) = (A ; I )n = N n = 0;
where N is the nilpotent matrix dened in the earlier examples.
The general case of the Cayley-Hamilton theorem can be proved using
the ideas from these examples and the Jordan decomposition, and you are
asked to do this in the exercises.
The signicance of the Cayley-Hamilton theorem for our purposes is
that it says the matrix An is a linear combination of the matrix set
64 2. State Space System Theory
fAn;1 ; An;2 ; : : : ; A; I g. Namely
An 2 spanfAn;1 ; An;2 ; : : : ; A; I g:
More generally we have the next proposition.
Proposition 2.4. Suppose k n. Then there exist scalar constants
0 ; : : : ; n;1 satisfying
Ak = 0 I + 1 A + + n;1 An;1
We now return for a moment to the matrix exponential and have the
following result.
Lemma 2.5. There exist scalar functions 0 (t); : : : ; n;1(t) such that
eAt = 0 (t)I + 1 (t)A + + n;1 (t)An;1 ;
for every t 0.
The result says that the time dependent matrix exponential eAt can be
written as a nite sum, where the time dependence is isolated in the scalar
functions k (t). The result is easily proved by observing that, for each t 0,
the matrix exponential has the expansion
2 3
eAt = I + At + (At2!) + (At3!) + :
The result then follows by expanding Ak , for k n, using Proposition 2.4.
We are ready for the rst step in the proof of our main result, which
is to prove that the reachable states are a subset of the image of the
controllability matrix.
Lemma 2.6. The set of reachable states Rt is a subset of the controllability
subspace CAB .
Proof . Fix t > 0 and choose any reachable state 2 Rt. It is sucient to
show that 2 CAB . Since 2 Rt there exists an input function u such that
Z t
= eA(t; )Bu( )d :
0
Now substitute the expansion for the matrix exponential from Lemma 2.5
to get
Z t Z t
= 0 (t ; )Bu( )d + + An;1 n;1 (t ; )Bu( )d :
0 0
Writing this as a product we get
2 Rt 3
0 0 (t ; )u( )d
= B AB An;1 B
6 .. 7
:
4
Rt
. 5
Corollary 2.15. The matrix pair (A; B) is stabilizable if and only if the
condition
rank A ; I B = n holds for all 2 C+ :
This corollary states that one need only check the above rank condition at
the unstable eigenvalues of A, that is those in the closed right half-plane
C + . This is a much simpler task than determining stabilizability from the
denition.
72 2. State Space System Theory
2.2.4 Controllability from a single input
This section examines the special case of a pair (A; B ) where B 2 Rn1 .
That is our state space system has only a single scalar input. This investi-
gation will furnish us with a new system realization which has important
applications.
A matrix of the form
2 3
0 1 0
6
6 ... ... 7
7
6 7
4 0 0 1 5
;a0 ;a1 ;an;1
is called a companion matrix, and has the useful property that its
characteristic polynomial is
sn + an;1 sn;1 + + a0 :
The latter fact is easily veried. We now prove an important theorem which
relates single input systems to companion matrices.
Theorem 2.16. Suppose (A; B) is a controllable pair, and B is a column
vector. Then there exists a similarity transformation T such that
2 3 2 3
0 1 0 0
6 ... ... 7 6.7
TAT ;1 = 66 7
7 and TB = 66 .. 77 :
4 0 0 1 5 405
;a0 ;a1 ;an;1 1
The theorem states that if a single input system is controllable, then it can
be transformed to the special realization above, where A is in companion
form and B is the rst standard basis vector. This is called the controllable
canonical realization, and features the A-matrix in companion form with
the B -matrix the n-th standard basis vector of Rn . To prove this result we
will again use the Cayley-Hamilton theorem.
2.4 Observability
In the last section we studied a special case of the system equations pre-
sented in (2.1), which had an input but no output. We will now in contrast
78 2. State Space System Theory
consider a system with an output but no input, the system is
x_ (t) = Ax(t); with x(0) = x0
y(t) = Cx(t):
This system has no input, and its solution depends entirely on the initial
condition x(0) = x0 . The solution of this equation is clearly
y(t) = CeAt x0 ; for t 0:
We regard the function y(t) as an output and will now focus on the question
of whether we can determine the value of x0 by observing the variable y(t)
over a time interval.
2.4.2 Observers
Frequently in control problems it is of interest to obtain an estimate of the
state x(t) based on past values of the output y. From our investigation in
the last section we learned under what conditions x(0) can be determined
exactly by measuring y. If they are met there are a number of ways to do
this; we will investigate an important one in the exercises. Right now we
focus on determining asymptotic estimates for x(t).
We will pursue this goal using our full state space system
x_ (t) = Ax(t) + Bu(t); x(0) = x0 ;
y(t) = Cx(t) + Du(t):
Our objective is to nd an asymptotic approximation of x(t) given the
input u and the output y, but without knowledge of the initial condition
x(0).
There are a number of ways to address this problem, and the one we
pursue here is using a dynamical system called an observer. The system
equations for an observer are
w_ (t) = Mw(t) + Ny(t) + Pu(t); w(0) = w0
x^(t) = Qw(t) + Ry(t) + Su(t): (2.11)
So the inputs to the observer are u and y, and the output is x^. The key
property that the observer must have is that the error
!1 0
x^(t) ; x(t) t;!
for all initial conditions x(0) and w(0), and all system inputs u.
Theorem 2.23. An observer exists, if and only if, (C; A) is detectable.
Furthermore, in that case one such observer is given by
x^_ (t) = (A + LC )^x(t) ; Ly(t) + (B + LD)u(t) (2.12)
where the matrix L is chosen such that A + LC is Hurwitz.
Notice in the observer equation (2.12) that we have x^ = w, so we have
removed w for simplicity. An observer with this structure is called a full-
order Luenberger observer.
82 2. State Space System Theory
Proof . We rst show the necessity of the detectability condition, by con-
trapositive. We will show that if (C; A) is not detectable, there always exist
initial conditions x(0) and w(0), and an input u(t), where the observer er-
ror does not decay to zero. In particular, we will select u = 0, w(0) = 0,
and x(0) to excite only the unobservable states, as is now explained.
Without loss of generality we will assume the system is already in observ-
ability form. For the general case, a state-transformation must be added to
the following argument; this is left as an exercise. With this assumption,
and u = 0, we have the equations
x_ 1 (t) = A11 x1 (t)
x_ 2 (t) = A21 x1 (t) + A22 x2 (t)
y(t) = C1 x1 (t):
Notice that if (C; A) is not detectable, A22 is not a Hurwitz matrix.
Therefore x2 (0) can be chosen so that the solution to
x_ 2 (t) = A22 x2 (t)
does not tend to zero as t ! 1. Now choosing x1 (0) = 0, it is clear that
x1 (t), and also y(t), will be identically zero for all time, while x2 (t) evolves
according to the autonomous equation and does not tend to zero.
Now turning to a generic observer equation of the form (2.11), we see
that if w(0) = 0, then w(t) and therefore x^(t) are identically zero. This
contradicts the fact that the error x^(t) ; x(t) tends to zero, so we have
shown necessity.
The proof of suciency is constructive. We know that if (C; A) is de-
tectable, L can be chosen to make A + LC a Hurwitz matrix. With this
choice, we construct the observer in (2.12). Substituting the expression for
y into (2.12) we have
x^_ (t) = (A + LC )^x(t) ; L(Cx(t) + Du(t) ) + (B + LD)u(t)
= (A + LC )^x(t) ; LCx(t) + Bu(t):
Now if we subtract the state equation from this, we obtain
x^_ (t) ; x_ (t) = (A + LC )^x(t) ; LCx(t) + Bu(t) ; Ax(t) ; Bu(t)
= (A + LC )(^x(t) ; x(t) )
This means that the error e = x^ ; x satises the autonomous dierential
equation
e_(t) = (A + LC )e(t)
Since A + LC is a Hurwitz matrix, e(t) tends to zero as required.
2.4. Observability 83
2.4.3 Observer-Based Controllers
We will now brie
y exhibit a rst example of feedback control, combin-
ing the ideas of the previous sections. Feedback consists of connecting a
controller of the form
x_ K (t) = AK xK (t) + BK y(t)
u(t) = CK xK (t) + DK y(t)
to the open-loop system (2.1), so that the combined system achieves some
desired properties. A fundamental requirement is the stability of the re-
sulting autonomous system. We have already seen that in the case where
the state is available for feedback (i.e. when y = x), then a static control
law of the form
u(t) = Fx(t)
can be used to achieve desired closed-loop eigenvalues, provided that (A; B )
is controllable. Or if it is at least stabilizable, the closed loop can be made
internally stable.
The question arises as to whether we can achieve similar properties for
the general case, where the output contains only partial information on the
state. The answer, not surprisingly, will combine controllability properties
with observability of the state from the given output. We state the following
result.
Proposition 2.24. Given open-loop system (2.1), the controller
x^_ (t) = (A + LC + BF + LDF )^x(t) ; Ly(t)
u(t) = F x^(t)
is such that the closed loop system has eigenvalues exactly at the eigenvalues
of the matrices A + BF and A + LC .
Notice that given the choice of u, the rst equation can be rewritten as
x^_ (t) = (A + LC )^x(t) ; Ly(t) + (B + LD)u(t)
and has exactly the structure of a Luenberger observer (2.12) for the state
x. Given this observation, the expression for u can be interpreted as be-
ing analogous to the state feedback law considered above, except that the
estimate x^ is used in place of the unavailable state.
Proof . Combining the open loop system with the controller, and elimi-
nating the variables u, y, leads to the combined equations
x_ (t) = A BF x(t)
x^_ (t) ;LC A + LC + BF x^(t)
Now the similarity transformation
T = I0 ;II
84 2. State Space System Theory
(which amounts to replacing x by x ; x^ as a state variable) leads to
A BF ; 1
T ;LC A + LC + BF T = ;LC A + BF A + LC 0
2.7 Exercises
1. Consider the following set of dierential equations
u1 + c u2 = q1 + b q_2 + q1 ;
u2 = q2 + d q1 + e q_2 + f q1 ;
where the dependent functions are q1 and q2 . Convert this to a 2 2
state space system.
2. Suppose A is Hermitian and is Hurwitz. Show that if x(t) = eAt x(0)
then jx(t)j < jx(0)j, for each t > 0. Here j j denotes the Euclidean
length of the vector.
3. Prove that e(N +M )t = eNteMt if N and M commute. Use the
following steps:
(a) Show that eNt and M commute;
94 2. State Space System Theory
(b) By taking derivatives directly show that the time derivative of
Q(t) = eNt eMt satises Q_ (t) = (N + M )eNteMt ;
(c) Since Q(t) satises Q_ (t) = (N + M )Q(t), with Q(0) = I , show
that Q(t) = e(N +M )t .
Notice in particular this demonstrates eN +M = eN eM .
4. Using Jordan decomposition prove Proposition 2.4.
5. In the proof of Lemma 2.8 we use that fact that given two subspaces
V1 and V2 , then V1 V2 if and only if V1? V2? . Prove this fact.
Hint: rst show that for any subspace V = (V ? )? .
6. Prove Proposition 2.7.
7. Using a change of basis transform the pair
2 3 2 3
0 1 0 1 1 1
A = 64 00 02 ;11 00 75 B = 64 11 00 75
6 7 6 7
0 ;1 1 1 0 0
~ ~12 B~1 where (A~11 ; B~1 ) is
to the form A~ = A011 A A~22 and B~ = 0
controllable.
8. Fill in the details of the proof of Theorem 2.12 on controllability form.
9. (a) Consider the discrete time state equation
xk+1 = Axk + Buk ; with x0 = 0:
A state 2 Rn is said to be reachable if there exists a sequence
u0 ; : : : ; uN ;1 such that xN = .
(i) Show that reachable if and only
if it is in the image of the
matrix B AB An;1 B ;
(ii) If is reachable the length N of the sequence uk can be
chosen to be at most n.
(b) Given the uncontrolled discrete time state space system
xk+1 = Axk ; with initial condition x0
yk = Cxk ;
we say it is observable if each x0 gives rise to a unique output
sequence yk .
(i) Show that the above system is observable if and only if
(A ; C ) is controllable;
(ii) Prove that if the system is observable, the initial condition
x0 can be determined from the nite sequence y0 ; : : : ; yn;1 .
10. Provide the details of a proof for Corollary 2.15.
2.7. Exercises 95
11. Using our proofs on controllability and stabilizability as a guide, prove
all parts of Proposition 2.22.
12. Given a pair (C; A), devise a test to determine whether it is possible
to nd a matrix L so that all the eigenvalues of A + LC are equal to
;1; you should not have to explicitly construct L.
13. In this question we derive the famous Kalman decomposition, which
is a generlization of the controllability and oberservability forms we
have studied in this chapter. Suppose we are given a matrix triple
(C; A; B ).
(a) Prove that the subspace intersection CAB \ NCA is A-invariant;
(b) Show that there exists a similarity transformation T so that
2 3 2 3
A1 0 A6 0 B1
T ;1AT = 64 02 03 A4 05 75 ; T ;1B = 64B02 75 ;
A A A A
6 7 6 7
7
0 0 A8 A9 0
and CT = C1 0 C2 0 . Furthermore the following proper-
ties are satised.
the pair (C1 ; A1 ) is observable;
the pair
(A1 ; B1 ) iscontrollable;
the pair A A ; BB1 is controllable;
A 1
2 3
0
2
the pair C1 ; C2 ; A01 AA67 is observable.
3
Linear Analysis
One of the prevailing viewpoints for the study of systems and signals is
that in which a dynamical system is viewed as a mapping between input
and output functions. This concept underlies most of the basic treatments
of signal processing, communications, and control. Although a functional
analytic perspective is implicitly used in these areas, the associated ma-
chinery is not typically used directly for the study of dynamical systems.
However incorporating more machinery from analysis (function spaces, op-
erators) into this conceptual picture, leads to methods of key importance
for the study of systems. In particular, operator norms provide a natural
way to quantify the \size" of a system, a fundamental requirement for a
quantitative theory of system uncertainty and model approximation.
In this chapter we introduce some of the basic concepts from analysis
that are required for the development of robust control theory. This involves
assembling some denitions, providing examples of the central objects and
presenting their important properties. This is the most mathematically
abstract chapter of the course, and is intended to provide us with a solid
and rigorous framework. At the same time, we will not attempt a completely
self contained mathematical treatment, which would constitute a course in
itself. To build intuition, the reader is encouraged to supply examples and
proofs in a number of places. For detailed proofs of the most technical
results presented consult the references listed at the end of the chapter.
During the chapter we will encounter the terminology \for almost ev-
ery" and \essential supremum", which refer to advanced mathematical
concepts that are not part of the core of our course. These clauses are
used to make our statements precise, but can be replaced by \for every"
98 3. Linear Analysis
and \supremum", respectively, without compromising understanding of the
material. If these former terms are unfamiliar, precise denitions and a
short introduction to them are given in Appendix A.
equipped with the L2 norm. To see this consider the sequence of functions
given by
wk (t) := e;t ; for 0 t k,
0; otherwise:
This sequence is clearly in W , and is a Cauchy sequence since kwk ; wl k
e; min(k;l) . The sequence wk is in L2 [0; 1) and it is easy to verify that it
converges in that space to the function
w(t) := e;t ; for 0 t,
0; otherwise:
But w is not in W , and thus W is not complete.
We have introduced the concept of completeness of a normed or inner
product space, and throughout the course we will work with spaces of this
type.
3.2. Operators 103
3.2 Operators
As we have mentioned in the previous examples, normed spaces can be used
to characterize time domain functions, which we informally call signals.
We now examine mappings from one normed space V to another normed
space Z . These are central to the course as they will eventually be used to
represent systems. The focus of interest will be linear, bounded mappings.
Denition 3.4. Suppose V and Z are Banach spaces. A mapping from V
to Z is called a linear, bounded operator if
(a) (Linearity) F (1 v1 + 2 v2 ) = 1 F (v1 ) + 2 F (v2 ) for all v1 ; v2 2 V
and scalars 1 , 2 .
(b) (Boundedness) There exists a scalar 0, such that
kFvkZ kvkV for all v 2 V : (3.1)
The space of all linear, bounded operators mapping V to Z is denoted by
L( V ; Z ) ;
and we usually refer to linear, bounded operators as simply operators. We
dene the induced norm on this space by
kF kV!Z = sup kkFv v k
kZ ;
v2V ; v6=0 V
and it can be veried that it satises the properties of a norm. Notice that
kF kV!Z is the smallest number that satises (3.1). When the spaces in-
volved are obvious we write simply kF k. It is possible to show that L(V ; Z )
is a complete space, and we take this fact for granted. If V = Z we use the
abbreviation L(V ) for L(V ; V ).
We also have a natural notion of composition of operators. If F 2 L(V ; Z )
and G 2 L(Z ; Y ) and the composition GF is dened by
(GF )v := G(Fv); for each v in V .
Clearly, GF is a linear mapping, and it is not dicult to show that
kGF kV!Y kGkZ!Y kF kV!Z ; (3.2)
which implies GF 2 L(V ; Y ). The above submultiplicative property of in-
duced norms has great signicance in robust control theory; we shall have
more to say about it below.
Examples:
As always our simplest example comes from vectors and matrices. Given
a matrix M 2 C mn , it denes a linear operator C n ! C m by matrix
multiplication in the familiar way
z = Mv where v 2 C n :
104 3. Linear Analysis
As in Chapter 1, we will not distinguish between the matrix M and the
linear operator it denes. From the singular value decomposition in Chapter
1, if we put the 2-norm on C m and C n then
kM kCm !C n = (M ):
If we put norms dierent from the 2-norm on C m and C n , then it should be
clear that the induced norm of M may be given by something other than
(M ).
Another example that is crucial in studying systems is the case of con-
volution operators. Suppose f is in the space L11 [0; 1) of scalar functions,
then it denes an operator F : L1 [0; 1) ! L1 [0; 1) by
Z t
(Fu)(t) := f (t ; )u( )d; t 0;
0
for u in L1 [0; 1). To see that F is a bounded mapping on L1[0; 1) notice
that if u 2 L1 [0; 1) and we set y = Fu then the following inequalities are
satised for any t 0.
Z t
jy(t)j = j f (t ; )u( )d j
0
Z t
jf (t ; )j ju( )jd
Z0 1
jf ( )jd kuk1 = kf k1 kuk1:
0
Therefore we see that kyk1 kf k1 kuk1 since t was arbitrary, and more
generally that
kF kL1!L1 kf k1:
It follows that F is a bounded operator. The next proposition says that
the induced norm is exactly given by the 1-norm of f , as one might guess
from our argument above.
Proposition 3.5. With F dened on L1[0; 1) as above we have
kF kL1!L1 = kf k1:
Proof . We have already shown above that kF kL1!L1 kf k1 and so
it remains to demonstrate kF kL1!L1 kf k1. We accomplish this by
showing that, for every > 0, there exists u, with kuk1 = 1 such that
kFuk1 + kf k1:
Choose > 0 and t such that
Z t
jf (t ; )jd + > kf k1:
0
3.2. Operators 105
Now set u( ) = sgn( f (t ; ) ), that is the sign of f (t ; ), and notice that
kuk1 1. Then we get
Z t
(Fu)(t) = f (t ; )sgn(f (t ; )) d
0
Z t
= jf (t ; )j d:
0
The function (Fu)(t) is continuous and therefore we get kFuk1 +
kf k1.
Thus this example shows us how to calculate the L1 induced norm of
a convolution operator exactly in terms of the 1-norm of its convolution
kernel.
We now introduce the adjoint of an operator in Hilbert space.
Denition 3.6. Suppose V and Z are Hilbert spaces, and F 2 L(V , Z ).
The operator F 2 L(Z ; V ) is the adjoint of F if
hz; FviZ = hF z; viV
for all v 2 V and z 2 Z .
We now look at some examples.
Examples:
The simplest example of this is where V = C n and Z = C m , which are
equipped with the usual inner product, and F 2 C mn . Then the adjoint
of the operator F is exactly equal to its conjugate transpose of the matrix
F , which motivates the notation F .
Another example is given by convolution operators: suppose that f is a
scalar function in L1 [0; 1) and the operator Q on L2 [0; 1) is again dened
by
Z t
(Qu)(t) := f (t ; )u( )d; (3.3)
0
for u in L2 [0; 1). Then we leave it to the reader to show that the adjoint
operator Q is given by
Z 1
(Q z )(t) = f ( ; t)z ( ) d; for z in L2 [0; 1),
t
where f denotes the complex conjugate of the function f .
Some basic properties of the adjoint are given in the following statement;
while the existence proof is beyond the scope of this course, the remaining
facts are covered in the exercises at the end of the chapter.
106 3. Linear Analysis
Proposition 3.7. For any F 2 L(V ; Z ), the adjoint exists, is unique and
satises
kF k = kF k = kF F k 21 : (3.4)
An operator F is called self-adjoint if F = F , which generalizes the
notion of a Hermitian matrix. It is easy to show that when F is self-adjoint,
the quadratic form
(v) := hFv; vi
takes only real values. Therefore we have : (V ; V ) ;! R. Such quadratic
forms play an important role later in the course.
As in the matrix case, it is natural to inquire about the sign of such
forms. We will say that a self-adjoint operator is
Positive semidenite (denoted F 0) if hFv; vi 0 for all v 2 V .
Positive denite (denoted F > 0) if, there exists > 0, such that
hFv; vi kvk2, for all v 2 V .
We remark that an operator F satisfying hFv; vi 0, for all nonzero
v 2 V , is not guaranteed to be positive as it is in the matrix case; the
exercises provide examples of this. An important property is that given an
operator in one of these classes, there always exists a square root operator
F 21 of the same class, such that (F 21 )2 = F .
We now introduce another important denition involving the adjoint. An
operator U 2 L(V ; Z ) is called isometric if it satises
U U = I;
The reason for this terminology is that these operators satisfy
hUv1 ; Uv2 i = hU Uv1 ; v2 i = hv1 ; v2 i;
for any v1 ; v2 2 V , i.e. the operator preserves inner products. In particular,
isometries preserve norms and therefore distances: they are \rigid" trans-
formations. A consequence of this is that they satisfy kU k = 1, but the
isometric property is clearly more restrictive.
An isometric operator is called unitary if U = U ;1 , in other words
U U = I and UU = I:
Unitary operators are bijective mappings that preserve the all the structure
of a Hilbert space; if U 2 L(V ; Z ) exists, the spaces V ; Z are isomorphic,
they can be identied from an abstract point of view.
Example:
A matrix U 2 C mn whose columns u1 ; : : : ; un are orthonormal vectors in
C m is an isometry; if in addition m = n, U is unitary.
3.2. Operators 107
Before leaving this section we again emphasize that we use the same
notation \ * " to denote complex conjugate of a scalar, complex conjugate
transpose of a matrix, and adjoint of an operator; a moment's thought will
convince you that the former two are just special cases of the latter and
thus this helps keep our notation economical.
but (Q~ ) = 10 which does not satisfy the hypothesis of the theorem.
Having considered some simple examples | soon we will consider innite
dimensional ones | let usPturn to the proof of the theorem. We start by
explaining the meaning of 1k=0 Q . If this sum is a member of B , this sum
k
3.2. Operators 109
represents the unique element L that satises
n
X
lim kL ;
n!1
Qk k = 0;
k=0
Pn k
that is L = limn!1 k=0 Q . We also have the following technical lemma;
we leave the proof to the reader as it is a good exercise using the basic
properties discussed so far.
Lemma 3.10. Suppose Q and P1k=0 Qk are both in the Banach algebra B.
Then
1
X 1
X
Q Qk = Qk :
k=0 k=1
We are now ready to prove the theorem. The proof of the theorem relies
on the completeness property of the space B, and the submultiplicative
property.
Proof of Theorem 3.9. Our rst task is to demonstrate that P1k=0 Qk
is an element of B. Since by assumption
Pn B is complete it is sucient to
show that the sequence Tn := k=0 Qk is a Cauchy sequence. By the
submultiplicative inequality we have that
kQk k kQkk :
For m > n we see that
m
X m
X m
X
kTm ; Tnk = k Qk k kQk k kQkk ;
k=n+1 k=n+1 k=n+1
where the left hand inequalityP follows by the triangle inequality.
m;n
It is
m
straightforward to show that k=n+1 kQk = kQk
k n +1 1 ;kQ k
1;kQk , since
the former is a geometric series. We conclude that
kTm ; Tn k 1k;QkkQk ;
n+1
L2 (;1?; 1)
?
;!
G
L2 (;1
? ; 1)
?
??y ?
?
y
L^ 2(j R) ;!
M^G
L^ 2(j R)
L2[0?; 1)
?
;!
G
L2 [0? ; 1)
?
??y ?
?
y
H2 ;!
M^G
H2
Examples:
As a nal illustrative example, we turn once more to the case of rational
functions. We dene the sets RL^ 1 and RH1 to consist of rational, matrix
functions that belong respectively to L^ 1 and H1 . It is not hard to see
that a matrix rational function is:
126 3. Linear Analysis
(i) in RL^ 1 if it is proper and has no poles on the imaginary axis.
(ii) in RH1 if it is proper and has no poles on the closed right half plane
C+ :
4
Model realizations and reduction
As this picture shows, these two ellipsoids need not be aligned, and there-
fore we can have a situation as drawn here, where the major and minor
axes of the ellipsoids are nearly opposite. Therefore we reason that the
most intuition would be gained about the system if the controllability and
observability ellipsoids were exactly aligned, and is certainly the natural
setting for this discussion. This raises the question of whether or not it is
possible to arrange such an alignment by means of a state transformation;
answering this is our next goal.
A change of basis to the state space of the above system yields the
familiar transformed realization
A~ = TAT ;1; B~ = TB; and C~ = CT ;1 ;
where T is a similarity transformation. The controllability gramian
associated with this new realization is
1 ~
Z
X~ c = eA B~ B~ eA~ d
Z0 1
= TeA T ;1TBB T (T );1 eA T d
0
= TXcT
142 4. Model realizations and reduction
Similarly the observability gramian in the transformed basis is
Y~o = (T );1 Yo T ;1 :
The above are called congruence transformations, which typically arise
from quadratic forms under a change of basis. The following result concerns
simultaneous diagonalization of positive denite matrices by this kind of
transformation.
Proposition 4.7. Given positive denite matrices X and Y , there exists
a nonsingular matrix T such that
TXT = (T );1 Y T ;1 =
where is a diagonal, positive denite matrix.
Proof . Perform a singular value decomposition of the matrix X 21 Y X 21 to
get
X 12 Y X 21 = U 2 U ;
where U is unitary and is diagonal, positive denite. Therefore we get
; 12 U X 12 Y X 12 U ; 12 = :
Now set T ;1 = X 21 U ; 21 and the above states that (T ;1 ) Y T ;1 = .
Also
TXT = ( 12 U X ; 21 ) X (X ; 12 U 21 ) = :
Applying this proposition to the gramians, the following conclusion can
be drawn:
Corollary 4.8. Suppose (A; B; C ) is a controllable and observable realiza-
tion. Then there exists a state transformation T such that the equivalent
~ B;
realization (A; ~ C~ ) = (TAT ;1; TB; CT ;1) satises
X~c = Y~o =
with > 0 diagonal.
A state space realization such that the controllability and observability
gramians are equal and diagonal, is called a balanced realization. The pre-
vious corollary implies that there always exists a balanced realization for
a transfer function in RH1 ; in fact, starting with any minimal realization
(which will necessarily have A Hurwitz), a balanced one can be obtained
from the above choice of state transformation.
Clearly the controllability and observability ellipsoids for the system are
exactly aligned when a system is balanced. Thus the states which are the
least controllable are also the least observable. Balanced realizations play
a key role in the model reduction studies of the rest of this chapter.
4.5. Hankel operators 143
Before leaving this section, we state a generalization of Proposition
4.7 which covers the general case, where the original realization is not
necessarily controllable or observable.
Proposition 4.9. Given positive semi-denite matrices X and Y there
exists a nonsingular matrix T such that
2 3
1
(a) TXT = 64
6 2 7
0 5,
7
0
2 3
1
6
(b) (T );1 Y T ;1 = 64 0 7
3 5,
7
0
where the matrices k are diagonal and positive denite.
When applied to the gramians Xc and Yo , we nd that if the system is
either uncontrollable or unobservable, then each of the k blocks of Y~o and
X~c have the following interpretation:
1 captures controllable and observable states
2 captures controllable and unobservable states
3 captures observable and uncontrollable states
4 captures unobservable and uncontrollable states.
Under such a transformation the state matrix A~ has the form
2~
A1 0 A~6 0 3
6~ ~ ~4 A~5 77
A~ = 64A02 A03 AA~7 0 5 ;
0 0 A~8 A~9
which is the so-called Kalman decomposition; this can be veried from
the various invariance properties of the controllability and observability
subspaces.
t
Gu
t
;G u
t
Figure 4.3. Results of G and ;G operating on a given u 2 L (;1; 0].
2
c o
Cn
Notice that ;G has at most rank n for a state space system of order n;
namely the dimension of Im(;G ) is at most n.
Our rst result on the Hankel operator tells us how to compute its
L2 (;1; 0] ! L2[0; 1) induced norm. Below max denotes the largest
eigenvalue of the argument matrix.
Proposition 4.10. The norm of ;G is given by
k;G kL2!L2 = (max (Yo Xc ) ) 21 ;
where Yo and Xc are the gramians of the realization (A; B; C ).
Proof . We begin with the following identity established in Chapter 3:
k;Gk2 = (;G ;G ) = ( c o o c ) ;
where () denotes spectral radius. We recall as well that the spectral radius
is invariant under commutation, and obtain
k;Gk2 = ( o o c c ) = (Yo Xc ) :
Now, Yo Xc is a matrix, and has only non-negative eigenvalues; for this
latter point,1 observe
12
that the nonzero eigenvalues of Yo Xc coincide with
those of Xc Yo Xc 0.
2
Therefore (Yo Xc ) = max (Yo Xc), which completes the proof.
We remark that it can also be shown that ;G ;G , a nite rank operator,
has only eigenvalues in its spectrum, namely the eigenvalues of Yo Xc in
addition to the eigenvalue zero. For this reason, in analogy to the matrix
case, the square roots of the eigenvalues of Yo Xc are called the singular
values of ;G or the Hankel singular values of the system G. We order these
and denote them by
1 2 n :
146 4. Model realizations and reduction
Clearly 1 = k;Gk.
Since the operator ;G depends only on the input-output properties of G,
it is clear that the Hankel singular values must be the same for two equiv-
alent state-space realizations. This can be veried directly for realizations
related by a state transformation T . Recall that the gramians transform
via
Y~0 = (T ;1 ) Yo T ;1;
X~c = TXc T ;
and therefore
Y~0 X~c = (T ;1) Yo XcT :
Thus we have Y~0 X~c and Yo Xc are related by similarity and their eigenvalues
coincide.
If we have realizations of dierent order (one of them non-minimal) of a
given G^ , their Hankel singular values can only dier on the value 0, which
appears due to lack of observability or controllability; in fact, the minimal
order can be identied with the number of non-zero Hankel singular values
of any realization.
In the special case of a balanced realization Xc = Y0 = , we see that
the diagonal of is formed precisely by the Hankel singular values.
We now turn to the question of relating the Hankel singular values to the
norm of the original operator G. While these are dierent objects we will
see that important bounds can be established. The rst one is the following.
Proposition 4.11. With the above denitions,
1 = k;G k kGk = kG^ k1:
Proof . Since the projection P+ has induced norm 1, then
k;Gk = kP+ GL2 (;1;0] k kGL2 (;1;0] k kGk:
For the last step, notice that the norm of an operator cannot increase when
restricting it to a subspace. In fact, it is not dicult to show that this step
is an equality.
Thus the largest Hankel singular value provides us with a lower bound
for the H1 norm of the transfer function G^ . The next result provides an
upper bound in terms of the Hankel singular values.
Proposition 4.12. Suppose G^(s) = C (Is;A);1 B and that A is a Hurwitz
matrix. Then
kG^ k1 2(1 + + n ) ;
where the k are the Hankel singular values associated with (A; B; C ).
4.6. Model reduction 147
A proof with elementary tools is given in the exercises at the end of the
chapter. In the next section we will see a slight renement of this bound
using more advanced methods.
The Hankel operator plays an important part in systems theory, as well
as pure operator theory, and is intimately related to the question of ap-
proximation in the kk1 norm. The latter has many connections to robust
control theory; in this chapter we will use it in connection to model re-
duction. For further discussion consult the references at the end of the
chapter.
4.6.1 Limitations
Faced with the approximation problem above it is natural for us to ask
whether there are some fundamental limits to how well we can approximate
a given system with a lower order one. To work towards the answer, as
well as to build intuition, we begin by studying a matrix approximation
problem that is strongly related to the system approximation problem:
given a square matrix N 2 Rnn , how well can it be approximated, in
the maximum singular value norm, by a matrix of rank r? We have the
following result:
4.6. Model reduction 149
Lemma 4.13. Suppose that N 2 Rnn , with singular values 1 2
n . Then for any R 2 Rnn , rank(R) r,
(N ; R) r+1 :
Proof . We start by taking the singular value decomposition of N :
2 3
1
N = U V where = 64 ... 7
5 ;
n
and matrices U; V are both unitary.
Now dene the column vectors vk by
[v1 vn ] = V ;
and consider the r+1-dimensional subspace described by spanfv1 ; : : : ; vr+1 g.
Since the subspace ker(R) has dimension at least n ; r, these two sub-
spaces must intersect non-trivially. Therefore spanfv1 ; : : : ; vr+1 g contains
an element x, with jxj = 1, such that
Rx = 0 :
The vectors vk are orthonormal since V is unitary and so x can be expressed
by
r+1
X r+1
X
x= k vk with 2k = 1;
k=1 k=1
for appropriately chosen scalars k .
We now let the matrix N ; R act on this vector x to get
r+1
X r+1
X
(N ; R)x = Nx = k Nvk = k k uk
k=1 k=1
where we dene uk by [u1 un ] = U . The uk are orthonormal and so we
have
r+1
X
j(N ; R)xj2 = k2 2k :
k=1
To complete the proof notice that since the singular values are ordered we
have
r+1
X r+1
X
k2 2k r2+1 2k = r2+1 ;
k=1 k=1
and so
(N ; R) j(N ; R)xj r+1 :
150 4. Model realizations and reduction
The preceding Lemma shows us that there are fundamental limits to the
matrix approximation problem in the () norm, and that singular values
play a key role in characterizing these limits. What does this have to do with
the system model reduction problem? The key connection is given by the
Hankel operator, since as we saw the dimension of a state-space realization
is the same as the rank of the Hankel operator. Not surprisingly, then, we
can express the fundamental limitations to model reduction in terms of the
Hankel singular values:
Theorem 4.14. Let 1 2 r r+1 n > 0 be the
Hankel singular values associated with the realization (A; B; C ) of G^ . Then
for any G^ r of order r,
kG^ ; G^ r k1 r+1 :
This result says that if we are seeking a reduced order model of state
dimension r, then we cannot make the approximation error smaller than
the (r + 1)-th Hankel singular value of the original system.
Proof . To begin we have that
kG^ ; G^ r k1 k;G;Gr k = k;G ; ;Gr k :
It therefore suces to demonstrate that
k;G ; ;Gr k r+1 :
As we have noted before these Hankel operators satisfy
rank (;G ) n and rank (;Gr ) r ;
which is a fact we will now exploit.
We recall the denitions of the observability and controllability operators
o and c associated with (A; B; C ), and the identity
;G = o c :
Now we dene the maps Po : L2[0; 1) ! Rn and Pc : Rn ! L2 (;1; 0] by
1 1
Po = Y0; 2 o and Pc = c Xc; 2 ;
and verify that kPo k = kPck = 1. We therefore have
k;G ; ;Gr k = kPo k k;G ; ;Gr k kPc k (Po ;G Pc ; P0 ;Gr Pc ) ;
(4.6)
where the submultiplicative inequality is used. We further have that
1 1 1 1
Po ;G Pc = Yo2 o o c c Xo2 = Yo2 Xc2 ;
which has rank equal to n since its singular values are 1 ; : : : ; n which are
all positive. But matrix Po ;Gr Pc has rank at most equal to r since rank
(;Gr ) r. Invoke Lemma 4.13 to see that
(Po ;GPc ; Po ;Gr Pc ) r+1 :
4.6. Model reduction 151
Now apply (4.6) to arrive at the conclusion we seek.
;C1 C1 C2 0
Notice that by Proposition 4.15, this realization has a Hurwitz \A"-matrix.
Our technique of proof will be to construct an \allpass dilation" of E11 (s),
by nding
E 11 (s
E (s) = E (s) E (s)) E 12 (s )
21 22
that contains E11 as a sub-block and is, up to a constant, inner. If this can
be done, the error norm can be bounded by the norm of E (s) which is easy
to compute.
We rst use the similarity transformation
2 3
I I 0
T = 4I ;I 05 ;
0 0 I
to get
2 3
A11 0 A12 =2 B1
E^11 (s) = 64 A021 ;AA1121 ;AA1222=2 B02 75 :
6 7
0 ;2C1 C2 0
Next, we dene the augmentation as
2 3
A11 0 A12 =2 B1 0
6 0 A11 ;A12 =2 0 ;
r+1 1 C1 77
1
6
^
E (s) = 6
6 A 21 ; A 21 A 22 B 2 ;C2 77
4 0 ;2C1 C2 0 2r+1 I 5
;2r+1 B1 1 ; 1 0
;B2 2r+1 I 0
A B :
=: C D
Here we are using the notation
= 01 0 I
r+1
for the balanced gramian of the original system G.
4.6. Model reduction 157
While this construction is as yet unmotivated, the underlying aim is to be
able to apply Lemma 4.16. Indeed, it can be veried by direct substitution
that the observability gramian of the above realization is
2 3
4r2+1 ;1 1 0 0
Yo = 4 0 41 0 5;
0 0 2r+1 I
and that it satises the additional restriction
C D + Yo B = 0:
While somewhat tedious, this verication is based only the Lyapunov
equations satised by the gramian .
Therefore we can apply Lemma 4.16 to conclude that
E (j!) E (j!) = D D = 4r2+1 I for all !;
and therefore we have
kE^11 k1 kE^ k1 = 2r+1 :
We now turn to the general case, where the truncated eigenvalues
r+1 ; : : : ; n are not necessarily all equal. Still, there may be repetitions
among them, so we introduce the notation 1t ; : : : kt to denote the distinct
values of this tail. More precisely, we assume that
1t > 2t > > kt
and fr+1 ; : : : ; n g = f1t ; : : : kt g. Equivalently, the balanced gramian of
the full order system is given by
2 3
1
6
6
... 7
7
2
1
3
6
6 r
7
7 6 1t I 7
= 6
6 r+1
7
7 = 664 ...
7
7
5
6 7
6
4 ... 7
5 kt I
n
where the block dimensions of the last expression correspond to the number
of repetitions of each it .
We are now ready to state the main result:
Theorem 4.18. With the above notation for the Hankel singular values of
G^ , let G^ r be obtained by r-th order balanced truncation. Then the following
inequality is satised:
kG^ ; G^ r k1 2(1t + kt ):
158 4. Model realizations and reduction
Proof . The idea is to successively apply the previous Lemma, truncat-
ing one block of repeated Hankel singular values at every step. By virtue
or Proposition 4.15, in this process the realizations we obtain remain bal-
anced and stable, and the Hankel singular values are successively removed.
Therefore at each step we can apply the previous Lemma and obtain an
error bound of twice the corresponding it . Finally, the triangle inequality
implies that the overall error bound is the sum of these terms.
More explicitly, consider the transfer functions G^ (0) ; : : : ; G^ (k) , where
G k) = G^ , and for each i 2 f1; : : : ; kg, G^ (i;1) is obtained by balanced
^ (
truncation of the repeated Hankel singular value it of G^ (i) . By induction,
each G^ (i) has a stable, balanced realization with gramian
2 3
1
6
6 1t I 7
7
6
4 ... 7
5
:
it I
Therefore Lemma 4.17 applies at each step and gives
kG^ (i) ; G^ (i;1) k1 2it :
Also G^ (0) = G^ r , so we have
k
X
kG^ ; G^ r k = k G^ (i) ; G^ (i;1) k1
i=0
k
X
kG^ (i) ; G^ (i;1) k1 2(1t + kt ):
i=0
As a comment, notice that by specializing the previous result to the
case r = 0 we can bound kG^ k1 by twice the sum of its Hankel singular
values; since repeated singular values are only counted once, this is actually
a tighter version of Proposition 4.12.
We have therefore developed a method by which to predictably reduce
the dynamic size of a state space model, with the H1 norm as the quality
measure on the error.
Example:
We now give a numerical example of the uses of this method. Consider the
7-th order transfer function in RH1 ,
(s + 10)(s ; 5)(s2 + 2s + 5)(s2 ; 0:5s + 5)
G^ (s) = (s + 4)( s2 + 4s + 8)(s2 + 0:2s + 100)(s2 + 5s + 2000) ;
with magnitude Bode plot depicted in the top portion of Figure 4.4.
4.6. Model reduction 159
0
10
−5
10
−1 0 1 2 3 4
10 10 10 10 10 10
0
10
−2
10
−4
10
−6
10
−1 0 1 2 3 4
10 10 10 10 10 10
Figure 4.4. Log-log plots for the example. Top: jG^ (j!)j. Bottom: jG^ (j!)j (dotted),
jG^ (j!)j (full) and jG^ (j!)j (dashed).
4 2
We conclude the section with two remarks:
First, we emphasize again that we are not claiming that the balanced
truncation method is optimal in any sense; the problem of minimizing kG^ ;
G^ r k1 remains computationally dicult.
Second, given the presence of the Hankel operator and its singular values
in the above theory, we may inquire: does balanced truncation minimize the
Hankel norm k;G^ ; ;G^ r k? Once again, the answer is negative. This other
problem, however, can indeed be solved by methods of a similar nature.
For details consult the references at the end of the chapter.
4.8 Exercises
1. Suppose A, X and C satisfy A X + XA + C C = 0. Show that any
two of the following implies the third:
(i) A Hurwitz.
(ii) (C; A) observable.
(iii) X > 0.
2. Prove Proposition 4.4.
4.8. Exercises 163
3. Use the controllability canonical form to prove Proposition 4.6 in the
general case of uncontrollable (A; B ).
4. Controllability gramian vs. controllability matrix. We have seen that
the singular values of the controllability gramian Xc can be used to
determine \how controllable" the states are. In this problem you will
show that the controllability matrix
Mc = [B AB A2 B An;1 B ]
cannot be used for the same purpose, since its singular values are
unrelated to those of Xc . In particular, construct examples (A 2
C 22 ; B 2 C 21 suces) such that:
a) Xc = I , but (Mc ) is arbitrarily small.
b) Mc = I , but (Xc ) is arbitrarily small.
5. Proof of Proposition 4.12.
a) For an integrable matrix function M (t), show that
!
Z b Z b
M (t) dt (M (t)) dt:
a a
You can use
the fact
that the property holds in the scalar case.
A B
b) Let G^ = C 0 . Using a) and the fact that G^ (j!) is the
Fourier transform of CeAt B , derive the inequality
Z 1
kG^ k1 (Ce At2 ) (e At2 B )dt:
0
c) If Xc , Yo are the gramians of (A; B; C ), show that
Z 1
(Ce At2 )2 dt 2Tr(Yo );
0
Z 1
(e At2 B )2 dt 2Tr(Xc ):
0
d) Combine b) and c) for a balanced realization (A; B; C ), to show
that
kG^ k1 2(1 + + n ) :
6. In Proposition 4.15 the strict separation r > r+1 of Hankel singular
values was used to ensure the stability of the truncation (i.e. that
A11 is Hurwitz). Show that indeed this is a necessary requirement, by
constructing an example (e.g. with n = 2, r = 1) where the truncated
matrix is not Hurwitz.
7. Consider the transfer function G^ (s) = s+1
1 .
164 4. Model realizations and reduction
a) Find the (only) Hankel singular value of G^ . Compare the error
bound and actual error when truncating to 0 states (clearly the
truncation is G^ 0 = 0).
b) Show that by allowing a nonzero dr term, the previous error can
be reduced by a factor of 2. Do this by solving explicitly
min
d2R
kG^ ; dk1 :
8. a) Let X and Y be two positive denite matrices in C nn . Show
that the following are equivalent:
(i) X Y ;1 ; (ii) min (XY ) 1; (iii) XI YI 0:
b) Under the conditions of part a), show that: min(XY ) = 1, with
multiplicity k, if and only if XI YI 0 and has rank 2n ; k.
c) Let (A; B; C ) be a state-space realization of G^ (s) with A
Hurwitz. Assume there exist X0 , Y0 in C rr such that
A X00 I 0 + X00 I 0 A + BB < 0
n;r n;r
A Y00 I 0 + Y00 I 0 A + C C < 0
n;r n;r
X0 Ir 0
Ir Y0
Using Theorem 4.20 prove that there exists a r-th order re-
alization G^ r (s) with kG^ ; G^ r k1 < . Is the above problem
convex?
9. Discrete time gramians and truncation. This problem concerns the
discrete time system
xk+1 = Axk + Bwk ;
zk = Cxk :
(a) The discrete Lyapunov equations are
A Lo A ; Lo + C C = 0;
AJc A ; Jc + BB = 0:
Assume that each eigenvalue in the set eig(A) has absolute value
less than one (stable in the discrete sense). Explicitly express the
solutions Jc and Lo in terms of A, B and C ; prove these solutions
are unique.
A realization in said to be balanced if Jc and Lo are diagonal
and equal. Is it always possible to nd a balanced realization?
4.8. Exercises 165
(b) In this part we assume, in contrast to (a), that J > 0 and
L > 0 are generalized gramians that are solutions to the strict
Lyapunov inequalities
A LA ; L + C C < 0;
AJA ; J + BB < 0:
Show that Lo < L and Jc < J . Can generalized gramians be
balanced?
(c) Suppose we are given generalized gramians J = L = 01 I 0 ,
where 1 > 0 is diagonal. Partition the state space accordingly:
A= A 11 A12
A21 A22 ; B= B 1
B2 ;
C = C1 C2 ;
and dene
G^ 11 = AC11 B01 :
1
Show that (A11 ; B1 ; C1 ) is a balanced realization in the sense of
generalized gramians, and that all the eigenvalues of A11 have
absolute value less than one.
10. Prove Proposition 4.19. Use the following steps:
a) Use the balanced Lyapunov inequalities to show that there exits
an augmented system
2 3
A B Ba
G^ a = 4 C D 0 5
Ca 0 0
whose gramians are X and Y .
b) Apply Theorem 4.18 directly to G^ a to give a reduced system
G^ ar . Show that G^ r of Proposition 4.19 is embedded in G^ ar and
must therefore satisfy the proposition.
5
Stabilizing Controllers
We begin here our study of feedback design, which will occupy our atten-
tion in the following three chapters. We will consider systematic design
methods in the sense that objectives are rst specied, without restricting
to a particular technique for achieving them.
The only a priori structure will be a very general feedback arrangement
which is described below. Once introduced, we will focus in this chapter
on a rst necessary specication for any feedback system: that it be stable
in some appropriate sense. In particular we will precisely dene feedback
stability and then proceed to parametrize all controllers that stabilize the
feedback system.
z w
G
u
y
K
z (t) = C1 x(t) + D11 D12 w(t) ;
y(t) C2 D21 D22 u(t)
and K being described by
x_ K (t) = AK xK (t) + BK y(t);
u(t) = CK xK (t) + DK y(t):
Throughout this chapter we have the standing assumption that the matrix
triples (C; A; B ) and (CK ; AK ; BK ) are both stabilizable and detectable.
As shown in the gure, G is naturally partitioned with respect to its two
inputs and two outputs. We therefore partition the transfer function of G
as
2 3
A B1 B2
^ ^
G^ (s) = 4 C1 D11 D12 5 = G^ 11 (s) G^ 12 (s) ;
C2 D21 D22 G21 (s) G22 (s)
so that we can later refer to these constituent transfer functions.
At rst we must determine under what conditions this interconnection of
components makes sense. That is, we need to know when these equations
have a solution for an arbitrary input w. The system of Figure 5.1 is well-
posed if unique solutions exist for x(t), xK (t), y(t) and u(t), for all initial
conditions x(0), xK (0) and all (suciently regular) input functions w(t).
Proposition 5.1. The connection of G and K in Figure 5.1 is well-posed,
if and only if, I ; D22 DK is nonsingular.
Proof . Writing out the state equations of the overall system, we have
x_ (t) = Ax(t) + B1 w(t) + B2 u(t) (5.1)
x_ K (t) = AK xK (t) + BK y(t);
5.1. System Stability 169
and
I ;DK u(t) 0 CK x(t) 0
;D22 I y(t) = C2 0 xK (t) + D21 w(t): (5.2)
Now it is easily seen that the left hand side matrix is invertible if and only
if I ; D22 DK is nonsingular. If this holds, clearly one can substitute u,
y into (5.1) and nd a unique solution to the state equations. Conversely
if this does not hold, from (5.2) we can nd a linear combination of x(t),
xK (t), and w(t) which must be zero, which means that x(0), xK (0), w(0)
cannot be chosen arbitrarily.
Notice in particular that if either D22 = 0 or DK = 0 (strictly proper
G^ 22 or K^ ), then the interconnection in Figure 5.1 is well-posed.
We are now ready to talk about stability.
v2 v1 d1
G22
d2 K
Figure 5.2. Input-output stability
are included in the equations for G22 , it follows immediately that internal
stability of one is equivalent to internal stability of the other.
Lemma 5.4. Given a controller K , Figure 5.1 is internally stable, if and
only if, Figure 5.2 is internally stable.
The next result shows that with this new set of inputs, internal stability
can be characterized by the boundedness of an input-output map.
Lemma 5.5. Suppose that (C2 ; A; B2) is stabilizable and detectable. Then
Figure 5.2 is internally stable if and only if the transfer function of dd1 7!
2
v1 is in RH .
v2 1
Proof . We begin by nding an expression for the transfer function. For
convenience denote
D = I ;DK
;D22 I ;
then routine calculations lead to the following relationship:
v^1 (s) = M^ (s) d^(s) ;
v^2 (s) d^2 (s)
where
^ ; 0 C K ;
M (s) := D C 0 (Is ; A) 0 B D + D + 0 ;I ;
1 1 B 2 0 ; 1 ; 1 0 0
2 K
and A is the closed loop matrix from (5.3). Therefore the \only if" direction
follows immediately, since the poles of this transfer function are a subset
of the eigenvalues of A, which is by assumption Hurwitz; see Proposition
5.3 and Lemma 5.4.
172 5. Stabilizing Controllers
To prove \if": assume that the transfer function has no poles in C+ ,
therefore the same is true of
0 CK (Is ; A);1 B2 0
C2 0 0 BK :
| {z } | {z }
C B
We need to show that A is Hurwitz; it is therefore sucient to show that
A;
(C; B ) is a stabilizable and detectable realization. Let
F = F0 F0 ; D ;1 C0 C0K ;
K 2
where F and FK are chosen so that A + B2 F and AK + BK FK are both
Hurwitz. It is routine to show that
A + B F = A +0B2 F A + 0B F ;
K K K
and thus (A; B ) is stabilizable.
A formally similar argument shows that (C; A) is detectable.
The previous characterization will be used later on to develop a
parametrization of all stabilizing controllers. As for the stabilizability
and detectability assumption on (C2 ; A; B2 ), we will soon see that it is
non-restrictive, i.e. it is necessary for the existence of any stabilizing
controller.
5.2 Stabilization
In the previous section we have discussed the analysis of stability of a
given feedback conguration; we now turn to the question of design of
a stabilizing controller. The following result explains when this can be
achieved.
Proposition 5.6. A necessary and sucient condition for the existence of
an internally stabilizing K for Figure 5.1, is that (C2 ; A; B2 ) is stabilizable
and detectable. In that case, one such controller is given by
K^ (s) = A + B2 F + LC 2 + LD22 F ;L ;
F 0
where F and L are matrices such that A + B2 F and A + LC2 are Hurwitz.
Proof . If the stabilizability or detectability of (C2 ; A; B2) is violated, we
can choose an initial condition which excites the unstable hidden mode. It
is not dicult to show that the state will not converge to zero, regardless
of the controller. Details are left as an exercise. Consequently no internally
stabilizing K exists, which proves necessity.
5.2. Stabilization 173
For the suciency side, it is enough to verify that the given controller
is indeed internally stabilizing. Notice that this is precisely the observer-
based controller encountered in Chapter 2. In particular, well-posedness is
ensured since DK = 0, and it was shown in Chapter 2 that the closed loop
eigenvalues are exactly the eigenvalues of A + B2 F and A + LC2 .
The previous result is already a solution to the stabilization problem,
since it provides a constructive procedure for nding a stabilizer, in this
case with the structure of a state feedback combined with an asymptotic
observer. However there are other aspects to the stabilization question, in
particular:
Can we nd a stabilizer with lower order? The above construction
provides a controller of the order of the plant.
What are all the stabilizing controllers? This is an important ques-
tion since one is normally interested in other performance properties
beyond internal stability.
We will address the rst question using purely LMI techniques, starting
with the state feedback problem, and later considering the general case.
The second problem will then be addressed in Section 5.3.
v2 q v1
N M ;1 d1
p
d2 V ;1 U
Figure 5.3. Coprime factorization
We now proceed to extend the idea in the example to the matrix setting,
and also build up the theory necessary to show the ensuing parametrization
is complete. Let
X~ ;Y~ M^ Y^ = I
;N~ M~ N^ X^
be a doubly coprime factorization of G^ 22 , and
K^ = U^ V^ ;1 = V~ ;1 U~
be coprime factorizations for K^ .
The following gives us a new condition for stability.
Lemma 5.11. Given the above denitions the following are equivalent
(a) The controller K input-output stabilizes G22 in Figure 5.2;
^ ^
(b) M^ U^ is invertible in RH1 ;
N V
~ ~
(c) ;VN~ ;M~U is invertible in RH1 .
Proof . First we demonstrate that condition (b) implies condition (a). This
proof centers around Figure 5.3, that is exactly Figure 5.2 redrawn with
the factorizations for G22 and K . Clearly
M^ ;U^ q^ = d^1 : (5.10)
;N^ V^ p^ d^2
Thus we see
( )
I 0 + 0 U^ M^ ;U^ ;1
v^1 = d^1 :
v^2 0 0 N^ 0 ;N^ V^ d^2
5.3. Parametrization of stabilizing controllers 181
The above transfer function inverse is in RH1 by assumption (b), therefore
we see that (a) holds.
Next we must show that (a) implies (b). To do this we use the Bezout
identity.
X~ M^ ; Y~ N^ = I :
Referring to the gure we obtain
M^ q^ = v^1
N^ q^ = v^2 :
Multiplying the Bezout identity by q^ and substituting we get
q^ = X~ v^1 ; Y~ v^2 :
Now by assumption the transfer functions from d^1 and d^2 to v^1 and v^2 are
in RH1 . Thus by the last equation the transfer function from the inputs
d^1 and d^2 to q^ must be in RH1 .
Similarly we can show that the transfer functions from the inputs to p^
are in RH1 , by instead starting with a Bezout identity X~K V^ + Y~K U^ = I
for the controller. Recalling the relationship in (5.10) we see that (b) must
be satised.
To show that (a) and (c) are equivalent we simply use the left coprime
factorizations for G^ 22 and K^ , and follow the argument above.
We can now prove the main synthesis result of the chapter.
Theorem 5.12. A controller K input-output stabilizes G22 in Figure 5.2,
if and only if, there exists Q^ 2 RH1 such that
K^ = (Y^ ; M^ Q^ )(X^ ; N^ Q^ );1 = (X~ ; Q^ N~ );1 (Y~ ; Q^ M~ ) (5.11)
and the latter two inverses exist as proper rational functions.
Proof . We begin by showing that the latter equality holds for any Q^ 2
RH1 such that the inverses exist. Given such a Q^ we have by the doubly
coprime factorization formula that
I Q^ X~ ;Y~ M^ Y^ I ;Q^ = I ;
0 I ;N~ M~ N^ X^ 0 I
which yields
X~ ; Q^ N~ ;(Y~ ; Q^ M~ ) M^ Y^ ; M^ Q^ = I (5.12)
;N~ M~ N^ X^ ; N^ Q^
182 5. Stabilizing Controllers
Taking this product we get nally
? (X~ ; Q~ N~ )(Y^ ; M^ Q^ ) ; (Y~ ; Q^ M~ )(X^ ; N^ Q^ ) = I :
? ?
Here \?" denotes irrelevant entries, and from the top right entry we see
that the two quotients of the theorem must be equal if the appropriate
inverses exist.
We now turn to showing that this parametrization is indeed stabilizing. So
choose a
Q^ 2 RH1 , where the above inverses exist, and dene
U^ = Y^ ; M^ Q^ V^ = X^ ; N^ Q^
U~ = Y~ ; Q^ M~ V~ = X~ ; Q^ N~ :
From (5.12) we see
V~ ;U~ M^ U^ = I ;
;N~ M~ N^ V^
which implies that U; ^ V^ and U; ~ V~ are right and left coprime factorizations
^
of K respectively. Also it clearly says
M^ U^ ;1 2 RH :
N^ V^ 1
5.4 Exercises
1. Consider the region R = s 2 C : (ss+ ;s s) (ss+;ss ) < 0 in
the complex plane. Here s denotes conjugation, and > 0.
a) Sketch the region R.
b) For A, X in C nn we consider the 2n 2n LMI
AX + XA (AX ; XA ) < 0
(XA ; AX ) AX + XA
that is obtained by formally replacing s by AX , s by XA in
the denition of R. Show that if there exists X > 0 satisfying
this LMI, then the eigenvalues of A are in the region R. It can
be shown that the converse is also true.
5.4. Exercises 185
c) Now we have a system x_ = Ax + Bu, and we wish to design a
state feedback u = Fx such that the closed loop poles fall in R.
Derive an LMI problem to do this.
2. Complete the proof of Proposition 5.10 by showing that the given
transfer functions satisfy equation (5.8). To do this, rst write
realizations of order n for each of
X~ ;Y~ and M^ Y^ ;
;N~ M~ N^ X^
then compose them and eliminate unobservable/uncontrollable states.
Yes, it's a little tedious, but signicantly less than working with
matrices of rational functions.
3. In this problem we consider the system in the gure, where P is a
stable plant, i.e. P^ (s) 2 RH1 .
d
r- - K -? - P z
6;
d2 H
a) Show that the system is input-output stable (i.e. the map from
d to v is in RH1 ) if and only if I ; H is invertible in RH1 .
b) If kk1 kH k1 < 1, show the system is input-output stable.
186 5. Stabilizing Controllers
6. Let P^ (s) be a proper rational matrix.
A right
coprime factorization
^
N ( s )
P^ = N^ M^ ;1 is called normalized if M^ (s) is inner (see Chapter 4).
It is shown in the exercises of the next chapter that every P^ admits
a normalized right coprime factorization (n.r.c.f.).
a) Given a n.r.c.f. P^ = N^ M^ ;1 , we now consider the perturbed
plant
P^ = (N^ + ^ N )(M^ + ^ M );1 ;
^ ^
^ N
where N , M are in RH1 and
^
:
M 1
Show that P^ corresponds to the system inside the dashed box
in the following Figure.
P
N ;M
p1 p2
N q M ;1
d1 d2
6
H2 Optimal Control
z w
G
u
y
K
1
Strictly speaking, this idealization escapes the theory of stationary random
processes; a rigorous treatment is not in the scope of our course.
192 6. H Optimal Control
2
X X ; X X
Now JH is symmetric and therefore the left hand side above must be
symmetric meaning
I J I H = ;H I J I :
X X ; ; X X
From this we have that
(X ; X )H; + H; (X ; X ) = 0 :
Recall that H; is Hurwitz, and therefore this latter Lyapunov equation has
a unique solution; that is X ; X = 0.
We now show property (b): left multiply (6.1) by ;X I to arrive at
;X I H XI = [;X I ] XI H; = 0 :
Now expand the left hand side, using the denition of H and get the Riccati
equation.
Finally we demonstrate that property (c) holds.
Again we use the
relationship (6.1); this time left multiply by I 0 and see that
I 0 H XI = I 0 XI H; = H; :
X2 I 0
to arrive at
X1 JH X1 = X1 J X1 H :
X2 X2 X2 X2 ;
As noted earlier JH is symmetric and so from the right hand side above
we get
(;X1X2 + X2X1 )H; = ;H; (;X1X2 + X2X1 ) :
Moving everything to one side of the equality we get a Lyapunov equation;
since H; is Hurwitz it has the unique solution of zero. That is
;X1X2 + X2 X1 = 0 :
Step B: Show X1 is nonsingular.
We will show that ker X1 = 0. First wedemonstrate
that it is an invariant
subspace of H; . Left multiply (6.2) by I 0 to get this is
AX1 + RX2 = X1 H; : (6.3)
Now for any x 2 ker X1 we have
x X2 (AX1 + RX2)x = x X2X1 H; x = x X1X2 H; x = 0 ;
196 6. H Optimal Control
2
6.3 Synthesis
Our solution to the optimal synthesis problem will be strongly based on
the concept of an inner function, introduced in Chapter 3.5. In fact the
6.3. Synthesis 197
connection between the H2 problem and Riccati equation techniques will
be based on a key Lemma about inner functions.
We recall that a transfer function U^ (s) 2 H1 is called inner when it
satises
U^ (j!)U^ (j!) = I for all ! 2 R:
Multiplication by an inner function denes an isometry on H2 . This was
indicated before; however the property also extends to the matrix space
RH2 which we are considering in the present chapter. To see this, suppose
U^ is inner and that G^ 1 and G^ 2 are in RH2 , then
Z 1
hU^ G^ 1 ; U^ G^ 2 i2 = Tr(U^ (j!)G^ 1 (j!)) U^ (j!)G^ 2 (j!)dw
;1
Z 1
= TrG^ 1 (j!)G^ 2 (j!)d!
;1
=hG^ 1 ; G^ 2 i2
Namely inner products are unchanged when such a function is inserted.
We now relate inner functions with Riccati equations.
Lemma 6.7. Consider the state-space matrices A; B; C; D, where (C; A; B)
is stabilizable and detectable and D [C D] = [0 I ]. For
H = ;CA C ;;BB A ;
dene X = Ric(H ) and F = ;B X . Then
U^ (s) = CA +
+ BF B
DF D (6.4)
is inner.
Proof . Since X is a stabilizing solution, we have that A;BBX = A+BF
is Hurwitz, therefore U^ (s) 2 RH1 . Now from the Riccati equation
A X + XA + C C ; XBB X = 0
and the hypothesis it is veried routinely that
(A + BF ) X + X (A + BF ) + (C + DF ) (C + DF ) = 0:
Therefore X is the observability Gramian corresponding to the realization
(6.4). Also, the Gramian X satises
(C + DF ) D + XB = F + XB = 0:
Therefore we are in a position to apply Lemma 4.16, and conclude that
U^ (s) is inner.
As a nal preliminary, we have the following matrix version of our earlier
vector valued result; this extension is a straightforward exercise.
198 6. H Optimal Control
2
Proposition 6.8. If G^1 and G^2 are matrix valued function in RH2 and
RH2? , respectively, then
hG^ 1 ; G^ 2 iL2 = 0 :
Observe that if Q^ 2 RH2 then the function Q^ 2 RH2? .
We are now in a position to state the main result, which explicitly gives
the optimal solution to the H2 synthesis problem. We spend the remainder
of the section proving the result using a sequence of lemmas. The technique
used exploits the properties of the Riccati equation, and in particular its
stabilizing solution.
Theorem 6.9. Suppose G is a state space system with realization
2 3
A B1 B2
G^ (s) = 4 C1 0 D12 5
C2 D21 0
where
(a) (C1 ; A; B1 ) is a stabilizable and detectable;
(b) (C2 ; A; B2 ) is a stabilizable and detectable;
C1 D12 = 0 I ;
(c) D12
= 0 I .
(d) D21 B1 D21
Then the optimal stabilizing controller to the H2 synthesis problem is given
by
K^ 2 (s) = AF2 ;0L ;
with
L = ;Y C2 ; F = ;B2 X and A2 = A + B2 F + LC2
where X = Ric(H ); Y = Ric(J ) and the Hamiltonian matrices are
H = ;CA C ;;BA2B 2 ; J = ;BA B ;C;2AC2 :
1 1 1 1
Furthermore the performance achieved by K2 is
kG^ c B1 k22 + kF G^ f k22
where
G^ c = A F I ^
CF 0 and Gf = I 0
AL BL
Proof . First we establish that all the objects in the theorem statement are
dened. Use hypothesis (a) and (b), and invoke Corollary 6.6 to see that
H is in the domain of the Riccati operator. From the same result we see
that AF is Hurwitz, and so G^ c 2 RH2 . Also notice that K must stabilize
Gtmp since the closed loop A-matrix is the same as that with G; hence
S (G^ tmp ; K^ ) 2 RH2 .
The closed loop state equations are
x_ (t) = Ax(t) + B1 w(t) + B2 u(t)
z (t) = C1 x(t) + D12 u(t)
y(t) = C2 x(t) + D21 w(t) ;
with u = Ky. Now make the substitution v(t) = u(t) ; Fx(t) to get
x_ (t) = (A + B2 F )x(t) + B1 w(t) + B2 v(t)
z (t) = (C1 + D12 F )x(t) + D12 v(t) :
Recalling the denition of G^ c (s) and converting to transfer functions we
get
z^(s) = G^ c (s)B1 w^(s) + U^ (s)^v (s) ;
where U^ has realization (AF ; B2 ; CF ; D12 ). It is routine to show that the
transfer function from w to v is S (G^ tmp ; K^ ) and so
S (G;^ K^ ) = G^ c B1 + US ^ (G^ tmp ; K ) :
Next let us take the norm of S (G; ^ K^ ) making use of this new expression
^ K^ )k22 = kG^ cB1 k22 + kUS
kS (G; ^ (G^ tmp ; K^ )k22 + 2RefhUS ^ (G^ tmp ; K ); G^ c B^1 ig :
The key to the proof is that:
(i) U^ is inner; this was proved in Lemma 6.7;
(ii) U^ G^ c B1 is in RH2? ; this fact follows by similar state-space
manipulations and is left as an exercise.
The conclusion now follows since kS (G^ tmp ; K^ )k2 = kUS ^ (G^ tmp ; K^ )k2 ,
and
^ (G^ tmp ; K^ ); G^ c B1 i2 = hS (G^ tmp ; K^ ); U^ G^ c B1 )i2 = 0 ;
hUS
where the orthogonality is clear because S (G^ tmp ; K^ ) 2 RH2 as noted above.
The major point of this lemma is that it puts a lower bound on the achiev-
able performance, which is kG^ c B1 k22 . Now K stabilizes G of the corollary
if and only if Gtmp is stabilized, and thus minimizing the closed loop
performance is equivalent to minimizing kS (G^ tmp ; K^ )k2 by choosing K .
6.3. Synthesis 201
An additional remark is that in the special case of state feedback, i.e.
when C2 = I , D21 = 0, this second term can be made zero by the static
control law K = F , as follows from the direct substitution into S (G^ tmp ; K^ ).
Alternatively, revisiting the proof we note that u = Fx gives v = 0, and the
second term of the cost would not appear. Therefore Lemma 6.10 provides
a solution to the H2 problem in the state feedback case.
In the general case, the auxiliary variable v with its associated cost re-
ects the price paid by not having the state available for measurement.
This additional cost can now be optimized as well. Before addressing this,
we state a so-called duality result; its proof is an exercise involving only
the basic properties of matrix transpose and trace
Lemma 6.11. Suppose G has a realization (A; B; C; D); K has a realiza-
tion (AK ; BK ; CK ; DK ), and
> 0. Then the following are equivalent.
(i) K internally stabilizes G and kS (G; ^ K^ )k2 =
;
(ii) K 0 internally stabilizes G0 and kS (G^ 0 ; K^ 0 )k2 =
.
Here G0 has realization (A ; C ; B ; D ) and K 0 has realization (AK ; CK ; BK ; DK )
The previous three results combine to give a further breakdown of the
closed loop transfer function of G and K .
Lemma 6.12. Let Gtmp be as dened in Lemma 6.10. If K internally
stabilizes Gtmp then
kS (G^ tmp ; K^ )k22 = kF G^ f k22 + kS (E;
^ K^ )k22
where
2 3
A ;L B2
E^ (s) = 4 ;F 0 I 5
C2 I 0
Proof . Start by applying Lemma 6.11 with G0 = Gtmp and K 0 = K to
get
kS (G^ tmp ; K^ )k2 = kS (G^ 0tmp ; K^ 0 )k2 :
Now apply Lemma 6.10 with G = G0tmp and K = K 0 to get
kS (G^ 0tmp ; K^ 0 )k22 = kG^ 0f F k22 + kS (E^ 0 ; K^ 0)k22
Finally applying Lemma 6.11 to the right hand side we get
kS (G^ 0tmp ; K^ 0 )k22 = kF G^ f k22 + kS (E;
^ K^ )k22 :
We can now completely prove Theorem 6.9.
202 6. H Optimal Control
2
Proof . Simply write out the state space equations corresponding to the
interconnection shown in Figure 6.1.
x_ (t) = (A + B2 DK )x(t) + B1 w(t)
z (t) = (C1 + D12 DK )x(t):
Now invoking Proposition 6.13 we see that our problem can be solved if
and only if there exist DK and X > 0 satisfying
(A + B2 DK )X + X (A + B2 DK ) + B1 B1 < 0
Tr(C1 + D12 DK )X (C1 + D12 DK ) < 1
Now introducing the change of variables DK X = Y these conditions are
equivalent to (6.5) and (6.6).
While we have transformed the H2 synthesis problem to a set of matrix
inequalities, this is not yet an LMI problem since (6.6) is not convex in X ,
Y . Now we recall our discussion from Chapter 1, where it was mentioned
that often problems which do not appear to be convex can be transformed
into LMI problems by a Schur complement operation. This is indeed the
case here, although it will also be necessary to introduce an additional vari-
able, a so-called slack variable. We explain these techniques while obtaining
our main result.
Theorem 6.15. There exists a feedback gain K^ (s) = DK that internally
stabilizes G and satises
^ K^ )k2 < 1;
kS (G;
if and only if, there exist square matrices X 2 Rnn Z 2 Rqq and a
rectangular matrix Y 2 Rmn such that
DK = Y X ;1;
and the inequalities
A B2 X
+
X Y A + B1 B1 < 0; (6.7)
Y B2
X (C1 X + D12 Y ) > 0; (6.8)
(C1 X + D12 Y ) Z
Tr(Z ) < 1 (6.9)
are satised.
Proof . It suces to show that conditions (6.8) and (6.9) are equivalent to
(6.6) and X > 0.
Suppose (6.6) holds; since the trace is monotonic under matrix in-
equalities, then we can always nd a matrix Z satisfying Tr(Z ) < 1
and
(C1 X + D12 Y )X ;1 (C1 X + D12 Y ) < Z
6.5. Exercises 205
indeed it suces to perturb the matrix on the left by a small positive
matrix. Now from the Schur complement formula of Theorem 1.10, the
above inequality and X > 0 are equivalent to (6.8).
The previous steps can be reversed to obtain the converse implication.
Thus we have reduced the static state feedback H2 synthesis problem to
a set of convex conditions in the three variables X , Y and Z .
The above derivations exhibit a common feature of tackling problems via
LMIs: there is always an element of \art" involved in nding the appro-
priate transformation or change of variables that would render a problem
convex, and success is never guaranteed. In the next chapter we will present
additional tools to aid us in this process. The references contain a more ex-
tensive set of such tools, in particular how to tackle the general (dynamic
output feedback) H2 synthesis problem via LMI techniques.
We are now ready to turn our attention to a new performance criterion;
it will be studied purely from an LMI point of view.
6.5 Exercises
1. Verify that for a multi-input sytsem P , kP^ k22 is the sum of out-
put energies corresponding to impulses (t)ek applied in each input
channel.
2. Prove Corollary 6.6.
3. Normalized coprime factorizations. This exercise complements Ex-
ercise 6, Chapter 5. We recall that a right coprime factorization
P^ = N^ M^ ;1 of a proper P^ (s) is called normalized when M N^ (s)
^ (s)
is inner.
(a) Consider P^ = CA B0 , and assume the realization is minimal.
Let F = ;B X , where
X = Ric ;CA C ;;BB
A :
Show that
2 3
N^ (s) = 4 A +CBF B0
5
M^ (s) F I
denes a normalized right coprime factorization for P^ .
(b) Extend the result for any P^ (not necessarily strictly proper).
206 6. H Optimal Control
2
7
H1 Synthesis
D = 0
21 D21
which are entirely in terms of the state space matrices for G. Then we have
AL = A + BJC BL = B + BJD21
(7.9)
CL = C + D12 JC DL = D11 + D12 JD21
The crucial point here is that the parametrization of the closed loop state
space matrices is ane in the controller matrix J .
Now we are looking for a controller K such that the closed loop is con-
tractive and internally stable. The following form of the KYP lemma will
help us.
216 7. H1 Synthesis
Corollary 7.4. Suppose M^ L(s) = CL(Is ; AL);1BL + DL. Then the
following are equivalent conditions.
(a) The matrix AL is Hurwitz and kM^Lk1 < 1;
(b) There exists a symmetric positive denite matrix XL such that
2 3
AL XL + XLAL XL BL CL
4 BL XL ;I DL 5 < 0 :
CL DL ;I
This result is readily proved from Lemma 7.3 by applying the Schur comple-
ment formula. Notice that the matrix inequality in (b) is ane in XL and
J individually, but it is not jointly ane in both variables. The main task
now is to obtain a characterization where we do have a convex problem.
Now dene the matrices
PXL = B XL 0 D12
Q = C D21 0
and further
2 3
A XL + XLA XLB C
HXL = 4 B XL ;I D11 5 :
C D11 ;I
It follows that the inequality in (b) above is exactly
HXL + Q J PXL + PX L JQ < 0 :
Lemma 7.5. Given the above denitions there exists a controller synthesis
K if and only if there exists a symmetric matrix XL > 0 such that
WPXL HXL WPXL < 0 and WQ HXL WQ < 0 ;
where WPXL and WQ are as dened in Lemma 7.2.
Proof . From the discussion above we see that a controller K exists if and
only if there exists XL > 0 satisfying
HXL + Q J PXL + PX L JQ < 0 :
Now invoke Lemma 7.2.
This lemma says that a controller exists if and only if the two matrix
inequalities can be satised. Each of the inequalities is given in terms of the
state space matrices of G and the variable XL . However we must realize
that since XL appears in both HXL and WPXL , that these are not LMI
conditions. Converting to an LMI formulation is our next goal, and will
7.2. Synthesis 217
require a number of steps. Given a matrix XL > 0 dene the related matrix
2
L;1 + XL;1 A B XL;1 C 3
AX
TXL = 4 B ;I D11 5 ; (7.10)
CXL ; 1 D11 ;I
and the matrix
P = B 0 D12 (7.11)
which only depends on the state space realization of G. The next lemma
converts one of the two matrix inequalities of the lemma, involving HXL ,
to one in terms of TXL .
Lemma 7.6. Suppose XL > 0. Then
WPXL HXL WPXL < 0; if and only if, WP TXL WP < 0:
Proof . Start by observing that
PXL = PS ;
where
2 3
XL 0 0
S = 4 0 I 05 :
0 0 I
Therefore we have
ker PXL = S ;1 ker P :
Then using the denitions of WPXL and WP we can set
WPXL = S ;1 WP :
Finally we have that WPXL HXL WPXL < 0 if and only if
WP (S ;1 ) HXL S ;1 WP < 0
and it is routine to verify (S ;1 ) HXL S ;1 = TXL .
Combining the last two lemmas we see that there exists a controller of
state dimension nK if and only if there exists a symmetric matrix XL > 0
such that
WP TXL WP < 0 and WQ HXL WQ < 0 : (7.12)
The rst of these inequalities is an LMI in the matrix variable XL;1 , where
as the second is an LMI in terms of XL. However the system of both
inequalities is not an LMI. Our intent is to convert these seemingly non-
convex conditions into an LMI condition.
218 7. H1 Synthesis
Recall that XL is a real and symmetric (n + nK ) (n + nK ) matrix; here
n and nK are state dimensions of G and K . Let us now dene the matrices
X and Y which are n n submatrices of XL and XL;1 , by
X X 2 ;
XL =: X X and XL =: Y Y :1 Y Y 2 (7.13)
2 3 2 3
We now show that the two inequality conditions listed in (7.12), only
constrain the submatrices X and Y .
Lemma 7.7. Suppose XL is a positive denite (n + nK ) (n + nK ) matrix
and X and Y are n n matrices dened as in (7.13). Then
WP TXL WP < 0 and WQ HXL WQ < 0 ;
if and only if, the following two matrix inequalities are satised
(a)
2 3
NX 0 4A XB +XXA XB
1 C1 N 0
;I D11 0 I < 0 ;
5 X
0 I 1
C 1 D
11 ;I
(b)
2 3
NY 0 4AY C+YY A Y;CI1 DB1 5 NY 0 < 0 ;
0 I 1 11 0 I
B1 ;I
D11
where NX and NY are full-rank matrices whose images satisfy
ImNX = ker C2 D21
:
ImNY = ker B2 D12
Proof . The proof amounts to writing out the denitions and removing
redundant constraints. Let us show that WP TXL WP < 0 is equivalent to
the LMI in (b).
From the denitions of TXL in (7.10), and A, B and C in (7.8) we get
2 3
AY + Y A AY2 B1 Y C1
0 Y2 C1 77
TXL = 64 Y2BA 0
6
1 0 ;I D 5 11
C1 Y C1 Y2 D11 ;I
Also recalling the denition of P in (7.11), and substituting for B and D12
from (7.8) yields
P = B0 I0 00 D0 :
2 12
7.2. Synthesis 219
Thus the kernel of P is the image of
2 3
V1 0
WP = 64 00 I075
6 7
V2 0
where
V1 = N
V2 Y
spans the kernel of B2 D12 as dened above. Notice that the second
block row of WP is exactly zero, and therefore the second block-row and
block-column of TXL , as explained above, do not enter into the constraint
WP TXL WP < 0. Namely this inequality is
V1 0 AY + Y A B1 Y C1 V1 0
2 3 2 32 3
4 0 I5 4 B1 ;I D11 5 4 0 I 5 < 0 :
V2 0 C1 Y D11 ;I V2 0
By applying the permutation
2 3 2 3
V1 0 I 0 0 N 0
4 0 I 5 = 40 0 I 5 Y
V 0 0 I 0 0 I
2
we arrive at (b).
Using a nearly identical argument, we can readily show that WQ HXL WQ <
0 is equivalent to LMI (a) in the theorem statement.
What we have shown is that a controller synthesis exists if and only if
there exists an (n + nK ) (n + nK ) matrix XL that satises conditions (a)
and (b) of the last lemma. These latter two conditions only involve X and
Y , which are submatrices of XL and XL;1 respectively. Our next result tell
us under what conditions, given arbitrary matrices X and Y , it is possible
to nd a positive denite matrix XL that satises (7.13).
Lemma 7.8. Suppose X and Y are symmetric, positive denite matrices
in Rnn ; and nK is a positive integer. Then there exist matrices X2 ; Y2 2
RnnK and symmetric matrices X3 ; Y3 2 RnK nK , satisfying
X X2 > 0 and X X2 ;1 = Y Y2
0 I 1 11 0 I
B1 ;I
D11
(c)
X I 0;
I Y
where NX and NY are full-rank matrices whose images satisfy
ImNX = ker C2 D21
:
ImNY = ker B2 D12
Proof . Suppose a controller exists, then by Lemma 7.7 a controller exists
if and only if the inequalities
WP TXL WP < 0 and WQ HXL WQ < 0
hold for some symmetric, positive denite matrix XL in R(n+nK )(n+nK ) .
By Lemma 7.7 these LMIs being satised imply that (a) and (b) are met.
Also invoking Lemma 7.8 we see that (c) is satised.
Showing that (a{c) imply the existence of a synthesis is essentially the
reverse process. We choose nK n, in this way the rank condition in
Lemma 7.8 is automatically satised, and thus there exists an XL in
R(n+nK )(n+nK ) which satises (7.13). The proof is now completed by using
XL and (a{b) together with Lemma 7.7.
This theorem gives us exact conditions under which a solution exists
to our H1 synthesis problem. Notice that in the suciency direction we
required that nk n, but clearly it suces to choose nk = n. In other words
a synthesis exists if and only if one exists with state dimension nK = n.
What if we want controllers of order nK less than n? Then clearly from
the above proof we have the following characterization.
Corollary 7.10. A synthesis of order nk exists for the H1 problem, if and
only if there exist symmetric matrices X > 0 and Y > 0 satisfying (a), (b),
222 7. H1 Synthesis
and (c) in Theorem 7.9 plus the additional constraint
rank XI YI n + nK :
Unfortunately this constraint is not convex when nK < n, so this
says that in general the reduced order H1 problem is computationally
much harder than the full order problem. Nevertheless, the above explicit
condition can be exploited in certain situations.
7.4 Exercises
1. Prove Lemma 7.1.
7.4. Exercises 223
2. Generalization
of the KYP Lemma. Let A be a Hurwitz matrix, and
Q S
let = S R be a symmetric matrix with R > 0. We dene
;1 ;1
^(j!) = (j!I ; A) B (j!I ; A) B
I I
Show that the following are equivalent:
(i) ^(j!) > 0 for all ! 2 R.
(ii) The Hamiltonian matrix
;1 S
H = ;(AQ;;BR ;BR;1B
BR;1 S ) ;(A ; BR;1 S )
is in the domain of the Riccati operator.
(iii) The LMI
A X + XA XB + > 0
BX 0
admits a symmetric solution X .
(iv) There exists a quadratic storage function V (x) = x Px such the
dissipation inequality
_V x(t) Q S x(t)
u(t) S R ; I u(t)
is satised over any solutions to the equation x_ = Ax + Bu.
Hint: The method of proof from x7.1.1 can be replicated here.
3. Spectral Factorization. This exercise is a continuation of the previous
one on the KYP Lemma. We take the same denitions for , H , etc.,
and assume the above equivalent conditions are satised. Now set
M^ (s) = A B
R; 12 (S + B X ) R 21 ;
where X = Ric(H ). Show that M^ (s) 2 RH1 , M^ (s);1 2 RH1 , and
the factorization
^(j!) = M^ (j!) M^ (j!)
holds for every ! 2 R.
4. Mixed H2 =H1 control.
(a) We are given a stable system with the inputs partitioned in two
channels w1 , w2 and a common output z :
P^ = P^1 P^2 = CA BD1 B02 :
224 7. H1 Synthesis
Suppose there exists X > 0 satisfying
2 3
A X + XA XB1 C
4 B1 X ;I D 5 < 0; (7.17)
C D ;I
Tr(B2 XB2) <
2 : (7.18)
Show that P satises the specications kP^1 kH1 < 1; kP^2 kH2 <
. Is the converse true?
(b) We now wish to use part (a) for state-feedback synthesis. In
other words, given an open loop system
x_ = A0 x + B1 w1 + B2 w2 + B0 u;
z = C0 x + Dw1 + D0 u;
we want to nd a state feedback u = Fx such that the closed
loop satises (7.17)-(7.18). Substitute the closed loop matrices
into (7.17); does this give an LMI problem for synthesis?
(c) Now modify (7.17) to an LMI in the variable X ;1, and show
how to replace (7.18) by two convex conditions in X ;1 and an
appropriately chosen slack variable Z .
(d) Use part (c) to obtain a convex method for mixed H1 /H2 state
feedback synthesis.
5. As a special case of reduced order H1 synthesis, prove Theorem 4.20
on model reduction.
6. Prove Theorem 5.8 involving stabilization. Hint: Reproduce the steps
of the H1 synthesis proof, using a Lyapunov inequality in place of
the KYP Lemma.
7. Connections to Riccati solutions for the H1 problem. Let
2 3
A B1 B2
^
G(s) = C1 0 D12
4 5
C2 D21 0
satisfy the normalization conditions
C1 D12 = 0 I
D12 and D21 B1 D21
= 0 I :
Notice that these (and D11 = 0) are part of the conditions we imposed
in our solution to the H2 -optimal control in the previous chapter.
7.4. Exercises 225
(a) Show that the H1 synthesis is equivalent to the feasibility of
the LMIs X > 0, Y > 0 and
A X + XA + C1 C1 ; C2 C2 XB1 < 0;
B1 X ;I
AY + Y A + B1 B1 ; B2 B2 Y C1 < 0;
C1 Y
;I
X I 0:
I Y
(b) Now denote Q = Y ;1 , P = X ;1. Convert the above conditions
to the following:
A P + PA + C1 C1 + P (B1 B1 ; B2 B2 )P < 0;
AQ + QA + B1 B1 + Q(C1 C1 ; C2 C2 )Q < 0;
(PQ) 1
These are two Riccati inequalities plus a spectral radius cou-
pling condition. Formally analogous conditions involving the
corresponding Riccati equations can be obtained when the
plant satises some additional technical assumptions. For details
consult the references.
8
Uncertain Systems
In the last three chapters we have developed synthesis techniques for feed-
back systems where the plant model was completely specied, in the sense
that given any input there is a uniquely determined output. Also our plant
models were linear, time invariant and nite dimensional. We now return
our focus to analysis, but move beyond our previous restriction of having
complete system knowledge to the consideration of uncertain systems.
In a narrow sense, uncertainty arises when some aspect of the system
model is not completely known at the time of analysis and design. The
typical example here is the value of a parameter which may vary according
to operating conditions. As discussed in the introduction to this course, we
will use the term uncertainty in a broader sense to include also the result
of deliberate under-modeling, when this occurs to avoid very detailed and
complex models.
To illustrate this latter point, let us brie
y and informally examine some
of the issues involved with modeling a complex system. In the gure we
have a conceptual illustration of a complex input-output system.
As depicted it has an input w and an output z. A complete description of
such a system is given by the set of input-output pairs
P = f(w;
z) : z is the output given input wg:
Here the complex system can be nonlinear or innite dimensional, but is
assumed to be deterministic.
Now suppose we attempt to model this complex system with a mapping
G. Then similar to the complex system, we can describe this model by an
228 8. Uncertain Systems
z Complex system w
input-output set
RG = f(w; z ) : z = Gwg :
Our intuitive notion of a good model is, given an input-output pair (w;
z)
in the set of system behaviors P and the corresponding model pair (w;
z) 2
RG that
z is a good approximation to z :
If however G is much simpler than the system it is intended to model, this
is clearly an unreasonable expectation for all possible inputs w.
z G w
p q
M
z w
Figure 8.1. Uncertain system model
Here M and are bounded operators on L2[0; 1), and w the system input
is in L2[0; 1). The picture represents the map w 7! z which is formally
dened by the equations
q =p
p = M11 M12 q ;
z M21 M22 w
where M is compatibly partitioned with respect to the inputs and outputs.
Our goal will be to understand the possible maps w 7! z when all that is
known about is that it resides in a pre-specied subset of the bounded
linear operators on L2 . We will consider two fundamental types of sets in
this chapter which are specied by spatial structure, norm bounds and their
dynamical characteristics.
Before embarking on this specic investigation let us quickly look at the
algebraic form of maps that can be obtained using the above arrangement.
Observe that if (I ; M11 ) is nonsingular then
w 7! z = M22 + M21 (I ; M11 );1 M12 =: S(M; );
where this expression denes the notation S(M; ), the upper star product.
Recall that we dened the lower star product S (; ) earlier when studying
synthesis.
Examples:
We now present two examples of the most common types of uncertain
models. Suppose that
M= 0 I : I G
8.1. Uncertainty modeling and well-connectedness 231
Then we see
S(M; ) = G + :
That is the sets of operators generated by S(M; ), when 2 is simply
that of an additive perturbation to an operator G.
Similarly let
M = I0 G
G
and get that
S(M; ) = G + G = (I + )G;
which is a multiplicative perturbation to the operator G.
Two examples have been presented above, which show that the setup
of Figure 8.1 can be used to capture two simple sets. It turns out that
this arrangement has a surprisingly rich number of possibilities, which are
prescribed by choosing the form of the operator M and the uncertainty or
perturbation set . One reason for this is that this modeling technique is
ideally suited to handle system interconnection (e.g. cascade or feedback),
as discussed in more detail below. The exercises at the end of the chapter
will also highlight some of these possibilities. First, though, we lay the basis
for a rigorous treatment of these representations by means of the following
denition.
Denition 8.1. Given an operator M and a set , the uncertain system
in Figure 8.1 is robustly well-connected if
I ; M11 is nonsingular for all 2 :
We will also use the terminology (M11 ; ) is robustly well-connected to
refer to this property.
When the uncertain system is robust well-connected, holds we are assured
that the map w 7! z of Figure 8.1 is bounded and given by S(M; ), for
every 2 . At rst sight, we may get the impression that robust well-
connectedness is just a technicality. It turns out however, that it is the
central question for robustness analysis, and many fundamental system
properties reduce to robust well-connectedness; we will, in fact, dedicate
this chapter to answering this question in a variety of cases. Subsequently,
in Chapter 9 these techniques will be applied to some important problems
of robust feedback design.
A simple rst case of evaluation of robust well-connectedness is when the
set is the unit ball in operator space. In this case of so-called unstructured
uncertainty, the analysis can be reduced to a small-gain property. The term
232 8. Uncertain Systems
unstructured uncertainty is used because we are not imposing any structure
on the perturbations considered except that they are contractive.
Theorem 8.2. Let Q be an operator and = f 2 L(L2) : kk 1g.
Then I ; Q is nonsingular for all 2 if and only if
kQk < 1:
Proof . The \if" direction, observe that for any 2 , we have the norm
inequalities
kQk kQkkk < 1:
Then it follows from the small-gain theorem in Chapter 3 that I ; Q is
invertible.
For the \only if" direction, we must show that if kQk 1 then there
exists 2 with I ; M11 singular. From our work in Chapter 3, we
obtain the spectral radius condition
(QQ ) = kQk2 = kQk2 1;
where QQ has only nonnegative spectrum, so I ; QQ is singular with
= kQk2.
Dividing by we see that I ; QQ;1 is singular, and thus we set
= ;1 Q , which is contractive since kk = kQk;1 1.
The preceding result reduces the analysis of whether the system of Figure
8.1 is robust well-connected under unstructured uncertainty, to evaluating
the norm of M11. If for instance M11 is a nite dimensional, causal LTI
operator, a computational evaluation follows from the tools of Chapter 7.
We remark that there is nothing special about perturbations of size one.
If the uncertainty set were specied as f 2 L(L2 ) : kk g, the cor-
responding test would be kM11 k < 1= ; now, since a normalizing constant
can always be included in the description of M , we will assume from now
on that this normalization has already been performed and our uncertainty
balls are of unit size.
Therefore the analysis of well-connectedness is simple for unstructured
uncertainty. There are, however, usually important reasons to impose ad-
ditional structure on the perturbation set , in addition to a norm bound.
We explain two such reasons here, the rst is illustrated by the following
example.
Example:
The gure depicts the cascade interconnection of two uncertain systems
of our standard form; i.e., a composition of the uncertain operators. It is
a routine exercise, left for the reader, to show that the interconnection is
equivalent to a system of the form of Figure 8.1, for an appropriately chosen
8.1. Uncertainty modeling and well-connectedness 233
- 1 - 2
p1 q1 p2 q2
N H
Figure 8.2. Cascade of uncertain systems
M and
p = pp1 ; q = qq1 ; = 01 0 :
2 2 2
Therefore we nd that the uncertainty set for the composite system
will have, by construction, a block diagonal structure.
Thus we see that system interconnection generates structure; while com-
ponents can be modeled by unstructured balls as described in previous
examples, a spatially structured uncertainty set can be used to re
ect the
complexity of their interconnection. We remark that many other forms of
interconnection (e.g. feedback) can also be accommodated in this setting.
We now move to a second source of uncertainty structure; this arises
when in addition to robust well-connectedness, one wishes to study the
norm of the resulting operator set S (M; ). That is assuming (M11 ; ) is
well-connected, does
kS(M; )k < 1
hold for all 2 . We have the following result which states that the latter
contractiveness problem can be recast as a well-connectedness question.
Proposition 8.3. Dene the perturbation set
= = 0u 0p : u 2 ; and p 2 L(L2 ) with kpk 1 :
p
and
I ; M = I ;;MM11u I0
21 u
is nonsingular by assumption. Therefore (I ; M11 u );1 exists for all u 2
.
Now looking at (8.2) for any xed u 2 , and all p satisfying kp k
1, we have that
I ; S(M; u )p is nonsingular :
Once again, Theorem 8.2 implies that we must have kS(M; u )k < 1 for
all u 2 .
Thus we have identied two reasons for the introduction of spatial,
block-diagonal structure to our perturbation set . The rst is that if
a system is formed by the interconnection of subsystems, then this struc-
ture arises immediately due to perturbations which may be present in the
subsystems. Our second motivation is because this will enable us to con-
sider the performance associated with the closed-loop. The next section
is devoted to studying the implications of this structure in the analysis
of well-connectedness. Later on in this chapter we will explore additional
structural constraints which can be imposed on our uncertainty set.
general not time invariant, this structure is sometimes called linear time-
varying (LTV) uncertainty. We have used the more general term arbitrary
block-structured uncertainty since
(i) From a modeling perspective, this uncertainty set is often not mo-
tivated by time variation. As explained above, we could well be
modeling a nonlinearity;
(ii) From a mathematical perspective, the relation Ra is indeed time
invariant; that is if (p; q) 2 Ra , then (S p; S q) 2 Ra , where S is
the shift on L2 [0; 1). While a linear parametrization of Ra involves
time-varying operators, we will see below that the time invariance of
Ra is indeed the central property required for our analysis.
We now proceed to the study of robust well-connectedness over the set
. Throughout we will assume that the nominal system M is an LTI
a
operator. Our objective is to study under what conditions (M11 ; ) is a
robustly well-connected. Recall that by this we mean
I ; M11 is invertible, for all 2 :a
For convenience, we will drop the subindex from M11 in the following
discussion and simply write M .
Notice that the inverse ;;1 of any ; 2 ; automatically has the same
commuting property, and we can write the identity
;(I ; M );;1 = I ; ;M ;;1; for all 2 : a
8.2. Arbitrary block-structured uncertainty 237
Thus I ; M is singular if and only if I ; ;M ;;1 is singular. This means
that if we can nd an element ; of the commutant satisfying
k;M ;;1k < 1;
then we can guarantee that (M; ) is robustly well-connected simply
a
by invoking the small gain condition. This motivates us to describe these
commuting operators.
You will show in the exercises at the end of the chapter, that ; 2 ; if
and only if ; is of the form
2 3
1 I 0 0
6 ... ... 7
; = 66 0. . .
6 7
7;
4 .. . . . . 0 75
0 0
d I
for some constant nonzero scalars
k in C . Clearly since such an operator
; is memoryless it can also be thought of as a matrix.
With this description we recapitulate our previous discussion as follows.
Proposition 8.5. Suppose M is a bounded operator. If there exists ; in
; such that
k;M ;;1k < 1;
then (M; ) is robustly well-connected.
a
system. In order to achieve this goal we will employ some of the ideas and
results from convex analysis introduced in x1.2.2. Our strategy will be to
translate our problem to one in terms of the separation of two specic sets
in Rd . We will be able to show that well-connectedness implies that these
sets are strictly separated, and that the scaled small-gain condition implies
there exists a hyperplane that strictly separates these sets. We will then
obtain our necessity result by showing that the strict separation of the
sets implies the stronger condition that there exists a strictly separating
hyperplane between them.
It will be advantageous to again look at the relation Ra introduced in
the preliminary discussion of this section and dened in (8.3), rather than
its parametrization by . Also we will wish to describe our relation Ra
a
in terms of quadratic inequalities of the form kpk k2 ; kqk k2 0. Out of
convenience we rewrite these as
kEk pk2 ; kEk qk2 0; k = 1 : : : d; (8.7)
where the mk m matrix
Ek = 0 0 I 0 0
selects the k-th block of the corresponding vector. The spatial dimensions
of Ek are such that any element ; = diag(
1 I; : : :
dI ) in the set P ; can
240 8. Uncertain Systems
be written as
; =
1 E1 E1 +
d Ed Ed : (8.8)
Inequalities such as the one in (8.7) are called integral quadratic constraints
(IQCs); the name derives from the fact that they are quadratic in p and
q, and involve the energy integral. We will have more to say about general
IQCs in Chapter 10.
We wish to study the interconnection of Ra to the nominal system M ;
for this purpose, we impose the equation p = Mq on the expression (8.7)
and dene the quadratic form k : L2 ! R by
k (q) := kEk Mqk2 ; kEk qk2 = hq; (M Ek Ek M ; Ek Ek )q i (8.9)
With this new notation, we make the following observation.
Proposition 8.7. If (M; ) is robustly well-connected, then there cannot
a
exist a nonzero q 2 L2 such that the following inequalities hold:
k (q) 0; for each k = 1; : : : ; d: (8.10)
This proposition follows from the reasoning: if such a q existed, we would be
assured that (Mq; q) 2 Ra . Thus by the earlier discussion, see Lemma 8.4,
there would exist a 2 such that
a
Mq = q;
implying that the operator (I ; M ) would be singular. And therefore,
invoking Lemma 3.16, the operator I ; M would also be singular. Thus
we have related robust well-connectedness to the infeasibility of the set of
inequalities in (8.10).
We now wish to relate the quadratic forms k to the scaled small-gain
condition. In fact this connection follows readily from version (ii) in Propo-
sition 8.6 by expressing the elements of P ; as in (8.8), which leads to the
identity
d
X d
X
hq; (M ;M ; ;)qi =
k hq; (M Ek Ek M ; Ek Ek )q i =
k k (q):
k=1 k=1
The following result follows immediately by considering condition (ii) of
Proposition 8.6; the reader can furnish a proof.
Proposition 8.8. The equivalent conditions in Proposition 8.6 hold if and
only if there exist scalars
k > 0 and > 0 such that
1 1 (q) + +
d d (q) ;kqk2; for all q 2 L2: (8.11)
Furthermore, if such a solution exists for (8.11) then there does not exist
a nonzero q in L2 satisfying (8.10).
We have thus related both robust well-connectedness and the scaled
small-gain test, to two dierent conditions involving the quadratic forms k .
8.2. Arbitrary block-structured uncertainty 241
The former condition states the infeasibility of the set of constraints (8.10);
the latter condition is expressed as a single quadratic condition in (8.11)
with multipliers
k , and implies the former. In the eld of optimization a
step of this sort, using multipliers to handle a set of quadratic constraints,
is sometimes called the S-procedure. The S-procedure is termed lossless
when there is no conservatism involved, namely when the conditions are
equivalent. In what follows we will show this is the case here for us.
To demonstrate this we rst express our ndings in geometric terms so
that we can bring convex analysis to bear on the problem. To this end
introduce the following subsets of Rd :
= f(r1 ; : : : ; rd ) 2 Rd : rk 0; for each k = 1; : : : ; dg;
r = f(1 (q); : : : ; d (q)) 2 Rd : q 2 L2 ; with kqk2 = 1g:
Here is the positive orthant in Rd , and r describes the range of the
quadratic forms k as q varies in the unit sphere of L2. For brevity we will
write (q) := (1 (q); : : : ; d (q)) and
= (
1 ; : : :
d ). Therefore with the
standard inner product in Rd , we have
h
; (q)i =
1 1 (q) + +
dd (q):
Now we are ready to interpret our conditions geometrically. From the
discussion and Proposition 8.7 we see that robust well-connectedness im-
plies that the closed set and the set r are disjoint, that is \r = ;. Now
r is a bounded set but it is not necessarily closed. Therefore we cannot
invoke Proposition 1.3 to conclude these sets are strictly separated when
they are disjoint. Nonetheless this turns out to be true.
Proposition 8.9. Suppose that (M; ) is robustly well-connected over
. Then the sets and r are strictly separated, namely
a
r r
(a) (b)
Figure 8.3. (a) D(; r) > 0; (b) , r separated by a hyperplane.
Consequently, if we wish to show that the conditions of Proposition 8.6
are necessary for robust well-connectedness, we need to prove that strict
separation of the sets r and automatically implies the existence of a
strictly separating hyperplane. This leads us to think of convex separation
theory, reviewed in x1.2.2. The closed positive orthant is clearly a convex
set. As for the set r, we have the following result. Its proof will rely strongly
8.2. Arbitrary block-structured uncertainty 243
on the time invariance of the operator M since it implies the forms k also
enjoy such a property.
Lemma 8.11. The closure r is convex.
Proof . We introduce the notation
Tk := M Ek Ek M ; Ek Ek :
For each k = 1; : : : ; d, the operator Tk is self adjoint and time invariant;
also from (8.9) we know that k (q) = hTk q; qi.
Choose two elements y = (q) and y~ = (~q ) from the set r. By denition,
kqk = kq~k = 1. We wish to demonstrate any point on the line segment
that joins them is an element of r . That is for any 2 [0; 1] we have
y + (1 ; )~y 2 r . Given such an let
p p
q := q + 1 ; S q~;
where S is the usual time shift operator.
Our rst goal is to examine the behavior of (q ) as tends to innity.
It is convenient to do this by considering each component k (q ). So for
any given k we compute
k (q ) = hTk q ; q i
p
= hTk q; qi + (1 ; )hTk S q~; S q~i + 2 (1 ; )RehTk q; S q~i:
(8.13)
We rst observe, from the time invariance of Tk , that the second term on
the right hand side satises
hTk S q~; S q~i = hS Tk q~; S q~i = hTk q~; q~i = k (~q ):
We this observation we can rewrite (8.13) as
p
k (q ) = k (q) + (1 ; )k (~q) + 2 (1 ; )RehTk q; S q~i:
Now we let ! 1. Clearly the inner product on the right converges to zero
since the rst element is xed, and the second is being shifted to innity.
Thus we have
!1 k (q ) = k (q) + (1 ; )k (~q ):
lim
Collect all components, for k = 1; : : : ; d, to conclude that
lim (q ) = y + (1 ; )~y:
!1
Thus we have succeeded in obtaining the convex combination y + (1 ;
)~y as a limit of vectors in the range of ; however we have not established
that the q have unit norm, as required in the denition of r; to address
this, note that
lim kq k2 = kqk2 + (1 ; )kq~k2 = 1:
!1
244 8. Uncertain Systems
This follows by the same argument as the above, replacing Tk by the iden-
tity operator. Thus the q have asymptotically unit norm, which implies
that
lim kqq k = y + (1 ; )~y:
!1
Now by denition the elements on the left are in r, for every , so we
conclude that
y + (1 ; )~y 2 r :
Finally, by continuity the same will hold if we choose the original y and y~
from r rather than r; this establishes convexity.
We can now assemble all the results of this section to show that the
scaled small-gain condition is indeed necessary for well-connectedness of
(M; ). This result is a direct consequence of the following.
a
that they are LTI, or static, or memoryless, etc.). We will assume that
P () imposes no norm restrictions, and in fact will further assume that if
satises P (), then so does
for any
> 0. Namely the set
C := f 2 L(L2 ) : the property P () is satised.g
is a cone (the cone generated by ). These assumptions are implicit in
what follows.
246 8. Uncertain Systems
Denition 8.13. The structured singular value of an operator M with
respect to a set which satises the above assumptions, is
(M; ) := inf fkk : 2 C and 1
I ; M is singularg (8.15)
when the inmum is dened. Otherwise (M; ) is dened to be zero.
The inmum above is undened only in situations were there are no per-
turbations that satisfy the singularity condition. This occurs for instance
if M = 0. If perturbations exist which make I ; M singular, the inmum
will be strictly positive and therefore (M; ) will be nite. We remark
that the terminology structured singular value originates with the matrix
case, which was the rst to be considered historically, and is discussed in
the next section.
To start our investigation we have the following result which provides an
upper bound on the value of the structured singular value. This result is a
restatement of the small-gain result.
Proposition 8.14. (M; ) kM k, with equality in the unstructured
case = f 2 L(L2 ) : kk 1g.
Proof . If kk < kM k;1, we know that I ; M is nonsingular by small-
gain. Therefore the inmum in (8.15) is no less than kM k;1, and thus the
rst statement follows by inversion.
In the unstructured uncertainty case, C is the entire space L(L2 ); by
scaling in Theorem 8.2, we can always construct 2 L(L2 ), kk = kM k;1,
and I ; M singular. This achieves the inmum and proves the desired
equality.
So we see that the structured singular value reduces to the norm if is
unstructured; more precisely (M; BL(L2) ) = kM k where BL(L2) denotes
the unit ball of L(L2 ). Furthermore if the uncertainty is arbitrary block-
structured we have the following version of Theorem 8.12.
Proposition 8.15. Let M be an LTI, bounded operator on L2. Then
(M; ) = inf k;M ;;1k
;2;
a
Notice that this set is simply the intersection of the LTI operators on
L2 [0; 1) with the set . Bringing in the Laplace transform from Chap-
a
ter 3 we see that each 2 can be identied with its transfer function
^ 2 H1 , which inherits the corresponding spatial structure.
TI
We now introduce the relation RTI that goes along with the operator
set . It is possible to show given an element q in L2 [0; 1) that
TI
where
RTI =
f(p; q) 2 L2 : ^ k 2 H1 ; k^ k k1 1; j^ k (j!)^pk (j!)j = jq^k (j!)jg:
From this we see that given any (p; q) 2 RTI the relationship
jp^k (j!)j jq^k (j!)j; for almost every !;
must be satised since (^ k (j!) ) 1. As with Ra we have established a
constraint between the sizes of the various components of p and q. However
the description is now much tighter since it is imposed at (almost) every
frequency. In contrast Ra can be written as
Z 1 Z 1
Ra = (p; q) 2 L2 : jp^ (j!)j2 d!
k jq^ (j!)j2 d!;
k k = 1; : : : ; d ;
0 0
which clearly shows that the relation Ra only imposes quadratic constraints
over frequency in an aggregate manner.
As a modeling tool, time invariant uncertainty is targeted at describing
dynamics which are fundamentally linear and time invariant, but which we
do not desire to model in detail in the nominal description M . Thus M
could be a low order approximation to the linearized system, leaving out
high dimensional eects which arise, for instance, from small scale spatially
distributed dynamics. Instead of simply neglecting these dynamics, time
invariant perturbations provide a formalism for including or containing
them in RTI using frequency domain bounds.
p1 q2
G1
?
1 2
q1-
G2
p2 6
Figure 8.4. Example system.
The uncertain system of Figure 8.4 is comprised of two xed linear time
invariant systems G1 , G2 , and two uncertainty blocks 1 , 2 . All blocks are
single-input and -output. We are interested in whether this conguration
is well-connected in the sense of I ; 1 G1 2 G2 being nonsingular. We can
routinely redraw this system in the standard conguration of Figure 8.1,
with
0 G 1
M = M11 = G 0 and = 0 : 1 0
2 2
Clearly I ; 1 G1 2 G2 is singular if and only if I ; M is singular. Suppose
k1 k 1 and k2 k 1. To investigate the well-connectedness of this
system we can invoke the small gain theorem and impose the sucient
condition for robust well-connectedness given by
k1 G1 2 G2 k < 1:
If the uncertainty is time invariant, then 2 and G1 commute and we
can use the submultiplicative inequality to obtain the stronger sucient
condition
kG1 G2 k = kG^ 1 G^ 2 k1 = ess sup jG^1 (j!)G^2 (j!)j < 1 (8.18)
!2R
for robust well-connectedness. If instead 1 and 2 are arbitrary contrac-
tive operators, we are not allowed to commute the operators, so we can
only write the small-gain condition
kG1 kkG2k = kG^ 1 k1 kG^ 2 k1 = (ess sup jG^1 (j!)j)(ess sup jG^2 (j!)j) < 1:
!2R !2R
(8.19)
These conditions are dierent in general: frequently (8.19) can be more
restrictive than (8.18) since values of jG^ 1 (j!)j and jG^ 2 (j!)j at dierent
8.4. Time invariant uncertainty 251
frequencies can give a larger product. For instance
G^1 (s) = s + 1 ; G^2 (s) = s +s 1
will satisfy (8.18) for < 2, but (8.19) only for < 1. An extreme case
would be if G^ 1 (j!) and G^ 2 (j!) had disjoint support.
To interpret these frequency domain conditions, notice that for 2 ,
the transfer functions ^ 1 (j!) and ^ 2 (j!) are contractive at every !, thus
TI
the small gain analysis can be decoupled in frequency, which makes (8.18)
sucient for well-connectedness. When the perturbation class is the a
operators 1 and 2 are still contractive, but they are allowed to \transfer
energy" between the frequencies where G^1 and G^2 achieve their maximum
gain, making well-connectedness harder to achieve.
It turns out that these conditions are also necessary for robust well-
connectedness in each respective case. While this can be shown directly in
the conguration of Figure 8.4, it is illustrative to write it in the standard
form and apply the robustness analysis techniques of this chapter. This is
left as an exercise.
We now proceed with the analysis of uncertain systems over . A TI
standing assumption throughout the rest of this section is that the nominal
operator M is nite dimensional LTI, i.e. it has a transfer function M^ (s) 2
RH1 .
Our main tool will be the structured singular value , dened as in
Denition 8.13 for the present uncertainty set . Following the general
TI
approach outlined in the previous section, the rst task is to identify the
operators ; which commute with the structure . Notice that such an
TI
operator must commute with the delay operator S , for every 0, since
the delay S is itself a member of . Thus we see that such an ; is
TI
necessarily time invariant. If we also take into account the spatial structure,
then it is shown in the exercises that the nonsingular commuting set is
; =
TI
at each frequency. Before we can accomplish this we rst need one more
property of the matrix structured singular value, in addition to the ones
above, namely that it satises a maximum principle over matrix functions
that are analytic in a complex domain.
To obtain this result we need a few preliminary facts regarding polyno-
mials with complex variables. The rst is a continuity property of the roots
of a complex polynomial as a function of its coecients. In plain language
this result says that if all the coecients of two polynomials are sucient
close to each other, then their respective roots must also be near each other.
Lemma 8.19. Let p() = ( ; 1 )( ; 2 ) ( ; n ) be an n-th order
polynomial over C ( 6= 0). If p(k) ( ) is a sequence of n-th order polyno-
mials with coecients converging to those of p( ), then there exists a root
1(k) of p(k) ( ) such that 1(k) ! 1 .
Proof . At each k, we dene the factorization
p(k) ( ) = (k) ( ; 1(k) )( ; 2(k) ) ( ; n(k) )
such that 1(k) is the root closest to 1 (i.e. j1 ; 1(k) j j1 ; i(k) j for every
i = 1 : : : n).
Since the coecients of p(k) ( ) converge to those of p( ), we can evaluate
at = 1 and see that p(k) (1 ) ! p(1 ) = 0 as k ! 1. Also, (k) ! 6= 0.
Therefore
n
Y
(k)
p (1 )
(k ) n
j ; j j ; j = (k ) ;! 0 as k ! 1:
1 1 1 i
i=1
(k)
The next lemma concerns the roots of a polynomial of two complex
variables. It states that a particular type of root must always exist for such
polynomials, one in which the moduli of the arguments are equal.
Lemma 8.20. Let p(; ) be a non-constant polynomial p : C 2 ! C . Dene
= minfmax(j j; j j) : p(; ) = 0g (8.20)
Then there exist ? ; ? such that p( ? ; ? ) = 0 and j ? j = j ? j = .
Proof . We start with ?; ? that achieve the minimum in (8.20). The only
case we need to consider is when > 0 and one of the above numbers has
magnitude less than . Take, for instance, j ? j = , j ? j < .
254 8. Uncertain Systems
By setting = ? and grouping terms in the powers of , we write
N
X
p(; ? ) = an ( ? ) n : (8.21)
n=0
Suppose that the above polynomial in is not identically zero. If we replace
? by (1 ; ) ? , its coecients are perturbed continuously, so we know from
Lemma 8.19 that as ! 0+, the perturbed polynomial will have a root
converging to ? . Thus, for small enough , we have
j j < ; j(1 ; ) ? j < ; p( ; (1 ; ) ? ) = 0:
This contradicts the denition of ; therefore the only alternative is that
the polynomial (8.21) must be identically zero in . But then the value of
can be set arbitrarily, in particular to have magnitude , yielding a root
with the desired property.
We are now in a position to state the maximum principle for the matrix
structured singular value. Just like the maximum modulus theorem for
scalar analytic functions, this theorem asserts that structured singular value
of a function that is analytic in the right half-plane achieves its supremum
on the boundary of the half-plane; we will actually treat the case of a
rational function.
Theorem 8.21. Let M^ (s) be a function in RH1, and s;f be a
perturbation structure in the set of complex matrices. Then
sup (M^ (s); s;f ) = sup (M^ (j!) s;f ):
Re(s)0 !2R
Proof . We rst convert the problem to a maximization over the unit disk,
by means of the change of variables
s = 11 + :
;
This linear fractional transformation maps fj j 1; 6= 1g to fRe(s) 0g,
and the point = 1 to s = 1. Also the boundary j j = 1 of the disk maps
to the imaginary axis.
Noticing that M^ (s) 2 RH1 has no poles in Re(s) 0 and has a nite
limit at s = 1, we conclude that the rational function
^ ^
Q( ) = M 1 ; 1 +
has no poles over the closed unit disk j j 1. With this change, our theorem
reduces to showing that
0 := sup (Q^ ( ); s;f ) = sup (Q^ ( ); s;f ):
jj1 jj=1
8.4. Time invariant uncertainty 255
Notice that by continuity of the matrix structured singular value, the supre-
mum 0 on the left is a maximum, achieved at some 0 , j0 j 1; we must
show it occurs at the disk boundary.
It suces to consider the case 0 > 0. By denition, we know that
1=0 = (0 ), where
0 := arg minf () : 2 s;f and I ; Q^ (0 ) is singular g:
Now consider the equation
det[I ; Q^ ( ) 0 ] = 0
in the two complex variables , . Since Q^ ( ) is rational with no poles in
j j 1, we can eliminate the common denominator and obtain a polynomial
equation
p(; ) = 0
equivalent to the above for j j 1. We claim that
1 = minfmax(j j; j j) : p(; ) = 0g:
In fact, the choice = 0 , = 1 gives p(0 ; 1) = 0 and max(j0 j; 1) = 1: If
we found p(; ) = 0 for some j j < 1, j j < 1, we would have
I ; Q^ ( ) 0 singular, 0 2 s;f ; ( 0 ) < 1=0:
This would give
(Q^ ( ) s;f ) > 0
contradicting the denition of 0 .
Thus we have proved our claim, which puts us in a position to apply
Lemma 8.20 and conclude there exists a root ( ? ; ? ) of p(; ) with j ? j =
j ? j = 1. Consequently,
I ; Q^ ( ? ) ? 0
is singular, with ( ? 0 ) = 1=0 . So we conclude that, for j ? j = 1,
(Q^ ( ? ) s;f ) = 0 :
At this point we have assembled all the results required to prove the
following theorem, which is a major step in our analysis of time invariant
uncertainty. This theorem converts the well-connectedness of (M; ) TI
to a pure matrix test at each point in the frequency domain. It there-
fore has clear computational implication for the analysis of time invariant
uncertainty.
Theorem 8.22. Assume M is a time invariant bounded operator, with its
transfer function in RH1 . The following are equivalent:
256 8. Uncertain Systems
(a) (M; ) is robustly well-connected;
TI
TI
In the rst inequality above, notice that we allow the scaling matrix ;!
to be frequency dependent. If in particular we choose this dependence to
be of the form ;^ (j!) where ; 2 ; , then we obtain the second inequality,
TI
which in fact re-derives the bound of Theorem 8.17 for this generalized
spatial structure. Now the intermediate bound appears to be potentially
sharper; in fact it is not dicult to show both bounds are equal and we
return to this issue in the next chapter.
Another important comment is that verication of the test
sup ; inf ;! M^ (j!);;! 1 < 1
! 2R ! 2;s;f
amounts to a decoupled LMI problem over frequency, which is particularly
attractive for computation.
We still have not addressed, however, the conservatism of this bound. The
rest of the section is devoted to this issue; our approach will be to revisit
the method from convex analysis used in x8.2.2, which was based on the
language of quadratic forms, and see how far the analysis can be extended.
We will nd that most of the methods indeed have a counterpart here, but
the extension will fail at one crucial point; thus the bound will in general be
conservative. However this upper bound is equal to the structured singular
value in a number of cases and the tools we present here can be used to
demonstrate this.
We begin by characterizing the quadratic constraints equivalent to the
relation q = p, 2 s;f . The rst part of the following lemma is matrix
version of Lemma 8.4; the second extends the method to repeated scalar
perturbations.
Lemma 8.24. Let p and q be two complex vectors. Then
(a) There exists a matrix , () 1 such that p = q if and only if
p p ; q q 0:
(b) There exists a matrix I , jj 1, such that Ip = q if and only if
pp ; qq 0:
Proof . Assume p =6 0, otherwise the result is trivial.
8.4. Time invariant uncertainty 259
(a) Clearly if p = q, () 1 then jpj jqj or p p ; q q 0.
Conversely, if jpj jqj we can choose the contractive rank one matrix
= jqp
pj2 :
(b) If p = q, jj 1, then pp ; qq = pp (1 ; jj2 ) 0. Conversely, let
pp ; qq 0; this implies ker p ker q and therefore Im q Im p.
So necessarily there exists a complex scalar which solves
q = p:
Now 0 pp ; qq = pp(1 ; jj2 ) implies jj 1.
Having established a quadratic characterization of the uncertainty
blocks, we now apply it to the structured singular value question.
Suppose I ; M is singular, with 2 s;f . It will be more convenient
to work with I ; M which is also singular (see Lemma 3.16). Let q 2 C m
be nonzero satisfying (I ; M )q = 0, so q = (Mq). Given the structure
of , we can use the previous lemma to write quadratic constraints for the
components of q and Mq. We proceed analogously to x8.2.2. First, we write
these components as Ek q and Ek Mq where
Ek = 0 0 I 0 0 :
Next we dene the quadratic functions
k (q) =Ek Mqq M Ek ; Ek qq Ek ;
k (q) =q M Ek Ek Mq ; q Ek Ek q :
Finally we bring in the sets
rs;f := f(1 (q); : : : ; s (q); s+1 (q); : : : ; s+f (q)) : q 2 C m ; jqj = 1g
s;f := f(R1 ; : : : ; Rs ; rs+1 ; : : : ; rs+f ); Rk = Rk 0; rk 0g:
The latter are subsets of V = H m1 H ms R R, that is a real
vector space with the inner product
s
X sX
+f
hY; Ri = Tr(Yk Rk ) + yk rk :
k=1 k=s+1
The following characterization is the counterpart of Proposition 8.9.
Proposition 8.25. The following are equivalent:
(a) (M; s;f ) < 1;
(b) the sets rs;f and s;f are disjoint.
260 8. Uncertain Systems
Proof . The sets rs;f and s;f intersect if and only if there exists q, with
jqj = 1 satisfying
k (q) 0; k = 1 : : : s;
k (q) 0; k = s + 1; : : : s + f:
Using Lemma 8.24, this happens if and only if there exist contractive k ,
k = 1; : : : ; s, and k , k = s + 1; : : : ; s + f satisfying
k Ek Mq =Ek q; k = 1; : : : ; s
k Ek Mq =Ek q; k = s + 1; : : : ; s + f:
Putting these blocks together, the latter is equivalent to the existence of
2 s;f and q such that () 1, jqj = 1 and
Mq = q
which is equivalent to (I ; M ) being singular for some 2 s;f , ()
1. By denition, this is the negation of (a).
Having characterized the structured singular value in terms of properties
of the set rs;f , we now do the same with the upper bound of Proposition
8.23. Once again, the parallel with x8.3 carries through and we have the
following counterpart of Proposition 8.10.
Proposition 8.26. The following are equivalent:
(a) There exists ; 2 ;s;f satisfying (;M ;;1) < 1;
(b) The convex hull co(rs;f ) is disjoint with s;f ;
(c) There exists a hyperplane in V which strictly separates rs;f and s;f ;
that is, there exists ; 2 V and ; 2 R such that
h;; Y i < h;; Ri for all Y 2 rs;f ; R 2 s;f : (8.22)
Proof . Notice that s;f is convex, so (b) implies (c) by the hyperplane
separation theorem in nite dimensional space. Conversely, if a hyperplane
strictly separates two sets it strictly separates their convex hulls, so (c)
implies (b). It remains to show that these conditions are equivalent to (a).
Starting from (a), we consider the LMI equivalent M ;M ; ; < 0, for
some ; 2 ;s;f . For every vector q of jqj = 1, we have
q (M ;M ; ;)q < 0:
Now we write
s
X sX
+f
;= Ek ;k Ek +
k Ek Ek
k=1 k=s+1
8.4. Time invariant uncertainty 261
that leads to the inequality
s
X
(q M Ek ;k Ek Mq ; q Ek ;k Ek q) +
k=1
sX
+f
(
k q M Ek Ek Mq ;
k q Ek Ek q) < 0
k=s+1
for jqj = 1. Taking a trace, we rewrite the inequality as
s
X
Tr (;k (Ek Mqq M Ek ; Ek qq Ek )) +
k=1
sX
+f
k (q M Ek Ek Mq ; q Ek Ek q) < 0
k=s+1
which we recognize as h;; Y i < 0, with
Y = (1 (q); : : : ; s (q); s+1 (q); : : : ; s+f (q)) and
; = (;1 ; : : : ; ;s ;
s+1 ; : : : ;
s+f )
Also since ;k > 0,
k > 0 we conclude that
h;; Ri 0 for all R 2 s;f
so we have shown (8.22). The converse implication follows in a similar way
and is left as an exercise.
The last two results have characterized the structured singular value and
its upper bound in terms of the sets rs;f and s;f ; if in particular the set
rs;f were convex, we would conclude that the bound is exact. This was
exactly the route we followed in x8.2.2. However in the matrix case the
set rs;f is not convex, except in very special cases. Recalling the proof of
Lemma 8.11, the key feature was the ability to shift in time, which has
no counterpart in the current situation. In fact, only for structures with
a small number of blocks can the bound be guaranteed to be exact. Such
structures are called -simple. We have the following classication.
Theorem 8.27.
(M; s;f ) = inf (;M ;;1 )
;2;s;f
holds for all matrices M if and only if the block speciers s and f satisfy
2s + f 3 :
For the alternatives (s; f ) 2 f(0; 1); (0; 2); (0; 3); (1; 0); (1; 1)g that satisfy
2s + f 3, the proof of suciency can be obtained using the tools intro-
duced in this section. In particular, an intricate study of the set rs;f is
262 8. Uncertain Systems
required in each case; details are provided in Appendix C. Counterexam-
ples exist for all cases with 2s + f > 3. We remark that the equality may
hold if M has special structure.
As an interesting application of the theorem for the case (s; f ) = (1; 1),
we invite the reader to prove the following discrete time version of the KYP
lemma. As a remark, in this case the I block appears due to the frequency
variable z , not to uncertainty; this indicates another application of the
structured singular value methods, which also extends to the consideration
of multi-dimensional systems, as will be discussed in Chapter 11.
Proposition 8.28. Given state space matrices A; B; C and D, the follow-
ing are equivalent.
(a) The eigenvalues of A are in the open unit disc and
sup (C (I ; zA);1 zB + D) < 1 ;
z2D
(b) There exists a symmetric matrix ; > 0 such that
A B ; 0 A B ; ; 0 <0:
C D 0 I C D 0 I
This completes our investigation of time invariant uncertainty and in
fact our study of models for uncertain systems. We now move to the next
chapter where we use our results and framework to investigate stability and
performance of uncertain feedback systems.
8.5 Exercises
1. This example is a follow-up to Exercise 6, Chapter 5 (with a slight
change in notation). As described there, a natural way to perturb a
system is to use coprime factor uncertainty of the form
P^ = (N^ + ^ N )(D^ + ^ D );1 :
Here P^0 = N^ D^ ;1 is a nominal model, expressed as a normalized
coprime factorization, and
^
= ^ N
D
is a perturbation with a given norm bound.
a) Find M such that algebraically we have P = S(M; ). Is your
M always a bounded operator?
b) Repeat the questions of part a) to describe the closed loop map-
ping from d1 , d2 to q in the diagram given in Exc. 6, Chapter
5.
8.5. Exercises 263
2. Consider the partitioned operators
N= N 11 N12 H11 H12
N21 N22 ; H = H21 H22 :
Find the operator M such that Figure 8.1 represents
a) The cascade of Figure 8.2, i.e.
S(M; ) = S(N; 1 )S(H; 2 ) with = diag(1 ; 2 ):
b) The composition S(M; ) = S(H; S(N; )), where we assume
that I ; N22 H11 has a bounded inverse. Draw a diagram to
represent this composition.
c) The inverse S(M; ) = S(N; );1 , where we assume that N22
has a bounded inverse.
3. Commutant sets.
a) Let ; 2 C nn . Prove that if ; = ; for all 2 C nn , then
; =
In ,
2 C .
b) Let ; 2 C nn . Show that if ; = ; for all structured =
diag(1 ; : : : ; d ), then ; = diag(
1 I; : : : ;
d I ), where
i 2 C
and the identity matrices have appropriate dimensions.
c) Let ; 2 C nn . Show that if ; = ; for all 2 s;f , then
; 2 ;s;f .
d) Characterize the commutant of in L(L2 ) (i.e. the set ;
TI TI
of operators that commute with all members of ). TI
- R
p q
Q
z w
9
Feedback Control of Uncertain
Systems
p q
z G w
y u
K
p q
M11 M12
z M21 M22 w
Figure 9.2. Robustness analysis setup
Now this setup loops exactly like the arrangement from Chapter 8. Indeed
our strategy is to study the robust stability of the overall system Figure
9.1 using the approach and results of Chapter 8.
We say the system of Figure 9.1 is robustly stable when it is stable for
every 2 . When robust stability holds we will consider questions of
c
both robust stability and the nominal performance condition kM22k < 1; it
9.1. Stability of feedback loops 273
should be clear, however, that robust performance is a stricter requirement
than the combination of these two.
At this point it is apparent our work in Chapter 8 will pay large divi-
dends provided that we can rigorously connect the equations which satisfy
the interconnection in Figure 9.1 to the simplied setup in Figure 9.2. Al-
though the diagrams appear to make this obvious, we must be careful to
ensure that the diagrams say the same thing as the equations which actu-
ally describe the system. We have the following theorem which provides us
with a guarantee of this.
Theorem 9.1. Suppose that 2 L(L2) is causal, and the assumption
of nominal stability holds. If (I ; M11 );1 exists as a bounded, causal
mapping on L2 , then the feedback system in Figure 9.1 is stable, and the
furthermore the mapping w to z is given by (9.1).
This theorem says that analysis of Figure 9.2 provides us with guarantees
for both stability and performance of the system in Figure 9.1, assuming
that (I ; M11);1 is causal in addition to being in L(L2 ). In the following
section we will prove this theorem and in doing so the motivation behind
the causality constraint will become clear.
To prove Theorem 9.1 rigorously will require some additional prepara-
tion, and this proof is the main topic of the next section. Subsequently,
once we have established Theorem 9.1, we will proceed in x9.2 to applying
the results of Chapter 8 directly to analysis of robust stability and per-
formance of feedback systems. Finally in x9.3 we discuss the synthesis of
feedback controllers for robust performance.
p q
G
z w
Figure 9.3. Uncertain plant
solutions to our closed-loop systems equations over L2e , which is a set con-
taining both the domain and codomain of G and K . If we instead attempted
to prove uniqueness of solutions to the equations governing Figure 9.1 over
just L2 , we could not rule out the existence of solutions in L2e . Namely the
fact that the solutions p, q, u, y and z in Figure 9.1 are in L2 , when w is
in L2, should be a conclusion of our analysis not an a priori assumption.
We have the following result, and remark that if a system satises (i){(iii)
we say it is L2e -stable.
Theorem 9.2. Pertaining to Figure 9.1, suppose
(a) That is a linear mapping on L2e and is bounded on L2;
(b) The state space system G is internally stabilized by the state space
controller K ;
(c) The inverse (I ; M11);1 exists as a mapping on L2e and is bounded
on the normed space L2.
Then
(i) Given any w in L2e and initial states of G and K , there exist unique
solutions in L2e for p, q, u, y and z . Furthermore if w is in L2 , so
are the other ve functions;
(ii) The states of G and K tend asymptotically to zero, from any initial
condition, when w = 0;
(iii) The maps w to p, q, u, y and z are all bounded on L2 , when the initial
states of G and K are zero. Furthermore, if the latter condition holds
the mapping w to z is given by (9.1).
This result diers from Theorem 9.1 in two main respects. One: the inverse
(I ; M11 );1 need not be causal, it simply needs to be dened on all of L2e
and bounded on L2 . Two: here we look for solutions to our loop equations
without assuming up front that they necessarily lie in L2 ; if w is chosen in
276 9. Feedback Control of Uncertain Systems
L2 then we see that the solutions to loop equations necessarily lie in L2 . In
x9.1.2 we will see that Theorem 9.1 is a special case of the current theorem.
Proof . We rst prove (i). Choose w in L2e and an initial state
(x(0); xK (0) ) for the state space systems G and K . Now the equations
governing this system are the state space equations for G and K , and the
operator equation q = p. We initially suppose that solutions for p, q, u,
y and z exist in L2e , and we will show that they are unique. Now by rou-
tine algebraic manipulations we can write the equations that govern the
feedback loops of this system as
q = p
x_ L = AL xL + BL1 q + BL2 w; where xL (0) = (x(0); xK (0) )
p = CL1 xL + DL11 q + DL12 w:
Here AL , BL , CL and DL is the state space representation for the inter-
connection of G and K , and thus we know from hypothesis (b) that AL is
Hurwitz. Clearly this is a state space realization for the operator M . From
this we see that
p = M11q + M12 w + CL0 A~L xL (0);
where A~L : C n ! L2 via the relationship A~L xL (0) = eAL t xL (0). Now
applying assumption (c) above we have that
p = (I ; M11 );1 M12 w + (I ; M11);1 CL0 A~L xL (0): (9.3)
Thus we see that p must be uniquely determined since w and xL (0) are
xed, and then the function q is uniquely specied by p. Since the inputs
q, w, and the initial conditions of the state space systems G and K are all
specied, we know from (b) and the work in Chapter 5 that u, y and z are
unique and in L2e .
To complete our demonstration of (i) we must show the existence of
solutions. To see this simply start by dening p from (9.3), given any w 2
L2e and initial condition xL (0), and essentially reverse the above argument.
Also if w is in L2 then by the boundedness of M , (I ; M11 );1 , and A~L
we see that p must be in L2 . Immediately it follows that q must be in L2
since is bounded, and again by the nominal stability in (b) we have now
that the other functions are also in L2.
We now prove (ii). From above we know that if w = 0 we have
p = (I ; M11 );1 CL0 A~L xL (0);
and so q = p is in L2 since AL is Hurwitz and is bounded. Continuing
under these conditions we have
x_ L = AL xL + BL0 q; with some initial condition xL (0).
It is not dicult to show that xL (t) ! 0 as t ! 1, since q is in L2.
9.1. Stability of feedback loops 277
Finally we need to show that the maps from w to the other functions are
bounded on L2 . From above we already know that the maps from w to p
and q are bounded on L2 . This means that the map from w to the state
xL must be bounded on L2 since AL is Hurwitz. Finally the functions z , u
and y are all simply given by matrices times w, xL and q and so the maps
from w to them must also be bounded on L2.
To end, the mapping w to z is given by (9.1) follows routinely from the
above formula for p, and the state space denition of the operator M .
The theorem just proved gives us a set of sucient conditions for L2e
stability where the mapping is dened on L2e and is bounded. Note that
it is not necessarily causal. We would however like to dene our perturba-
tions from the operators on L2 in keeping with our work in Chapter 8, and
we now address this topic.
2
Such examples are impossible in discrete time if starting at time zero.
280 9. Feedback Control of Uncertain Systems
9.2 Robust stability and performance
We are now ready to confront robustness for the general feedback control
setup of Figure 9.1. In this section we will concentrate on the analysis of
robust stability and performance for a given controller K .
We start with the following denition which can be interpreted in terms
of Figure 9.2. With respect to a causal set of contractive perturbations
c
stability or performance, then the feedback system of Figure 9.1 will also
enjoy the respective property.
Theorem 9.5. Suppose K internally stabilizes G. If the uncertain system
(M11 ; ) has robust stability, then the feedback system described by Fig-
c
erties in the feedback conguration of Figure 9.1. Henceforth in this section
we will focus our eort on the uncertain system (M; ). The remainder
c
(ii) The structured singular value condition sup (M^ 11 (j!); s;f ) < 1 holds.
!2R
Proof . By Theorem 8.22 it is sucient to show that robust well-
connectedness of (M11 ; ) is equivalent to robust stability. Since by
TI
denition the latter condition implies the former we need only prove the
converse.
Thus we need to show that if I ; M11 is nonsingular, where 2 ,
then (I ; M11 );1 is causal. This follows directly from frequency domain
TI
a;c
9.2. Robust stability and performance 283
(b) The uncertain system (M; ) is robustly stable;
c
a;p
TI
(c)
sup (M^ (j!); s;f +1 ) < 1: (9.5)
!2R
The proof follows along similar lines as that of Proposition 9.10 and is left
as an exercise. Notice that here the so-called performance block p can be
taken to be LTI.
This concludes our work on analysis of robust stability and performance
in causal feedback systems. We now turn our attention to the synthesis of
controllers.
S (G; K ).
We will only consider perturbation structures for which we have developed
the necessary analysis tools. In this way our problem turns into the search
9.3. Robust Controller Synthesis 285
for a controller K that satises a precise mathematical condition. The
simplest case is when we have unstructured uncertainty, i.e. when
= f 2 L(L2); causal ; kk 1g :
In this case a controller K will be robustly stabilizing if and only if it
satises
kS (G; K )k1 < 1;
this small-gain property follows, for instance, from our more general result
of Theorem 9.7. So our problem reduces to H1 synthesis; in fact, this
observation is the main motivation behind H1 control as a design method,
completing the discussion which was postponed from Chapter 7.
What this method does not consider is uncertainty structure, which as
we have seen can arise in two ways:
Uncertainty models derived from interconnection of more simple
component structures.
Performance block, added to account for a performance specication
in addition to robust stability.
The remainder of this chapter is dedicated to the robust synthesis problem
under the structured uncertainty classes and which we have
a;c TI
studied in detail.
We begin our discussion with the class . Let K be the set of state
a;c
space controllers K which internally stabilize G. Then robust synthesis is
reduced, via Theorem 9.7, to the optimization problem
inf k; 21 S (G; K );; 12 k;
;2P ;;K 2K
which is H1 synthesis under constant scaling matrices; we have constrained
the latter to the positive set P ;, which is a slight change from Theorem
9.7 but clearly inconsequential.
Robust stabilization is achieved if and only if the above inmum is less
than one, since the set contains only contractive operators. We there-
a;c
fore have a recipe to tackle the problem. The main question is whether the
above optimization can be reduced to a tractable computational method.
Note that we have already considered and answered two restrictions of this
problem:
For xed ; we have H1 synthesis;
For xed K we have robustness analysis over . a;c
286 9. Feedback Control of Uncertain Systems
As seen in previous chapters, both these subproblems can be reduced to
LMI computation, and the new challenge is the joint search over the matrix
variable ; and the system variable K .
The rst result is encouraging. It says that we can restrict our search for
a controller to those which have the same dynamic order as the plant.
Proposition 9.12. Let G be an LTI system of order n. If
inf k; 21 S (G; K );; 12 k < 1; (9.6)
;2P ;;K 2K
then there exits a controller K 2 K of order at most n, such that
k; 21 S (G; K );; 12 k < 1:
The proposition states that if the the robust stabilization problem can be
solved, then it can be solved using a controller K of state dimension no
greater than the plant G. Figure 9.4 illustrates the constructions used in
the proof.
G;
; 21 ;; 12
G
K
Figure 9.4. Scaled synthesis problem
XA XB1 ;; 21 C1 1 ; 12 1 N 0
2 3
NX 0 4A;;X21 +
0 I BX
12 1 12
;I 1 ;; 2 D11 ; 2 5 0X I < 0 ;
; C1 ; D11 ;; 2 ;I
(9.7)
Y C1 ; 21 B ;; 12
2 3
NY 0 4AY; 21+CYYA
12 1 ; 12 5 NY 0
0 I 1 ;I 1 ; D11 ; 0 I < 0 (9.8)
;; 21 B1 ;; 12 D11
;2 ;I
X I 0 (9.9)
I Y
where NX and NY are full-rank matrices whose images satisfy
ImNX = ker C2 D21 ;; 21 ;
ImNY = ker B2 D12 ; 21 :
288 9. Feedback Control of Uncertain Systems
So we see that the robust stabilization problem is solvable if and only if
conditions (9.7), (9.8) and (9.9) in the variables X , Y and ; are satised.
This once again emphasizes the nite dimensionality of the problem.
To get these conditions into a more transparent form, it is useful to
redene the outer multiplication factors so that they are independent of ;.
Dene NX , NY to be full rank matrices whose images satisfy
ImNX = ker C2 D21 ;
:
ImNY = ker B2 D12
Then NX , NY are constant, and clearly we can take
I 0
NX = 0 ; 12 NX ; (9.10)
NY = I0 ;;0 12 NY : (9.11)
Substituting with (9.10) into (9.7) gives
C1 ; 211 N 0
2 3
NX 0 4A XB +XXA XB
1
0 I 121
;; D11 ; 2 5 0X I < 0:
12
; C1 ; D11 ;I
As a nal simplication, we can multiply the last row and column of the
preceding LMI by ; 21 . This gives the condition
2 3
NX 0 4A XB +XXA XB
1 C1 ; N 0
;; D11 ; 0 I < 0:
5 X (9.12)
0 I 1
;C 1 ;D 11 ;;
An analogous procedure combining (9.8) and (9.11) leads to
2 ;1 3
NY 0 4AY C+YY A ;Y;C;11 DB1 ;;;1 5 NY 0 < 0:
0 I 1 11 0 I (9.13)
;;1 B1 ;;1 D11 ;;;1
To summarize, we have now reduced robust stabilization to conditions (9.9),
(9.12) and (9.13). Condition (9.12) is an LMI on the variables X , ;, but
unfortunately we nd that (9.13) is an LMI on Y and ;;1 , not on ;. This
means that the conditions, as written are not jointly convex on the variables
X , Y and ;. Thus we have not been able to reduce robust synthesis to LMI
computation. In particular, examples can be given where the allowable set
of ;-scalings is even disconnected.
Of course at this point we may wonder whether some further manip-
ulations and possibly additional changes of variables may not yield an
equivalent problem which is convex. No such conversion is known, but at
the time of writing no mathematical proof to the contrary is known ei-
9.3. Robust Controller Synthesis 289
ther. However insights gained from computational complexity theory (see
remarks below) seem to indicate that such a conversion is impossible.
Since our approach does not provide a convex answer for the general
stabilization problem, the next question is whether there are special cases
where the conclusion is dierent. Clearly if somehow we could get rid of one
of the two conditions (9.12) and (9.13), an LMI would result. One such case
is the so-called full information control problem where the measurements
are
y = xq :
That is the controller has direct access to the states and all the outputs of
the uncertainty blocks. Here
I
C2 = 0 ; D21 = I0 ;
therefore the kernel of [C2 D21 ] is trivial, so the constraint (9.12) disappears
completely. Consequently the variable X can be eliminated and the robust
synthesis problem reduces to (9.13) and Y > 0, hence an LMI problem in
Y and ;;1 .
A dual situation which is also convex is the full control case, where
(9.13) disappears. A few other instances of simplication in synthesis are
mentioned in the references at the end of the chapter. However these cases
are very special and in general robust design under structured uncertainty
remains a dicult problem. An alternative viewpoint which reinforces
this conclusion, is work on bilinear matrix inequalities (BMIs): these are
feasibility problems of the form
f (X; Y ) < 0;
where f is matrix valued and bilinear in the variables X , Y . Our robust
synthesis problem can indeed be rewritten in this form, as is shown in the
references. This is unfortunately not very helpful, since BMIs do not share
the attractive computational features of LMIs; rather, the general BMI
problem falls in an intractable computational complexity class3.
Given this complexity we are led to consider heuristic methods for
optimization; these will be mentioned after discussing synthesis issues
associated with time invariant uncertainty.
We now turn our attention to the uncertainty class . From the pre-
TI
ceding theory the robust stabilization problem reduces in this case to the
3
To be precise it is provably NP-hard, in the terminology of complexity theory.
290 9. Feedback Control of Uncertain Systems
optimization
;
inf sup S^(G; K )(j!); s;f :
K 2K !2R
This synthesis problem is however even harder than the one studied just
now in x9.3.1, since the function to be optimized is dicult to evaluate. In
other words we are starting with a weaker result from the analysis side, so
the addition of K can only make matters worse. For this reason the above
optimization is rarely attempted, and what usually goes by the name of -
synthesis is the minimization based on the upper bound for the structured
singular value. This is the optimization
inf sup inf ;! S^(G; K )(j!);;! 1 :
K 2K !2R ;! 2;s;f
(9.14)
If the above inmum is made less than one, we have a robustly stabilizing
controller from the analysis theory. One should note, however, that except
for -simple structures, the converse is not true, that is the previous method
might fail even when robust stabilization is achievable.
At this point we take the opportunity to discuss the relationship between
the above problem and the one obtained by using scaling operators from
; , namely
TI
P ; = f; 2 L(L2 ) : ; = ;~ ;~ ; ;~ 2 ; g;
TI TI
and we remarked there that such sets are not always contained in the cor-
responding commutant. Here, in fact, operators in P ; are self-adjoint
TI
over L2 [0; 1), so in general they are not LTI; in other words they are not
represented by a transfer function in H1 , except when they are memoryless.
Now the above factorization provides in eect a representation for mem-
bers of P ; in terms of transfer functions in L1 . To be more precise, the
TI
3. Compare k with k;1 ; if these are approximately equal stop, and set
K = Kk as the nal controller. Otherwise continue to next step.
4. Solve for ;k+1 in the scaled gain problem
This is the D-K iteration algorithm, which is so named because the scaling
variable ; is often denoted by D, and thus the algorithm iterates between
nding solutions for D and K . Starting with the trivial scaling ; = I , the
algorithm begins by performing an H1 synthesis in step (2); later in step
(4) a new scaling ; is found, which can be chosen from either ; or ; TI
depending on which type of uncertainty is involved. Then these scalings
are included for a new H1 synthesis; the algorithm stops when there is no
signicant improvement in the scaled norm.
What are the properties of this algorithm? First the achieved perfor-
mance k at any step forms a nonincreasing sequence up to the tolerance
employed in the inmization steps (an exercise). Also if we are dealing with
the uncertainty set the scaling ;k is a constant matrix, and therefore
a;c
the controller Kk is never required to have order larger than that of G. In
contrast, when dealing with the uncertainty set we must nd ratio-
TI
nal scalings by any of the two methods discussed in x9.3.2; in general the
scaling ;k may need to be of arbitrarily high order, and thus so must Kk .
Now notice that the scaled problem always involves the original generalized
plant G; this means that the ;k are modied, not accumulated, in the pro-
cess. Therefore if we impose a restriction on the order of the scalings to be
t, this automatically forces our algorithm to search only over controllers
up to a certain order.
However there is no guarantee that this process converges to the global
minimum, in fact not even to a local minimum: the iteration can get \stuck"
in values which are minima with respect to each variable separately, but
not for both of them at a time. This kind of diculty is not surprising given
the lack of convexity of the problem. Consequently these methods should
be viewed as ways of searching the controller space to improve an initial
design, rather than global solutions to the robust synthesis problem.
9.4. Exercises 295
9.4 Exercises
1. Complete the discussion in x9.1.2 by providing the details for the
proof of Proposition 9.3. Also show that if S and S ;1 are causal and
in L(L2 ), then their extensions to L2e preserve the inverse.
2. Small gain and nonlinearity. Consider the static function
(
f (x) = 2 x for jxj 1;
1
0 for jxj < 1:
Clearly, jf (x)j 12 x so f has small gain. If I is the identity, is the
function I ; f invertible?
3. Prove Proposition 9.11.
4. Consider the robust synthesis problem under a constant scaling ; and
static state feedback, i.e.
2 3
A B1 B2
G^ (s) = 4 C1 D11 D12 5
I 0 0
and K (s) = F , a static matrix. Show that the search for F and ;
reduces to a bilinear matrix inequality (BMI) condition in the vari-
ables F and (X; ;), where X is a square state space matrix as in the
KYP lemma.
5. Show that the sequence k in the D-K iteration of x9.3.3 is nonin-
creasing. Thus the performance of the controllers Kk generated by
the algorithm will improve, or remain unchanged, as the iteration
proceeds.
6. Basis function approach to uncertainty scalings.
a) Consider the scalar transfer function
^(j!) = 0 + 1 1 +1 j! + 1 1 ;1j! + 2 1 +1 !2 ;
which is real valued for every !. Find xed A0 , B0 , and Q ane
in (0 ; 1 ; 2 ), so that
;1 (j!I ; A0 ));1
^(j!) = (j!I ; A0 )
B0 Q B0 : (9.20)
b) Discuss how to modify the previous construction to describe:
(i) A spatially structured matrix
^ (j!) = 0 + 1 1 +1j! + 1 1 ;1 j! + 2 1 +1 !2 ;
(ii) Terms of higher order in 1+1j! and its conjugate.
296 9. Feedback Control of Uncertain Systems
c) Given a scaling ^ (j!) of this form (9.20), and
M^ (s) = CA D
B ;
B B
d) Explain how to reduce condition (9.19) to an LMI.
7. Comparison of frequency scaling approaches.
a) Suppose that a rational scaling from ; is required to minimize
the gain k;S (G; K );;1 k subject to the constraint of having
TI
10
Further Topics: Analysis
p
= q 2 L2 L2 : kEk pk kEk qk; k = 1; : : : ; d ;
where Ek = 0 0 I 0 0 . Introducing the quadratic forms
p = kE pk2 ; kE qk2;
k q k k
the above can be expressed as
Ra = pq 2 L2 L2 : k pq 0; k = 1; : : : ; d :
Notice that the forms k map L2 L2 ! R, which is slightly dierent from
our k 's from Chapter 8. We can also rewrite these quadratic forms as
p = h p ; p i
k q q k q
with
k = Ek0Ek ;E0E :
k k
Now let us combine the constraints in the k by means of multipliers
k > 0.
We dene
d
p :=
X
k k p = h pq ; pq i
q k=1
q
where the matrix
d
k k = ;0 ;0; ;
X
= (10.1)
k=1
and we recall that
d
X
;=
k Ek Ek = diag(
1 I; : : : ;
d I )
k=1
is an element of P ;, the set of positive scaling matrices from earlier
chapters.
300 10. Further Topics: Analysis
From the above discussion, for any p q related by q = p, 2 we a
have the inequality
p 0: (10.2)
q
Next we turn our attention to the robustness analysis results. It was
shown in the previous chapters that the feasibility of the operator inequality
M ;M ; ; < 0
for ; 2 P ;, is necessary and sucient for robust well-connectedness of
(M; ), and also for robust stability of (M; ) when M is causal.
a a;c
or equivalently that
Mq ;kqk2 for every q 2 L2 : (10.3)
q
Notice that the set
RM := Mq q : q 2 L2 L2 L2
is the relation or graph dened by the operator M .
Consequently we can interpret the above results in geometric terms inside
the space L2 L2. Condition (10.3) species that the relation RM lies in
the strict negative cone dened by the quadratic form , whereas (10.2)
states that the uncertainty relation Ra lies in the non-negative cone. Thus
the two relations are quadratically separated by the form .
The results of Chapters 8 and 9 imply that the robust well-connectedness
of (M; ) (respectively the robust stability of (M; )) is equivalent to
a a;c
the existence of of the form (10.1) such that the resulting form provides
this quadratic separation.
What happens with the analysis over ? While the exact (and dicult
TI
to compute) structured singular value tests are of a dierent nature, its
convex upper bound has indeed the same form.
Let be an LTI, self-adjoint operator on L2 (;1; 1), characterized by
the frequency domain L^ 1 function
^
^ (j!) = ;(j!) ^ 0 ;
0 ;;(j!)
10.1. Analysis via Integral Quadratic Constraints 301
where ;^ (j!) 2 P ;s;f , corresponding to the spatial structure s;f of .
TI
The quadratic form
Z 1
(v) = hv; vi = 21 v^(j!) ^ (j!)^v (j!)d!
;1
is thus dened on L2 (;1; 1), and can in particular be restricted to
L2 [0; 1). In particular we will have
p 0
q
whenever q = p and 2 . Also, the L1 condition
TI
0 q (t) 0 q(t) 0
so satises the IQC dened by
= 0 0 :
If in addition we have the contractiveness condition j(t)j 1, it follows
easily that satises the IQC dened by
= ;0 ;0;
for any matrix ; > 0. Now one can always superimpose two IQCs, so we
nd that a contractive, time-varying parameter gain always satises the
IQC dened by
= ; ;; :
Finally, assume the parameter is real, contractive, and also constant over
time (LTI). Then it follows analogously that the component q = p satises
the IQC dened by
^ j!) = ^;^ (j!) ^^(j!) :
( (j!) ;;(j!)
for any bounded ^ (j!) = ;( ^ j!) and ;^ (j!) > 0.
As a nal remark on the modeling aspect, we have not imposed any
a priori restrictions on the allowable choices of IQC; to be interesting,
however, they must allow for a rich enough set of signals. For instance a
negative denite will only describe the trivial set (p; q) = 0. Furthermore
this example shows that arbitrary IQCs need not respect the restriction
that the variable p is free, which is implicit in any model of the operator
10.1. Analysis via Integral Quadratic Constraints 303
form q = p. While a departure from our traditional operator viewpoint
might indeed have interest, in what follows we will assume that our IQCs
are models for some set of operators.
p q
Mq q
M
The above result provides an important rst step for robustness analysis
with IQCs. It tells us that the mapping (I ; M ) is injective over L2, and
that the inverse mapping dened on Im(I ; M ) has a norm bound of
. However this still falls short of establishing that the inverse is dened
over all of L2 (i.e. that the mapping is surjective), which would be required
to state that (M; ) is robustly well-connected. The following example
illustrates this diculty.
Example:
Suppose M is the constant gain M = 2, and is the LTI system with
transfer function
^ s) = s ; 1 :
( s+1
It is easy to see that is isometric, so kpk = kqk whenever q = p.
Therefore satises, for example, the IQC
p = ;kpk2 + 2kqk2 0:
q
corresponding to
= ;0I 20I :
Also it is clear that
Mq = ;4kqk2 + 2kqk2 ;2kqk2;
q
Applying Proposition 10.3 we see that
kpk k(I ; M )pk
for some constant (it is easily veried here that = 1 suces). However
Q := I ; M has transfer function
;s
Q^ (s) = s3 + 1
and does not have a bounded inverse on L2 [0; 1). In fact it is easy to see
that the operator Q, while injective, is not surjective, since the Laplace
transform of any element in its image must belong to the set
fv^(s) 2 H2 : v^(3) = 0g:
306 10. Further Topics: Analysis
In fact the above subspace of H2 exactly characterizes the image of Q.
The above discussion implies that in addition to separation of the graphs
of M and by an IQC, some additional property is required to prove
the invertibility of I ; M . For the specic IQCs considered in previous
chapters, this stronger suciency result was provided by the small-gain
theorem. We seek an extended argument to establish this for a richer class
of IQCs. The solution discussed below is to assume that the uncertainty set
is closed under linear homotopy to = 0. Before making this precise, we
state a property about the image of operators such as those in the previous
proposition.
Lemma 10.5. Suppose Q 2 L(L2 ) satises
kQpk kpk
for all p 2 L2 . Then ImQ is a closed subspace of L2 .
Proof . Take a sequence Qpn which converges to z 2 L2. Then Qpn is a
Cauchy sequence so kQpn ; Qpmk < holds for suciently large n, m.
Now applying the hypothesis we nd
kpn ; pm k
so pn is also a Cauchy sequence, hence pn ! p 2 L2 . By continuity of Q,
Qpn ! Qp and therefore z = Qp is in the image of Q.
We can now state the main result.
Theorem 10.6. Suppose the set is such that if 2 , then 2
for every 2 [0; 1]. If satises the IQC dened by , and (10.4) holds,
then (M; ) is robustly well-connected, i.e. I ; M is invertible over .
Proof . Fix 2 . Given Proposition 10.3, it suces to show that I ;M
is surjective, because in that case it follows that (I ; M );1 exists and
has norm bounded by .
Let us suppose that Im(I ; M0 ) = L2 for some 0 2 [0; 1]. The key
step is to show that the property is maintained when 0 is replaced by
satisfying
j ; 0 j kM 1k kk : (10.7)
Before proving this step, we note that it suces to establish our result.
Indeed, in that case we can begin at = 0, where Im(I ) = L2 trivially,
and successively increment the value of by the above steps, which are
constant, independent of the current . In a nite number of steps we will
cover the interval 2 [0; 1].
We thus focus on the perturbation argument for a given 0 , and
satisfying (10.7). By contradiction, suppose
Im(I ; M )
10.1. Analysis via Integral Quadratic Constraints 307
is a strict subspace of L2. Since it is closed by Lemma 10.5, then by the
projection theorem it has a non-trivial orthogonal complement. Namely we
can nd a function v 2 L2 , kvk = 1, such that
h(I ; M )p; vi = 0 for all p 2 L2:
Now observe that
(I ; M0 ) = (I ; M ) + M ( ; 0 )
therefore
h(I ; M0 )p; vi = hM ( ; 0 )p; vi for all p 2 L2 : (10.8)
Since I ; M0 is surjective we can nd p0 satisfying
(I ; M0 )p0 = v;
and furthermore such p0 has norm bounded by . Substitution into (10.8)
gives
1 = hv; vi = hM ( ; 0 )p0 ; vi kM k j ; 0 j kk < 1;
which is a contradiction. Therefore I ; M must be surjective as required.
We have thus obtained a fairly general robust well-connectedness test
based on IQCs; the extra condition imposed on the uncertainty set is
quite mild, since one usually wishes to consider the nominal system ( = 0)
as part of the family, and thus it is not too restrictive to impose that the
segment of operators , 2 [0; 1] also keep us within . The reader
can verify that this is true with the IQC models presented in the above
examples.
An important comment here is that we have only shown the suciency
of this test for a given IQC. When studying the uncertainty set in a
I I (10.9)
10.1. Analysis via Integral Quadratic Constraints 309
is convex in . Therefore the search for an IQC is a (possibly innite
dimensional) convex feasibility condition over .
In more practical terms, one usually considers a family of known IQCs
for the relevant uncertainty class , described by some free parameters;
this is potentially a subset of all the valid IQCs. Then one searches over the
set of parameters to nd one that satises (10.9). If successful, the search
is over; if not, it is conceivable that a richer \catalog" of IQCs may be able
to solve the problem.
How is such a parametrization of IQCs obtained? Clearly, one can be
generated from any nite set 1 ; : : : ; d of valid IQCs, by means of the
relevant convex cone. Sometimes, however, as in the example of scalar pa-
rameters discussed above, a matrix parameter such as = ; or ; > 0
is more convenient. In some cases, most notably for LTI uncertainty, one
disposes of an innite family of IQCs characterized in the frequency domain
by
^ (j!) 2 S
at every !. In this case we have the same options that were discussed in
Chapter 9: either keep the parametrization innite dimensional, or restrict
it to a nite dimensional space of transfer functions by writing
;1 (j!I ; A );1
^ (j!) = (j!I ; A ) Q (10.10)
B B
where A and B are xed and impose the structure of the relevant set
S , and Q is a free matrix parameter. The latter form can fairly gener-
ally accommodate a nite parametrization of a family of self-adjoint, LTI
operators.
The two options are distinguished when we address the second basic
question, namely the search for an IQC satisfying (10.9). Assuming M is
LTI, the frequency domain alternative is to impose
M^ (j!) (
^ j!) M^ (j!) ; for all !;
I I
which in practice implies a one-dimensional gridding approach.
For nite parametrizations of the form (10.10), the search for the param-
eter Q reduces to a state-space LMI by application of the KYP Lemma, as
discussed in the exercises of Chapter 9.
So we see that given a family of IQCs that satisfy the uncertainty, the
search for an appropriate one to establish robust stability can be handled
with our standard tools. Thus we have an attractive general methodology
for a variety of robustness analysis problems.
310 10. Further Topics: Analysis
10.2 Robust H2 Performance Analysis
In Chapter 8 we introduced a language for the characterization of system
uncertainty, based on uncertainty balls in the L2-induced norm. This pro-
cedure led naturally to conditions for robust stability analysis based on H1
norms of the nominal system, which in fact constitute the main motivation
for the use of H1 norms in control theory. When treating the problem of
system performance, it was shown in Chapter 9 how a disturbance rejection
specication in the H1 norm could be imposed with the same methods as
robust stability analysis.
Other than this mathematical convenience, however, the motivation for
H1 norms in disturbance rejection problems is often not very strong. An
H1 criterion measures the eect of the worst case adversary in a game
where we know nothing about the spectral content of the disturbance.
As argued in Chapter 6, very commonly one has available information on
the disturbance spectrum, that leads more naturally to an H2 measure of
performance.
In short, H1 norms are largely motivated by robustness, and H2 norms
by disturbance rejection. Is there are suitable compromise for robust per-
formance problems? This has been a long-lasting problem, that goes back
to the question of robustness of LQG regulators studied since the late 1970s
(see the references). We will present two approaches for this \Robust H2 "
problem. While the picture is not as tight as for H1 performance, we will
see that similar tools can be brought to bear for this case.
-
p M11 M12 q
z M21 M22 w
case the conditions will involve the use of constant uncertainty scalings
; 2 P ;, analogously to the situation of previous chapters. In particular,
we will consider the modied problem
Z 1
Jf ;a := inf Tr(Y (!)) d!
2 ; subject to ; 2 P ; and
;1
^ ; 0 ^ ; 0
M (j!) 0 I M (j!) ; 0 Y (!) 0 8 !: (10.13)
We now outline the set-based approach to white noise rejection, that
is directly tailored to the robustness analysis problem. By treating both
noise and uncertainty from a worst-case perspective, exact characteriza-
tions can be obtained. At rst sight, this proposition may seem strange
to readers accustomed to a stochastic treatment of noise; notice, however,
that a stochastic process is merely a model for the generation of signals
314 10. Further Topics: Analysis
with the required statistical spectrum, and other models (e.g. determinis-
tic chaos) are possible. Here we take the standpoint that rather than model
the generating mechanism, we can directly characterize a set of signals of
white spectrum, dened by suitable constraints, and subsequently pursue
worst-case analysis over such set.
One way to do this for scalar L2 signals is to constrain the cumulative
spectrum by dening the set
Z )
B
W;B := w 2 L2 : min ; ; ; jw^(j!)j2 d! + ;
; 2
(10.14)
kwk2
0
B
Figure 10.3. Constraints on the accumulated spectrum
This approach is inspired in statistical tests for white noise that are com-
monly used in the time series analysis literature. The constraints, depicted
in Figure 10.3, impose that signals in W;B have approximately unit spec-
trum (controlled by the accuracy > 0) up to bandwidth B , since the
integrated spectrum must exhibit approximately linear growth in this re-
gion. Notice that this integrated spectrum will have a nite limit as ! 1,
for L2 signals, so we only impose an sublinear upper bound for frequencies
above B .
Having dened such approximate sets, the white noise rejection measure
will be based on the worst-case rejection of signals in W;B in the limit as
! 0, B ! 1:
kS(M; )k2;wn := lim!0
sup kS(M; )wk2 :
B !1 w2W;B
For an LTI system under some regularity assumptions, kk2;wn can be shown
to coincide with the standard H2 norm; see the references for details. The
method can be extended to multivariable noise signals, where the compo-
10.2. Robust H Performance Analysis
2 315
nents are restricted to have low cross-correlation. Notice that the preceding
denition can apply to any bounded operator, even nonlinear.
We are now ready to state a characterization of the frequency domain
test.
Proof . The rst observation is that the eect of the impulse at the i-th
input is to \load" an initial condition x0 = Bw ei in the system, which
subsequently responds autonomously. Here ei denotes the i-th coordinate
vector in Rm . For this reason we rst focus on the problem for xed initial
condition and no input,
J (x0 ) := sup kz k22:
2a;c ;x(0)=x0
We now write the bound
J (x0 ) sup kz k22
p2L2 [0;1);kp^k k22 kq^k k22
x(0)=x0
d !
inf sup kz k2 + X
(kp^ k2 ; kq^ k2 ) : (10.15)
k >0 q2L2 [0;1) 2 k k 2 k 2
k=1
In the rst step the k-th uncertainty block is replaced by the Integral
Quadratic Constraint kp^k k22 kq^k k22 ; this constraint would characterize
the class of contractive (possibly non-causal) operators. However by re-
quiring p 2 L2[0; 1), we are imposing some causality in the problem by
not allowing p to anticipate the impulse. This does not, however, impose
full causality in the map from q to p, hence the inequality.
Secondly, we are bounding the cost by using the Lagrange multipliers
k > 0 to take care of the constraints. It is straightforward to show the
stated inequality. This step is closely related to the \S-procedure" method
explained in Chapter 8 when studying the structured well-connectedness
problem. In that case we showed the procedure was not conservative; in fact,
a slight extension of those results can be used to show there is equality in
the second step of (10.15).
To compute the right hand side of (10.15), observe that for xed
k we
have
Z 1
sup x(t) (Cp ;Cp + Cz Cz )x(t) ; q(t) ;q(t) dt: (10.16)
q2L2 [0;1) 0
This is a linear-quadratic optimization problem, whose solution is closely
related to our work in H2 control and the KYP Lemma. We refer to the
exercises of Chapter 7 for the main ideas behind the following Proposition.
Proposition 10.15. If the H1 norm condition
1
;2 0
M ;; 12
< 1
0 I 0
1
318 10. Further Topics: Analysis
holds, then the optimal value of (10.16) is given by
x0 Xx0 ;
where X is the stabilizing solution of the Algebraic Riccati Equation
A X + XA + Cp ;Cp + Cz Cz + XBq ;;1 Bq X = 0:
Furthermore, this solution X is the minimizing solution of the LMI
A X + XA + Cp ;Cp + Cz Cz XBq 0: (10.17)
Bq X ;;
Notice that the above LMI is the same as the one considered in Problem
10.13. To apply this result, rst notice that the norm condition is
1
; 2 M11 ;; 21
< 1:
;; 21 M
21
1
Given the robust stability assumption, we know by Theorem 8.12 in Chap-
ter 8 that the norm of the top block can be made less than one by
appropriate choice of ;. Now since ; can be scaled up, the norm of the bot-
tom block can be made as small as desired, yielding the required condition,
and thus the feasibility of (10.17).
Now we wish to combine the solution X with the minimization over ;.
Here is where the LMI (10.17) is most advantageous, since it is jointly ane
in ; and X . We have
J (x0 ) inf x0 Xx0 ;
X;;>0 satisfying (10.17)
that is a convex optimization problem.
The nal step is to return to the sum over the impulses applied at the
input channels:
m
X
kS(M; )k22;imp = kS(M; )i k22
i=1
m
X
J (Bw ei ) (10.18)
i=1
Xm
inf ei Bw XBw ei (10.19)
i=1 X;;>0 satisfying (10.17)
Xm
inf ei Bw XBw ei (10.20)
X;;>0 satisfying (10.17) i=1
= inf Tr(Bw XBw )
X;;>0 satisfying (10.17)
= Js;a :
10.2. Robust H Performance Analysis
2 319
In the above chain of inequalities, (10.18) comes from nding the worst-case
for each initial condition (they need not be the same across i), (10.19) is
the previous derivation, and (10.20) results from exchanging the sum with
the inmum.
The previous result focuses on the H2 norm as a measure of transient
performance; we immediately wonder if the same bound applies to the other
notions that were used to motivate the H2 norm, in particular in regard to
the rejection of stationary white noise. It can, in fact, be shown that
sup kS(M; )k22;aov Js;a ;
2a;c
where
Z T
kS(M; )k22;aov := lim sup T1 Ejz (t)j2 dt
T !1 0
is the average output variance of the time-varying system when the input
is stochastic white noise. This bound can be derived from Theorem 10.14,
using the fact that kS(M; )k2;aov equals the average of the impulse re-
sponse energy as the impulse is shifted over time; see the references for
this equivalence. Alternatively, a stochastic argument can be given using
Ito calculus to show directly this bound; the references contain the main
ideas of this method, that also applies to nonlinear uncertainty. Notice,
however, that in these interpretations we can only prove a bound, not an
exact characterization as was done with Jf ;a .
We end the discussion of this method by explaining how the bound can
be rened in the case of LTI uncertainty. Returning to the proof we would
consider in this case frequency depending scalings
k (!), and write
J (x0 ) sup kz k22
q2L2 [0;1);jp^k (j!)j2 jq^k (j!)j2
x(0)=x0
!
d Z 1
(inf sup kz k 2+X
k (!)(jp^k (j!)j2 ; jq^k (j!)j2 )d! :
2
k !)>0 q2L2 [0;1) k=1 ;1
(10.21)
However at this level of generality the restriction q 2 L2 [0; 1) (related
to causality) is not easily handled. The only available methods to impose
this are based on state-space computation, for which the
k (!) must be
constrained a priori to the span of a nite set of rational basis functions.
We will not pursue this here (see the references), but remark that in this
way one generates a family of optimization costs JN s;TI of state dimension
increasing with number of elements in the basis, in the limit approaching
the optimization (10.21); we will have more to say about this below.
320 10. Further Topics: Analysis
10.2.3 Comparisons
We have shown two alternative methods to approach the robust H2 perfor-
mance problem; we end the section with a few remarks on the comparison
between them. For simplicity, we focus on the case of scalar disturbances
w.
We rst discuss perturbations in and state the following relation-
a
ships:
sup kS(M; )k22;imp sup kS(M; )k22;imp
2a;c 2a
lim
!0
sup kS(M; )wk22 (10.22)
B !1 w22W;B
a
= lim
!0
sup kS(M; )wk22 = Jf ;a :
B !1 w22W;B
a;c
The rst inequality is clear; (10.22) follows from the fact that the impulse
(or more exactly, an L2 approximation) is always an element of the \white"
set W;B . The equalities with W;B were stated before.
Notice that the previous inequality does not transparently relate Jf ;a and
Js;a, since we only know that Js;a is an upper bound for the rst quantity.
Nevertheless, we have the following:
Proposition 10.16. Js;a Jf ;a.
Proof . The exercises provide a direct proof based on the state-space ver-
sion of Jf ;a . Here we will give a more insightful argument for the case of
scalar w. Notice that in the case of a scalar impulse, x0 = Bw and the right
hand side of (10.15) is directly Js;a. Rewriting (10.15) in the frequency
domain we have
Z 1 " d #
inf sup 1 X
jz (j!)j +
k (jpk (j!)j ; jqk (j!)j ) d!:
2 2 2
; q2L2 [0;1) 2 ;1
k=1
Now, introducing the slack variable Y (!) to bound R 1the above integrand we
can rewrite this problem as the minimization of ;1 Y (!) d!
2 subject to
d
X
jz (j!)j2 +
k (jpk (j!)j2 ; jqk (j!)j2 ) Y (!)
k=1
for all q^(j!) in the H2 , Fourier image of L2 [0; 1). Now, since w(t) = (t)
we have
p^(j!) = M^ (j!) q^(j!) ;
z^(j!) 1
which translates the previous inequality to
q^(j!) M^ (j!) ; 0 M^ (j!) ; ; 0
q^(j!) 0:
1 0 I 0 Y (!) 1
10.2. Robust H Performance Analysis
2 321
Now we have an expression that closely resembles Problem 10.8; if q^(j!)
were allowed to be any frequency function, the inequality would reduce to
(10.12) and the two problems would be equivalent. However the constraint
q^(j!) 2 H2 that embeds some causality in Problem 10.13 will lead in
general to a smaller supremum.
We remark that examples can be given where the inequality in
Proposition 10.16 is strict.
An interesting consequence of the above proof is that for scalar w, there
is equality in (10.22). In fact if we remove the causality constraint in the
previous derivation, the result is exactly Jf ;a . This means that the impulse
can be taken as the worst-case \white" disturbance when we allow for
non-causal uncertainty.
A few more comments are in order regarding the role of causality. As
remarked before, if we are considering the worst-case white noise problem,
the cost does not change by the causality restriction; what happens is
that the signal achieving this cost ceases to be the impulse. When dealing
directly with the impulse response norm, or with average case stochastic
noise, then causality of does indeed aect the cost, and in this case Js;a
provides a tighter bound.
Finally, we discuss the comparison for the case of LTI uncertainty. Notice
that here we have an unambiguous H2 norm we are trying to compute, for
which both approaches provide bounds.
In this regard, once again we nd that removing the restriction q 2
L2 [0; 1) from (10.21) will lead to the result Jf ;TI , but that there is a gap
between the two. This would mean that if the uncertainty set is , T I;c
the state-space approach could in principle give a tighter bound. Notice,
however, that we do not have a Js;TI bound, only a family JN s;TI obtained
by basis expansions of order N for the frequency varying scalings. This
means that while
inf JN Jf ;TI;
N s;TI
we know nothing about the situation with a given, nite N . This is par-
ticularly relevant since the computational cost of state space LMIs grows
dramatically with the state dimension, in contrast with a more tractable
growth rate for computation based on Problem 10.8.
10.2.4 Conclusion
In summary, we have presented two approaches to the Robust H2 perfor-
mance problem, and discussed their interpretation. The two methods are
not equivalent; one oers a tighter characterization of causal uncertainty,
the other the benet of frequency domain interpretation and computation.
A rare feature of H1 theory has been the complete duality between two
ways of thinking, one based on state-space and linear-quadratic optimiza-
322 10. Further Topics: Analysis
tion, the other on operator theory and the frequency domain. After two
decades of research, the robust H2 problem has found these two faces but
not achieved the same unity: LQG does not quite adapt to the world of
robustness.
11
Further Topics: Synthesis
We have arrived at the nal chapter of this course. As with the preceding
chapter our main objective is to acquire some familiarity with two new
topics; again our treatment will be of a survey nature. The two areas we
will consider are linear parameter varying systems, and linear time varying
(LTV) systems. The previous chapter considered advanced analysis, our
aim in this chapter is synthesis.
Up to this point in the course we have worked entirely with systems and
signals of a real time variable, namely continuous time systems, however
in this chapter we will instead consider discrete time systems. One reason
for this is that the concepts of the chapter are more easily developed and
understood in discrete time. This change also gives us the opportunity to
re
ect on how our results in earlier parts of the course translate to discrete
time.
The basic state space form of a discrete time system is given below.
xk+1 = Axk + Bwk x0 = 0 (11.1)
zk = Cxk + Dwk :
This system is described by a dierence equation with an initial state con-
dition 0 , and these replace the dierential equation and initial condition
we are accustomed to in continuous time. Thus every matrix realization
(A; B; C; D) species both a discrete time system, and a continuous time
system. We will only use discrete time systems in this chapter.
Before starting on the new topics we brie
y digress, and dene the space
of sequences on which the systems discussed in this chapter will act. We use
324 11. Further Topics: Synthesis
`n2 (N ) to denote the space of square summable sequences mapping the non
negative integers N to Rn . This is a Hilbert space with the inner product
1
X
hx; yi`2 = xk yk :
k=0
The space `n2 (N ) is the discrete time analog of the continuous time space
Ln2 [0; 1). We will usually write just `2 when the spatial dimension and
argument are clear.
1 In1 0
...
0 d Ind
A B
z C D w
The new set of state space systems that we introduce are shown in Fig-
ure 11.1. The picture shows the upper star product between two systems.
The upper system is spatially diagonal, and the lower system is memoryless.
11.1. Linear parameter varying and multidimensional systems 325
Each of the blocks in the upper system is written i Ini which means
2 3
i 0
i Ini = 64 . . . 75 2 L(`n2 i );
0 i
for some operator i on `12(N ). In words the operator i Ini is just the spa-
tially diagonal operator formed from ni copies of the bounded operator i .
For reference we call the operators i scalar-operators, because they act on
scalar sequences. Here (A; B; C; D) is simply a set of state space matri-
ces; dene n such that A 2 Rnn and therefore n1 + + nd = n holds.
Referring to Figure 11.1, let us set
2 3
1 In1 0
= 64 ... 7
5 (11.2)
0 d Ind
for convenient reference. Therefore we have the formal equations
xk = (Ax)k + (Bw)k (11.3)
zk = Cxk + Dwk
describing the interconnection. We use (Ax)k to denote the k-th element
of the sequence given by Ax; note that since the state space matrix A
has no dynamics (Ax)k = Axk in this notation. Here w 2 `2 and we
dene this system to be well-posed if I ; A is nonsingular. Thus there
is a unique x 2 `2 which satises the above equations when the system is
well-posed. Now the map w 7! z is given by C (I ; A);1 B + D and
is rational in the operators 1 ; : : : ; d when they commute. However these
operators i do not commute in general, and we therefore call these types
of systems noncommuting multidimensional systems, or NMD systems for
short. NMD systems can be used to model numerous linear situations, and
we now consider some examples.
Examples:
First we show that our standard state space system can be described in
this setup. Let Z denote the shift operator, or delay, on `2 . That is, given
x = (x0 ; x1 ; x2 ; : : : ) 2 `2 we have
Zx = (0; x0 ; x1 ; : : : ): (11.4)
We will not distinguish between shifts that act on `2 of dierent spatial
dimension. Therefore, given a sequence x 2 `n2 we have
2 3 2 32 3 2 (1) 3
Zx(1) Z 0 x(1) x
6 . 7 6 . 76 . 7 6 . 7
Zx = 4 . 5 = 4
. . . 5 4 . 5 ; for x partitioned as 4 .. 5 2 `n2 :
.
Zx(n) 0 Z x(n) x(n)
326 11. Further Topics: Synthesis
Namely Z acts independently on every scalar sequence x(i) comprising the
vector sequence x; so it is spatially diagonal. Now by setting = Z we see
that the above system is exactly of the form in (11.3)
Our next example involves varying parameters. Suppose we have a sys-
tem whose state space realization depends on the real scalar parameters
1k ; : : : ; rk which vary with respect to the discrete time parameter k; the
variation with k may not be known a priori. Let (A~(); B~ (); C~ (); D~ () )
be the realization, where signies the dependence on the i . If the de-
pendence of each of the matrices is rational in the parameters i , then it is
frequently possible to convert this system to the form in (11.3) with 1 = Z
and 2 = 1 ; 3 = 2 ; : : : ; r+1 = r , for some state space realization of
constant matrices (A; B; C; D). See the Chapter 8 exercises.
For our third example let us consider a multidimensional system in two
independent variables k1 and k2 . The state equation for such a system
follows.
xk1 +1; k2 = A11 A12 xk1 ; k2 + B1 w :
xk1 ; k2 +1 A21 A22 xk1 ; k2 B2 k1 ; k2
Now suppose that w 2 `2 (N N ). Then let Z1 and Z2 be the shift operators
on the variables k1 and k2 respectively. By setting
= Z01 Z0
2
this system can immediately be converted to the form in (11.3). Clearly this
construction can be extended to any multidimensional system with inputs
and states in `2 (N N ). Furthermore with some additional technical
considerations, systems with inputs in the space `2 (N N Z Z)
can also be treated; see the references for details.
The examples above provide motivation for the use of this model, and
has a clear analogy to the full block uncertainty introduced in Chapters 8
and 9 We will say more about these ties later.
Dene the set of operators
= f 2 L(`2 ) : is diagonal as in (11.2), and satises kk`2!`2 1g:
This is the set of contractive operators that have the diagonal form in
(11.2). We dene X to be the set of positive symmetric matrices in the
commutant of by
X = fX 2 Sn : X > 0 and X = X , for all 2 g:
Therefore every element of X has the block-diagonal form
2 3
X1 0
X = 64 ... 7
5; (11.5)
0 Xd
11.1. Linear parameter varying and multidimensional systems 327
where each Xi 2 Sni and is positive. We are now ready to state a ma-
jor result pertaining to NMD systems. The following theorem mirrors the
continuous time, full block uncertainty case of Theorem 8.12.
Theorem 11.1. The operators I ; A and C (I ; A);1 B + D are non-
singular and contractive respectively, for all 2 , if and only if there
exists X 2 X such that
A B X 0 A B ; X 0 < 0:
C D 0 I C D 0 I
This theorem provides a necessary and sucient condition for such a NMD
system to be both well-posed and contractive with respect to the xed set
. We remark that this condition remains exact when 1 is xed to be
the shift Z . Thus it is exactly the scalar-operator-times-identity version of
Theorem 8.12 in discrete time; see also the exercises in Chapter 8 for a
starting point for proving this theorem.
The pure LMI test given above is also reminiscent of the KYP lemma,
however X is now structured. The result in Theorem 11.1 aords the oppor-
tunity to develop synthesis methods directly for NMD systems. Synthesis
is the topic of the next subsection, after which we discuss realization theory
for models of this type.
z w
G(1 ; : : : ; d )
y
K (1 ; : : : ; d )
CP D 0 I CP D 0 I (11.9)
Proof . We need to show that both well-posedness and contractiveness are
tantamount to the LMI condition of the proposition. That is I ; L AL
must be nonsingular and CL (I ; LAL );1 L BL + D contractive. Ob-
serve that P ;1 = P since P is a permutation matrix. Thus the two
conditions are equivalent to I ; (P L P )(P ALP ) is nonsingular and
CL P (I ; (P L P )(P AL P ));1 (P L P )P BL + D is contractive. Now
P L P is of the form in (11.8), and so by invoking Theorem 11.1 we see
these two latter conditions hold if and only if (11.9) has a solution.
11.1. Linear parameter varying and multidimensional systems 331
Checking the admissibility of any given controller is equivalent to an LMI
feasibility problem. Our ultimate goal is to obtain synthesis conditions, so
we will need to examine the block structure of XL in (11.9). Any XL in XL
is of the block diagonal form diag(XL1 ; : : : ; XLd) with blocks given by
XLi = XXi X X
2i ;
2i 3i
and the matrices Xi 2 Sni, X2i 2 Rni n i and X3i 2 Sn i. Now consider the
eect of the permutation matrix P in (11.8) on XL . Let XP = PXLP and
we have
XP = XX X 2 ; (11.10)
2 X3
where the constituent matrices are
X = diag(X1 ; : : : ; Xd ); X2 = diag(X21 ; : : : ; X2d); and
X3 = diag(X31 ; : : : ; X3d ). We can pre- and post-multiply the expression
in (11.9) by the matrices diag(P ; I ) and diag(P; I ) respectively, to arrive
at
AL BL XP 0 AL BL ; XP 0 < 0;
C L DL 0 I CL DL 0 I
Now from (11.10) the matrix XP has the same partitioned structure as
the XL-matrix that appears in the H1 synthesis of x7.2; however each
of the matrices X1 , X2 and X3 is block-diagonal. If we apply the Schur
complement formula twice to the above LMI we get the following equivalent
inequality.
2 3
;XP;1 AL BL 0
6 A
L ;XP 0 CL 7
0 ;I DL 5 < 0: (11.11)
6 7
4 B
L
0 CL DL ;I
That is a controller is admissible exactly when a solution can be found to
this inequality. This is the critical form for solving the synthesis problem.
The left hand side of this inequality is ane in the state space matrices
for the controller. We can use Lemma 7.2 to show that the solution to this
controller dependent LMI, implies the existence of solutions to two matrix
inequalities which do not explicitly or implicitly involve the controller. An-
other application of the Schur complement converts these inequalities into
LMIs, and yields the following lemma.
Lemma 11.3. An admissible controller exists, with partition dimensions
n 1 ; : : : ; n d , if and only if there exist symmetric matrices X and Y such
that
332 11. Further Topics: Synthesis
(i)
2 3
NY 0 4AYCAY A; Y C AY
C1 B1 N 0
Y C ; I D Y
0 I 1 1 1 11 5 0 I < 0
B1
D11 ;I
(ii)
2 3
NX 0 4A BXA
; X A XB1 C1 NX 0
1 XA B1 XB1 ; I D11
5 <0
0 I C1 D11 ;I 0 I
(iii) The identities
PXL P = X? ?? and PXL;1P = Y? ??
hold, for some XL 2 XL .
where the operators NY , NX satisfy
Im NY = ker B2 D12 NY NY = I
Im NX = ker C2 D21 NX NX = I:
We have concluded that an admissible controller exists exactly when there
exist solutions X and Y to conditions (i){(iii) in Lemma 11.3. Now (i)
and (ii) are LMIs and are completely specied in terms of the given plant
G. However (iii) is not an LMI condition, but only depends on controller
dimension. Note that X and Y must necessarily be members of the set X
dened in (11.5) if they satisfy (iii).
So the next step is to convert (iii) to an LMI-type condition. We have
the following result.
Lemma 11.4. The block-diagonal matrices X , Y 2 X satisfy condition
(iii) in Lemma 11.3 if and only if, for each 1 k d, the following
inequalities hold
Xi I 0 and rank Xi I n + n : (11.12)
I Yi I Yi i i
Proof . By Lemma 7.8 we see that there exist positive matrices XLk 2
Sni+ni exist such that
XLk = X?i ?? ; and XLk
;1 = Yi ?
? ?
for 1 k d, exactly when the inequalities in (11.12) are satised. Clearly
the block-diagonal matrix XL 2 XL , and PXLP satises the condition (iii)
in Lemma 11.3.
The rst of these conditions is an LMI, however the rank inequality is not.
Note that the rank conditions are trivially met if n i ni , and therefore
11.1. Linear parameter varying and multidimensional systems 333
can be eliminated if the controller order is chosen to be the same as the
plant. This situation was observed in our earlier H1 synthesis, where we
had only one variable.
We can now state the NMD system synthesis result.
Theorem 11.5. An admissible NMD controller exists if and only if, there
exist matrices X , Y 2 X , which satisfy LMI conditions (i) and (ii) in
Lemma 11.3 , and the LMI conditions given in (11.12), for each k =
1; : : : ; d,.
This gives exact convex conditions for the existence of a solution to the
NMD synthesis problem. Notice that if the rank conditions in (11.12) are
also achieved then a controller synthesis exists with dimensions n 1 ; : : : ; n d .
Furthermore, an admissible controller exists if and only if one exists satis-
fying n 1 = n1 ; : : : ; n d = nd. Also if d = 1, namely there is only one i , this
result corresponds exactly to the discrete time H1 synthesis problem; this
is more clearly apparent by consulting Proposition 8.28. When solutions X
and Y are found an explicit controller can be computed by constructing a
scaling XL , and then nding a controller realization which solves the LMI
given above in (11.11); this is similar to the procedure in x7.3.
C D 0 I C D 0 I (11.20)
This is an operator version the matrix KYP lemma. It does not depend on
the structure of A, B , C or D, or the presence of the operator Z .
For comparison, the corresponding standard result for linear time in-
variant discrete time systems is now stated from Proposition 8.28. Given
a system G with transfer function G^ (z ) := C0 (I ; zA0 );1 zB0 + D0 in a
minimal realization, the H1 norm of G is less than 1, if and only if, there
exists a matrix X0 > 0 such that
A0 B0 X0 0 A0 B0 ; X0 0 < 0:
C0 D0 0 I C0 D0 0 I
11.2. A Framework for Time Varying Systems: Synthesis and Analysis 345
Thus the above LTV result looks very similar to the time invariant result
just stated.
In Lemma 11.17 the variable X does not necessarily have a block-diagonal
structure, it is simply self-adjoint and positive denite. Our next goal is
therefore to improve upon this and obtain a formulation in which the vari-
able is block-diagonal. To this end dene the set X to consist of positive
denite self-adjoint operators X of the form
2 3
X0 0
6 X1 7
X = 664 X2
7
7
5
> 0; (11.21)
...
0
where the block structure is the same as that of the operator A. With this
denition we can state the main result of this section.
Theorem 11.18. The following conditions are equivalent
(i) kC (I ; ZA);1 ZB + Dk < 1 and 1 62 spec(ZA);
(ii) There exists X 2 X such that
ZA ZB X 0 ZA ZB ; X 0 < 0:
C D 0 I C D 0 I (11.22)
Formally, the result is the same as that for the linear time invariant case,
but the operators ZA and ZB replace the usual A-matrix and B -matrix,
and X is block-diagonal. We shall see in the sequel that this is a general
property of this formalism, and that this gives a simple way to construct
and to understand the relationship between time invariant and time varying
systems.
C D 0 I C D 0 I
holds. Now we can apply Proposition 11.11 to show that the above is
tantamount to
A B Z XZ 0 A B ; X 0 < 0:
C D 0 I C D 0 I (11.23)
We will now show that this inequality is satised.
Observe that, for each k 0, the following holds 2
Ek C = [0 0 Ck 0 ]:
Now using the fact Ek Ek = I it is routine to verify the important property
A B Ek 0 = Ek 0 A B
C D 0 Ek 0 Ek C D k holds, (11.24)
for each k 0.
Since X by assumption satises (11.20) there exists a > 0 such that
A B Z XZ
0 A B X 0
C D 0 I C D ; 0 I < ;I:
Pre and post multiply this by diag(Ek ; Ek ) and diag(Ek ; Ek ) respectively,
and use (11.24) to get that the matrix inequality
A B Ek 0 Z XZ
0 Ek 0 A B
C D k 0 Ek 0 I 0 Ek C D k
E 0 0 Ek 0
X
; 0 E 0 I 0 Ek < ;I
k
k
2
here we do not distinguish between versions of Ek that dier only in the spatial
dimension of the identity block
11.2. A Framework for Time Varying Systems: Synthesis and Analysis 347
holds, for every k 0. Finally use the denition of X to see that this last
inequality is exactly
A B Z XZ 0 A B ; X 0 < ;I;
C D k 0 I k C D k 0 I k (11.25)
for each k 0. This immediately implies that inequality (11.23) is satised.
The following corollary relates the innite dimensional linear matrix
inequality to the pointwise properties of the system matrices.
Corollary 11.19. The following conditions are equivalent
(i) kC (I ; ZA);1 ZB + Dk < 1 and 1 62 spec(ZA);
(ii) There exists a sequence of matrices Xk > 0, bounded above and below,
such that the inequality
Ak Bk Xk+1 0
Ak Bk ; Xk 0 < 0;
C k Dk 0 I Ck Dk 0 I
holds uniformly.
Proof . The result follows immediately from equation (11.25) in the proof
of Theorem 11.18 using the fact that (Z XZ )k = Xk+1 .
In this section we have developed an analysis condition for evaluating
the induced norm of an LTV system. In this framework the condition looks
formally equivalent to the KYP lemma for LTI systems.
z w
G
C~ D~ 0 I C~ D~ 0 I (11.29)
Thus this result gives an LMI condition to determine the `2 induced norm
of a periodic system of the form in (11.14). Notice that the statement of
this theorem simply involves replacing all the objects in Theorem 11.18
with their \wave" equivalent; for instance A now appears as A~. We have
the following synthesis theorem which mirrors this pattern exactly.
Theorem 11.22. Suppose G has a q-periodic realization. Then an admis-
sible controller of order n n exists if and only if there exist solutions to
the inequalities of Theorem 11.20, where A, B , C , D, X , Y and Z , are
replaced by the block-matrices dened by A~, B~ , C~ , D~ , X~ , Y~ and Z~
Given this correspondence one wonders whether a nite dimensional
system function can be dened. The answer is yes. Dene
G~ (~ ) := C~ (I ; ~ Z~A~);1 ~ Z~B~ + D:
~
11.2. A Framework for Time Varying Systems: Synthesis and Analysis 351
Here the matrix ~ is dened by
2 3
0 0
~ = 64 ... 7
5 ;
q;1
with the k being complex scalars. Analogs of our recent results on the
system function exist for this periodic systems function.
This brings us to the end of our quick look at LTV systems. The frame-
work presented here reduces general time varying problems to solutions in
terms of structured operator inequalities; these inequalities follow from the
standard LTI problems we have studied in earlier chapters. We have explic-
itly illustrated this by applying the new tools to deriving a KYP lemma
for LTV systems, and then provided the corresponding synthesis results
without proof. These general results become bona de LMI problems when
periodic or nite time horizon systems are being considered. In summary
the main feature of the framework is that it makes direct ties to the stan-
dard LMI techniques for time invariant systems, and thus many results and
derivations become formally equivalent.
AppendixA
Some Basic Measure Theory
1
We will restrict ourselves to Lebesgue measure.
A.1. Sets of zero measure 353
sets are the same size. Clearly this question distills down to:
When is the dierence S1 nS2 small?
Said yet another way, when do we say that the size of the dierence between
the sets is insignicant? Thus in order to answer this question we need to
precisely dene what we mean by insignicant in size. We will introduce
the idea of a set having zero size, or in mathematical language having zero
measure.
The rst thing we require is a denition for the size of an open interval
(a; b) in R. We dene the size of this set to be b ; a, which is exactly our
usual notion of length. Generalizing this, suppose we have a collection of n
disjoint intervals
(a1 ; b1 ); (a2 ; b2 ); : : : ; (an ; bn );
and dene the associated set
G = (a1 ; b1 ) [ (a2 ; b2 ) [ : : : [ (an ; bn )
to be their union. Since the intervals are disjoint, our usual intuition about
size would dictate that the size of G should be additively based on these
interval lengths. We therefore accordingly dene the size of this set to be
n
X
(bk ; ak ) = size of G :
k=1
With these denitions in place, consider a subset S which is contained
in such a union:
S [nk=1 (ak ; bk ):
Then if S has some size associated with it we would want this size to satisfy
n
X
size of S (bk ; ak ):
k=1
Notice that this bound will remain true regardless of whether or not these
intervals are disjoint, we therefore proceed assuming that they are not
necessarily disjoint. Let us generalize this idea to a countable number of
intervals. Suppose that
S [1k=1 (ak ; bk ):
Then if we have a size associated with S , we would naturally conclude that
1
X
size of S (bk ; ak ) must hold:
k=1
If the series on the right converges we have an upper bound on the possible
size, or measure, of the set S . Of course if the series diverges then the above
inequality gives us no information about the set S .
354 AppendixA. Some Basic Measure Theory
Having done a little exploration above about measuring sets, we are now
ready to dene a set of zero measure.
Denition A.1. A subset S R has zero measure if, for every > 0,
there exists a countable family of intervals (ak ; bk ) such that the following
conditions hold:
(a) The set S is a subset of the union [1 k=1 (ak ; bk );
P1
(b) The sum k=1 (bk ; ak ) < .
Given our discussion above this denition can be interpreted as follows:
a set S has zero measure if the upper bound on its size can be made as
small as desired. To see how this denition applies we consider two simple
examples.
Examples:
First we consider the simplest nonempty sets in R, those containing one
element; let S = ftg be such a set. For > 0 this set is contained in the
interval (t ; 2 ; t + 2 ), whose length is . Thus directly from the denition
this set has zero measure. More intuitively this construction says that
size S :
Therefore S has zero size, since the above inequality holds for any > 0.
Using the same argument it is not dicult to show that any set comprised
of a nite number of points ft1 ; : : : ; tn g has zero measure.
Let us now turn to the set of natural numbers N = f1; 2; 3; : : : g. This
set contains a countably innite number of points, yet we will now see that
it too has zero measure. Set to be any number satisfying 0 < < 1, and
dene the intervals
k
(1 ;
(ak ; bk ) = k ; 2 ; k + 2) k (1 ; ) ;
for each k 2 N . Since k 2 (ak ; bk ), for each k > 0, we see that
N [1 k=1 (ak ; bk ):
The length of each interval (ak ; bk ) is (1 ; )k , and so
1
X 1
X
(bk ; ak ) = (1 ; )k = ;
k=1 k=1
where we have used the geometric series formula. From our denition above
we conclude that N has zero measure; its size is clearly smaller than any
. We leave as an exercise the extension of this example to show that the
integers Z are also a subset of R that has zero measure. Similarly it is pos-
sible to show that any countable subset of the real line has measure zero.
In particular the set of rational numbers Q is of zero measure; this fact
A.2. Terminology 355
is perhaps surprising at rst glance since the rationals are so densely dis-
tributed on the real line. The examples we have given here of zero measure
sets all have a nite or countable number of elements; not all sets of zero
measure are countable, but constructing them is more involved.
A.2 Terminology
Having introduced the denition of a set of zero measure, we can explain
the meaning of the term \for almost every". Suppose that P (t) is a logical
condition which depends on the real variable t. Then recall that a statement
\For every t 2 R the condition P (t) holds"
means that, for any chosen value t0 2 R, the condition P (t0 ) is true. Then
we dene the following terminology.
Denition A.2. Given a logical condition P (t), which depends on the real
variable t, the expression \For almost every t 2 R the condition P (t) holds",
means that the set
S = ft0 2 R : P (t0 ) is falseg has zero measure.
This denition states that \for almost every" means that the condition P (t)
can fail for some values of t, provided that it only fails on a very small set
of points. Put more precisely, the set S of points where the condition P (t)
is false has zero measure. Notice that this means that \for every" implies
for \for almost every" but the converse is not true; namely the former is
the stronger condition. To further see the implications of this terminology
we consider some examples.
Examples:
Consider the function f (t) = sin2 t. This function does not satisfy f (t) >
0, for all t 2 R, since the positivity condition fails when t is an integer.
Since we know Z is a set of measure zero it follows that
f (t) > 0, for almost all t 2 R
For the purpose of another example consider the function
d(t) = 10;; for t 2 Q;
for t 62 Q :
Then we see that d(t) = 0, for almost all t. Further given any function g(t),
it follows from the properties of d that that (g d)(t) = 0, for almost all t.
So far we have assumed that P (t) is dened on R, however it not un-
common for logical conditions to depend on subsets of the real line, and
356 AppendixA. Some Basic Measure Theory
we therefore extend our above denition. If D is a subset of R then \For
almost every t 2 D the condition P (t) holds" is dened to mean that the
set S = ft0 2 D : P (t0 ) is falseg is of zero measure.
We can now turn to the denition of the essential supremum of a function.
Denition A.3. Suppose that D is a subset of R, and the function f :
R ! R. The essential supremum of the function over D is dened by
ess sup f (t) = inf f 2 R : the function f (t) < , for almost every t 2 D.g
t2D
In other words a function is never greater than its essential supremum,
except on a set of measure zero, and the essential supremum is the smallest
number that has this property. Thus we immediately see that the essential
supremum of a function can never be greater than the supremum. The
basic property which makes the essential supremum useful is that it ignores
values of the function that are only approached on a set of zero measure.
Again we look at some concrete examples to make this denition clear.
Examples:
Dene the function h : [0; 1) ! R by
h(t) = e;t ; for t > 0;
2; for t = 0:
Then according the denitions of supremum and essential supremum we
have
sup h(t) = 2 and ess sup h(t) = 1:
t2[0; 1) t2[0; 1)
The distinguishing property here is that the supremum of 2 is only ap-
proached (in fact achieved) at one point, namely t = 0, whereas the function
is otherwise less than one, so the essential supremum can be no greater than
one. However for any value of < 1, the set of points for which h(t) is
never of zero measure for it always contains an interval. Thus the essential
supremum is indeed one.
Recall the function d(t) just dened above. It satises
sup d(t) = 1 and ess sup d(t) = 0:
t2R t2R
To see this simply realize that d(t) is only near the value one on a set of
zero measure, namely the rational numbers; otherwise it is always equal to
zero. In fact, given any function g(t) we have that ess sup(g d)(t) = 0.
Finally we leave as an exercise the verication of the fact that if f (t) is a
continuous function, then its supremum is equal to its essential supremum.
A.3. Comments on norms and Lp spaces 357
A.3 Comments on norms and Lp spaces
To end this appendix we discuss how sets of measure zero play a role in
dening the elements in an Lp (;1; 1) space. We begin by focusing on
L1 and an example.
Example:
Let f (t) be the function that is zero at every time, and then clearly kf k1 =
0. Also dene the function g by
g(t) = 10;; tt 262 ZZ:
From the above discussion of the essential supremum we know that kgk1 =
0. Thus we have that f and g are functions in L1 which both have norm
zero. In fact it is clear that we can dene many dierent functions with
zero innity norm.
This example seems to indicate that k k1 is not a norm, since it violates
the requirement that only one element can have zero norm. What is needed
to reconcile this dichotomy is a reinterpretation of what we mean by an
element of L1 :
Functions that dier only on a set of measure zero are
considered to represent the same element.
Thus in our example above f and g both represent the zero element in L1 .
Furthermore if h and w are L1 functions, and satisfy kh ; wk1 = 0, then
they represent the same element. Thus strictly speaking the elements of
L1 are not functions but instead sets of functions, where each set contains
functions that are equivalent.
We now generalize to Lp spaces, for 1 p < 1. Recall that the norm is
dened by
Z 1 p1
khkp = p
jh(t)jp dt :
;1
It is a fact that if h is zero everywhere except on a set of zero measure,
then
khkp = 0:
That is function values on a measure zero set do not contribute to the
integral.2 Thus we see that the two example functions f and g given above
have zero norm in every Lp space, and as a result we cannot rigorously
2
This is based on Lebesgue integration theory.
358 AppendixA. Some Basic Measure Theory
regard them as distinct. So just as in L1 we regard any functions that
dier only on a set of measure zero as representing the same element in Lp .
In doing this all the mappings k kp indeed dene norms.
To conclude we emphasize that for our purposes in this course the dis-
tinction between functions and elements of an Lp space is not crucial, and
elements of Lp spaces can be viewed as functions without compromising
understanding.
This is page 359
Printer: Opaque this
AppendixB
Proofs of Strict Separation
This appendix presents two technical proofs which were omitted in Chap-
ters 8 and 9, part of the argument to establish necessity of scaled small-gain
conditions for robustness. Specically we will prove two propositions which
concerned the strict separation of the sets r and in Rd . These were
dened as
= f(r1 ; : : : ; rd ) 2 Rd : rk 0; for each k = 1; : : : ; dg;
r = f(1 (q); : : : ; d (q)) 2 Rd : q 2 L2 satisfying kqk2 = 1g:
We recall that k (q) = kEk Mqk2 ; kEk qk2 , M is the nominal LTI
system under consideration, and the projection matrices Ek break up
signals in components conformably with the uncertainty structure =
diag(1 ; : : : ; d ):
In what follows, L2 [a; b] denotes the subspace of functions in L2 [0; 1)
with support in the interval [a; b], and P[a;b] : L2[0; 1) ! L2 [a; b] is the
natural projection. We now state the rst pending result.
Proposition B.1 (Proposition 8.9, Chapter 8). Suppose that (M; ) a
is robustly well-connected. Then the sets and r are strictly separated, i.e.
D(; r) := r2inf;y2r jr ; yj > 0:
We will give a proof by contrapositive, based on the following key lemma.
Lemma B.2. Suppose D(r; ) = 0. Given any > 0 and any t0 0 the
following conditions can be satised:
360 AppendixB. Proofs of Strict Separation
1. There exists a closed interval [t0 ; t1 ], and two functions p; q 2
L2 [t0 ; t1 ], with kqk = 1, such that
kEk pk kEk qk; for each k = 1; : : : ; d: (B.1)
> k(I ; P[t0 ;t1 ] )Mqk
2 (B.2)
p
d = kp ; P[t0 ;t1 ] Mqk (B.3)
2. With the above choice of [t0 ; t1 ] and q, there exists an operator =
diag(1 ; : : : ; d ) in L(L2 [t0 ; t1 ]) \ , such that kk 1 and
p
a
;
k I ; P[t0 ;t1 ] M qk d: (B.4)
Proof . Fix > 0 and t0 0. By hypothesis, there exists q 2 L2, kqk = 1,
satisfying k (q) > ;2 for each k = 1; : : : ; d. This amounts to
2 + kEk Mqk2 > kEk qk2 ; for each k = 1; : : : ; d:
Now clearly if the support of q is truncated to a suciently long interval,
and q is rescaled to have unit norm, the above inequality will still be satis-
ed by continuity of the norm. Also since Mq 2 L2 , by possibly enlarging
this truncation interval we can obtain [t0 ; t1 ] satisfying (B.2), and also
2 + kEk P[t0 ;t1 ] Mqk2 > kEk qk2 ; for each k = 1; : : : ; d:
Next choose 2 L2[t0 ; t1 ] such that Ek has norm and is orthogonal to
Ek P[t0 ;t1 ] Mq, for each k = 1; : : : ; d. Then dene
p = P[t0 ;t1 ] Mq + :
p
Now kk = d so (B.3) follows, and also
kEk pk2 = 2 + kEk P[t0 ;t1 ]Mqk2 > kEk qk2 ; for every k = 1; : : : ; d;
which proves (B.1) and completes Part 1.
For Part 2, we start from (B.1) and invoke Lemma 8.4, Chapter 8 (notice
that it holds in any L2 space), to construct a contractive, block diagonal
satisfying p = q. Then
; ;
I ; P[t0 ;t1 ]M q = p ; P[t0 ;t1 ] Mq
so (B.4) follows from (B.3).
Proof . (Proposition B.1) The argument is by contrapositive: we assume
that D(r; ) = 0, the objective is to construct a perturbation 2 a
such that I ; M is singular.
Fix any positive sequence n ! 0 as n tends to 1. For each n, we
construct q(n) and (n) as in Lemma B.2. Since their supports can be
shifted arbitrarily, we choose them to be of the form [tn ; tn+1 ], with t0 =
0, so that these intervals form a complete partition of [0; 1). Now we
can combine the (n) 2 L(L2 [tn ; tn+1 ]) \ to construct a single 2
a
AppendixB. Proofs of Strict Separation 361
L(L2 [0; 1)), dened by
1
X
= (n) P[tn ;tn+1] : (B.5)
n=1
Descriptively, this operator breaks up a signal u into its components in the
time partition [tn ; tn+1 ], applies (n) to each \piece" P[tn ;tn+1] u, and puts
the resulting pieces back together. It is easy to see that kk 1, since all
the (n) are contractive. Furthermore inherits the block-diagonal spatial
structure so 2 . a
Now apply to the signal Mq(n) for a xed n. We can write
Mq(n) = P[tn ;tn+1] + (I ; P[tn ;tn+1 ] ) Mq(n)
= (n) P[tn ;tn+1] Mq(n) + (I ; P[tn ;tn+1 ])Mq(n) ;
Applying the triangle inequality this leads to
k (I ; M ) q(n) k k I ; (n) P[tn ;tn+1] M q(n) k + k(I ; P[tn ;tn+1 ] )Mq(n) k
p
n d + 2n ;
where we have used (B.4) and (B.2) respectively. Now we let n ! 1
to see that the right hand side tends to zero, and thus so does the left
hand side. Therefore I ; M cannot have a bounded inverse since for
each n we know by denition that kq(n)k = 1. This contradicts robust
well-connectedness.
We turn now to our second result which states that if we restrict ourselves
to the causal operators in , our rst result still holds.
a
q
q~
p
p~
0 h nh nh + h
AppendixC
-Simple Structures
x; y 2 E . Taking
x = Q 21 1
0
we have qx qy Q; 21 x = qx , and also
x x = 10 Q 10 = qx qx = 1:
An analogous construction holds for y.
E is an ellipsoid. To see this, we rst parametrize the generating 's
by
= r re1j' ;
2
where r1 0, r2 0, r12 + r22 = 1, and ' 2 [0; 2). Notice that we
have made the rst component real and positive; this restriction does
not change the set E since a complex factor of unit magnitude applied
to does not aect the value of the quadratic forms H~ k .
We can also parametrize the valid r1 , r2 and write
= sin(cos( 2 ) ; 2 [0; ]; ' 2 [0; 2):
)ej'
2
372 AppendixC. -Simple Structures
Now setting
H~ k = ak bk
bk ck
we have
H~ k = ak cos2 2 + ck sin2 2 + 2 sin 2 cos 2 Re(bk ei' ):
Employing some trigonometric identities and some further manipu-
lations, the latter is rewritten as
2 3
a k + c k
a k ; c k
cos()
H~ k = 2 + 2 Re(bk ) ; Im(bk ) 4sin() cos(')5 :
sin() sin(')
Collecting the components for k = 1; 2; 3 we arrive at the formula
2 3
cos()
v = v0 + T 4sin() cos(')5 ;
sin() sin(')
3 3
where v0 2 R and T 2 R are xed, and and ' vary respec-
3
tively over [0; ] and [0; 2). Now we recognize that the above vector
varies precisely over the unit sphere in R3 (the parametrization corre-
sponds to the standard spherical coordinates). Thus E is an ellipsoid
as claimed.
The above Lemma does not imply that the set r0;3 is convex; indeed
such an ellipsoid can have \holes" in it. However it is geometrically clear
that if the segment between two points intersects the positive orthant 0;3 ,
the same happens with any ellipsoid going through these two points; this
is the direction we will follow to establish that co(r0;3 ) \ 0;3 nonempty
implies r0;3 \ 0;3 nonempty.
However the diculty is that not all points in co(r0;3 ) lie in a segment
between two points in r0;3 : convex combinations of more than two points
are in general required. The question of how many points are actually
required is answered by a classical result from convex analysis; see the
references for a proof.
Lemma C.8 (Caratheodory). Let K V , where V is a d dimensional
real vector space. Every point in co(K) is a convex combination of at most
d + 1 points in K.
We will require the following minor renement of the above statement.
Corollary C.9. If K V is compact, then every point in the boundary of
co(K) is a convex combination of at most d points in K.
C.2. The case of ; 0 3 373
Proof . The Caratheodory result implies that for every V 2 co(K), there
exists a nite convex hull of the form
( )
Xd+1 d+1
X
cofv1 ; : : : ; vd+1 g = k vk : k 0; k = 1
k=1 k=1
with vertices vk 2 K, which contains v.
If the vk are in a lower dimensional hyperplane, then d points will suf-
ce to generate v by invoking the same result. Otherwise, every point in
cofv1 ; : : : ; vd+1 g which is generated by k > 0 for every k will be interior
to cofv1 ; : : : ; vd+1 g co(K). Therefore for points v in the boundary of
co(K), one of the k 's must be 0 and a convex combination of d points will
suce.
Equipped with these tools, we are now ready to tackle the main result.
Theorem C.10. If co(r0;3 ) \ 0;3 is nonempty, then r0;3 \ 0;3 is
nonempty.
Proof . By hypothesis there exists a point v 2 co(r0;3) \ 0;3; since
co(r0;3 ) is compact, such a point can be chosen from its boundary. Since
we are in the space R3 , Corollary C.9 implies that there exist three points
x, y, z in r0;3 such that
v = x + y +
z 2 0;3 ;
with , ,
non-negative and + +
= 1. Geometrically, the triangle
S (x; y; z ) intersects the positive orthant at some point v.
Claim: v lies in a segment between two points in r0;3.
This is obvious if x,y, z are aligned or if any of , ,
is 0. We thus focus
on the remaining case, where the triangle S (x; y; z ) is non-degenerate and
v is interior to it, as illustrated in Figure C.1.
x
y v
w z
u
References
[1] V.M. Adamjan, D.Z. Arov, and M.G. Krein. Innite block hankel ma-
trices and related extension problems. American Mathematical Society
Translations, 111:133{156, 1978.
[2] U.M. Al-Saggaf and G.F. Franklin. An error bound for a discrete re-
duced order model of a linear multivariable system. IEEE Transactions
on Automatic Control, 32:815{819, 1987.
[3] B. D. O. Anderson and J. B. Moore. Optimal Control: Linear Quadratic
Methods. Prentice Hall, 1990.
[4] J.A. Ball, I. Gohberg, and M.A. Kaashoek. Nevanlinna-pick interpolation
for time-varying input-output maps: the discrete case. Operator Theory:
Advances and Applications, Birkauser, 56:1{51, 1992.
[5] B. Bamieh and M. Daleh. On robust stability with structured time-invariant
perturbation. Systems and Control Letters, 21:103{108, 1993.
[6] T. Basar and P. Bernhard. H1 -Optimal Control and Related Mini-Max
Design Problems: A Dynamic Game Approach. Birkhauser, 1991.
[7] C.L. Beck, J.C. Doyle, and K. Glover. Model reduction of multi-dimensional
and uncertain systems. IEEE Transactions on Automatic Control, 41:1466{
1477, 1996.
[8] V. Belevitch. Classical Network Theory. Holden-Day, 1968.
[9] H. Bercovici, C. Foias, and A. Tannenbaum. Structured interpolation
theory. Operator Theory Advances and Applications, 47:195{220, 1990.
[10] D.S. Bernstein and W. H. Haddad. LQG control with an H1 performance
bound: A Ricatti equation approach. IEEE Transactions on Automatic
Control, 34:293{305, 1989.
376 References
[11] S. Bochner and K. Chandrasekharan. Fourier Transforms. Princeton
University Press, 1949.
[12] B. Bollobas. Linear Analysis. Cambridge University Press, 1990.
[13] S.P. Boyd, El Ghoui, E. Feron, and V. Balakrishnan. Linear Matrix In-
equalities in System and Control Theory. Society for Industrial and Applied
Mathematics, 1994.
[14] R. Braatz, P. Young, J. C. Doyle, and M. Morari. Computational
complexity of calculation. IEEE Transactions on Automatic Control,
39:1000{1002, 1994.
[15] C.A.Desoer and M. Vidyasagar. Feedback Systems: Input-Output Proper-
ties. Academic Press, 1975.
[16] B. R. Copeland and M. G. Safonov. A generalized eigenproblem solution
for singular H and H1 problems. In Control and Dynamic Systems; editor
2
[37] C. Foias and A.E. Frazho. The Commutant Lifting Approach to Interpola-
tion Problems. Birkhauser, 1990.
[38] C. Foias, H. Ozbay, and A. Tannenbaum. Robust Control of Innite
Dimensional Systems. Springer-Verlag, 1996.
[39] A. Fradkov and V. A. Yakubovich. The S-procedure and duality theo-
rems for nonconvex problems of quadratic programming. Vestnik Leningrad
University, 31:81{87, 1973. In Russian.
[40] B.A. Francis. Notes on introductory state space systems. Personal commu-
nication.
[41] B.A. Francis. A Course in H1 Control Theory. Springer-Verlag, 1987.
[42] P. Gahinet and P. Apkarian. A Linear Matrix Inequality approach to H1
control. International Journal of Robust and Nonlinear Control, 4:421{448,
1994.
[43] J.B. Garnett. Bounded Analytic Functions. Academic Press, 1981.
[44] T. Georgiou and M. Smith. Optimal robustness in the gap metric. IEEE
Transactions on Automatic Control, 35:673{686, 1990.
[45] E. Gilbert. Controllability and observability in multivariable control
systems. SIAM Journal of Control, 1:128{151, 1963.
[46] K. Glover. All optimal hankel-norm approximations of linear multivari-
able systems and their l1 error bounds. IEEE Transactions on Automatic
Control, 39:1115{1193, 1984.
[47] K. Glover. A tutorial on model reduction. In From Data to Model; editor
J.C. Willems. Springer-Verlag, 1989.
[48] K. Glover and D. McFarlane. Robust stabilization of normalized coprime
factor plant descriptions with H1 -bounded uncertainty. IEEE Transactions
on Automatic Control, 34:821{830, 1989.
378 References
[49] G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins
University Press, 1996.
[50] M. Green and David J.N. Limebeer. Linear Robust Control. Prentice Hall,
1995.
[51] W.H. Greub. Linear Algebra. Springer, 1981.
[52] J. Guckenheimer and P. Holmes. Nonlinear oscillations, dynamical systems
and bifurcations of vector elds. Springer, 1986.
[53] P.R. Halmos. Mesure Theory. Springer, 1974.
[54] P.R. Halmos. A Hilbert Space Problem Book. Springer-Verlag, 1982.
[55] M.L.J. Hautus. Controllability and observability conditions of linear
autonomous systems. In Proc. Kon. Ned. Akad. Wetensch. Ser. A., 1969.
[56] D. Hinrichsen and A.J. Pritchard. An improved error estimate for reduced
order models of discrete time systems. IEEE Transactions on Automatic
Control, 35:317{320, 1990.
[57] K. Homan. Banach Spaces of Analytic Functions. Prentice-Hall, 1962.
[58] R.A. Horn and C.R. Johnson. Matrix Analysis. Cambridge University Press,
1991.
[59] R.A. Horn and C.R. Johnson. Topics in Matrix Analysis. Cambridge
University Press, 1995.
[60] P.A. Iglesias. An entropy formula for time-varying discrete-time control
systems. SIAM Journal of Control and Optimization, 34:1691{1706, 1996.
[61] T. Iwasaki. Robust performance analysis for systems with norm-bounded
time-varying structured uncertainty. In Proc. American Control Confer-
ence, 1994.
[62] T. Kailath. Linear Systems. Prentice-Hall, 1980.
[63] R.E. Kalman. A new approach to linear ltering and prediction theory.
ASME Transactions, Series D: Journal of Basic Engineering, 82:35{45,
1960.
[64] R.E. Kalman. On the general theory of control systems. In Proc. IFAC
World Congress, 1960.
[65] R.E. Kalman. Mathematical descriptions of linear systems. SIAM Journal
of Control, 1:152{192, 1963.
[66] R.E. Kalman and R.S. Bucy. New results in linear ltering and predic-
tion theory. ASME Transactions, Series D: Journal of Basic Engineering,
83:95{108, 1960.
[67] D. Kavranoglu and M. Bettayeb. Characterization of the solution to the
optimal H1 model reduction problem. Systems and Control Letters, 20:99{
107, 1993.
[68] C. Kenig and P. Tomas. P. maximal operators dened by fourier multipliers.
Studia Math., 68:79{83, 1980.
[69] H.K. Khalil. Nonlinear Systems. Prentice-Hall, 1996.
[70] M. Khammash and J. B. Pearson. Performance robustness of discrete-
time systems with structured uncertainty. IEEE Transactions on Automatic
Control, 36:398{412, 1991.
References 379
[71] P. Khargonekar and M. Rotea. Mixed H /H1 control: A convex opti-
2