Lecture Notes Control Systems Theory and Design PDF Free
Lecture Notes Control Systems Theory and Design PDF Free
Herbert Werner
c
Copyright
2017 Herbert Werner ([email protected])
Technische Universität Hamburg-Harburg
Introduction vi
4 Observer-Based Control 54
4.1 State Estimate Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 Reference Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Closed-Loop Transfer Function . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4 Symmetric Root Locus Design . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 Multivariable Systems 77
5.1 Transfer Function Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 State Space Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 The Gilbert Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4 Controllability and Observability . . . . . . . . . . . . . . . . . . . . . . . 82
5.5 The Smith-McMillan Form . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.6 Multivariable Feedback Systems and Closed-Loop Stability . . . . . . . . . 89
5.7 A Multivariable Controller Form . . . . . . . . . . . . . . . . . . . . . . . . 92
5.8 Pole Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.9 Optimal Control of MIMO Systems . . . . . . . . . . . . . . . . . . . . . . 97
Bibliography 303
Introduction
Classical techniques for analysis and design of control systems, which employ the concepts
of frequency response and root locus, have been used successfully for more than five
decades in a vast variety of industrial applications. These methods are based on transfer
function models of the plant to be controlled, and allow an efficient design of controllers
for single-input single-output systems. However, when it comes to more complex systems
with several input and output variables, classical techniques based on transfer functions
tend to become tedious and soon reach their limits. For such applications, the so-called
modern control techniques developed in the 1960s and 70s turned out to be more suitable.
They are based on state space models of the plant rather than on transfer function models.
One major advantage of state space models is that multi-input multi-output systems can
be treated in much the same way as single-input single-output systems. This course
provides an introduction to the modern state space approach to control and its theoretical
foundations.
State space models are introduced in Chapter 1; to facilitate the understanding of the
basic concepts, the discussion is initially limited to single-input single-output systems.
The main idea - introduced in Chapters 1 through 4 - is to break the controller design
problem up in two subproblems: (i) the problem of designing a static controller that
feeds back internal variables - the state variables of the plant, and (ii) the problem of
obtaining a good estimate of these internal state variables. It is shown that these two
problems are closely related, they are called dual problems. Associated with state feedback
control and state estimation (the latter is usually referred to as state observation) are the
concepts of controllability and observability. In Chapter 5 these concepts are extended
to multivariable systems, and it will be seen that this extension is in most aspects rather
straightforward, even though the details are more complex.
A further important topic, discussed in Chapter 6, is the digital implementation of con-
trollers. It is shown that most of the methods and concepts developed for continuous-time
systems have their direct counterparts in a discrete-time framework. This is true for trans-
fer function models as well as for state space models.
The third major issue taken up in this course is about obtaining a model of the plant to be
controlled. In practice, a model is often obtained from experimental data; this approach
is known as system identification. In Chapter 7 the basic ideas of system identification
Introduction vii
are introduced; identification of transfer function models as well as state space models are
discussed. In Chapter 8 the issue of reducing the dynamic order of a model is addressed -
this can be important in applications where complex plant models lead to a large number
of state variables.
Most chapters are followed by a number of exercise problems. The exercises play an
important role in this course. To encourage student participation and active learning,
derivations of theoretical results are sometimes left as exercises. A second objective is to
familiarize students with the state-of-the-art software tools for modern controller design.
For this reason, a number of analysis and design problems are provided that are to be
solved using MATLAB and Simulink. MATLAB code and Simulink models for these
problems can be downloaded from the web page of this course. A complete design exercise
that takes up all topics of this course - identification of a model, controller and observer
design and digital implementation - is presented in Chapter 9.
This course assumes familiarity with elementary linear algebra, and with engineering
applications of some basic concepts of probability theory and stochastic processes, such
as white noise. A brief tutorial introduction to each of these fields is provided in the
Appendix. The Appendix also provides worked solutions to all exercises. Students are
encouraged to try to actively solve the exercise problems, before checking the solutions.
Some exercises that are more demanding and point to more advanced concepts are marked
with an asterisk.
The exercise problems and worked solutions for this course were prepared by Martyn
Durrant MEng.
Chapter 1
In this chapter we will discuss linear state space models for single-input single-output
systems. The relationship between state space models and transfer function models is
explored, and basic concepts are introduced.
k b
The model (1.3), (1.4) describes the same dynamic properties as the transfer function
(1.2). It is a special case of a state space model
ẋ(t) = Ax(t) + Bu(t) (1.5)
y(t) = Cx(t) + Du(t) (1.6)
In general, a system modelled in this form can have m inputs and l outputs, which are
then collected into an input vector u(t) ∈ IRm and an output vector y(t) ∈ IRl . The vector
x(t) ∈ IRn is called state vector, A ∈ IRn×n is the system matrix, B ∈ IRn×m is the input
matrix, C ∈ IRl×n is the output matrix and D ∈ IRl×m is the feedthrough matrix. Equation
(1.5) is referred to as the state equation and (1.6) as the output equation.
The spring-mass-damper system considered above has only one input and one output.
Such systems are called single-input single-output (SISO) systems, as opposed to multi-
input multi-output (MIMO) systems. For a SISO system the input matrix B degenerates
into a column vector b, the output matrix C into a row vector c and the feedthrough
matrix D into a scalar d. Initially, we will limit the discussion of state space models to
SISO systems of the form
ẋ(t) = Ax(t) + bu(t) (1.7)
y(t) = cx(t) + du(t) (1.8)
In the above example the direct feedthrough term d is zero - this is generally true for
physically realizable systems, as will be discussed later.
The state space model (1.3), (1.4) for the second order system in the above example was
constructed by introducing a single new variable. We will now show a general method for
constructing a state space model from a given transfer function. Consider the nth order
linear system governed by the differential equation
dn dn−1 d
n
y(t) + a n−1 n−1
y(t) + . . . + a1 y(t) + a0 y(t) =
dt dt dt
dm dm−1 d
bm m u(t) + bm−1 m−1 u(t) + . . . + b1 u(t) + b0 u(t) (1.9)
dt dt dt
where we assume for simplicity that the system is strictly proper, i.e. n > m (the case of
bi-proper systems is discussed at the end of this section). A transfer function model of
this system is
Y (s) = G(s)U (s) (1.10)
1.1. From Transfer Function to State Space Model 3
where
bm sm + bm−1 sm−1 + . . . + b1 s + b0 b(s)
G(s) = = (1.11)
sn + an−1 sn−1 + . . . + a1 s + a0 a(s)
The transfer function model of this system is shown in Fig. 1.2. In order to find a state
space model for this system, we will first construct a simulation model by using integrator
blocks. For this purpose, we split the model in Fig. 1.2 into two blocks as shown in Fig.
1.3, and let v(t) denote the fictitious output of the filter 1/a(s).
1 v(t)
u(t) b(s) y(t)
a(s)
From Fig. 1.3 the input and output signals can be expressed in terms of the new variable
as U (s) = a(s)V (s) and Y (s) = b(s)V (s), or in time domain
dn dn−1 d
u(t) = n
v(t) + a n−1 n−1
v(t) + . . . + a1 v(t) + a0 v(t) (1.12)
dt dt dt
and
dm dm−1 d
y(t) = bm m
v(t) + b m−1 m−1
v(t) + . . . + b1 v(t) + b0 v(t) (1.13)
dt dt dt
From (1.12) the signal v(t) can be generated by using a chain of integrators: assume first
that dn v(t)/dtn is somehow known, then integrating n times, introducing feedback loops
as shown in the lower half of Fig. 1.4 (for n = 3 and m = 2) and adding the input signal
u(t) yields the required signal dn v(t)/dtn . The output signal y(t) can then be constructed
as a linear combination of v(t) and its derivatives according to (1.13), as shown in the
upper half of Fig. 1.4.
An implicit assumption was made in the above construction: namely that the initial values
at t = 0 in (1.12) and (1.13) are zero. In the model in Fig. 1.4 this corresponds to the
assumption that the integrator outputs are zero initially. When transfer function models
are used, this assumption is usually made, whereas state space models allow non-zero
initial conditions to be taken into account.
State Variables
Before we derive a state space model from the simulation model in Fig. 1.4, we introduce
the concept of state variables. For a given system, a collection of internal variables
x1 (t), x2 (t), . . . , xn (t)
4 1. State Space Models
b2
b1
...
u v R v̈ R v̇ R v y
b0
x˙3 x3 = x˙2 x2 = x˙1 x1
−a2
−a1
−a0
is referred to as state variables if they completely determine the state of the system at
time t. By this we mean that if the values of the state variables at some time, say t0 ,
are known, then for a given input signal u(t) where t ≥ t0 all future values of the state
variables can be uniquely determined. Obviously, for a given system the choice of state
variables is not unique.
We now return to the system represented by the simulation model in Fig. 1.4. Here the
dynamic elements are the integrators, and the state of the system is uniquely determined
by the values of the integrator outputs at a given time. Therefore, we choose the integrator
outputs as state variables of the system, i.e. we define
d dn−1
x1 (t) = v(t), x2 (t) = v(t), . . . , xn (t) = n−1 v(t)
dt dt
The chain of integrators and the feedback loops determine the relationship between the
state variables and their derivatives. The system dynamics can now be described by a set
of first order differential equations
ẋ1 = x2
ẋ2 = x3
..
.
ẋn−1 = xn
ẋn = u − a0 x1 − a1 x2 − . . . − an−1 xn
The last equation is obtained at the input summing junction in Fig. 1.4. The first order
differential equations can be rewritten in vector form: introduce the state vector
x(t) = [x1 (t) x2 (t) . . . xn (t)]T
1.2. From State Space Model to Transfer Function 5
then we have
ẋ1 0 1 0 ... 0 x1 0
ẋ2
0 0 1 0
x2 0
.. .. ... .. ..
..
.= . . + . u(t)
. (1.14)
ẋn−1 0 0 0 ... 1 xn−1 0
ẋn −a0 −a1 −a2 . . . −an−1 xn 1
This equation has the form of (1.7) and is a state equation of the system (1.9). The
state equation describes the dynamic properties of the system. If the system is physically
realizable, i.e. if n > m, then from (1.13) and Fig. 1.4, the output signal is a linear
combination of the state variables and can be expressed in terms of the state vector as
x1
x2
y(t) = [b0 b1 . . . bn−1 ] . (1.15)
..
xn
This equation has the form of (1.8) and is the output equation of the system (1.9) associ-
ated with the state equation (1.14). The system matrices (vectors) A, b and c of this state
space model contain the same information about the dynamic properties of the system as
the transfer function model (1.11). As mentioned above, the choice of state variables for
a given system is not unique, and other choices lead to different state space models for
the same system (1.9). For reasons that will be discussed later, the particular form (1.14)
and (1.15) of a state space model is called controller canonical form. A second canonical
form, referred to as observer canonical form, is considered in Exercise 1.7.
Note that for bi-proper systems, i.e. systems where n = m, the state equation will still be
as shown in (1.14), but the output equation will be different from (1.15): There will be a
feedthrough term du(t) in (1.15); if the example in Figure 1.4 was bi-proper there would be
a direct path with gain b3 from d3 v/dt3 to y. Moreover, the elements of the measurement
vector c will not comprise the numerator coefficients of the bi-proper transfer function,
but of the strictly proper remainder obtained after polynomial division. This is illustrated
in Exercise 1.8.
We have seen how a particular state space model of a system can be constructed when its
transfer function is given. We now consider the case where a state space model is given
and we wish to find its transfer function. Thus, consider a system described by the state
space model
ẋ(t) = Ax(t) + bu(t)
y(t) = cx(t) + du(t)
6 1. State Space Models
and assume again that the initial conditions are zero, i.e. x(0) = 0. Taking Laplace
transforms, we have
sX(s) = AX(s) + bU (s)
and solving for X(s) yields
X(s) = (sI − A)−1 bU (s) (1.16)
where I denotes the identity matrix. Substituting the above in the Laplace transform of
the output equation leads to
Further insight into the relationship between transfer function and state space model can
be gained by noting that
1
(sI − A)−1 = adj (sI − A)
det(sI − A)
where adj (M ) denotes the adjugate of a matrix M . The determinant of sI − A is a
polynomial of degree n in s. On the other hand, the adjugate of sI − A is an n × n
matrix whose entries are polynomials in s of degree less than n. Substituting in (1.17)
and assuming n > m (i.e. d = 0) gives
c adj(sI − A)b
G(s) =
det(sI − A)
The adjugate is multiplied by the row vector c from the left and by the column vector b
from the right, resulting in a single polynomial of degree less than n. This polynomial is
the numerator polynomial of the transfer function, whereas det(sI −A) is the denominator
polynomial. The characteristic equation of the system is therefore
det(sI − A) = 0
Note that the characteristic equation is the same when there is a direct feedthrough term
d 6= 0. The values of s that satisfy this equation are the eigenvalues of the system matrix
A. This leads to the important observation that the poles of the transfer function - which
determine the dynamic properties of the system - can be found in a state space model as
eigenvalues of the system matrix A.
A block diagram representing the state space model (1.7), (1.8) is shown in Fig. 1.5.
Note that no graphical distinction is made between scalar signals and vector signals. The
1.4. Non-Uniqueness of State Space Models and Similarity Transformations 7
integrator block represents n integrators in parallel. When the relationship between state
space models and transfer functions was discussed, we assumed x(0) = 0. From now on
we will include the possibility of non-zero initial values of the state variables in the state
space model. In Fig. 1.5 this is done by adding the initial values to the integrator outputs.
x(0)
ẋ R x
u b c y
The dynamic properties of this system are determined by the eigenvalues of the system
matrix A. Suppose we wish to improve the dynamic behaviour - e.g. when the system is
unstable or has poorly damped eigenvalues. If all state variables are measured, we can
use them for feedback by taking the input as
where f = [f1 f2 . . . fn ] is a gain vector, and uv (t) is a new external input. This type of
feedback is called state feedback, the resulting closed-loop system is shown in Fig. 1.6.
Substituting (1.18) in the state equation yields
Comparing this with the original state equation shows that as a result of state feedback
the system matrix A is replaced by A + bf , and the control input u(t) is replaced by uv (t).
The eigenvalues of the closed-loop system are the eigenvalues of A + bf , and the freedom
in choosing the state feedback gain vector f can be used to move the eigenvalues of the
original system to desired locations; this will be discussed later (see also Exercise 1.6).
The state variables we chose for the spring-mass-damper system have a physical meaning
(they represent position and velocity). This is often the case when a state space model is
derived from a physical description of the plant. In general however state variables need
8 1. State Space Models
x(0)
u ẋ R x
uv b c y
not have any physical significance. The signals that represent the interaction of a system
with its environment are the input signal u(t) and the output signal y(t). The elements
of the state vector x(t) on the other hand are internal variables which are chosen in a way
that is convenient for modelling internal dynamics; they may or may not exist as physical
quantities. The solution x(t) of the forced vector differential equation ẋ(t) = Ax(t)+bu(t)
with initial condition x(0) = x0 can be thought of as a trajectory in an n-dimensional
state space, which starts at point x0 . The non-uniqueness of the choice of state variables
reflects the freedom of choosing a coordinate basis for the state space. Given a state space
model of a system, one can generate a different state space model of the same system by
applying a coordinate transformation.
To illustrate this point, consider a system modelled as
A different state space model describing the same system can be generated as follows. Let
T be any non-singular n × n matrix, and consider a new state vector x̃(t) defined by
x(t) = T x̃(t)
˙
T x̃(t) = AT x̃(t) + bu(t)
or
˙
x̃(t) = T −1 AT x̃(t) + T −1 bu(t)
and
y(t) = cT x̃(t) + du(t)
1.4. Non-Uniqueness of State Space Models and Similarity Transformations 9
Comparing this with the original model shows that the model (A, b, c, d) has been replaced
by a model (Ã, b̃, c̃, d), where
à = T −1 AT, b̃ = T −1 b, c̃ = cT
Note that the feedthrough term d is not affected by the transformation, because it is not
related to the state variables.
In matrix theory, matrices A and à related by
à = T −1 AT
where T is nonsingular, are said to be similar, and the above state variable transformation
is referred to as a similarity transformation. Similarity transformations do not change the
eigenvalues, we have
In fact, it is straightforward to check that the models (A, b, c) and (Ã, b̃, c̃) have the same
transfer function (see Exercise 1.9).
To see that T represents a change of coordinate basis, write x = T x̃ as
x̃1
x̃2
x = [t1 t2 . . . tn ] . = t1 x̃1 + t2 x̃2 + . . . + tn x̃n
..
x̃n
where ti is the ith column of T . Thus, x̃i is the coordinate of x in the direction of the
basis vector ti . In the original coordinate basis, the vector x is expressed as
x = e 1 x1 + e 2 x2 + . . . + e n xn
where ei is a vector with zeros everywhere except for the ith element which is 1.
We have thus seen that for a given transfer function model, by choosing a nonsingular
transformation matrix T we can find infinitely many different but equivalent state space
models. As a consequence, it is not meaningful to refer to the state variables of a system,
but only of the state variables of a particular state space model of that system. Conversely,
all state space models that are related by a similarity transformation represent the same
system. For this reason, a state space model is also referred to as a particular state space
realization of a transfer function model.
Diagonal Form
To illustrate the idea of a similarity transformation, assume that we are given a 3rd order
state space model (A, b, c) and that the eigenvalues of A are distinct and real. Suppose we
10 1. State Space Models
wish to bring this state space model into a form where the new system matrix is diagonal.
We need a transformation such that
à = T −1 AT = Λ
where Λ = diag (λ1 , λ2 , λ3 ). It is clear that the λi are the eigenvalues of A. To find the
required transformation matrix T , note that AT = T Ã or
λ1 0 0
A[t1 t2 t3 ] = [t1 t2 t3 ] 0 λ2 0
0 0 λ3
Ati = λi ti , i = 1, 2, 3
Therefore, the columns of T are the (right) eigenvectors of A. Since we assumed that
the eigenvalues of A are distinct, the three eigenvectors are linearly independent and T is
non-singular as required.
State space models with a diagonal system matrix are referred to as modal canonical
form, because the poles of a system transfer function (the eigenvalues appearing along
the diagonal) are sometimes called the normal modes or simply the modes of the system.
where x(t) is a single state variable and a is a real number. The solution to this problem
can be found by separating variables: we have
d −at
e x = e−at (ẋ − ax) = e−at bu
dt
Integration from 0 to t and multiplication by eat yields
Z t
at
x(t) = e x0 + ea(t−τ ) bu(τ )dτ
0
1.5. Solutions of State Equations and Matrix Exponentials 11
Crucial for finding the solution in this scalar case is the property
d at
e = aeat
dt
of the exponential function. To solve the state equation when x ∈ IRn , we would need
something like
d At
e = AeAt
dt
for a n × n matrix A. This leads to the definition of the matrix exponential
1 22 1 33
eAt = I + At + A t + A t + ... (1.19)
2! 3!
It can be shown that this power series converges. Differentiating shows that
d At 1 1
e = A + A2 t + A3 t2 + A4 t3 + . . .
dt 2! 3!
1 22 1 33
= A(I + At + A t + A t + . . .)
2! 3!
At
= Ae
as required. Incidentally, this also shows that AeAt = eAt A because A can be taken as a
right factor in the second equation above.
The solution of the state equation can thus be written as
Z t
At
x(t) = e x0 + eA(t−τ ) bu(τ )dτ
0
Φ(t) = eAt
The first term on the right hand side is called the zero-input response, it represents the
part of the system response that is due to the initial state x0 . The second term is called
the zero-initial-state response, it is the part of the response that is due to the external
input u(t).
The frequency domain expression (1.16) of the solution was derived by assuming a zero
initial state. When the initial state is non-zero, it is easy to verify that the Laplace
transform of x(t) is
X(s) = Φ(s)x0 + Φ(s)bU (s) (1.21)
where
Φ(s) = L[Φ(t)σ(t)] = (sI − A)−1
12 1. State Space Models
is the Laplace transform of the transition matrix. The second equation follows directly
from comparing the expression obtained for X(s) with the time-domain solution (1.20).
The definition of Φ(t) implies that this matrix function is invertible, we have
AΦ(t) = Φ(t)A
Stability
From the study of transfer function models we know that a system is stable if all its poles
are in the left half plane. For a transfer function model, stability means that after a
system has been excited by an external input, the transient response will die out and the
output will settle to the equilibrium value once the external stimulus has been removed.
Since we have seen that poles of a transfer function become eigenvalues of the system
matrix A of a state space model, we might ask whether stability of a state space model
is equivalent to all eigenvalues of A being in the left half plane. In contrast to transfer
function models, the output of a state space model is determined not only by the input
but also by the initial value of the state vector. The notion of stability used for transfer
functions is meaningful for the part of the system dynamics represented by the zero-
initial-state response, but we also need to define stability with respect to the zero-input
response, i.e. when an initial state vector x(0) 6= 0 is driving the system state.
Definition 1.1 An unforced system ẋ(t) = Ax(t) is said to be stable if for all
x(0) = x0 , x0 ∈ IR we have x(t) → 0 as t → ∞.
x(t) = eAt x0
or in frequency domain
X(s) = (sI − A)−1 x0
Assuming that the eigenvalues are distinct, a partial fraction expansion reveals that
where λi are the eigenvalues of A, and φi are column vectors that depend on the residual at
λi . It is clear that x(t) → 0 if and only if all eigenvalues have negative real parts. Thus, we
find that stability with respect to both zero-initial state response and zero-input response
requires that all eigenvalues of A are in the left half plane.
We conclude this chapter with an important result on the representation of the transition
matrix. The following is derived (for the case of distinct eigenvalues) in Exercise 1.2:
An + an−1 An−1 + . . . + a1 A + a0 I = 0
In the last equation the 0 on the right hand side stands for the n × n zero matrix. This
theorem states that every square matrix satisfies its characteristic equation.
With the help of the Cayley-Hamilton Theorem we can express the matrix exponential
eAt - defined as an infinite power series - as a polynomial of degree n − 1. Recall the
definition
1 1 1
eAt = I + At + A2 t2 + A3 t3 + . . . + An tn + . . . (1.22)
2! 3! n!
From Theorem 1.1 we have
An = −an−1 An−1 − . . . − a1 A − a0 I
and we can substitute the right hand side for An in (1.22). By repeating this, we can in fact
reduce all terms with powers of A greater than n to n − 1. Alternatively, we can achieve
the same result by polynomial division. Let a(s) = det(sI − A) be the characteristic
polynomial of A with degree n, and let p(s) be any polynomial of degree k > n. We can
divide p(s) by a(s) and obtain
p(s) r(s)
= q(s) +
a(s) a(s)
where the remainder r(s) has degree less than n. Thus, p(s) can be written as
where the second equation follows from a(A) = 0. Since the polynomial p was arbitrary,
this shows that we can reduce any matrix polynomial in A to a degree less than n. This
is also true for the infinite polynomial in (1.22), and we have the following result.
14 1. State Space Models
Theorem 1.2
The transition matrix of an nth -order state space model can be written in the form
Note that the polynomial coefficients are time-varying, because the coefficients of the
infinite polynomial (1.22) are time-dependent. One way of computing the functions αi (t)
for a given state space model is suggested in Exercise 1.4.
Exercises 15
Exercises
Problem 1.1
L R C
vl vr vc
i
+
−
vs
a) Show that the voltage vr over the resistance R can be described by the state space
representation
ẋ = Ax + bu, y = cx
where
i
x= , u = vs , y = vr
vc
−R/L −1/L 1/L
A= , b= , c= R 0
1/C 0 0
di
Use the physical relationships vs = vc + vl + vr , vl = L dt , i = C dv
dt
c
.
b) Show that the circuit can also be described by the following differential equation:
d 2 vc dvc
LC + RC + vc = vs
dt2 dt
c) Write this equation in state space form with state variables x1 = vc and x2 = v̇c .
Problem 1.2
Problem 1.3
ẋ = Ax
where the distinct, real eigenvalues of A are λ1 and λ2 with corresponding eigenvectors t1
and t2 .
x(0) = kt1
we have
k
X(s) = t1
s − λ1
c) For
−1 1
A=
−2 −4
Exercises 17
Problem 1.4
a) Use the fact that the eigenvalues of a square matrix A, are solutions of the character-
istic equation to show that eλi t for all eigenvalues λ = λ1 , λ2 . . . λn can be expressed
as
eλt = α0 (t) + α1 (t)λ + . . . + αn−1 (t)λn−1
b) For the case of distinct eigenvalues, show that the functions αi (t), i = 1, . . . , n − 1
in (1.23), are the same as the ones in part a.
Hint: Use the fact that if A = T ΛT −1 then eAt = T eΛt T −1 .
Problem 1.5
a) Use the results from Problem 1.4 to calculate the functions α0 (t), α1 (t) and eAt .
b) Calculate the state and output responses, x(t) and y(t), with initial values x(0) =
T
2 1
c) Calculate the output response with initial values from (b) and a step input u(t) =
2σ(t)
18 1. State Space Models
Problem 1.6
a) Convert the 2nd order model of the spring mass system (1.1) into the controller
canonical form. Use the values
m = 1, k = 1, b = 0.1
c) Calculate gain matrix f = [f1 f2 ] for a state feedback controller u = f x, which will
bring the system with an initial state of x(0) = [1 0]T back to a steady state with a
settling time (1%) ts ≤ 5 and with damping ratio ζ ≥ 0.7.
d) Calculate f , using MATLAB, for the settling time ts ≤ 5 and damping ratio ζ ≥ 0.7.
Hint: Consider the characteristic polynomial of the closed loop system.
Problem 1.7
b2 s 2 + b1 s + b0
Y (s) = U (s)
s 3 + a2 s 2 + a1 s 1 + a0
b0 b1 b2
R R R
x1 x2 x3 y
Problem 1.8
Determine the controller and observer canonical forms for the system with transfer func-
tion
4s3 + 25s2 + 45s + 34
H(s) =
s3 + 6s2 + 10s + 8
Problem 1.9
Show that two state space models (A1 , b1 , c1 ) and (A2 , b2 , c2 ) represent the same transfer
function if a matrix T exists (det T 6= 0) such that
T −1 A1 T = A2 , T −1 b1 = b2 , c1 T = c2
Problem 1.10
This exercise is a short revision of the concept of linearisation of physical model equations.
Consider the water tank in Figure 1.9.
fin
h Valve position
=u
0
fout
a) Describe the relationship between water height h, inflow fin and valve position u by
a differential equation. Why is this not linear?
b) Determine the valve position u0 , that for a constant water height h0 and inflow fin0
keeps the level in the tank constant.
c) Show that for small deviations δh, δu of h and u, respectively, from a steady state
(h0 ,u0 ) the following linear approximation can be used to describe the water level
√
dynamics, where kt = ρgkv :
fin0 kt p
δ ḣ = − δh − h0 δu
2At h0 At
Hint: Use the Taylor-series expansion of f (h + δ) for small δ
d) Write down the transfer function of the system at the linearised operating point
around the steady state (h0 ,u0 ). Identify the steady state process gain and the time
constant of the linearised system.
e) For the linearised system in (c) with input u and output h, determine a state space
model of the form
ẋ = Ax + bu
y = Cx + du
Figure 1.10: The Mini Segway Figure 1.11: Model of the Mini Segway with its de-
grees of freedoms
Consider a self balancing robot in Figure 1.10. The relation between the input voltage
u(t) and the degrees of freedoms s(t), α(t) are described by the set of nonlinear differen-
tial equations
Exercises 21
Jw kt kb kt kb kt
(mp + 2mw + 2 2
)s̈ + mp lcos(α)α̈ + 2
ṡ − α̇ − mp lsin(α)α̇2 = u
r Rr Rr Rr (1.24)
kt kb kt kb kt
(Jp + mp l2 )α̈ + mp lcos(α)s̈ − ṡ + α̇ − mp glsin(α) = − u
Rr R R
To derive these equations, the Lagrange Formulation and the DC Motor Equation are
used. (1.24) can be rewritten as a nonlinear state space model
ẋ = f (x, u) (1.25)
T
where the states are defines as x = s α ṡ α̇ . Equation (1.25) can be linearized
around the operating point (x, u), where x(t) = x + δx(t) and u(t) = u + δu(t). Finally,
this will lead to a linear state space model
δ ẋ = A δx + B δu
(1.26)
y = Cx + Du
a) Derive the function f (x, u) for the nonlinear state space model.
c) Derive the matrices A and B for the linear state space model.
d) Determine the matrices C and D. Assume that all outputs can be measured, there-
fore y = x.
Hint: If you have problems to solve a), go through the following steps
22 1. State Space Models
Test your model in a closed loop configuration with a state feedback controller to check
the validity of the linear model. Assume the following initial conditions
0
α0
x(0) =
0
0
initial angle: α 0 = 5◦
state feedback gain: K= 26.4575 82.3959 56.2528 12.0057
a) Build two block diagrams in Simulink, one for the linear model and one for the
nonlinear model. Each block diagram should look like Figure 1.12.
– open the Simulink file Simulation LQR.slx and use the blocks on the left
– define all necessary variables in the workspace
– add a saturation block for the nonlinear model
– finally, run simulation LQR script.m
b) Compare the output from linear and nonlinear model. What do you observe? Is the
input voltage inside the limits?
c) Now increase the initial angle first to 10◦ and then to 15◦ . Compare the results with
the previous task. Is the linear model still a good approximation?
Exercises 23
Figure 1.12: Block diagram of the model with state feedback controller
Familiarize yourself with the Mini Segway; for this purpose read and follow the file Tuto-
rial.pdf.
d(t)
d0
t
tstart
b) Build the same disturbance signal d(t) again in experiment LQR disturbance.slx.
Compare simulation with experiment by running comparison LQR disturbance script.m.
Does the experiment match the simulation?
Chapter 2
This chapter will introduce the first of two important properties of linear systems that
determine whether or not given control objectives can be achieved. A system is said to be
controllable if it is possible to find a control input that takes the system from any initial
state to any final state in any given time interval. In this chapter, necessary and sufficient
conditions for controllability are derived. It is also shown that the closed-loop poles can
be placed at arbitrary locations by state feedback if and only if the system is controllable.
We start with a definition of controllability. Consider a system with state space realization
Definition 2.1
The system (2.1) is said to be controllable if for any initial state x(0) = x0 , time tf > 0
and final state xf there exists a control input u(t), 0 ≤ t ≤ tf , such that the solution of
(2.1) satisfies x(tf ) = xf . Otherwise, the system is said to be uncontrollable.
Since this definition involves only the state equation, controllability is a property of the
data pair (A, b).
Example 2.1
and it is clear that there exists no input u(t) that will bring the system to the final state
xf .
Returning to the system (2.1), controllability requires that there exists an input u(t) on
the interval 0 ≤ t ≤ tf such that
Z tf
Atf
xf = e x0 + eA(tf −τ ) bu(τ )dτ (2.2)
0
To establish conditions for the existence of such a control input, we will first present a
particular choice of input that satisfies (2.2) under the assumption that a certain matrix
is invertible. We will then show that if (2.1) is controllable, this assumption is always
true.
Thus, consider the input
T (t
u(t) = −bT eA f −t)
Wc−1 (tf )(eAtf x0 − xf ) (2.3)
That this input takes the system from x0 to xf can be easily verified by substituting (2.3)
into (2.2), this leads to
To see that the left factor in the second term is Wc (tf ), observe that (using a change of
Rt Rt
variables) 0 f (t − τ )dτ = 0 f (λ)dλ.
In the above derivation the assumption that Wc (tf ) is invertible is necessary for the input
(2.3) to exist. The matrix function Wc (t) in (2.4) plays an important role in system
theory, it is called the controllability Gramian of the system (2.1).
The matrix Wc (t) is positive semidefinite for any t > 0, because for any column vector
q ∈ IRn we have Z t Z t
T Aτ T AT τ
q e bb e dτ q = (q T eAτ b)2 dτ ≥ 0
0 0
This also shows that if Wc (t) is positive definite for some t > 0, it is positive definite for
all t > 0. We will use the notation M > 0 and M ≥ 0 to indicate that a matrix M is
positive definite or positive semi-definite, respectively. Since Wc (t) has full rank and is
invertible only if Wc (t) > 0, a necessary condition for u(t) in (2.3) to exist is that the
controllability Gramian is positive definite.
26 2. Controllability and Pole Placement
The following Theorem gives a necessary and sufficient condition for controllability.
Theorem 2.1
The system (2.1) is controllable if and only if the controllability Gramian Wc (t) in (2.4)
is positive definite for any t > 0.
Proof
That Wc (t) > 0 for any t > 0 implies controllability has already been shown, it follows
from the existence of u(t) in (2.3). To prove the Theorem, it remains to show that con-
trollability also implies Wc (t) > 0 for any t > 0. Thus, assume that (A, b) is controllable
but that there exists a time tf > 0 such that the controllability Gramian is not invertible,
i.e. rank Wc (tf ) < n. Then there exists a column vector q 6= 0 such that
Z tf Z tf
T Aτ T AT τ
q e bb e dτ q= (q T eAτ b)2 dτ = 0
0 0
which implies
q T eAτ b = 0, 0 ≤ τ ≤ tf
and therefore
q T eAtf x0 = 0
Note that in order to show that (2.1) is controllable, it is sufficient to show that Wc (t) > 0
for some t > 0.
We will now see that we can check whether a given system is controllable without com-
puting the controllability Gramian (2.4). An equivalent condition for controllability is
provided by the following theorem.
2.2. The Controllability Matrix 27
Theorem 2.2
The controllability Gramian Wc (t) is positive definite for all t > 0 if and only if the
controllability matrix
C(A, b) = [b Ab A2 b . . . An−1 b] (2.5)
Proof
First assume that Wc (t) > 0 for all t > 0 but rank C(A, b) < n. Then there exists a vector
q 6= 0 such that
q T Ai b = 0, i = 0, 1, . . . , n − 1
q T eAt b = 0
for all t > 0, or equivalently q T Wc (t) = 0 for all t > 0. This contradicts the assumption,
therefore rank C(A, b) = n.
Conversely, assume that rank C(A, b) = n but Wc (tf ) is singular for some tf . Then there
exists a vector q 6= 0 such that
q T eAt b = 0
qT b = 0
q T Ai b = 0, i = 1, 2, . . .
This implies
q T [b Ab A2 b . . . An−1 b] = 0
which contradicts the assumption that C(A, b) has full rank. Therefore, Wc (t) must be
non-singular for any t > 0. This completes the proof.
Corollary 2.1
The system (2.1) is controllable if and only if the controllability matrix C(A, b) has full
rank.
28 2. Controllability and Pole Placement
It was shown in Chapter 1 that state feedback can be used to change the poles of a system
(which turned out to be eigenvalues of the system matrix), and that the location of the
closed-loop eigenvalues depends on the choice of the feedback gain vector f . An important
question is whether for any given choice of pole locations - constrained of course by the
fact that complex eigenvalues of a real matrix are symmetric about the real axis - there
exists a feedback gain vector f that achieves the desired closed-loop poles. We will now
show that this is indeed the case if the system is controllable.
Consider again the system (2.1) with x(0) = 0. Assume the control input is taken to be
u(t) = f x(t) + uv (t)
leading to the closed-loop system
ẋ(t) = (A + bf )x(t) + buv (t)
Let
a(s) = det(sI − A) = sn + an−1 sn−1 + . . . + a0
denote the open-loop characteristic polynomial, and
ā(s) = det(sI − A − bf ) = sn + ān−1 sn−1 + . . . + ā0 (2.6)
the closed-loop characteristic polynomial. The question is whether for any choice of
closed-loop eigenvalues - equivalently for any polynomial ā(s) - it is possible to find a gain
vector f that satisfies (2.6).
To answer this question, we investigate the closed-loop transfer function from uv to y.
The open-loop transfer function from u to y is
b(s)
G(s) = = c(sI − A)−1 b
a(s)
From Fig. 2.1 - where we introduced a new signal z(t) - we find that
Z(s) m(s)
= f (sI − A)−1 b = (2.7)
U (s) a(s)
where m(s) denotes the numerator polynomial of the transfer function from u to z.
Now closing the loop, we have
Z(s) m(s)/a(s) m(s)
= =
Uv (s) 1 − m(s)/a(s) a(s) − m(s)
and the closed-loop transfer function is
Y (s) Y (s) U (s) Z(s) b(s)
= =
Uv (s) U (s) Z(s) Uv (s) a(s) − m(s)
x(0)
u ẋ R x
uv b c y
z f
Theorem 2.3
Let Ta denote the Toeplitz matrix on the right hand side, and note that the second factor
on the right hand side is the controllability matrix C(A, b). Since Ta is invertible, we can
solve for the desired gain vector f if and only if C(A, b) has full rank. In this case
Theorem 2.4
The eigenvalues of the system (2.1) can be placed at arbitrary locations by state feedback
if and only if (A, b) is controllable.
If the system is controllable, equation (2.9) - which is known as Bass-Gura formula - can
be used to compute the state feedback gain required to assign the desired eigenvalues.
We derived two equivalent tests for the controllability of a system - checking the rank
either of the controllability Gramian or of the controllability matrix. Now we will address
the question of what can be said about a system if it fails these rank tests. It is often
helpful for gaining insight if a state space model is in diagonal form. Consider a model
with state equation
ẋ1 λ1 0 0 x1 b1
ẋ2 = 0 λ2 0 x2 + b2 u
ẋ3 0 0 λ3 x3 0
In this diagonal form the state variables are decoupled, and because the last element in
b is zero, there is no way to influence the solution x3 (t) via the control input u(t). The
system is clearly uncontrollable in the sense of Definition 2.1, because we cannot take x3
to any desired value at a given time. On the other hand, it may well be possible to take
2.4. Uncontrollable Systems 31
x1 and x2 to any desired value - this is indeed the case if λ1 6= λ2 and b1 , b2 are non-zero.
This example illustrates that when a system is uncontrollable, it is often of interest to
identify a controllable subsystem. The following Theorem suggests a way of doing this.
Theorem 2.5
Consider the state space model (2.1), and assume that rank C(A, b) = r < n. Then there
exists a similarity transformation
x̄
x = Tc x̄, x̄ = c
x̄c̄
such that
x̄˙ c Āc Ā12 x̄c b̄ x̄
= + c u, y = [c̄c c̄c̄ ] c (2.10)
x̄˙ c̄ 0 Āc̄ x̄c̄ 0 x̄c̄
with Āc ∈ IRr×r and (Āc , b̄c ) controllable. Moreover, the transfer function of the system is
Proof
Let (Ā, b̄, c̄) denote the transformed model (2.10), then
Thus C(Ā, b̄) = Tc−1 C(A, b). This shows that rank C(Ā, b̄) = rank C(A, b) = r. The
controllability matrix C(Ā, b̄) has the form
b̄c Āc b̄c . . . Ācn−1 b̄c
C(Ā, b̄) =
0 0 ... 0
The first r columns are linearly independent; to see this, note that for each k ≥ r, Ākc is
a linear combination of Āic , 0 ≤ i < r by the Cayley-Hamilton Theorem. Therefore
Tc = [b Ab . . . Ar−1 b qr+1 . . . qn ]
where the first r columns are the linearly independent columns of C(A, b), and qr+1 . . . qn
are any n − r linearly independent vectors such that Tc is nonsingular. To verify that this
32 2. Controllability and Pole Placement
This means that As b can be expressed as linear combination of components with less
powers of A. Now if we multiply the above equation by A, we can conclude that As+1 b
can be expressed as linear combination of components with powers of A less than and/or
equal to s. However it was seen that As b can be expressed as linear combination of
components with the powers of A less than s, thus, As+1 b also can be expressed as linear
combination of components with the powers of A less than s. In the same way we can
continue to show that columns from As+2 b to An−1 b can be expressed as linear combination
of columns with the powers of A less than s as well. This means that the rank of C(A, b)
is s < r which contradicts our assumption. Similarly, we have
1
0
b = Tc b̄ = [b Ab . . . ] .
..
0
The last statement of the Theorem is easily verified by computing the transfer function
of the model (Ā, b̄, c̄) in (2.10). This completes the proof.
The transformation matrix Tc used in this proof is not the best choice from a numerical
point of view. A numerically reliable way of constructing a transformation matrix Tc is
to use QR factorization: if C = QR is a QR factorization of C, then Tc = Q.
The fact that C(Ā, b̄) = Tc−1 C(A, b) was used above to show that the subsystem with r
state variables is controllable. An important observation is that this equation holds for
any transformation T . In particular, if the state space model (2.1) is controllable and
T is a transformation to (Ã, b̃, c̃), then we have C(Ã, b̃) = T −1 C(A, b) and therefore rank
C(Ã, b̃) = rank C(A, b) = n. This proves the following:
2.4. Uncontrollable Systems 33
Theorem 2.6
This result shows that controllability is not a property of a particular state space realiza-
tion, but a property of a system which is independent of the coordinate basis.
Example 2.2
and rank C = 1. To bring this system in the form of (2.10), we construct the transforma-
tion matrix Tc = [t1 t2 ] by taking t1 = b and choosing t2 orthogonal to t1 . Thus
1 1 0.2 0.4
Tc = and Tc−1 =
2 −0.5 0.8 −0.4
Controllable Subspace
If a system is not controllable, one might ask which parts of the state space can be reached
by the state vector, starting from the origin. Assuming x(0) = 0, we have
Z t
x(t) = eA(t−τ ) bu(τ )dτ
Z0 t
= α0 (t − τ )I + α1 (t − τ )A + . . . + αn−1 (t − τ )An−1 bu(τ )dτ
0
34 2. Controllability and Pole Placement
where Theorem 1.2 has been used in the last equation. The expression inside the integral
can be rearranged as
Z t α0 (t − τ )
x(t) = [b Ab . . . An−1 b]
..
. u(τ )dτ
0
αn−1 (t − τ )
which can also be written as
β0 (t)
x(t) = [b Ab . . . An−1 b] ...
βn−1 (t)
where Z t
βi (t) = αi (t − τ )u(τ )dτ
0
Observing that the matrix on the right hand side is the controllability matrix C, we
conclude that the state vector x(t) can only take values that are linear combinations of
the columns of C, i.e.
x(t) ∈ R(C)
Here R() denotes the column space of a matrix. Thus, the part of the state space that
is reachable from the origin is precisely the column space of the controllability matrix.
This space is called the controllable subspace. For a controllable system, the controllable
subspace is the entire state space.
Returning to Example 2.2, the controllable subspace is spanned by the vector [1 2]T , it
is a line through the origin with slope 2.
Stabilizability
The state variables of the uncontrollable subsystem (Āc̄ , 0, c̄c̄ ) cannot be influenced through
the system input, but they can have an effect on the system output through c̄c̄ . It is clear
that if Āc̄ has eigenvalues in the right half plane, then there is no way to stabilize the
system by state feedback. This motivates the following definition.
Definition 2.2
The system with state space realization (2.1) is said to be stabilizable if there exists a
state feedback law u(t) = f x(t) such that the resulting system is stable.
If (2.1) is uncontrollable and (2.10) is another realization of the same system, then the
system is stabilizable if and only if Āc̄ has no eigenvalues in the right half plane.
The decomposition of a state space model into a controllable and an uncontrollable sub-
system is shown in the form of a block diagram in Fig. 2.2.
2.4. Uncontrollable Systems 35
xc
u (Āc , b̄c , c̄c ) y
A12
(Āc̄ , 0, c̄c̄ )
xc̄
Theorem 2.7 The system (2.1) is controllable if and only if the matrix
[sI − A b] (2.11)
Exercises
Problem 2.1
d) Calculate the Laplace transform of the response x(t) to initial values [x10 x20 ]T with
the input u(t) = 0. Describe the relationship between initial values required to
ensure that the system eventually reaches equilibrium at [0 0]T . Compare it to the
answer of part (c) and explain whether there can be any relationship between them
or not.
f) Does the transfer function fully describe the dynamic behaviour of the system? If
not, why not?
Problem 2.2
This problem shows how the limit of the controllability Gramian as t → ∞ can be
represented as the solution of an algebraic matrix equation.
For the stable system
ẋ(t) = Ax(t) + bu(t)
The limiting value of the controllability Gramian, Wc is
Wc = lim Wc (t)
t→∞
c) Show that as t → ∞ the controllability Gramian Wc (t) satisfies the algebraic equa-
tion
AWc + Wc AT + bbT = 0
Problem 2.3
Problem 2.4
Problem 2.5
q T A = λq T and qT b = 0
b) Explain why the system is uncontrollable if such a q exists. Then show that the
converse is also true.
c) Show that
rank sI − A b < n
for some s ∈ C
| if and only if such a q exists.
d) Use the above results to show that the system is controllable iff sI − A b has full
row rank for all s ∈ C
| (PBH test).
Problem 2.6
Consider the system in Figure 2.3. The equations of motion of this system are
M v̇ = −mgθ1 − mgθ2 + u
m(v̇ + li θ̈i ) = mgθi , i = 1, 2
where v(t) is the speed of the cart and u(t) is a force applied to the cart. Note that these
equations represent the system dynamics only in a small neighborhood of the equilibrium.
m m
θ2 θ1 l1
l2
u Mass M
a) Determine a state space model of this system with state variables θ1 , θ2 , θ̇1 and θ̇2 .
b) What happens to the last 2 rows of A and b if the pendulums have the same length?
c) Assume now that the pendulums have the same length. Show that the controllability
matrix of the system can be written in the form:
0 b1 0 b̃1
0 b1 0 b̃1
C=
b1 0 b̃1 0
b1 0 b̃1 0
Exercises 39
Show then that the controllable subspace is defined by θ1 = θ2 and θ̇1 = θ̇2 .
Problem 2.7
a) For the system in Problem 2.6, generate a state space model in Matlab with the
following parameter values: g = 10, M = 10, m = 1.0, l1 = 1.0, l2 = 1.5.
b) Use Matlab to design a state feedback controller with the following properties
Problem 2.8
In this problem it is demonstrated for a 3rd order system, that a similarity transformation
matrix T exists that takes a state space realization into controller canonical form if and
only if (A, b) is controllable.
exists, where ti are the columns of T , and that the resulting controller form is
(Ac , bc , cc ). Use the structure of the controller form to show that t3 = b.
t1 = A2 b + Aba2 + ba1
t2 = Ab + a2 b
where a0 , a1 and a2 are the coefficients of the characteristic polynomial of A.
c) Show that
a1 a2 1
T = C a2 1 0
1 0 0
where C is the controllability matrix of (A, b).
40 2. Controllability and Pole Placement
Problem 2.9
Find a state space model for the system in Figure 2.4. Calculate its controllability matrix
and its transfer function.
g1
g2
y
R R
u x1 x2
−a0 −a1
Problem 2.10
ẋ = (A + bf )x + buv
is still controllable.
Hint: Use the fact that only controllable systems can have state space models in controller
canonical form.
Chapter 3
The discussion of state feedback in the previous chapter assumed that all state variables
are available for feedback. This is an unrealistic assumption and in practice rarely the
case. A large number of sensors would be required, while it is known from classical control
theory that efficient control loops can be designed that make use of a much smaller number
of measured feedback signals (often just one). However, the idea of state feedback can still
be used even if not all state variables are measured: the measurements of state variables
can be replaced by estimates of their values. In this chapter we discuss the concept of
state estimation via observers. Moreover, after discussing controllability we introduce the
second of two important properties of a linear system: a system is called observable if the
values of its state variables can be uniquely determined from its input and output signals.
It turns out that the problem of designing a state estimator has the same mathematical
structure as the problem of designing a state feedback controller; for this reason they are
called dual problems.
Since the plant models we encounter are usually strictly proper, we will limit the discussion
to strictly proper systems.
Assume it is desired to change the eigenvalues of the system to improve its dynamic
properties, but only the input signal u(t) and the output signal y(t) are available for
feedback. The idea of state feedback discussed in the previous chapter can still be used
if we can obtain an estimate of the state vector x(t). One possible approach is indicated
42 3. Observability and State Estimation
in Fig. 3.1. As part of the controller, we could simulate the system using the same state
space model as in (3.1), and apply the same input u(t) to the simulation model that is
applied to the actual plant. Provided that the initial values of the state variables (the
integrator outputs) are known, and that the simulation model is an exact replication of
the actual system, the estimated state vector x̂(t) will track the true state vector exactly.
This estimated state vector could then be used to implement a state feedback controller.
x(0)
ẋ R x
u b c y
x̂(0)
x̂˙ R x̂
b c ŷ
Unfortunately, the initial state values are in general not known, and for this reason the
scheme in Fig. 3.1 is impractical. The estimated state vector x̂(t) is governed by
˙
x̂(t) = Ax̂(t) + bu(t), x̂(0) = x̂0
and subtracting this from (3.1) shows that the dynamics of the estimation error x̃(t) =
x(t) − x̂(t) are determined by
˙
x̃(t) = Ax̃(t), x̃(0) = x(0) − x̂(0)
In general x̃(0) 6= 0, and it will depend on the eigenvalues (λ1 , . . . , λn ) of A how fast (or
if at all) the error will go to zero: a partial fraction expansion of X̃(s) = (sI − A)−1 x̃(0)
shows that
x̃(t) = φ1 eλ1 t + . . . + φn eλn t
where φi is a column vector that depends on the residual at λi . If A has eigenvalues close
to the imaginary axis, error decay will be slow, and if the plant is unstable, the error will
become infinite.
3.1. State Estimation 43
x(0)
ẋ R x
u b c y
-
l
x̂(0)
x̂˙ R x̂ ŷ
b c
˙
x̂(t) = Ax̂(t) + lc(x̂(t) − x(t)) + bu(t) (3.2)
˙
ẋ(t) − x̂(t) = A(x(t) − x̂(t)) + lc(x(t) − x̂(t))
or
˙
x̃(t) = (A + lc) x̃(t), x̃(0) = x(0) − x̂(0) (3.3)
The error dynamics are now determined by the eigenvalues of A + lc. A and c are given
plant data, but the gain vector l can be chosen freely to obtain desired eigenvalues. The
situation closely resembles pole placement via state feedback, where a gain (row) vector
f is chosen to place the eigenvalues of A + bf in desired locations of the complex plane.
The only difference is that f is a right factor in the product bf , whereas l is a left factor
44 3. Observability and State Estimation
in the product lc. However, observing that the eigenvalues do not change if a matrix
is transposed, we can equivalently consider the problem of choosing a row vector lT to
obtain desired eigenvalues of (A + lc)T = AT + cT lT . This problem has exactly the same
form as the problem of pole placement by state feedback.
A → AT , b → cT , f → lT
has full rank. We will call the transpose of this matrix the observability matrix of the
model (3.1)
c
cA
O(c, A) = C T (AT , cT ) = .
(3.4)
..
cAn−1
The Theorem below follows now simply from the fact that the pole placement problem
and the state estimation problem have the same mathematical form.
Theorem 3.1
The eigenvalues of the state estimator (3.2) for the system (3.1) can be placed at arbitrary
locations by a suitable choice of the estimator gain vector l if and only if the observability
matrix O(c, A) has full rank.
Pole placement and state estimation are said to be dual problems. For each result about
state feedback in the last chapter, we will obtain an equivalent result about state estima-
tion by invoking the duality between both problems, i.e. by making suitable replacements.
3.1. State Estimation 45
Comparing both problems, we recall that rank C(A, b) = n is a necessary and sufficient
condition for a system to be controllable, and that controllability is an important prop-
erty of a system that determines whether the poles can be shifted via state feedback to
arbitrary locations. This raises the question of what the property rank O(c, A) = n tells
us about a system. The following definition will help to clarify this.
Definition 3.1
The system with state space model (3.1) is said to be observable if for any tf > 0 the
initial state x(0) can be uniquely determined from the time history of the input u(t) and
the output y(t) in the time interval 0 ≤ t ≤ tf . Otherwise, the system is said to be
unobservable.
Note that - given the time history of u(t) - knowledge of x(0) is all that is needed to
uniquely determine the state vector x(t) by solving the state equation.
Theorem 3.2
The system (3.1) is observable if and only if the observability matrix O(c, A) has full rank.
Proof
First we show that rank O(c, A) = n implies observability. For 0 ≤ t ≤ tf we have
Z t
At
y(t) = ce x(0) + ceA(t−τ ) bu(τ )dτ
0
Since u(t) is known, the second term on the right hand side can be computed and sub-
tracted on both sides of the equation. We can therefore replace the above by an equivalent
system with a modified output and assume without loss of generality that u(t) = 0, t > 0.
We thus consider the zero-input response
At t = 0 we have y(0) = cx(0), and taking derivatives we obtain ẏ(0) = cAx(0), ÿ(t) =
cA2 x(0) etc, and thus
y(0) c
ẏ(0) cA
ÿ(0) cA2
= x(0) (3.5)
.. .
.
. .
y (n−1) (0) cAn−1
This equation can be uniquely solved for x(0) if the observability matrix has full rank.
This proves the first claim.
46 3. Observability and State Estimation
To show that observability implies rank O(c, A) = n, assume that the system is observable
but rank O(c, A) < n. Then there exists a column vector x0 6= 0 such that O(c, A)x0 = 0,
this implies cAi x0 = 0 for i = 0, 1, . . . , n−1. Now assume that x(0) = x0 , then by Theorem
1.2 we have y(t) = ceAt x0 = 0. Therefore, the initial state cannot be determined from the
output y(t) = 0 and the system is unobservable by Definition 3.1, which contradicts the
assumption. This completes the proof.
Having established the non-singularity of the observability matrix as a necessary and suf-
ficient condition for observability, we can now derive the following results from Theorems
2.1 and 2.2 simply by making the replacements A → AT and b → cT .
Theorem 3.3
To explore the analogy between controllability and observability further, we will now
establish a decomposition result for unobservable systems. Consider the example
ẋ1 λ1 0 0 x1 b1 x1
ẋ2 = 0 λ2 0 x2 + b2 u, y = [c1 c2 0] x2
ẋ3 0 0 λ3 x3 b3 x3
3.2. Unobservable Systems 47
This system is not observable because the output y(t) is completely independent of the
state variable x3 (t). On the other hand, the initial values of the state variables x1 (t) and
x2 (t) can be determined from the input and output if λ1 6= λ2 and c1 , c2 are non-zero.
Again, the question is how to identify observable subsystems in general. The answer is
given by the following dual version of Theorem 2.5.
Theorem 3.4
Consider the state space model (3.1) and assume that rank O(c, A) = r < n. Then there
exists a similarity transformation
x̄
x = To x̄, x̄ = o
x̄ō
such that
x̄˙ o Āo 0 x̄o b̄ x̄
= + o u, y = [c̄o 0] o (3.7)
˙x̄ō Ā21 Āō x̄ō b̄ō x̄ō
with Āo ∈ IRr×r and (c̄o , Āo ) observable. Moreover, the transfer function of the system is
Theorem 3.5
and rank O = 1. To bring this system in the form of (3.7), we construct the transformation
matrix To−1 by taking c as first row and choosing the second row orthogonal to c. Thus
−1 1 1 0.5 0.5
To = and To =
1 −1 0.5 −0.5
Observable Subspace
In the previous chapter we introduced the concept of a controllable subspace. As one
would expect, there is a dual concept associated with observability. We will illustrate it
with the above example. From (3.5) we have
y(0)
= O(c, A)x0
ẏ(0)
Assuming that
1
x0 = , and u(t) = 0
1
we have x1 (t) = x2 (t) = e−t and y(t) = x1 (t) + x2 (t) = 2e−t . Pretending we do not know
the value of x0 , we can try use the above to find an estimate x̂0 of the initial state vector
from the observed output; for this purpose we write
Y = O x̂0 (3.8)
Y ∈ R(O)
3.3. Kalman Canonical Decomposition 49
i.e. if Y is in the range space of the observability matrix. Of course, we know that in
general a solution must exist, because there must have been some initial value x0 of the
state vector at time t = 0 that generated the output data in Y (and for this example we
know the solution anyway). More insight is obtained by considering the null space of the
observability matrix
N (O) = {x : Ox = 0}
Detectability
We now define the dual concept to stabilizability.
Definition 3.2
The system with state space realization (3.1) is said to be detectable if there exists a gain
vector l such that all eigenvalues of A + lc are in the left half plane.
If (3.1) is unobservable and (3.7) is another realization of the same system, then the
system is detectable if and only if Āō has no eigenvalues in the right half plane. The
decomposition of a state space model into an observable and an unobservable subsystem
is shown in Fig. 3.3.
xo
u (Āo , b̄o , c̄o ) y
A21
xō
Of the four subsystems, it is only the controllable and observable subsystem (Āco , b̄co , c̄co )
that determines the input-output behaviour of the system. A block diagram interpretation
of this decomposition is shown in Fig. 3.4; the dashed lines indicate interaction between
subsystems due to the non-zero off-diagonal blocks in Ā.
The transfer function of the realization (A, b, c) is of order n, whereas that of (Āco , b̄co , c̄co )
is of order r < n, where r is the dimension of x̄co . The fact that they are equal shows that
3.3. Kalman Canonical Decomposition 51
Definition 3.3
A realization (A, b, c) is said to be a minimal realization if it has the smallest order, i.e.
the smallest number of state variables, among all realizations having the same transfer
function c(sI − A)−1 b.
From this definition and the above discussion of the Kalman decomposition, we conclude
the following.
Theorem 3.6
A realization (A, b, c) is minimal if and only if (A, b) is controllable and (c, A) is observable.
52 3. Observability and State Estimation
Exercises
Problem 3.1
Problem 3.2
Consider the system in Figure 3.5. Here m is the mass of the pendulum, M is the mass
of the trolley, d the position of the trolley, θ the pendulum angle, F the coefficient of
friction, and L the distance from the center of gravity of the pendulum to the point of
connection to the trolley. With the state variables
xT = d d˙ d + Lθ d˙ + Lθ̇
the system dynamics - linearized about the upright pendulum position - can be described
by the state space model
ẋ = Ax + bu, y = cx
where
0 1 0 0 0
0 −F/M 0 0 1/M
A= , b=
0 0 0 1 0
−g/L 0 g/L 0 0
Note that this linearized model represents the system dynamics only in a small neighbor-
hood of the equilibrium.
a) If the angle θ is available as a measured signal, what is the output vector c of the
state space model?
m
θ
L
u Mass M
d
0
e) Show that the system is observable when the measured variable is d + Lθ instead
of θ.
⋆ Problem 3.3
Consider two minimal realisations of SISO systems S1 = (A1 ,b1 ,c1 ) and S2 = (A2 ,b2 ,c2 )
that represent the same transfer function, with similarity transforms Tc1 and Tc2 that
transform S1 and S2 respectively to the controller canonical form Sc .
b) Prove that if two minimal realisations represent the same transfer function, then a
matrix T exists which can transform S1 to S2 .
Problem 3.4
a) What does this pole-zero cancellation tell us about a state space realization of this
transfer function?
b) Construct the controller and the observer canonical forms corresponding to this
transfer function and discuss the controllability and observability of these two state
space models.
Chapter 4
Observer-Based Control
In this chapter we will combine the ideas of state feedback and state estimation to con-
struct a controller that uses only the measured plant output as feedback signal. We
investigate the dynamic properties of the closed-loop system, and we will see that the dy-
namics of state feedback control and of state estimation can be designed independently.
Tracking of reference inputs will also be discussed; this will lead us to a discussion of the
zeros of a state space model and how they are determined by the observer configuration.
Fig. 4.1 shows a state feedback loop where feedback of the state vector (see Fig. 1.6) has
been replaced by feedback of an estimated state vector. From the figure, it can be seen
that the dynamic behaviour of the closed-loop system is governed by the plant equation
ẋ = Ax + bf x̂ + buv , x(0) = x0
The state variables of the closed-loop system are the state variables of the plant and the
observer. Introducing the closed-loop state vector [xT x̂T ]T , the closed-loop state and
output equations can be written as
ẋ A bf x b x(0) x
˙x̂ = −lc A + bf + lc x̂ + b uv , = 0
x̂(0) x̂0
x
y = [c 0]
x̂
4.1. State Estimate Feedback 55
x(0)
u ẋ R x
uv b c y
-
l
x̂(0)
x̂˙ R x̂ ŷ
b c
More insight into the structure of this system is gained by applying to this state space
model of the closed-loop system the similarity transformation
x x I 0
=T , T =
x̂ x̃ I −I
Under this transformation, the observer state vector x̂ is replaced by the estimation error
x̃, and the closed-loop state and output equations become
ẋ A + bf −bf x b x(0) x
˙x̃ = + uv , = 0 (4.1)
0 A + lc x̃ 0 x̃(0) x̃0
x
y = [c 0]
x̃
The block triangular form of the system matrix reveals that the eigenvalues of the closed-
loop system are the eigenvalues of (A + bf ) together with the eigenvalues of (A + lc). But
these are precisely the eigenvalues assigned by state feedback and the observer eigenvalues,
respectively. An important conclusion is that state feedback and state observer can be
designed independently; this fact is referred to as the separation principle.
56 4. Observer-Based Control
A second observation is that the state space model (4.1) has the same form as the control-
lable/uncontrollable decomposition of Theorem 2.5. The estimation error x̃ is uncontrol-
lable from the input uv , whereas the system state x is controllable if (A, b) is controllable,
because - as shown in Exercise 2.10 - the system (A + bf, b) is controllable if (A, b) is con-
trollable. The estimation error is observable: a non-zero initial error will have an effect
on the output. The closed-loop transfer function from uv to y is
which is the same transfer function as that achieved with direct state feedback. This is of
course a consequence of the fact that the observer is not controllable from the input: if
the initial estimation error is zero, the estimated state vector used for feedback is identical
with the actual plant state vector.
When the idea of state feedback was introduced in Chapter 1, the objective was to modify
the dynamic properties of a system by moving its eigenvalues to desired locations. We
will now discuss how observer-based state feedback can be used to design a control system
for tracking a reference input. The question is how to introduce the reference input into
the loop. One possibility is shown in Fig. 4.2, this loop has the structure of a standard
control loop with unity feedback.
e x̂ u
r Observer f Plant y
-
An alternative way of introducing the reference input into the loop is shown in Fig. 4.3.
The difference between both configurations can be seen by considering a step function
4.2. Reference Tracking 57
r(t) = σ(t) as reference input. In both configurations we assume that the plant output
will not change instantaneously in response to a changing plant input. In Fig. 4.2, the
reference step will then result in a step change in the control error e, which in turn excites
the observer dynamics but not the plant, and leads to an estimation error instantaneously
after the reference step is applied. In contrast, in Fig. 4.3 plant and observer are excited
in exactly the same way; no estimation error is introduced by the reference step. We
would therefore expect that the configuration in Fig.4.3 is superior to that in Fig.4.2.
u
r v Plant y
x̂ -
f Observer
y(t) = (k1 ep1 t + k2 ep2 t + k0 es0 t )σ(t) = ytrans (t) + yext (t)
where ytrans (t) = (k1 ep1 t + k2 ep2 t )σ(t) is the transient component of the response - deter-
mined by the system poles - and yext (t) = k0 es0 t σ(t) is the component of the response
that reflects the direct effect of the external input signal u(t).
58 4. Observer-Based Control
A zero of the system (4.2) is a value of s for which G(s) = 0; in the above case there
are zeros at s = z1 and s = z2 . In time domain, another way of looking at these zeros
is to assume that s0 in the exponent of the above input signal is equal to one of the
zeros, i.e. u(t) = u0 ezi t σ(t) where i = 1 or i = 2. In (4.3) the factor s − s0 = s − zi in
the denominator will disappear, thus yext (t) = 0, t ≥ 0 and y(t) = ytrans (t) - only the
transient component will appear in the output.
Now assume that
ẋ = Ax + bu, x(0) = x0
y = cx + du (4.4)
is a state space realization of this system. Note that a non-zero feedthrough term du is
included in the output equation, reflecting the assumption that G(s) in (4.2) has an equal
number of poles and zeros; to consider direct feedthrough will turn out to be useful when
we discuss the closed-loop zeros of the control loops in Fig. 4.2 and Fig. 4.3. Applying
again the input u(t) = u0 es0 t σ(t) will give a response with Laplace transform
u0
Y (s) = cΦ(s)x0 + (cΦ(s)b + d) (4.5)
s − s0
where Φ(s) = (sI − A)−1 . The second term on the right hand side - the zero-initial-state
response - is precisely the response in (4.3), which can be decomposed into ytrans (t) and
yext (t). If s0 = zi , i = 1, 2, then with the above input we have yext (t) = 0, and the
output signal y(t) consists of the zero-input response and the transient component of the
zero-initial-state response. Compared with the transfer function response (4.3), there is
an additional term - the zero-input response cΦ(s)x0 - present in the output, and we could
ask the question whether it is possible that ytrans and cΦ(s)x0 cancel each other. In other
words: can the response to u(t) = u0 ezi t σ(t) be made zero for all t ≥ 0 by a suitable
choice of x0 ?
The answer to this question is yes: if the initial state is
the output (4.5) will vanish completely. To see this, substitute the above in (4.5) to get
1 1
Y (s) = c Φ(s)Φ(zi ) + Φ(s) bu0 + du0
s − zi s − zi
Definition 4.1
The complex number z is a zero of the model (4.4) if and only if there exists an initial
state x0 such that the response to the input vector u(t) = u0 ezt σ(t) is zero for all t ≥ 0.
From (4.7) we see that if z is a zero, and the input u(t) = u0 ezt is applied for t ≥ 0, the
state vector will be x(t) = x0 ezt , and for the output we have y(t) = 0. Substituting these
in (4.4) yields
ẋ = zx0 ezt = Ax0 ezt + bu0 ezt
and
y = cx0 ezt + du0 ezt = 0
Therefore, z is a zero if and only if the matrix on the left hand side is singular.
Theorem 4.1
A value of s that satisfies the condition of this theorem is called an invariant zero of
the system, whereas a value of s such that G(s) = 0 is called a transmission zero. This
distinction will be taken up again in Definitions 5.5 and 5.6. One can show that if a
state space realization is minimal, invariant and transmission zeros are the same, but if
a realization has uncontrollable or unobservable modes, then there will be invariant zeros
that do not appear as transmission zeros.
60 4. Observer-Based Control
Returning to the reference tracking problem, we can now use Theorem 4.1 to study the
closed-loop zeros of the configurations in Fig. 4.2 and Fig. 4.3. We will do this however in
a more general framework. The controller is determined by the state feedback gain vector
f and the observer with gain vector l. Referring to Fig. 4.1, a state space model of the
controller with input y and output u is
x̂˙ = (A + bf + lc)x̂ − ly
u = f x̂
The most general way of introducing a reference input signal r into the control loop is to
add r to both the state equation and the output equation of the controller, this leads to
x̂˙ = (A + bf + lc)x̂ − ly + wr
u = f x̂ + vr (4.8)
Fig. 4.4 shows the resulting control loop. Here w and v are a gain vector and a constant,
respectively, that can be chosen by the designer. The configurations in Fig. 4.2 and Fig.
4.3 can now be obtained as special cases, see Exercise 4.3 and the discussion below.
u
r v Plant y
x̂
f Observer
It is clear that the choice of w and v has no influence on the closed-loop eigenvalues, but
after the discussion above we expect it to have an effect on the closed-loop zeros. To
study this effect, we first observe that the control loop can be redrawn as in Fig. 4.5, and
that any zero of the controller, i.e. any zero from r to u will also be a zero from r to y
unless it is cancelled by a pole of the plant. The closed-loop zeros are therefore - if no
pole-zero cancellation occurs - the controller zeros together with the plant zeros.
We can now use Theorem 4.1 to find the zeros of the controller. With the state space
model (4.8) of the controller, and assuming y = 0 (because we are interested in the
dynamic behaviour from r to u), we obtain as condition for s to be a zero of the controller
- and therefore of the closed loop
sI − A − bf − lc −w
det =0
f v
4.3. Closed-Loop Transfer Function 61
r u
Controller Plant y
The roots of the determinant polynomial are not changed if we perform row or column
operations on the matrix, so we can divide the last column by v and - after multiplying
from the right by f - subtract it from the first block column. This yields
sI − A − bf − lc + v1 wf − v1 w
det =0
0 1
or
1
γ(s) = det(sI − A − bf − lc + wf ) = 0 (4.9)
v
Y (s) γ(s)b(s)
Gcl (s) = =K (4.10)
R(s) ā(s)āe (s)
where γ(s) is the polynomial in (4.9), K a constant gain, ā(s) = det(sI − A − bf ) the
characteristic polynomial of the system under direct state feedback and āe (s) = det(sI −
A − lc) the observer characteristic polynomial. This explains why the control loop in
Fig. 4.3 is superior to that in Fig. 4.2: in Fig. 4.3 we have w = vb, thus γ(s) = āe (s)
and the observer dynamics are cancelled in the closed-loop transfer function. In fact the
closed-loop system in 4.3 is equivalent to that in Fig. 4.1 if we take uv = vr, and the pole-
zero cancellation in (4.10) is equivalent to the fact observed earlier that the estimation
error in Fig. 4.1 is uncontrollable from the input uv . Note that the static gain from r to
y in Fig.4.3 is −vc(A + bf )−1 b; to achieve a zero steady-state tracking error one should
therefore choose
1
v=− .
c(A + bf )−1 b
62 4. Observer-Based Control
In this and the previous two chapters we established that when a system is controllable
and observable, we can design an observer-based state feedback controller and place the
plant and observer eigenvalues at will by choice of the gain vectors f and l. This is a
powerful result that opens the door to efficient design techniques. However, when faced
directly with the question of where to place the plant and observer eigenvalues for a
given control problem, it often turns out to be surprisingly difficult to achieve a good
design by choosing plant and observer poles in a trial and error fashion. Of course, we
can approximate the closed-loop behaviour by a dominant pole pair and use guidelines
from classical control to translate design specifications like damping ratio or rise time into
required pole locations. However, for higher order systems that still leaves the question
of what to do about the remaining poles. Moreover, one will find that in general it is
not a good idea to choose closed-loop pole locations without considering the open-loop
poles and zeros. We have seen from equation (2.9) for example that the feedback gains
(and thus the required control effort) will increase if the closed-loop poles are moved far
away from their open-loop locations. But physical actuator limitations will always impose
a constraint on the controller design, and require a trade-off between performance and
control effort.
In fact, the real power of state space design methods reveals itself when the design is
based on a systematic procedure, which usually involves the search for a controller that is
optimal - i.e. the best that can be achieved - in the sense of a given performance measure.
While a rigorous treatment of optimal controller design methods is beyond the scope
of this course, we will conclude this chapter with a brief presentation of a widely used
method for choosing pole locations, which for single-input single-output systems can be
used in a way similar to root locus design. The main result is presented here without
proof - this problem and its solution is discussed in detail in the lecture course ”Optimal
and Robust Control”.
ẋ = Ax + bu, x(0) = x0
y = cx (4.11)
where (A, b) is stabilizable and (c, A) is detectable. Assume that the desired setpoint for
this system is y = 0, but that at time t = 0 some external disturbance has driven the
system away from its equilibrium to a state x(0) = x0 6= 0 and thus y 6= 0. We wish to
find a control input u(t) that brings the system back to its equilibrium at x = 0 as quickly
as possible, but with ”reasonable” control effort. One way of assessing the capability of a
4.4. Symmetric Root Locus Design 63
The first term under the integral represents the control error, and the second term the
control effort. The parameter ρ - a positive constant - is a tuning parameter that allows the
designer to adjust the balance between the weights attached to control error and control
effort in this cost function. The optimal controller in the sense of this cost function is
the controller that minimizes V . It is clear that if ρ is chosen to be large, the optimal
controller will use less control effort (control being ”expensive”) and thus achieve a slower
response than an optimal controller with a smaller value of ρ in the cost function.
It turns out that the optimal control law takes the form of state feedback control
where f is the optimal state feedback gain. The following Theorem - presented here
without proof - provides a characterization of the optimal controller in terms of the optimal
closed-loop eigenvalues.
Let G(s) = b(s)/a(s) denote the transfer function of the system (4.11), i.e. G(s) =
c(sI − A)−1 b. Then one can show the following
Theorem 4.2
The optimal state feedback controller uopt (t) = f x(t) that minimizes the cost function
(4.12) for the system (4.11) places the eigenvalues of A + bf at the stable roots (the roots
in the left half plane) of the polynomial
1
p(s) = a(−s)a(s) + b(−s)b(s) (4.13)
ρ
Before we discuss how to use this Theorem for optimal controller design, we will have
a closer look at the properties of the polynomial p(s). If x(t) ∈ IRn , the open loop
characteristic polynomial a(s) has degree n, and p(s) will have degree 2n. Thus p(s) has
2n roots. It is easy to see that if s0 is a zero of p(s), then −s0 is also a zero of p(s).
This means that in the complex plane, the roots of p(s) are symmetric with respect to
the imaginary axis - half of the roots are stable and the other half unstable. The n stable
roots are the optimal eigenvalues, and Theorem 4.2 provides a direct way of computing
these from the open-loop model. Knowing the optimal eigenvalues, we can compute the
optimal state feedback gain f by solving a pole placement problem.
We would face a difficulty, however, if it turns out that p(s) has roots on the imaginary
axis, because then the roots could not be divided into n stable and n unstable ones. That
64 4. Observer-Based Control
this situation does not arise is guaranteed by the following Theorem, the proof of which
is left to Exercise 4.9.
Theorem 4.3
The polynomial p(s) has no roots on the imaginary axis if (A, b) is stabilizable and (c, A)
is detectable.
When designing an optimal controller, the tuning parameter ρ in the cost function plays
an important role. To display the effect of ρ on the closed-loop eigenvalues graphically,
divide the equation p(s) = 0 by a(−s)a(s) to obtain
1
1 + G(−s)G(s) = 0 (4.14)
ρ
The optimal eigenvalues are the values of s in the left half plane that satisfy this equation.
Now recall that the effect of the controller gain K on the closed-loop eigenvalues of a
control system with loop transfer function L(s) is displayed as the root locus of the
characteristic equation 1 + KL(s) = 0. Comparing this with (4.14), we see that if we
make the replacements
1
L(s) → G(−s)G(s), K→
ρ
we can use standard root locus techniques to display the optimal eigenvalue locations for
values of ρ in the range 0 < ρ < ∞. This optimal root locus is symmetric with respect
to the imaginary axis, the root locus branches in the left half plane represent the optimal
eigenvalues.
In summary, an optimal state feedback controller for the system (4.11) with cost function
(4.12) can be designed as follows.
• Use standard root locus tools to plot the root locus of 1 + ρ1 G(−s)G(s) = 0
• Choose a set of optimal closed-loop eigenvalues from the root locus plot (this is a
choice of ρ and thus a decision on the trade-off between performance and control
effort)
• Compute the optimal state feedback gain by solving a pole placement problem.
This design technique is known as symmetric root locus technique, and the resulting
optimal state feedback controller is called a linear quadratic regulator (LQR).
4.4. Symmetric Root Locus Design 65
ẋ = Ax + bu + bn nx
y = cx + ny (4.15)
where nx and ny are white noise processes representing process and measurement noise,
respectively. Both noise processes are assumed to be wide-sense stationary, zero-mean,
Gaussian distributed and uncorrelated, with spectral densities Sx and Sy , respectively. A
review of stochastic processes is provided in the appendix.
The input vector bn describes how the process noise affects the individual state variables.
The observer equation for this stochastic plant model is
This equation again illustrates the two conflicting objectives of state estimation. To re-
duce the estimation error, the observer should be fast enough to track the state movements
induced by the process noise nx ; this requires large gains in the vector l. But because l
multiplies ny , large gains in turn lead to amplification of measurement noise. An optimal
balance - given the spectral densities of process and measurement noise - can again be ob-
tained by using a symmetric root locus technique. Let Gn (s) denote the transfer function
from process noise to measured output, i.e. Gn (s) = c(sI − A)−1 bn , and let q = Sx /Sy
denote the ratio of noise spectral densities. Then one can show the following.
66 4. Observer-Based Control
Theorem 4.4
places the eigenvalues of A + lc at the stable solutions (the solutions in the left half plane)
of
1 + qGn (−s)Gn (s) = 0 (4.17)
The term E[x̃T x̃] is a measure of the “size” of the estimation error. An observer designed
according to Theorem 4.4 to minimize the estimation error is also known as a Kalman
filter. An output feedback controller obtained by combining an LQR state feedback gain
with a Kalman filter is referred to as a linear quadratic Gaussian (LQG) controller. Some
insight into the nature of optimal state feedback and estimation in the above sense can
be gained by considering the limiting cases ρ → ∞, ρ → 0 and q → ∞, q → 0; this is
explored in Exercises 4.7 and 4.8.
Exercises 67
Exercises
Problem 4.1
ẋ = Ax + bu
y = cx
where,
0 1 0
A = b= c = −1 0
1 0 1
a) Find the state feedback gain matrix f to place the closed loop poles at [−0.5 ± 0.5j].
i) s = −10 ± 10j
ii) s = −1 ± 1j
d) Use SIMULINK to simulate the closed loop system with and without white noise of
power 0.001. Explain the results in terms of the frequency response.
e) Find observer poles to achieve a variance in the control input u < 0.3 while keeping
5% the settling time < 7.0 sec.
Problem 4.2
b) Calculate the input and initial conditions that produce zero output: y(t) = 0, t ≥ 0.
Hint: Use complex conjugates to find real input signals associated with complex
zeros.
Problem 4.3
Consider the two degree of freedom (2DOF) system in Figure 4.6, where x̂ is the state
estimate and r is the setpoint. The details of the observer are shown in Figure 4.7.
A state space realization of the system G(s) is
−2 1 1
A= , b= , c= 1 3
0 −3 1
r u System y
v
G(s)
Observer
x̂
a) Design the controller and the observer so that the closed-loop poles are at −3 ± 3j
and the poles of the observer are at −10 ± 10j.
b) Construct a state space model of the closed-loop system, in Figure 4.6, from r to
y. What can be said about the controllability of the state estimation error from the
input r?
Determine analytically the closed-loop transfer function from r to y
d) Consider the closed loop system as shown in Figure 4.8, where instead of l(ŷ − y)
we have l(ŷ − y + r) but the controller and the observer gain matrices are same as
found above.
Exercises 69
l
x̂(0)
x̂˙ R x̂ ŷ
b c -
r e x̂ u System y
Observer f
- G(s)
What is the relationship between the poles of the new closed-loop system from r to
y and the designed closed-loop and observer pole positions?
Hint: Use a similarity transformation to make the [2, 1] block of the closed loop
system matrix zero.
What are the zeros of the closed-loop system? Why do the open -loop zeros appear
in the closed-loop system?
e) Use Matlab to compare the step response of the observer states for the observer
configurations in Figures 4.6 and 4.8. Use f and l designed in part (a). What is the
effect of the open-loop zeros on the response?
x̂˙ = (A + bf + lc)x̂ − ly + wr
u = f x̂ + vr
r u System y
v
G(s)
Observer
x̂
f x̂
l
x̂(0) y
x̂˙ R x̂ ŷ
b c -
wr
A
g) How should v and w be chosen if e = r −y is used by the observer (as in Figure 4.8)?
Problem 4.4
ẋ = Ax + bu, y = cx
a) How are the poles and zeros of this system related to the numerator and denominator
of its transfer function?
Exercises 71
b) Under what conditions could such a system become unobservable under state feed-
back?
Hint: Consider the transfer function of a system with state feedback.
Problem 4.5
Assume that the controller, the observer and the gain v are the same as in Problem 4.3
parts (a) and (b).
r u System y
v x
G(s)
BP
f
Observer
a) Consider Figure 4.11 with the control loop broken at point ‘BP’. Calculate the
following open loop transfer functions analytically, and use the command linmod to
confirm your answers.
b) With the loop closed, write the transfer function from r to y as a function of Gyd (s),
Gur (s) and Gyu (s). What happens to the zeros of Gur (s) in closed loop?
when the disturbance d in Figure 4.11 has the constant value d0 and r = 0?
d) Assume that the system matrix of the plant is replaced by (1 + ǫ)A (represent-
ing model uncertainty). With r(t) = σ(t) and ǫ = 0.1 use Matlab to calculate
limt→∞ (y − r)
72 4. Observer-Based Control
s = −5, s = −3 + 3j, s = −3 − 3j
Hint: The integrator adds an extra state variable xI to the plant, so it is possible to
assign three pole locations with the composite controller f¯ = f fI
f) With the controller from part (e), how large is the error
lim (y − r),
t→∞
r 1 xI u System y
fI
- s G(s)
x̂
Observer
Problem 4.6
The first state variable is the position of the pendulum and the second state variable is
its speed. In the performance index
Z ∞
J= [z 2 (t) + ρu2 (t)]dt
0
a) Plot the symmetric root locus for this performance index using Matlab.
b) Use Matlab to design an optimal state feedback controller and constant gain pre-
filter v (similar to that in Figure 4.11) such that
lim (y − r) = 0
t→∞
ii) following a unit step input r(t) = σ(t), the 95% rise time is less than 2.5 and
the maximum input magnitude |u| is less than 3.0.
ẋ = Ax + bu + bn nx
y = cx + ny
Design a Kalman filter (using Matlab) to minimize E[x̃T x̃], where x̃ is the estimation
error.
d) Simulate the response of the closed-loop system to a unit step input in r(t). Use the
block band_limited_white_noise in Simulink with sample time=0.01 to simulate
the white noise signals.
e) Consider now the system with the deterministic disturbance d instead of the stochas-
tic noises nx and ny
ẋ = Ax + bu + bn d
y = cx
in a Kalman filter design, tune q such that with a pulse disturbance d(t) = σ(t) −
σ(t − 1) the maximum magnitude of y during the transient is 1.0 (with ny =0).
Describe the connection between the trade-off required here and the trade-off in
part (c).
74 4. Observer-Based Control
Problem 4.7
1
p(s) = ac (s)ac (−s) = a(−s)a(s) + b(s)b(−s)
ρ
where b(s)/a(s) is the open-loop transfer function and bc (s)/ac (s) is the closed-loop trans-
fer function under optimal state feedback.
s+1
G(s) =
s2 + 1
i) Show that of the four roots of p(s) only two remain finite as ρ → 0. Deter-
mine the value of the finite root of the closed loop system characteristic under
optimal state feedback control as ρ → 0.
ii) Show that the ‘expanding’ stable root is at the position
1
lim s = − √
ρ→0 ρ
iii) Give an explanation for the behaviour as ρ → 0 when the symmetric root
locus design method is used for optimal controller and Kalman filter design (in
a Kalman filter design this is equivalent to q → ∞).
Problem 4.8
a) Show that P (s) = ac (s)ac (−s) has 2m finite roots. What are their locations?
Exercises 75
b) Show that for large values of s the roots of the polynomial p(s) tend towards the
roots of
1
(−1)n s2n + (−1)m b2m s2m = 0
ρ
Hint: For large s, you can assume that a(s) ≈ sn and b(s) ≈ bm sm
Problem 4.9
b(s)
G(s) =
a(s)
a) Assume that the symmetric root locus polynomial p(s) has a root on the imaginary
axis. Show that this implies that
1
|a(jw)|2 + |b(jw)|2 = 0
ρ
b) Show that this can only be true if a(s) and b(s) can be factored as
c) Use (b) to show that p(s) has no roots on the imaginary axis if the system is
stabilizable and detectable.
Hint: Show that there is a contradiction between the symmetric root locus having
roots on the imaginary axis and the plant being stabilizable and detectable.
Read and understand the MATLAB script Task 2 Simulation LQR Design.m and Simulink
file Task 2 Simulation LQR.slx. The MATLAB files simulate and plot the closed-loop
response of the linear and nonlinear models of the Mini Segway.
a) Design an LQR state feedback controller to stabilize the system given the vectors
of initial conditions x0,1 = [0; 5π/180; 0; 0]T and x0,2 = [0; 9π/180; 0; 0]T .
b) Set the parameter sinewave in the m-file to 1 (this will enable a position reference
sinusoidal input). Re-tune the controller to minimize the mean square error to be
eRM S < 0.004 between the reference input and the position state
n
1X
(ri − xi )2
n i=1
Run the Matlab script Experiment parameters.m and open the Experiment simulink file
Experiment LQR.slx.
a) Implement the controller designed in 4.10b. Run the experiment and extract the
states from the simulink experimental model. Compare between the simulation
and the experiment. (Hint: export the data as a structure with time as a vector
[states;Control input;reference] and name the data ”expOut” so that you can use
implemented code).
Chapter 5
Multivariable Systems
The previous chapters introduced the basic concepts of state space methods for the anal-
ysis and design of control systems; however, the discussion was limited to single-input
single-output systems. A major advantage of state space methods is the fact that multi-
input multi-output systems can be handled within the same framework. In this chapter
we will extend the concepts and results introduced so far to MIMO systems.
u1 u2
servo excitation
valve current
tachogenerator
symmetric
GT load
turbine
synchronous
y1 generator
y2
y 1 = u2 y2
u1 G1 (s) G2 (s)
Figure 5.3 shows a parallel connection. If both systems have the same number of inputs
and outputs, this connection is possible and an equivalent system is G1 (s) + G2 (s).
The feedback loop shown in Figure 5.4 requires that the dimensions of y1 and u2 as well
as the dimensions of y2 and u1 , respectively, are the same. Solving
y1 = G1 (r + G2 y1 )
for y1 yields
y1 = (I − G1 G2 )−1 G1 r (5.1)
5.2. State Space Models 79
G1 (s)
y1
u y
y2
G2 (s)
Note that (5.1) and (5.2) are equivalent expressions for the closed-loop system in Figure
5.4.
u1
r G1 (s) y1
y2
G2 (s)
u2
was introduced in Chapter 1, where B and D have as many columns as there are plant
inputs, and the number of rows of C and D equals the number of plant outputs. For
strictly proper systems we have D = 0.
An experimentally identified state space model of a turbogenerator plant as described
above is introduced and explored in Exercise 5.3. We will now discuss the conversion
80 5. Multivariable Systems
between transfer function models and state space models of multivariable systems. Given
a state space model, it is straightforward to check - just as in the SISO case discussed in
Chapter 1 - that the associated transfer function is
Obtaining a state space model from a given transfer function matrix is less straightforward,
this will be illustrated by an example.
Example 5.1
Consider a plant with two inputs and two outputs, that can be described by the transfer
function matrix " #
1 2
s+1 s+1
G(s) = −1 1
(5.4)
(s+1)(s+2) s+2
As a first attempt to construct a state space realization of this model, we combine the
state space models (Aij , bij , cij ) of the four SISO transfer functions Gij (s) to obtain the
MIMO model. Controller forms of the four sub-models can be written down by inspection
as
The state vector of the MIMO model contains five state variables - the state variables
of each sub-model; the order of the MIMO model equals the sum of the orders of the
sub-models. One can check that substituting the matrices (A, B, C) of this model in (5.3)
yields indeed the transfer function matrix (5.4).
We will now present a different way of finding a state space realization of a transfer func-
tion model, known as Gilbert realization. This realization is only possible if all eigenvalues
of the model are distinct, i.e. if no sub-model has repeated eigenvalues. The idea is as
follows. First, a transfer function matrix G(s) can be rewritten as
1
G(s) = N (s)
d(s)
where the least common multiple d(s) of all denominator polynomials has been pulled
out as a common factor, leaving a “numerator” polynomial matrix N (s). Assuming deg
d(s) = r, a partial fraction expansion is
N1 N2 Nr
G(s) = + + ... +
s − λ1 s − λ2 s − λr
where the residuals Ni can be computed elementwise. Defining
ρi = rank Ni
Ni = C i B i , Ci ∈ IRl×ρi , Bi ∈ IRρi ×m
Pr
This realization of G(s) has ρ = i=1 ρi state variables.
This realization of G(s) in Example 5.1 has three state variables, two less than the realiza-
tion in (5.5). However, it is straightforward to check that substituting the matrices of this
model in (5.3) yields the same transfer function. From the discussion in Chapter 3, we
would therefore expect that at least two state variables in (5.5) are either uncontrollable
or unobservable. This is indeed the case, but before we can establish this, we need to
extend the concepts of controllability and observability to MIMO systems.
We will now extend the results of Chapters 2 and 3 to MIMO systems. Consider a system
with state space realization
where x(t) ∈ IRn , u(t) ∈ IRm , y(t) ∈ IRl , and the matrices (A, B, C, D) are of compatible
dimensions.
5.4. Controllability and Observability 83
Definition 5.1
The system (5.7) is said to be controllable if for any initial state x(0) = x0 , time tf > 0
and final state xf there exists a control input u(t), 0 ≤ t ≤ tf , such that the solution of
(5.7) satisfies x(tf ) = xf . Otherwise, the system is said to be uncontrollable.
The following can be shown in a way similar to the proofs of the corresponding SISO
results in Chapter 2.
Theorem 5.1
C(A, B) = [B AB . . . An−1 B]
Notice that in contrast to SISO systems where the controllability matrix C(A, b) is a n × n
square matrix, the controllability matrix C(A, B) of a MIMO system is n × mn, i.e. it has
more columns than rows. But just as for SISO systems, rank C(A, B) = n (full row rank)
ensures that the controllable subspace - the column space of C(A, B) - is the whole state
space. It is possible that the partial controllability matrix
Cr (A, B) = [B AB . . . Ar−1 B]
where r < n, has rank n. In this case the smallest integer νc for which rank Cνc (A, B) = n
is called the controllability index of the system (A, B).
84 5. Multivariable Systems
Definition 5.2
The system (5.7) is said to be observable if for any tf > 0 the initial state x(0) can be
uniquely determined from the time history of the input u(t) and the output y(t) in the
time interval 0 ≤ t ≤ tf . Otherwise, the system is said to be unobservable.
Theorem 5.2
The proof of Theorem 3.2 (the SISO version of the statement (iii) ⇔ (i) in Theorem 5.2)
was based on the fact that the equation
Y = O(c, A)x(0)
where Y is a column vector containing y(0), ẏ(0) etc., can be solved uniquely for x(0)
if the observability matrix O(c, A) has full rank. For MIMO systems the observability
matrix is ln × n, i.e. it has more rows than columns. However, multiplying the above from
the right by OT leads to
x(0) = (OT O)−1 OT Y
5.4. Controllability and Observability 85
which shows that rank O(C, A) = n is also in the MIMO case a sufficient condition for
the existence of a unique solution x(0), because rank O = n implies rank OT O = n.
The partial observability matrix Or (C, A) and the observability index νo are defined in the
same way as the corresponding controllability concepts.
The following can be shown in a way similar to the proofs of Theorems 2.5 and 3.4.
Theorem 5.3
Consider the system (5.7) and assume that rank C(A, B) < n and O(C, A) < n. There
exists a similarity transformation that takes the system into the form
x̄˙ co Āco 0 Ā13 0 x̄co B̄co
˙
x̄cō Ā21 Ācō Ā23 Ā24 x̄cō B̄cō
˙ = + u
x̄c̄o 0 0 Āc̄o 0 x̄c̄o 0
x̄˙ c̄ō 0 0 Ā43 Āc̄ō x̄c̄ō 0
x̄co
x̄
y = [C̄co 0 C̄c̄o 0] cō + Du (5.8)
x̄c̄o
x̄c̄ō
where the subsystem (Āco , B̄co , C̄co , D) is controllable and observable. The transfer function
from u to y is
The concepts of stabilizability and detectability are defined for MIMO systems in the
same way as for SISO systems. Likewise, the following are straightforward extensions of
Definition 3.3 and Theorem 3.6
Definition 5.3
Theorem 5.4
Returning to Example 5.1 and its Gilbert realization, it turns out that (5.3) is a minimal
realization, because - as shown in Exercise 5.9 - a Gilbert realization is always controllable
and observable.
A linear system is stable if all its poles are strictly inside the left half of the complex
plane. So far we have not yet extended the concept of poles and zeros to multivariable
systems. As Example 5.1 indicated, it may not be trivial to determine what the poles
of a multivariable system are. Clearly, in that example we can say by inspection of the
transfer function matrix that the poles are located at s = −1 and s = −2. What is
however not immediately clear is the multiplicity of the poles at these locations. Exactly
how many poles the system has at each of these locations plays for example a role when
a multivariable version of the Nyquist stability criterion is used to assess the closed-loop
stability of a control system. At first glance, it is not even clear how many poles the
system has at all - to answer that question we need to know the degree of its minimal
realization.
We will define poles of a multivariable system in terms of a minimal state space realization.
Let (A, B, C, D) be a realization of G(s), and recall that
Cadj(sI − A)B
G(s) = +D
det(sI − A)
The numerator Cadj(sI − A)B is a polynomial matrix, and it is clear that every pole
of G(s) must be a zero of det (sI − A) (i.e. an eigenvalue of A). However, not every
zero of det (sI − A) needs to be a pole (because there may be cancellations by zeros in
the numerator that occur in all subsystems). If this happens, the realization is either
uncontrollable or unobservable, which motivates the following definition.
Definition 5.4
Let (A, B, C, D) be a minimal realization of a system with transfer function matrix G(s).
The eigenvalues of A are called the poles of G(s).
In Section 4.2 we introduced the distinction between invariant zeros and transmission
zeros.
Definition 5.5
Definition 5.6
G(s) has a transmission zero at zi if there exists a vector u0 6= 0 such that G(zi )u0 = 0.
Recall from Section 4.2 that if a state space realization is minimal, invariant and trans-
mission zeros are the same, but if a realization has uncontrollable or unobservable modes,
then there will be invariant zeros that do not appear as transmission zeros.
Multivariable poles and zeros are usually defined in terms of the Smith-McMillan form of
a system which is discussed in the next section. One can prove that the poles according
to Definition 5.4 and the transmission zeros according to Definition 5.6 are the same as
those defined in terms of the Smith-McMillan form.
We will now introduce an alternative definition of multivariable poles and zeros. Consider
again the system of Example 5.1. The transfer function is
" 1 2
#
s+1 s+1 1 s + 2 2(s + 2) 1
G(s) = = = N (s)
−1 1 (s + 1)(s + 2) −1 s+1 d(s)
(s+1)(s+2) s+2
where d(s) is the least common multiple of all denominator polynomials. Obviously, the
poles of this system should be defined such that the poles of the individual elements of
G(s) are included; what is less obvious is with what multiplicity these poles - at s = −1
and s = −2 in the above example - should be counted as poles of the MIMO system. Even
less obvious are the zeros of this system. The multivariable poles and zeros are usually
defined in terms of a diagonal canonical form to which every transfer function matrix can
be reduced, known as the Smith-McMillan form.
Theorem 5.5
where the βi (s) are unique monic polynomials (i.e. the highest power of s has coefficient
1) such that βi (s) | βi+1 (s) (βi (s) divides βi+1 (s) without remainder) for i = 1, . . . , r, and
r is the rank of N (s) for almost all s.
The matrix Λ(s) is known as the Smith form of N (s). To prove the Theorem, we outline
how the Smith form can be constructed. By interchange of columns and rows, bring the
element of N (s) with least degree to the (1,1) position. Use elementary row operations
to make all entries in the first column below the (1,1) entry zero. Use elementary column
operations to make all entries in the first row except the (1,1) entry zero. These column
operations may bring back non-zero elements to the first column. In this case repeat the
above steps until all elements of row 1 and column 1 are zero except the (1,1) entry. This
yields a matrix of the form
β1 (s) 0
0 N1 (s)
where β1 (s) divides every element of N1 (s). Repeat the whole procedure on N1 (s). Pro-
ceeding in this way leads to the Smith form Λ(s) of N (s).
We will say that N (s) is similar to Λ(s) and use the notation N (s) ∼ Λ(s). Now returning
to the factorization
1
G(s) = N (s)
d(s)
of a transfer function matrix, it follows that
β
1 (s)
0
α1 (s) .
1 ..
G(s) ∼ Λ(s) = βr (s)
(5.10)
d(s)
αr (s)
0 0
where βi (s) | βi+1 (s) and αi+1 (s) | αi (s). This form of a transfer function matrix is called
the Smith-McMillan form.
Definition 5.7
Consider a transfer function matrix G(s) and its Smith-McMillan form as in (5.10).
Introduce the polynomials
The roots of the polynomials β(s) and α(s) are called the zeros and poles of G(s), respec-
tively. The degree of α(s) is called the McMillan degree of G(s).
It is clear from the above definition that the poles of G(s) include all poles of its individual
entries. Moreover, the Smith-McMillan form determines the multiplicity of each pole. The
McMillan degree is equal to the number of poles. It can be shown that the poles of G(s) in
the sense of Definition 5.7 are the eigenvalues of the state matrix of a minimal realization.
This also implies that the McMillan degree of G(s) is the order of its minimal realization.
Returning to the system of Example 5.1, we find that (see Exercise 5.6)
" 1 2
# " 1
#
s+1 s+1
− (s+1)(s+2) 0
G(s) = −1 1
∼
s+3
(s+1)(s+2) s+2 0 s+1
Thus, the poles are -1, -1 and -2. These are exactly the eigenvalues of the Gilbert re-
alization, and the order of the Gilbert realization is equal to the McMillan degree. The
system also has a zero at s = −3. This zero is not obvious from the transfer function
matrix G(s), but we have
1
− 2 −1
G(−3) =
− 12 −1
so -3 is a value of s where G(s) loses its full rank. The significance of such multivariable
zeros is illustrated in Exercises 5.11 and 5.12, where it is shown that a multivariable
right-half-plane zero has an effect similar to a right-half-plane zero of a SISO system.
In this section we introduce a multivariable feedback structure and discuss its stability.
Definition 5.8 A feedback loop is well-posed if all closed-loop transfer functions are
proper.
One can show that if either G1 (s) or G2 (s) is strictly proper, the feedback loop in Figure
5.5 is well-posed. From now on we will assume that the feedback loops we consider are
well-posed.
e1
u1 G1 (s)
G2 (s) u2
e2
Internal Stability
Assume we want to check the stability of the closed-loop transfer function from u1 to
the output of G1 in Figure 5.5. In the case of single-input single-output systems, the
closed-loop transfer function from u1 to the output of G1 is
G1 (s)
1 − G1 (s)G2 (s)
To check stability we could investigate the zeros of the denominator 1 − G1 G2 - this is in
fact what we do when we apply the Nyquist stability test or root locus techniques. Again,
a simple example illustrates that we need to be careful.
Example 5.2
Let
1 s−1
G1 (s) = , G2 (s) =
s−1 s+2
We can check that
s+1
1 − G1 (s)G2 (s) =
s+2
which has no zeros in the right half plane. However, the closed-loop transfer function is
G1 (s) s+2
=
1 − G1 (s)G2 (s) (s + 1)(s − 1)
which is unstable. The problem here is that an unstable pole-zero cancellation prevents
the unstable closed-loop pole from being detected. One might ask why this issue has not
been raised when the Nyquist stability test for SISO systems was introduced. The answer
is that when using classical design techniques for SISO systems, one can asssume that the
5.6. Multivariable Feedback Systems and Closed-Loop Stability 91
designer has explicit control over pole and zero locations and will usually avoid unstable
pole-zero cancellations. For MIMO systems, automated design techniques are often used
to find a controller, and in some cases internal stability must be checked.
We now turn to MIMO systems and allow u1 , u2 , e1 and e2 to be vector signals. Again
we want to check whether the feedback loop in Figure 5.5 is internally stable. Define the
transfer functions Hij (s), i = 1, 2 by
e1 H11 H12 u1
=
e2 H21 H22 u2
Definition 5.9
The feedback system in Figure 5.5 is called internally stable if and only if the transfer
function matrix
H11 (s) H12 (s)
H21 (s) H22 (s)
is stable.
This definition makes sure that no unstable pole-zero cancellation prevents an unstable
closed-loop pole from being detected. Note that internal stability requires that each of
the four transfer matrices Hij (s) is stable.
One can show that the feedback loop is internally stable if and only if
Now introduce
φ(s) = det(I − G1 (s)G2 (s))
and note that (ii) is equivalent to φ(s) having all its zeros in the open left half plane. With
these results we can now formulate a MIMO version of the Nyquist stability criterion. Let
n1 be the number of unstable poles of G1 (s) and n2 the number of unstable poles of G2 (s).
Observing that φ(s) has all zeros in the open left half plane if and only if the Nyquist plot
of φ(jω) encircles the origin n1 + n2 times counter-clockwise, we conclude the following.
Theorem 5.6 The feedback loop in Figure 5.5 is internally stable if and only if
(ii) the Nyquist plot of φ(jω) encircles the origin n1 + n2 times counter-clockwise.
92 5. Multivariable Systems
We will now extend the idea used in Section 1.1 for the construction of the controller form
state space realization to multivariable systems. The resulting multivariable controller
form will be the basis for a discussion of multivariable pole placement in the following
section.
In the SISO case, the starting point for the construction of a state space model from a
transfer function was the factorization of the transfer function model shown in Fig. 1.3.
For a l × m MIMO transfer function model G(s), we now introduce the factorization
where N (s) is a l×m and D(s) a m×m polynomial matrix. Such a representation is called
a (right) matrix fraction description (MFD) of G(s) (a left matrix fraction description is
G(s) = D−1 (s)N (s)). An MFD of G(s) is not unique, one choice is to start with the
factorization G(s) = N (s)/d(s) that was used for the Gilbert realization, and to define
D(s) = d(s)I. Other MFDs can be generated by multiplying G(s) from the right by a
polynomial matrix D(s) which is chosen such that all denominator polynomials in G(s)
are cancelled, leading to G(s)D(s) = N (s) where N (s) is a polynomial matrix.
We will now illustrate the construction of a multivariable controller form with an example.
Example 5.3
To follow the development in Section 1.1, we introduce the representation shown in Fig.
5.6 (the MIMO equivalent of Fig. 1.3), where the m-dimensional signal vector v(t) is
introduced as the output of the multivariable filter D−1 (s). In the SISO case, the idea was
to express input and output signal in terms of v(t), to assume that the highest derivative
of this signal (determined by the highest power of s in a(s)) is somehow available and
then to generate it by using a chain of integrators. To do something similar for a MIMO
system, we first need the equivalent of the highest derivative of v(t). Now v(t) is a signal
vector, and we should consider the highest powers of s in D(s). We will do this by writing
D(s) as
D(s) = Dh S(s) + Dl Ψ(s) (5.15)
5.7. A Multivariable Controller Form 93
where
s2 0
S(s) =
0 s3
v(t)
u(t) D−1 (s) N (s) y(t)
is a diagonal matrix that has the highest power of s in each column of D(s) as element in
the corresponding position, Ψ(s) contains the lower powers of s, and Dh and Dl are the
coefficient matrices for the highest and the lower powers of s, respectively. For the MFD
in (5.14) this decomposition is
0 −(s + 1)2 (s + 2) 0 −s3 − 4s2 − 5s − 2
D(s) = = 2
(s + 2)2 s+2 s + 4s + 4 s+2
s 0
1 0
0 −1 s2 0 0 0 −4 −5 −2
= + 0 s2 = Dh S(s) + Dl Ψ(s)
1 0 0 s3 4 4 0 1 2
0 s
0 1
The next step is to generate v(t) by repeatedly integrating the highest derivatives of each
element of this signal vector. We have
thus
S(s)V (s) = −Dh−1 Dl Ψ(s)V (s) + Dh−1 U (s) (5.17)
Equation (5.16) is the MIMO equivalent of (1.12) in the SISO case, where the plant model
was normalized such that the characteristic polynomial a(s) is monic. The assumption
that Dh is invertible corresponds in the scalar case to the assumption that a(s) is indeed
a polynomial of degree n (i.e. an 6= 0). Because
2
s 0 V1 (s) v̈1 (t)
S(s)V (s) = → ...
0 s3 V2 (s) v 2 (t)
and
s 0 v̇1 (t)
1 0 v (t)
1
V 1 (s)
Ψ(s)V (s) = 0 s2 → v̈2 (t)
V (s)
0 s 2 v̇2 (t)
0 1 v2 (t)
94 5. Multivariable Systems
the vector S(s)V (s) represents the highest derivatives of each element of v(t), whereas
Ψ(s)V (s) collects all lower derivatives. We can now proceed in the same way as in the
SISO case: we assume temporarily that the highest derivative of each element of v(t)
is available, and integrate repeatedly to obtain all lower derivatives. Instead of a single
integrator chain, we need however m integrator chains. The outputs of the integrators
are the entries of Ψ(s)V (s), which can be used for feedback through the gain matrix
Dh−1 Dl and added to Dh−1 U (s) to generate the vector S(s)V (s), according to (5.17). Like
in (1.13) for SISO systems, the output vector Y (s) can be obtained as a weighted sum of
the elements of Ψ(s)V (s): from Fig. 5.6 we have
where Nl is the coefficient matrix of N (s). This leads to the model in Fig. 5.7, which is
a multivariable version of Fig. 1.4. A controllable state space realization of this model is
developed in Exercise 5.13.
Nl y
Ψv
z }| {
R v˙1 R
v¨1 v1
u Dh−1 Sv
−
R v¨2 R v˙2 R
v...2
v2
| {z }
Ψv
Dh−1 Dl
For reasons that will become clear when pole placement by state feedback around this
realization is discussed in the next section, we will call this realization a multivariable con-
troller form. In the scalar case, we called the equivalent realization a canonical controller
form, reflecting the fact that for a SISO system, the procedure described in Section 1.1
leads to a unique realization. This is not the case for a MIMO system: the model in Fig.
5.7 depends on the choice of MFD (in the given example on (5.14)), which is not unique.
Because different MFDs will lead to different controller forms of the same system, this
multivariable controller form is not called a canonical realization.
5.8. Pole Placement 95
We will now develop a multivariable version of the pole placement technique that was
used in Exercise 1.6. In the previous section we saw that a multivariable controller form
can be constructed from an MFD
where D(s) is chosen such that its highest coefficient matrix Dh in (5.15) has full rank.
Referring to the model in Fig. 5.7, it is clear that the state variables of this system are
completely determined by the signal vector Ψ(s)V (s). We can therefore represent state
feedback through a gain matrix F as shown in Fig. 5.8. From Fig. 5.8, the closed-loop
transfer function is
Y (s)
GF (s) = = N (s)DF−1 (s)
Uv (s)
We have
U (s) = D(s)V (s) = Uv (s) + F Ψ(s)V (s)
thus
Uv (s) = (D(s) − F Ψ(s))V (s) = DF (s)V (s)
which shows that
DF (s) = D(s) − F Ψ(s) (5.18)
and after substituting from (5.15)
DF−1 (s)
u v
uv D−1 (s) N (s) y
x
F Ψ(s)
This should be compared with the observation made in Exercise 1.6, that for a SISO
model under state feedback around a controller form realization (Ac , bc , cc ) we have
where
det(sI − A) = sn + an−1 sn−1 + . . . + a0
is the open-loop characteristic polynomial. All coefficients (except for the highest power of
s) of the closed-loop polynomial can be assigned arbitrary values by the appropriate choice
of the state feedback gains fi , therefore the closed-loop eigenvalues can be arbitrarily
chosen. From (5.19) we see that the same is true for MIMO systems: by choosing the
gain matrix F we can arbitrarily change all coefficients in D(s) except for the highest
powers in each column. We would therefore expect that by choice of F we can obtain any
desired closed-loop polynomial. This is indeed the case, and we will derive a procedure
for computing a gain matrix F that achieves this. We also observe that - just as we found
for SISO systems - state feedback does not alter the numerator polynomial matrix N (s).
and state feedback u(t) = F x(t) is to be used to assign a set of distinct closed-loop
eigenvalues λ1 , λ2 , . . . , λn . These closed-loop eigenvalues are the solutions of
det DF (s) = 0
We will also need the closed-loop eigenvectors, i.e. the vectors hi , i = 1, . . . , n that satisfy
(Ac + Bc F )hi = λi hi
hi = Ψ(λi )pi
where pi is any vector in the nullspace of DF (λi ), i.e. a vector such that
DF (λi )pi = 0
F hi = gi , i = 1, . . . , n
or
F [h1 . . . hn ] = [g1 . . . gn ]
5.9. Optimal Control of MIMO Systems 97
Because the eigenvalues are assumed distinct, the eigenvectors are linearly independent
and we can solve for the state feedback gain
If the desired eigenvalues are not distinct, one can use generalized eigenvectors to obtain
a unique solution F to the pole placement problem.
The discussion leading to (5.20) shows that not only the closed-loop eigenvalues, but -
within limits - also the closed-loop eigenvectors can be assigned by state feedback. The
procedure can be summarized as follows.
and such that eigenvectors associated with a complex conjugate eigenvalue pair are
also a complex conjugate pair. With this choice, there exist vectors pi such that
hi = Ψ(λi )pi , i = 1, . . . , n.
4. Compute the state feedback gain matrix F from (5.20). This gain matrix satisfies
(Ac + Bc F )hi = λi hi , i = 1, . . . , n
Step 2 of this procedure indicates that a given set of closed-loop eigenvalues does not
uniquely determine the state feedback gain, but that we can choose the closed-loop eigen-
vectors as well - subject to the constraint (5.21). Recall that for SISO systems equation
(2.9) shows that the solution to the pole placement problem is unique. For MIMO systems,
the additional degree of freedom can be exploited to find a solution that improves certain
performance measures. One way of choosing closed-loop eigenvalues and eigenvectors is
briefly outlined next.
We will now extend the symmetric root locus design introduced in Section 4.4 to MIMO
systems. Consider a system with state space representation
where (A, B) is stabilizable and (C, A) is detectable. As in the SISO case, assume that
x0 6= 0 and that we wish to find a control input u(t) that brings the system quickly back
to x = 0. A MIMO version of the cost function (4.12) is
Z ∞
V = (y T (t)y(t) + uT (t)Ru(t))dt, R>0
0
where R ∈ IRm×m is a positive definite matrix that places a cost on the control effort. A
more general form of a cost function is
Z ∞
V = (xT (t)Qx(t) + uT (t)Ru(t))dt, R > 0, Q ≥ 0 (5.22)
0
n×n
where Q ∈ IR is a positive semidefinite weighting matrix on the control error. A
common choice of weighting matrices is
ρ1 0
Q = C T C, R=
...
0 ρm
where Q is fixed and the ρi can be used as tuning parameters as discussed below.
Next, define the matrix
A −BR−1 B T
H= (5.23)
−Q −AT
This 2n×2n matrix plays an important role in optimal control. By applying the similarity
transformation
0 I
T =
−I 0
one can show that H is similar to −H T , which implies that if λ is an eigenvalue of H then
so is −λ. A matrix with this property is called a Hamiltonian matrix, its eigenvalues are
symmetric about the imaginary axis. For a SISO system and the choices Q = cT c and
R = ρ, one can verify that the characteristic polynomial of H is exactly the polynomial
p(s) of the symmetric root locus design, that was introduced in Section 4.4 and provides
the optimal closed-loop eigenvalues. For MIMO systems, it turns out that the Hamiltonian
matrix (5.23) provides not only the optimal eigenvalues, but also the optimal eigenvectors.
Partition the eigenvectors of H as
hi h
H = λi i , i = 1, . . . , 2n (5.24)
gi gi
where hi contains the top n entries of the 2n-dimensional eigenvectors. Assume that H
has no eigenvalues on the imaginary axis (as in the SISO case, it can be shown that
eigenvalues on the imaginary axis represent unobservable or uncontrollable modes, which
are excluded by the assumption of stabilizability and detectability). Then the optimal
control law that minimizes the cost V in (5.22) takes the form of state feedback
and we have the following result, which is given here without proof.
Theorem 5.7
The optimal state feedback control law uopt (t) = F x(t) that minimizes the cost function
(5.22) places the eigenvalues of (A + BF ) at the stable eigenvalues of H. Moreover, the
eigenvectors of (A + BF ) are the partitions hi of the eigenvectors of H that are associated
with the stable eigenvalues.
Theorem 5.7 can be combined with the formula (5.20) to solve an optimal pole placement
problem: compute the stable eigenvalues and eigenvectors of H, and use (5.20) to compute
the optimal state feedback gain F . With the choice of weighting functions Q = C T C and
R = diag (ρ1 , . . . , ρm ), the tuning parameters ρi can be used to trade control performance
against control effort in each input channel independently. This is illustrated in Exercise
5.10 for the turbogenerator design problem.
i.e. that minimizes the noise power in the state estimation error, is obtained by solving
the dual version of the optimal pole placement problem, with the replacements
A, B → AT , C T
and
Q, R → Q e , Re
In practice, the noise covariance matrices are often not known explicitly, but are used as
tuning parameters. Common choices are Qe = BB T and Re = diag(r1 , . . . , rl ), where the
values of ri can be used to tune the speed of estimation for each output channel. This is
illustrated in Exercise 5.15 for the turbogenerator design problem.
100 5. Multivariable Systems
Exercises
Problem 5.1
b) Write a Matlab script to determine the eigenvalues of G(s) as s takes values along
the positive and negative imaginary axis, and to draw the characteristic loci.
c) Use the plot to determine the maximum value of the constant k such that with the
controller kI in negative feedback the system remains stable.
Hint: The commands frd and frdata are useful Matlab commands for working with
multivariable frequency responses.
Problem 5.2
S = (I + GK)−1
This function is called the sensitivity function. Show that S is also the transfer
function from do to y.
c) What is the transfer function from di to ug ? (This function is sometimes called the
input sensitivity function SI .) Show that GSI = SG. Note that SI = S for SISO
systems.
Exercises 101
di do
e u ug
r K(s) G(s) y
−
Problem 5.3
Consider the state space model of an experimental turbogenerator with the following
inputs and outputs:
Figure 5.10 indicates the inputs (u1 , u2 ), the outputs (y1 , y2 ) and the disturbances (d1 ,
d2 ) associated with the plant, G(s).
d1
u1 ug1
y1
u2 ug2 G(s)
y2
d2
An observer-based state feedback controller is to be designed, for this system, such that
the following operating requirements are satisfied.
Matrices A, B, C and D of the plant model can be obtained from Matlab file cs5_tgen.m.
Same model will be used in problems 5.10 and 5.15 also
a) Draw the block diagram of the closed-loop system. Explain the structure of the
closed-loop system. Show the dimensions of the matrices and the signals in the
system.
b) Create the closed loop system in Simulink. Tune the pole locations of the controller
using pole placement to achieve the following setpoint tracking specifications.
i) The 5% settling time for output y1 should be less than 3.0 and the maximum
change in y2 should be less than 0.15 for step changes to setpoint r1 .
ii) The 5% settling time for output y1 should be less than 2.0 and the maximum
change in y1 should be less than 0.15 for step changes to setpoint r2 .
Hints:
c) Modify the observer pole locations with pole placement so that the maximum
changes in y1 and y2 are both less than 1.0 in response to a unit step disturbance
d1 (t) = σ(t).
Problem 5.4
where k11 = 4.5, k22 =3.2 and τI = 0.1 in the configuration shown in Figure 5.9.
a) Use Matlab to show that the resulting closed loop system is stable.
Exercises 103
b) Use the Matlab command sigma to plot the singular values of the sensitivity function
S and the complementary sensitivity function T . Explain the low and high frequency
behaviour in terms of the controller.
c) Calculate the input and output direction corresponding to each singular value of
S at zero frequency. Explain the significance of these directions in terms of the
controller.
Problem 5.5
The plant
1 1 1
G(s) =
(s + 2)(s + 1) 1 + 2s 2
is subject to an input disturbance d, and a controller is to be designed for tracking a
reference r, see Figure 5.11.
d r
u
G(s) −
e
The maximum allowable magnitude of error and input in each channel are given by the
elements of the vectors
0.2 1.0
emax = , umax =
0.5 2.0
respectively, i.e.
|e1 (t)| < 0.2, |e2 (t)| < 0.5, |u1 (t)| < 1.0, |u2 (t)| < 2.0 ∀t > 0
The input disturbance vector di has its maximum allowable magnitudes at dmax and the
setpoint vector r has its maximum at rmax
0.1 4
dmax = , rmax =
0.1 0.4
The plant model is to be scaled so that with scaled variables ū, ȳ and d¯ the plant dynamics
can be described by
ȳ = Ḡū + Ḡd d¯
104 5. Multivariable Systems
ē = Rr̃ − ȳ
a) Calculate a diagonal scaling matrix for the error e so that the scaled error vector ē
at its maximum is
1 1
ēmax = √
2 1
b) Use diagonal scalings of the true inputs u and outputs y to determine Ḡ so that the
scaled output and input vectors ȳ and ū at their maximum magnitudes are
1 1 1 1
ȳmax = √ , ūmax = √
2 1 2 1
Interpret these scalings in terms of the importance assigned to the scaled input and
output variables.
Hint: y should have the same scaling as e in part (a).
c) Determine a diagonal scaling of the true disturbance di to determine Ḡd so that the
scaled disturbance d¯ at its maximum magnitude is
¯ 1 1
dmax = √
2 1
d) Calculate a diagonal scaling matrix R so that the scaled setpoint vector r̃ at its
maximum magnitude is
1 1
r̃max = √
2 1
Hint: Note that Rr̃ should be in the same scaled units as ē and ȳ.
e) Consider the scaled plant in closed loop with scaled setpoint changes r̃. What
constraint on the corresponding sensitivity function S ensures that the error does
not violate the constraints during the expected variation of r?
Problem 5.6
b) Use row and column exchange to bring the non-zero elements of G̃(s) with lowest
order into the position (1, 1).
c) Use elementary row and column operations to bring the positions (1, 2) and (2, 1)
to zero.
d) Divide the elements of the matrix obtained in (c) by d(s). (The resulting matrix is
the Smith-McMillan form of the system.) What are the system poles and zeros?
Problem 5.7
1 2 0 0 0 0 0
C= , D=
0 0 −1 0 1 0 0
Use the Matlab command minreal to compute the Kalman canonical form of this system.
Identify the uncontrollable and unobservable states from this form.
Hint: Read the online help of the command minreal.
Problem 5.8
Construct by hand the Gilbert realization of the system with transfer function
" #
−s 1
(s2 +3s+2) (s2 +3s+2)
G(s) = −s −1
(s2 +3s+2) (s+2)
106 5. Multivariable Systems
Problem 5.9
The controllability matrix C of the Gilbert realization of a system with m inputs and
outputs, n state variables and a denominator with r distinct roots can be factorised as
Im λ1 Im . . . λ1n−1 Im
C = BV where V = ... .. ... ..
. .
Im λr Im . . . λrn−1 Im
For the special case n = 2, m = 2 and r = 2 (that is, no root is repeated) show this and
find the matrix B. Hint: Show first that the state space model B matrix can be written
as
B1 B1 0 I
B= =
B2 0 B2 I
Use the fact that V has full rank to show that the Gilbert realization of a system is
controllable and, by duality, also observable.
Problem 5.10
For the turbogenerator plant from Problem 5.3 and with the controller structure in Figure
5.12, the cost function Z ∞
J= (xT Qx + uT Ru)dt
0
is to be minimized, where
A
d
r 1 u 1 y
FI B C
- s s
c) Use the Matlab command lqr and the Simulink model of the turbogenerator from
Problem 5.3 to find values of k2 and k3 with k1 = 1 such that in closed loop the
following conditions are satisfied:
d) Compare the LQR approach used here to the pole placement approach used in
Problem 5.3 in terms of achieved performance and time used in tuning the controller.
108 5. Multivariable Systems
⋆ Problem 5.11
e u
r K(s) G(s) y
a) Use Matlab to determine the poles pi and zeros zi of the plant G(s).
b) Show that for each zero zi of G(s) there is a vector yzi 6= 0 such that
T
yzi G(zi ) = 0
Calculate the vector yzi for each of the zeros zi of G(s). (The vectors yzi are called
the output directions of the zeros.)
c) Use the result from (b) to explain why internal stability requires that for all right
half plane zeros zi of G(s),
T
yzi G(zi )K(zi ) = 0
d) Show that (1 + K(s)G(s))−1 must not have any poles in the right half plane for the
closed loop to be internally stable. Use this fact and the result from (b) to show
that for all zeros zi in the right half plane we must have
T
yzi G(zi )K(zi )(1 + G(zi )K(zi ))−1 = 0
that is
T
yzi T (zi ) = 0
where
(Recall from Problem 5.2 that T (s) is the transfer function from r to y.) Show that
internal stability imposes the following constraints on the relationship between the
elements of T at s = zi for any right half plane zero zi
where α1 and α2 are constants depending on zi . What are their values for G(s)?
⋆ Problem 5.12
Here the constraints from Problem 5.11.e imposed by internal stability in the presence of
right half plane zeros and their effect on achievable closed-loop performance are further
investigated.
a) What constraints on the elements of T (s) for any 2 input 2 output system are
imposed by internal stability if in steady state the outputs must be equal to the
setpoints, that is as t → ∞, r → y?
Hint: Consider the conditions on T (s) as t → ∞, that is as s → 0
b) What constraints are imposed on the elements of T (s) for any 2 input 2 output
plant if it is desired that the two input-output channels of the system are completely
decoupled in closed loop, i.e. there is no movement in y2 following a change in r1
and no movement in y1 following a change in r2 ?
Hint: This is a constraint only on the off-diagonal elements.
where α1 and α2 are constants, must hold on the elements of T (s) if G(s) is a 2 × 2
transfer function with a right half plane zero at zi .
Show that the complementary sensitivity function
" #
1
0
T1 (s) = 1+0.01s 1
0 1+0.01s
110 5. Multivariable Systems
violates this constraint and the design constraints from parts (a) and (b), whereas
" #
zi −s
0
T2 (s) = zi +s zi −s
0 zi +s
does not violate these constraints. Sketch the step responses. While fulfilling the
above constraints, is it possible to find a T (s) with a better step response? Explain
how these constraints lead to a fundamental limitation imposed by right half plane
zeros.
⋆ Problem 5.13
w1 w2
u Dh−1 Go (s) Nl y
-
Dh−1 Dl
where
s 0
1 0
s2 0
2
D(s) = Dh S(s) + Dl ψ(s), S(s) = , ψ(s) = 0 s
0 s3
0 s
0 1
The signal vector v is defined as
u = D(s)v
The system G(s) can then be represented as in Figure 5.14, where w1 = S(s)v and
w2 = ψ(s)v.
Exercises 111
a) What is the transfer matrix G0 (s) in Figure 5.14 as a function of the matrices
introduced here?
c) What is the matrix Nl in Figure 5.14 for the given System G(s)?
Hint: What is y as a function of v and of ψ(s)v ?
⋆ Problem 5.14
For the following, use the state space realization of G(s) from Problem 5.13.
a) Use Matlab to calculate a state feedback gain matrix F that places the closed-loop
poles at
−2, −3 + 2j, −3 − 2j, −5 + 3j, −5 − 3j
hi = ψ(λi )pi
where
DF (s) = D(s) − F ψ(s)
d) Check that
DF (λi )pi = 0
holds for all closed-loop eigenvalues.
112 5. Multivariable Systems
Problem 5.15
r xI u p1 y
1 FI x G(s)
- s
x p2
F 1 B
s
n
C L
b) Tune the filter by changing the gains r1 and r2 , so that the following conditions are
fulfilled
Exercises 113
i) Conditions (i) to (iii) from Problem 5.10 part (c) are satisfied.
ii) Following a step disturbance d1 (t) = σ(t) or d2 (t) = σ(t) the maximum changes
in y1 and y2 should both be less than 0.6.
iii) The spectral density of the noise in u1 and u2 is less than 0.05 when mea-
surement noise is present at each of the outputs of the plant (parameters as
defined for the Simulink block band-limited white noise) with noise power
= 0.0001 and sample time = 0.01
Hint: You should modify the Simulink model of the controller to include the Kalman
filter. The closed-loop performance is not sensitive to small changes in Re - consider
values of r1 between 10−2 and 104 and variations in r1 /r2 of between 10−4 and 104
c) Use Matlab to plot the singular value plots of the closed-loop system from
and for the open-loop system when the loop is broken at points p1 and p2 .
Describe the relationship between the open-loop and closed-loop frequency responses,
and between the closed-loop frequency response and the time responses.
Problem 5.16
c) What are the gain and phase margins for the closed-loop system from w1 to z1 , and
what are the margins from w2 to z2 ?
114 5. Multivariable Systems
z1 w1
y1
− G(s)
y2
−
y1
− G(s)
y2
−
z2 w 2
d) The plant inputs are now perturbed by a constant matrix G̃ such that
ũ = G̃u, y = G(s)ũ
where
ũ1 = (1 + ǫ1 )u1 , ũ2 = (1 + ǫ2 )u2
and ǫ1 , ǫ2 are constants.
Determine an open-loop state space model Ã, B̃, C̃ with these errors, and the cor-
responding closed-loop model Ãcl and B̃cl .
e) Write down the characteristic polynomial for Ã(ǫ)cl . With ǫ2 = 0 how large can ǫ1
become before the system becomes unstable. With ǫ1 = 0, how large can ǫ2 become
before the system becomes unstable?
f) With ǫ2 = −ǫ1 = −ǫ how large can ǫ become, before the system becomes unstable?
g) What does this suggest about the value of SISO gain margins in determining the
robustness of MIMO systems?
Chapter 6
Digital Control
All controllers discussed so far were modelled by differential equations, transfer function
(matrices) or by linear state space models involving a first order vector differential equa-
tion. These representations have in common that the input and output signals of such
controllers are defined on the whole time axis - we will call such signals continuous-time
signals. These assumptions imply that controllers are implemented using analog electronic
devices. Today, most controllers are implemented digitally - the control law is realized
as code on a microprocessor. Microprocessors cannot handle continuous-time signals di-
rectly, they can only process sequences of numbers. A signal consisting of a sequence of
numbers is called a discrete-time signal. In this chapter we will develop tools for analyzing
and designing digital controllers, and we will see that many results on continuous-time
controller design can be translated into a discrete-time framework.
A/D
T
The block diagram in Fig. 6.1 shows the structure of a control loop with a digital controller.
The plant - represented by the transfer function G(s) - has continuous-time input and
output signals. An analog-to-digital (A/D) converter takes samples of the measured
feedback signal y(t) repeatedly every T seconds, where T is the sampling period, and
converts the sampled values into binary numbers. These are subtracted from samples
of the reference input (here assumed also binary) to generate a sequence of sampled
116 6. Digital Control
control errors. A control algorithm in the form of difference equations takes the sampled
control error as input and generates a sequence of values, which is then converted into
a continuous-time control signal u(t) by a digital-to-analog (D/A) converter and a hold
element. We assume that the operation of the A/D and D/A converters and of the
sample and hold devices are synchronized. The components inside the dashed box in Fig.
6.1 together form a digital controller.
The operation of a digital controller involves two types of discretization: a discretization of
time, and a discretization of signal values due to the conversion into binary numbers with
finite word length. In this chapter we will study the consequences of the discretization of
time. We will assume that the resolution of the AD converter is sufficiently high so that
quantization effects can be neglected. The effect of finite word length at fast sampling
rates is however explored in Exercise 6.12.
r(kT )
The conversion between continuous-time and discrete-time signals inside the digital con-
troller is shown again in Fig. 6.2. The signals entering and leaving the controller are
continuous-time signals, they represent plant input u(t) and output y(t). The control
algorithm itself operates on discrete-time signals, i.e. sequences of numbers u(kT ) and
y(kT ), which are the values of the continuous-time plant input and output sampled at
time instants t = kT , where k = 0, 1, . . .. The block labelled ”D/A hold” converts the
sequence of control inputs u(kT ) into a continuous-time signal u(t) that can be applied
to the plant input. Usually a zero-order hold is used for this purpose; a zero-order hold
generates a continuous-time signal according to
u(t) = u(kT ), kT ≤ t < (k + 1)T. (6.1)
The block labelled “difference equations” in Fig. 6.2 represents the dynamic behaviour
of the controller. The input to this block is the sequence of numbers y(kT ), and the
output is the sequence of numbers u(kT ), where k = 0, 1, . . . The factor T in the time
argument reflects the fact that these sequences are generated by sampling continuous-
time signals every T seconds. The processor that generates the control action is only
sequentially processing incoming numbers - the sampling period itself has no effect on the
sequence of numbers that is produced as output. (We assume however that the sampling
period is longer than the computation time required for generating the next output value).
When studying discrete-time systems, we will use a simplified notation and write y(k) and
u(k) instead of y(kT ) and u(kT ), but we will return to the latter notation when we are
interested in the interaction between discrete-time signals and continuous-time systems.
Formally we let the integer variable k run from −∞ to +∞, but - similar to our treatment
of continuous-time systems - we will assume that
x(k) = 0, k<0
i.e. all signals that we consider are zero for k < 0. Important discrete-time signals which
we will use frequently in this chapter are the discrete-time unit impulse
1, k = 0
δ(k) = (6.2)
0, else
Note that using the step function we can also write x(k) = e−ak σ(k).
Just as the dynamic behaviour of a continuous-time system can be represented by differ-
ential equations, the dynamics of a discrete-time system can be described - as the block
label in Fig. 6.2 suggests - by difference equations. A linear difference equation has the
form
If b0 6= 0 the output y(k) at time k depends on the input u(k) at the same time - the
system responds instantaneously. Physically realizable systems cannot do this, and when
118 6. Digital Control
Example 6.1
The use of difference equations for describing discrete-time dynamic behaviour is now
illustrated with an example. Consider the system
and the input signal u(k) = σ(k). Because input and output are zero for k < 0 we have
y(0) = 0. The solution for k > 0 is obtained by computing
k 0 1 2 3 4 5 ...
u(k) 1 1 1 1 1 1 ...
y(k) 0 1 0.5 0.75 0.625 0.6875 . . .
The output is oscillating and appears to converge towards a value between 0.625 and
0.6875. If the difference equation (6.6) is changed to
we have to solve
y(k) = 0.5y(k − 1) + u(k − 1)
yielding
k 0 1 2 3 4 5 ...
u(k) 1 1 1 1 1 1 ...
y(k) 0 1 1.5 1.75 1.875 1.9375 . . .
Now the solution is monotonically increasing and appears to converge to a value around
2. We will return to this example after introducing the z-transform.
The z-Transform
When dealing with continuous-time systems, we found it convenient to transform linear
differential equations into algebraic equations in frequency domain; this allows us to
6.1. Discrete-Time Systems - z-Transform and State Space Models 119
represent linear systems by transfer function models. The tool for doing this is the
Laplace transform, and a similar tool - the z-transform - is available for transforming
difference equations into algebraic frequency domain equations. Recall that the Laplace
transform of a continuous-time signal x(t) which is zero for negative time is defined as
Z ∞
L[x(t)] = X(s) = x(t)e−st dt
0
This transformation maps functions of time into functions of the complex variable s. Its
usefulness comes from the fact that (assuming x(0) = 0)
L[ẋ(t)] = sX(s)
which means that taking derivatives in time domain reduces to multiplication by the
complex variable s in frequency domain. For a discrete-time signal x(k) which is zero for
k < 0, we now define its z-transform as
∞
X
Z[x(k)] = X(z) = x(k)z −k (6.8)
k=0
This transformation maps functions of the discrete time variable k into functions of the
complex variable z. Equation (6.8) defines the z-transform as an infinite power series in
z −1 . Just as the Laplace integral converges for all s such that Re(s) > c if the limit
exists, the power series in (6.8) converges for all z such that |z| > r if the limit
lim rk |x(k)|
k→∞
exists. Because the signals we usually encounter in control systems do not grow faster
than exponentially, both Laplace transform and z-transform converge, so that the region
of convergence is usually not an issue of concern.
That the z-transform is as useful as the Laplace transform is due to the fact that
∞
X ∞
X ∞
X
Z[x(k − 1)] = x(k − 1)z −k = x(l)z −l−1 = z −1 x(l)z −l
k=0 l=−1 l=0
where a change of variables has been used together with the assumption that x(k) = 0
for k ≤ 0. Comparing the last expression with the definition of the z-transform, we find
or
(1 + a1 z −1 + . . . + an z −n )Y (z) = (b1 z −1 + . . . + bn z −n )U (z)
b1 z −1 + . . . + bn z −n b1 z n−1 + . . . + bn
G(z) = = (6.12)
1 + a1 z −1 + . . . + an z −n z n + a1 z n−1 + . . . + an
This shows that the z-transform can be used in a way similar to the Laplace transform to
obtain a transfer function model of a discrete-time system. Before we study the dynamic
behaviour of discrete-time systems, we first compute the z-transform of the three discrete-
time signals introduced before: unit impulse, unit step and exponential function.
Unit Impulse
The discrete-time unit impulse was defined in (6.2), and from the definition we have
immediately
∞
X
Z[δ(k)] = δ(k)z −k = 1 (6.13)
k=0
Unit Step
From (6.3) we obtain
∞
X 1 z
Z[σ(k)] = z −k = −1
= (6.14)
k=0
1−z z−1
Exponential Function
The z-transform of signals described by (6.4) can be computed as
∞ ∞
X
−ak −k
X 1 z
Z[x(k)] = e z = (e−a z −1 )k = −a −1
= (6.15)
k=0 k=0
1−e z z − e−a
This result can be used to compute the static gain of a system with transfer function
G(z). The step response is
z
Y (z) = G(z)
z−1
and using (6.16) yields
z
y(∞) = lim (z − 1)G(z) = G(1) (6.17)
z→1 z−1
provided the limit exists. Thus, the static gain of a discrete-time system is obtained by
evaluating its transfer function at z = 1.
Example 6.2
z −1 1
G(z) = =
1 + 0.5z −1 z + 0.5
1
G(z) =
z − 0.5
and a static gain G(1) = 2, confirming the time domain solution. The use of pulse transfer
functions is further explored in Exercise 6.1.
122 6. Digital Control
δ(k) 1
1 1
s σ(t) σ(k) 1 − z −1
1 1
−at −akT
s+a e σ(t) e σ(k) 1− e−aT z −1
1 T z −1
s2 tσ(t) kT σ(k) (1 − z −1 )2
a (1 − e−aT )z −1
s(s + a) (1 − e−at )σ(t) (1 − e−akT )σ(k)
(1 − z −1 )(1 − e−aT z −1 )
1 T e−aT z −1
(s + a)2 te−at σ(t) kT e−akT σ(k) (1 − e−aT z −1 )2
s 1 − (1 + aT )e−aT z −1
(s + a)2 (1 − at)e−at σ(t) (1 − akT )e−akT σ(k)
(1 − e−aT z −1 )2
b z −1 sin bT
s 2 + b2 sin bt σ(t) sin bkT σ(k) 1 − 2z −1 cos bT + z −2
s 1 − z −1 cos bT
s2 + b2 cos bt σ(t) cos bkT σ(k) 1 − 2z −1 cos bT + z −2
1
ak σ(k) 1 − az −1
Thus G(z) is the z-transform of the impulse response y(k) = g(k), and we have
∞
X
G(z) = g(k)z −k
k=0
In Exercise 6.9 it is shown that for an arbitrary input signal u(k) the output is given by
k
X
y(k) = g(l)u(k − l) (6.18)
l=0
1 v(k)
u(k) a(z) b(z) y(k)
Using the fictitious signal v(k), it is possible to construct the discrete-time version of
the controller canonical form. In contrast to continuous-time systems, here it is not a
124 6. Digital Control
chain of integrators but a chain of time delay blocks that forms the core of the simulation
model. This reflects the fact that we are dealing with difference equations rather than
differential equations. Accordingly, the state equation is a first order vector difference
equation instead of a vector differential equation. A state space model of a discrete-time
MIMO system has the form
where Φ and Γ denote the discrete-time system matrix and input matrix, and C and D the
measurement and feedthrough matrix, respectively. Note that for a SISO system D = 0 if
b0 = 0 in (6.5) or (6.12); for a MIMO system, the b0 s of all entries of the transfer function
matrix must be zero. A way of constructing the discrete-time controller canonical form
of a SISO model from the representation in Fig. 6.4 is shown in Exercise 6.4.
Since the state equation is a difference equation, its solution can be constructed recursively.
Assume that x(0) and an input sequence u(k), k = 0, 1, 2, . . . for the model (6.19) are
given. Starting at time k = 1 we have
The first term on the right hand side of the last equation represents the unforced (zero-
input) response, whereas the second term describes the forced (zero-initial-state) response.
Thus, the relationship between state space model and transfer function of a discrete-time
system is
G(z) = C(zI − Φ)−1 Γ + D (6.21)
6.1. Discrete-Time Systems - z-Transform and State Space Models 125
Note that this relationship is the same as that for continuous-time systems. In particular,
the poles of the pulse transfer function are eigenvalues of the system matrix Φ.
Definition 6.1 An unforced system x(k + 1) = Φx(k) is said to be stable if for all
x(0) = x0 , x0 ∈ IR we have x(k) → 0 as k → ∞.
x(k) = Φk x(0)
Assuming that it is possible to diagonalize Φ, the elements of the response x(k) are
combinations of terms λki , where λi are the eigenvalues of Φ. Since stability requires that
all solutions go to zero as k goes to infinity, we need
|λi | < 1, i = 1, . . . , n
We thus have
Theorem 6.1 A discrete-time system x(k + 1) = Φx(k) is stable if and only if all
eigenvalues of Φ are strictly inside the unit disc.
From 6.21 and the relationship between pulse transfer function poles and eigenvalues of
Φ, we expect that a pulse transfer function is stable if its poles are strictly inside the unit
disc. This is indeed confirmed in the next section where stability of sampled-data systems
is discussed.
Definition 6.2
The discrete-time system (6.19) is said to be controllable if for any initial state x(0) = x0 ,
kf > 0 and final state xf there exists a control sequence u(k), 0 ≤ k ≤ kf , such that the
solution of (6.19) satisfies x(kf ) = xf . Otherwise, the system is said to be uncontrollable.
126 6. Digital Control
Definition 6.3
The discrete-time system with state space model (6.19) is said to be observable if there
exists a kf > 0 such that the initial state x(0) can be uniquely determined from the input
sequence u(k) and the output sequence y(k) in the interval 0 ≤ k ≤ kf . Otherwise, the
system is said to be unobservable.
Whereas the controllability and observability Gramians take different forms for discrete-
time systems, the controllability and observability matrices have the same form as for
continuous-time systems.
To show this for the controllability matrix, consider the solution of (6.19) given by (6.20)
at k = n
n−1
X
n
x(n) = Φ x(0) + Φn−l−1 Γu(l)
l=0
n n−1
= Φ x(0) + Φ Γu(0) + . . . + Γu(n − 1)
n
= Φ x(0) + Cd U
where
Cd = [Γ ΦΓ . . . Φn−1 Γ]
U = [uT (n − 1) uT (n − 2) . . . uT (0)]
From
x(n) − Φn x(0) = Cd U
we have
Theorem 6.2 There exists a sequence {u(0), u(1), . . . , u(n − 1)} that takes the system
(6.19) from any initial state to any desired state in no more than n steps if and only if
Cd has rank n.
Note that the part of the state space that can be reached from the origin (the controllable
subspace) is spanned by the columns of the controllability matrix.
To derive an equivalent result for observability of discrete-time systems, we consider with-
out loss of generality instead of (6.19) the unforced system
x(k + 1) = Φx(k)
y(k) = Cx(k)
6.2. Sampled Data Systems 127
because - as in the continuous-time case - the effect of a known input sequence can be
eliminated from the model. Assume that y(0), y(1), . . . , y(n − 1) are known. We then
have
y(0) = Cx(0)
y(1) = Cx(1) = CΦx(0)
..
.
y(n − 1) = CΦn−1 x(0)
or in vector notation
Y = Od x(0)
where
C
CΦ
Od = ..
.
CΦn−1
is the discrete-time observability matrix and
Theorem 6.3 The discrete-time system (6.19) is observable if and only if the observ-
ability matrix Od has rank n.
Stabilizability and detectability of discrete-time systems can be defined in the same way
as for continuous-time systems, also the Kalman decomposition and the concept of a
minimal realization.
In (6.8) we defined the z-transform of a discrete-time signal x(k) without making any
reference to the possibility that this sequence may have been obtained by sampling a
continuous-time signal x(t) at time instants t = kT, k = 0, 1, 2, . . . Therefore the notion
of a sampling period T played no role when we introduced the pulse transfer function. We
now return to the continuous-time origin of the discretized signals, and we will explore
the relationship between the Laplace transform of a continuous-time signal x(t) and its
sampled version x(kT ). For this purpose, we will now have a closer look at the sampling
process. A useful mathematical model for the process of sampling a signal is given by
the impulse sampler shown in Fig. 6.5. In this model the sampling process is viewed as
modulating a train of delta impulses δ(t), δ(t − T ), δ(t − 2T ) etc., i.e. the sampled signal
128 6. Digital Control
δ
x(t) x∗ (t)
T 2T 3T t T 2T 3T t
is represented by a train of delta impulses where the impulse weight at time kT is equal
to the value of the sampled signal x(kT ) at this sampling instant.
The output x∗ (t) of the impulse sampler is then given by
∞
X
∗
x (t) = x(kT )δ(t − kT ) (6.22)
k=0
we find that these two transforms are equivalent if we define the complex variable z as
z = eT s (6.24)
Example 6.3
To illustrate the above relationship between the complex variables of Laplace and z-
transform, respectively, consider the exponential function
K
Z[x(kT )] =
1 − e−aT z −1
Note that we now include the sampling period T . Again, this signal would be a component
of the step response of a discrete-time system with a pole at z = e−aT . Comparing the pole
locations of the continuous-time and the discrete-time system shows that the continuous-
time pole in the s-plane is mapped into the z-plane according to (6.24).
0.5π/T
1 0.6π/T 0.4π/T
0.7π/T 0.3π/T
0.1
0.8
0.2
0.3
0.6 0.8π/T 0.2π/T
0.4
0.5
0.6
0.4
0.7
0.9π/T 0.1π/T
0.8
0.2 0.9
Imaginary Axis
π/T
0
π/T
−0.2
0.9π/T 0.1π/T
−0.4
−0.8
0.7π/T 0.3π/T
−1 0.6π/T 0.4π/T
0.5π/T
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
Real Axis
z 2 + a1 z + a0
where p
a1 = −2e−ζωn T cos 1 − ζ 2 ωn T
and
a0 = e−2ζωn T
Figure 6.6 shows the curves of constant ζ and ωn T for the sampled system. This figure
relates the damping ratio and the natural frequency of a discrete-time second order system
to its pole locations. See also Exercise 6.3.
We are now in a position to address the problem of designing a digital controller for the
control loop shown in Fig. 6.1. The controller is a discrete-time system, while the plant
inputs and outputs are continuous-time signals. There are two ways of dealing with this
hybrid nature of the control loop:
1. One can first design a continuous-time controller C(s), using standard design tech-
niques for continuous-time systems, and then try to find a pulse transfer function
D(z) that approximates the dynamic behaviour of the controller C(s).
6.3. Sampled-Data Controller Design 131
C(s)
e(t) u(t)
r(t) D(z) zoh G(s) y(t)
- T
Note that the input e(t) and the output u(t) of the controller are continuous-time signals.
Assume that a continuous-time controller C(s) has been designed that meets the design
specifications. The task is then to find a pulse transfer function D(z) such that the dashed
block - which is a continuous-time system due to the presence of the sampler and the hold
- approximates C(s) sufficiently well so that the design specifications are also met with a
digital controller. There are several ways of constructing discrete-time approximations of
continuous-time transfer functions; one of them - known as Tustin approximation - will
be discussed next.
We begin with the discrete-time approximation of the basic building block of continuous-
time systems, an integrator. Thus, consider a pure integral controller
Z t
u(t) = e(τ )dτ
0
At time t = kT we have
Z kT Z kT −T Z kT
u(kT ) = e(τ )dτ = e(τ )dτ + e(τ )dτ
0 0 kT −T
or Z kT
u(kT ) = u(kT − T ) + e(τ )dτ
kT −T
The second term on the right hand side represents the area under e(t) marked in Fig. 6.8.
For the Tustin method, e(t) is approximated between the last two samples by a straight
line. This yields
T
u(kT ) = u(kT − T ) + (e(kT − T ) + e(kT ))
2
132 6. Digital Control
1111
0000
0000
1111
e(t)
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
kT − T kT t
or in z-domain
T −1
U (z) = z −1 U (z) +(z E(z) + E(z))
2
Thus we find the pulse transfer function from e to u as
U (z) T 1 + z −1 1
= · = 2 1−z −1
E(z) 2 1 − z −1 T 1+z −1
Similarly one can check that for a controller with continuous-time transfer function
K
C(s) =
s+a
the same approximation as above leads to a discrete-time controller
K
D(z) = 2 1−z −1
T 1+z −1
+a
More generally, making the substitution
2 1 − z −1 2 z−1
s= · = · (6.25)
T 1 + z −1 T z+1
in every term in the controller transfer function that contains s yields a pulse transfer
function D(z) that is based on the above trapezoidal approximation of an integral.
If a discrete-time controller D(z) is implemented in the control system in Fig. 6.7, the
value of the control signal u(t) is held constant by the zero order hold block connected to
the controller output until the next value is available, so that the continuous-time control
input consists of steps as shown in Fig. 6.2. Due to the fact that physical systems have a
limited bandwidth, the effect of this control input on the plant will have the same effect as
a low-pass filtered version of this step-shaped signal. This is illustrated in Fig. 6.9, where
it can be seen that the hold operation introduces a time-delay of approximately T /2.
Thus, while the idea was to approximate the behaviour of the continuous-time controller
C(s) by a discrete-time controller, the discretized controller D(z) is actually emulating
the effect of a controller
T 2/T
C̃(s) ≈ C(s)e− 2 s ≈ C(s)
s + 2/T
6.3. Sampled-Data Controller Design 133
u
u(kT )
1 2 3 4 5 6 7 8 kT
Taking this time delay into account when designing the continuous-time controller gives a
reasonable prediction of the effect of the zero-order hold when the sampling rate is slower
than 20ωb . To keep this effect small, the sampling frequency ωs = 2π/T should be much
higher than the system bandwidth ωb - experience suggests that the sampling frequency
should be at least 20 ∼ 30ωb .
H(z)
e(k) u(k)
r(k) D(z) zoh G(s) y(k)
- T
For this purpose, the sampler has been moved “around the loop” to the plant output. The
task is now to find a discrete-time transfer function H(z) that describes the behaviour of
the system inside the dashed box, i.e. the plant with hold and sampler. We will see that
- in contrast to the Tustin approximation - an exact discretization is possible in this case,
in the sense that the discretized model describes the behaviour of the continuous-time
plant exactly in the sampling instants.
For the derivation of the discretized model we will use a state space realization of the
134 6. Digital Control
is a realization of G(s). To describe the evolution of the state vector x(t) from time kT
to kT + T , we recall that the solution of the state equation at time t, starting at t0 < t,
is given by Z t
x(t) = eA(t−t0 ) x(t0 ) + eA(t−τ ) Bu(τ )dτ
t0
Now we can exploit the fact that the shape of the input signal between two sampling
instants is known exactly: due to the presence of the zero-order hold we have
Defining
Z T
AT
Φ=e , Γ= eAt Bdt (6.27)
0
this becomes
x(kT + T ) = Φx(kT ) + Γu(kT ) (6.28)
This is the discrete-time state equation (6.19) that was introduced earlier, now obtained
by sampling a continuous-time system at sampling instants kT . Given a continuous-time
model, the discrete-time matrices Φ and Γ can be computed from (6.27). Note that the
discrete-time model describes the continuous-time system exactly in the sampling instants
kT . The reason for this is that due to the presence of the zero-order hold the shape of
the input signal between sampling instants is known exactly - which is not the case for
the Tustin approximation. No time delay is produced in this approach - the discrete-
time model describes the continuous-time system exactly in the sampling instants even
when the sampling frequency is low. There are however other considerations that demand
sampling to be sufficiently fast, this will be discussed in the next section.
Having obtained a discrete-time plant model, one can design a controller directly by
using discrete-time versions of continuous-time techniques such as discrete-time root locus
design, pole placement or discrete-time optimal state feedback and state estimation.
6.3. Sampled-Data Controller Design 135
If the system is controllable, the state feedback gain matrix F can be chosen to obtain
any desired closed-loop poles. Moreover, one can compute the state feedback gain that
minimizes the quadratic cost function
∞
X
Vd = xT (k)Qx(k) + uT (k)Ru(k) (6.30)
k=0
by using a discrete-time version of the method discussed in Section 5.9. The dual problem
of finding an optimal state estimator can be solved with the same tools.
Examples of discrete-time controller design are provided in Exercises 6.5, 6.6 and 6.8.
Deadbeat Control
While discrete-time versions exist for all continuous-time design methods, there is a
discrete-time control scheme for which no continuous-time equivalent exists. This scheme
is called deadbeat control because it brings a system from any initial state to the origin in
at most n sampling periods. Given a controllable system (6.28), the idea is to use state
feedback to assign all closed-loop poles to the origin. Assuming that the model (6.28) is
in controller canonical form, the closed loop matrix then has the form
0 1 0 ... 0
0 0 1 . . . 0
Φ + ΓF = ... ... ... ..
.
0 0 0 . . . 1
0 0 0 ... 0
(Φ + ΓF )n = 0
x(n) = (Φ + ΓF )n x(0)
it is clear that any initial state is taken to the origin in no more than n steps.
It should however be mentioned that deadbeat control is mainly of theoretical interest:
the only design parameter is the sampling time T , and when a short sampling time is
136 6. Digital Control
chosen the control signal that would theoretically result in a settling time of n sampling
periods is typically very large. In practice, the actuators would be driven into saturation,
resulting in a longer settling time.
Deadbeat control is explored in Exercise 6.7.
Zeros of Sampled-Data Systems We have seen earlier in this section that when input
and output signals of a continuous-time system are sampled at t = kT, k = 0, 1, 2, . . .
using the Tustin approximation or exact (zero-order hold) discretization, a pole at s = pi
is mapped into a pole at z = epi T of the resulting discrete-time system. Unfortunately,
it is not possible to give an equally simple formula for the mapping of zeros. However,
results are available for the limiting case where the sampling interval goes to zero; these
can also be used to approximate the mapping of zeros when sampling is sufficiently fast.
Consider a continuous-time system with transfer function
(s − z1 )(s − z2 ) . . . (s − zm )
G(s) = K0 , m<n (6.31)
(s − p1 )(s − p2 ) . . . (s − pn )
This system has n poles and m zeros and therefore a pole excess of n − m. Assume we
want to find a pulse transfer function
b0 + b1 z −1 + . . . + bn z −n b0 z n + b1 z n−1 + . . . + bn
H(z) = = (6.32)
1 + a1 z −1 + . . . + an z −n z n + a1 z n−1 + . . . + an
that describes the behaviour of G(s) sampled with sampling time T . It is straightforward
to check that when using the Tustin approximation, i.e. making the substitution (6.25),
in the resulting pulse transfer function we will always have b0 6= 0. Thus the sampled
system has n zeros, n − m more than the underlying continuous-time system. One can
also verify that when we use the model
where Φ and Γ are computed from (6.27), the discrete-time system will have n zeros if
D 6= 0 and n − 1 zeros if D = 0.
This means that the sampling process can introduce additional zeros; these are referred
to as sampling zeros. A useful fact, stated here without proof, is the following. Assume
that the continuous-time system (6.31) is sampled with sampling time T , and that exact
(zero-order hold) discretization results in the discrete-time model (6.32). Then one can
show that when the sampling time approaches zero, m zeros of the discrete-time model
will approach ezi T , where zi , i = 1, 2, . . . , m are the continuous-time zeros. The remaining
n − m − 1 sampling zeros will approach the zeros of the system shown in Fig. 6.11. This
issue is explored further in Exercise 6.11.
6.3. Sampled-Data Controller Design 137
1
zoh sn−m T
Example 6.4
′ K1 z −3 K1
H (z) = −1
= 2
= z −2 H(z)
1 − az z (z − a)
and using (6.18) with the discrete-time impulse response h(kT ) we obtain
∞
X
y(kT ) = h(lT ) Im ejω(k−l)T
l=0
∞
X
jωkT
= Im e h(lT ) e−jωlT
l=0
jωT jωkT
= Im H(e )e
= Im |H(ejωT )| ejφ ejωkT
where
φ(ejωT ) = arg H(ejωT )
is the phase angle of H(ejωT ). Taking the imaginary part in the last equation yields
The response to a sampled sinusoidal input is therefore again a sampled sinusoidal signal
with the same frequency. Amplitude and phase shift of the response are determined by
the magnitude and phase, respectively, of the pulse transfer function evaluated at the
given frequency. The difference between a continuous-time and a discrete-time frequency
response is that the continuous-time frequency response is obtained by evaluating G(s)
at s = jω, whereas the pulse transfer function H(z) is evaluated at z = ejωT . Moreover,
the fact that
ejωT = ej(ωT +2π)
implies that the discrete-time frequency response is periodic. Travelling from −jπ/T
to +jπ/T along the jω axis in the s plane corresponds to travelling from −jπ to +jπ
anticlockwise along the unit circle. Further excursions along the jω axis merely retrace
6.4. Frequency Response of Sampled-Data Systems 139
this same path. For a discrete-time frequency response it is therefore sufficient to consider
the frequency range
π π
− ≤ω≤
T T
Aliasing
To explore the periodicity of the discrete-time frequency response further, we consider
again the Laplace transform of a sampled signal. Recall the representation of the sampling
process in (6.22) as modulation of a train of delta impulses
∞
X ∞
X
∗
x (t) = x(t) δ(t − kT ) = x(kT )δ(t − kT )
k=−∞ k=−∞
Note that because x(t) = 0 for t < 0, starting the summation at −∞ does not change
(6.22). The pulse train is a periodic signal and can be developed into a Fourier series
∞
X ∞
X
δ(t − kT ) = αl ejlωs t
k=−∞ l=−∞
From Z ∞
jlωs t
L x(t)e = x(t)e−(s−jlωs )t dt = X(s − jlωs )
0
This is the superposition on the jω-axis of the transform of the original continuous-time
signal and its replications shifted by lωs , l = ±1, ±2, . . . If the signal x(t) contains no
frequency components at ω > ωb and if ωb < ω2s , then the replicated spectra do not overlap,
and the original signal could theoretically be recovered by ideal low pass filtering. Fig.
6.12 illustrates this important result, which is in fact a statement of Shannon’s sampling
theorem: a signal x(t) can only be exactly reconstructed from its sampled version x∗ (t)
if ωs > 2ωb , i.e. the sampling frequency is more than twice the signal bandwidth. The
maximum signal bandwidth
ωs π
ωN = =
2 T
for which a reconstruction is possible, is called the Nyquist frequency. If the signal contains
frequency components above the Nyquist frequency, the sampling process will bring high
|X(jω)|
x(t)
t −ωb 0 ωb ω
(a) Amplitude spectrum of continuous-time signal
|X ∗ (jω)|
x(t)
−ω ωb
t −ωs ωs b 0 ωs ωs ω
−2 2
(b) Amplitude spectrum of sampled signal, ωs > 2ωb
|X ∗ (jω)|
x(t)
t −ωs ωs ωs
0 ω
− ω2s 2
(c) Amplitude spectrum of sampled signal, ωs < 2ωb
frequency components back into the range below the Nyquist frequency and thus distort
the original signal. This effect is known as aliasing. A common measure called anti-
aliasing is to low-pass filter a signal before sampling with a filter bandwidth below the
Nyquist frequency but well above the signal bandwidth.
The frequency response of a sampled-data system including a zero-order hold is discussed
in Exercises 6.14 and 6.15.
Exercises
Problem 6.1
2z 2 − 6z
G(z) =
2z 2 − 6z + 4
use polynomial division to determine the first 3 values of the impulse response of
the system.
Problem 6.2
a) Show that
∞
X ∞
X
−k
x(k)z − x(k − 1)z −k = X(z) − z −1 X(z)
k=0 k=0
b) Use the result from (a) to prove the Final Value Theorem for discrete-time signals
Problem 6.3
a) For a given sampling time T , sketch in the z-plane the regions corresponding to the
following regions in the s-plane: Re(s) and Im(s) refer to the real and imaginary
parts of a complex point s respectively.
ii) |Im(s)| < 0.5 Tπ , |Im(s)| < 1.0 Tπ and |Im(s)| < 1.5 Tπ
Exercises 143
b) For a second order system, sketch lines in the z-plane corresponding to constant
damping ratios ζ = 0, 0.5 and 1.
c) For a second order system, sketch lines in the z-plane corresponding to constant
π
natural frequencies ωn = 2T and Tπ .
Problem 6.4
construct a state space model in controller canonical form, similar to the controller form
for continuous system. Sketch the corresponding block diagram.
Problem 6.5
b
G(s) = , a>0
s+a
This plant is to be controlled by a controller K as shown in figure 6.13.
r K G y
-
c) The plant is now to be controlled with the discrete-time proportional feedback con-
troller Kd (z) = Kpd and sampling time T = 0.2. Using the discrete-time plant
transfer function Gd (z) from (b), sketch the discrete-time root locus with a = 2 and
b = 1. At what value of Kpd does the system become unstable?
144 6. Digital Control
d) Using the continuous-time and discrete-time root loci, compare the closed-loop be-
haviour as Kp → ∞ and Kpd → ∞ for the system with continuous-time and discrete-
time controllers, respectively. Describe qualitatively why this behaviour occurs.
Hint: Generate the discrete-time controller using the command c2d with the option
’tustin’.
Problem 6.6
1
G(s) =
s2
This system is to be controlled as shown in Figure 6.13.
b) Sketch the root loci for the continuous-time plant and for the discrete-time plant,
each with proportional controllers. Can the plant be stabilized using either a
continuous-time or discrete-time proportional controller?
Kd (z) = Kpd (1 + Td (1 − z −1 ))
d) Use rltool to find Kpd2 and α so that a settling time < 2.0 and a damping ratio
ζ ≈ 0.7 are achieved. Use a sampling time T = 0.2 s.
Hint: You can use the design constraints settling time and damping ratio in rltool.
Exercises 145
e) Use rltool again to calculate Kp and α, so that the settling time < 4.5 and ζ ≈ 0.707.
This time use a sampling period of T = 0.5 s. Is it possible to achieve a settling
time < 2.0 with a discrete-time PD controller with T = 0.5 s?
Problem 6.7
a) Use Matlab to design a state feedback dead-beat controller Kdb for the discretized
double integrator Gd (z) from Problem 6.6. You can either use the function acker
if available in your Matlab distribution, or place where you set the desired pole
locations to values close to 0 (place requires distinct pole locations, e.g. [0 1e-5]).
b) Use Matlab and Simulink to construct the closed loop system consisting of plant
G(s), the discrete state feedback controller Kdb , a sampler and a zero order hold
unit, and a disturbance at input of G(s). Plot the step response of the output of
G(s) to this disturbance.
Problem 6.8
This problem refers to the state space model of the turbogenerator and the continuous-
time controller and observer designed in Problem 5.15. You can use the script file
cs5_tgen_LQG.m to generate the controller..
b) Simulate the response of the closed-loop system with both discretized controllers
(for Ts = 0.02, 0.5) to a disturbance d1 (t) = σ(t). Compare your response with the
one obtained in Problem 5.15.
c) Discretise the augmented plant (that is, including the integrators) for Ts = 0.02 and
Ts = 0.5. For each of these discretized systems design an LQG controller with the
same performance weights as in 5.15.
Hint: Use the command dlqr to design a discrete LQR controller and Kalman filter.
d) Compare the performance achieved with discrete time controllers of part a) and
c) for disturbance d1 (t) = σ(t).
146 6. Digital Control
e) Simulate the response of the controller from exercise 5.15 and the disturbance d1 =
σ(z), but with a delay of 0.25 s between the controller output and the plant input.
Compare your answer with the response obtained in part b).
Hint: Use either Pad’e approximation, if you are performing the simulation in Mat-
lab, or transport delay block if you are using Simulink.
Problem 6.9
Show that the response y(k) to an input signal u(k) with z-transform
∞
X
U (z) = u(k)z −k
k=0
is given by
k
X
yk = g(l)u(k − l)
l=0
Problem 6.10
a) Compute the impulse response of the system for the first 4 sampling instants.
b) Compute the system response to an input signal u = {5, 0, −1, 2}. Consider zero
initial conditions.
Problem 6.11
a) What zeros would you therefore expect, when the continuous-time systems
Exercises 147
(i)
s+1
G(s) =
s2 + 4s + 1
(ii)
1
G(s) =
s3 + s2 + s
are sampled at high sampling rates with a zero order hold?
Hint: Calculate a state space model of the exact discretization of 1/sn−m
b) Check the results from (a) with the Matlab function c2d. For which sampling
times is the sampled system (ii) minimum phase? For which sampling times is it
non-minimum phase?
Problem 6.12
Practical representations of digital signals have finite word-length. Explain why this can
lead to problems if high sampling rates are used.
Hint: What happens to the locations of the poles and to the system matrix of a discrete-
time state space model if the sampling time approaches zero?
Problem 6.13
b) Compute the frequency ω3 of a sinusoidal signal u(t) = sin(ω3 t), which when applied
as an to G(s) it will generate an output y(t) with amplitude 3 dB smaller than the
steady-state gain.
Problem 6.14
Figure 6.14 shows a delta sampler with a zero-order hold (zoh) unit.
δ
u∗ (t)
u(t) zoh y(t)
T
b) Calculate the Laplace transform Y (s) of the output signal as a function of U (s).
Y (s) 1
Gzoh (s) = ∗
= (1 − e−T s )
U (s) s
d) Show using (c) that the frequency response of the zoh unit is
Sketch the bode diagram of the sample and hold unit for T = 1s.
Problem 6.15
1
G(s) =
s+1
δ
u∗ (t)
u(t) zoh G(s) y(t)
T
b) Simulate the response to the input signal in (a) for the frequencies ω = 2 rad/s,
ω = 20 rad/s and ω = 60 rad/s when the sampling period is T = 0.05 s. Interpret
the simulation with the help of the results in (a) and Problem 6.14.d. For the
simulation, use the Simulink model cs6_simzoh.mdl
The gyroscope sensor, which is used to measure the angular rate α̇, has a low pass filter
with cut-off frequency of 20Hz (DLPF mode=4).
a) Open the Simulink file Experiment LQR.slx, go into the gyroscope mask and set
DLPF mode=0. This sets the low pass filter’s cut-off frequency to 250Hz. What
should we expect when experimenting using this setting?
b) Experiment with caution using this setting to track the same reference trajectory
presented in the LQR tracking problem (Problem 4.11). Compare the results of the
experiments with the different gyroscope filter settings (0 and 4), is the tracking
performance comparable? How could we solve this problem without changing the
gyroscope filter setting?
Read and understand the MATLAB script lqrd comparison.m and Simulink files
Task 5 Simulation LQR Contin.slx and Task 5 Simulation LQR Discrete.slx. The
MATLAB files simulate and plot the closed-loop response of the continuous time and
discrete time linear model.
a) Sample the continuous time controller implemented in Problem 4.10 and compare
between the discrete time and continuous time response. (Hint: use the MATLAB
command lqrd.)
i) Check the documentation of the MATLAB commands dlqr and lqrd. What
is the difference between these commands?
ii) Is it possible to design a discrete time observer by dualism using lqrd?
A discrete time observer is to be used to output the filtered states by combining the state
estimates from the model and the measured states from the sensors. The gyroscope’s
150 6. Digital Control
low pass filter setting should be set to zero(DLPF mode=0) since now the states are
filtered using an observer. Read and understand the MATLAB script Task 6 Simulation
LQG Design.m and Simulink file Task 6 Simulation LQG Discrete.slx. The MATLAB
files simulate and plot the closed-loop response of the linear model with observer state
feedback.
a) Discretise the continuous model in the MATLAB file using zero order hold.
b) Implement the discrete time observer based state feedback in the Simulink file,
tune the observer and controller gains using dlqr command. Compare between the
simulation with and without the observer for tracking a sinusoidal position reference.
(Hint: make sure that the observer poles are at least 3 times faster than the controller
poles, except for the fast controller pole).
Run the MATLAB script Experiment parameters.m and open the experiment simulink
file Experiment LQG.slx.
a) Run the experiment and extract the states from the simulink experimental model
(Hint: export the data as a structure with time as a vector [states;Control in-
put;reference] and name the data ”expOut” so that you can use implemented code)
and compare between the simulation and the experiment for sine wave tracking
with:
System Identification
All methods for designing and analyzing control systems that have been introduced in
this course are model based, i.e. it is assumed that a dynamic plant model in the form
of a transfer function or state space realization is available. In practice, obtaining such
a model can take up a significant part of the time required for solving a given control
problem. In Exercises 1.10, 1.1 or 2.6, state space models of physical systems were derived
from the knowledge of the underlying physical laws. However, such physical modelling
can become difficult if the plant dynamics are complex and not well understood. An
alternative is to obtain a plant model experimentally by measuring the response of the
plant to suitable test signals; this approach is known as system identification. Because
no physical insight into the plant behaviour is utilized, this method is also referred to as
black box modelling. This chapter gives an introduction to the basic concepts of system
identification.
Transfer functions and state space models are called parametric models, because the com-
plete information about the dynamic behaviour of the system is contained in a fixed
number of parameters - e.g. the coefficients of the transfer function polynomials or the
entries of the matrices of a state space model. Nonparametric models on the other hand
are representations of plant dynamics that cannot be expressed by a finite number of
parameters, such as the shape of the frequency response or the impulse response of the
system. In this chapter we will consider the experimental identification of parametric
plant models. Because the input-output data used for system identification are usually
sampled-data sequences, the identified models are discrete-time models. After introducing
the concept of a linear regression model in Section 7.1, we will discuss the identification
of transfer function models for SISO systems in Section 7.2. Sections 7.3 and 7.4 then
present two recently developed techniques for identifying SISO and MIMO state space
models.
152 7. System Identification
Assume that we are observing a process - characterized by a quantity y(t) - at time in-
stants t = 0, T, 2T, . . . where T is a given sampling time. Using the shorthand notation
introduced earlier and suppressing the sampling time, this observation yields a data se-
quence y(k), k = 0, 1, 2, . . . Assume further that the process variable y(k) at time instant
kT depends linearly on the values of other variables m1 (k), m2 (k), . . . , mn (k) which are
known and available at the same time. A linear process model is then
where two column vectors - the vector of regression variables m(k) and the parameter
vector p - have been introduced. Given a set of measured data y(l) and m(l), l =
0, 1, . . . , k, we can now pose the least squares estimation problem: find the parameter
vector p that best fits the data, in the sense that the sum of the squared errors
k
X
V (p) = e2 (l) (7.3)
l=0
is minimized.
Example 7.1
is applied to the unknown system and to the model G(z), yielding the actual system
response y(k) and the response ŷ(k) predicted by the model, respectively. The modelling
error is
e(k) = y(k) − ŷ(k)
This problem can be expressed in the form of a linear regression model by writing the
difference equation of the system as
y(k) = −ay(k − 1) + bu(k − 1) + e(k)
a
= [−y(k − 1) u(k − 1)] + e(k)
b
= mT (k)p + e(k) = ŷ(k) + e(k)
yk
System
uk ek
-
ŷk
G(z)
This equation is called the normal equation associated with the given estimation problem.
If M has full column rank, the matrix M T M ∈ IRn×n is non-singular, and we can compute
p = (M T M )−1 M T Y (7.7)
However, the parameter vector p obtained from (7.7) will satisfy (7.5) only if the system
is indeed exactly governed by a linear difference equation of the assumed order, and if
there are no measurement errors. In real life problems, neither condition will be met, so
that p will not satisfy (7.5) but only (7.4) with E 6= 0. The best we can then achieve is
to find the parameter vector p that is associated with the “smallest modelling error” -
in other words the closest approximation we can get with this model in the presence of
measurement errors. The following result is derived in Exercise 7.1.
Theorem 7.1
The sum of square errors V (p) (7.3) is minimized if the parameter vector satisfies the
normal equation (7.6).
E = Y − Mp
or
e(0) y(0) m1 (0) mn (0)
.. .. .. .
. = . − . p1 − . . . − .. pn
e(k) y(k) m1 (k) mn (k)
Introducing the column vectors
mi (0)
ϕi = ... ,
i = 1, . . . , n
mi (k)
we thus have
E = Y − ϕ1 p 1 − . . . − ϕn p n
If the true system can be accurately described by the assumed linear model and if there are
no measurement errors, then Y would be in the space spanned by the vectors ϕ1 . . . ϕn .
In real life, unmodelled features of the system and measurement errors will in general
lead to a vector Y that is outside the data space. The estimation problem can then be
interpreted as searching for the linear combination of the vectors ϕ1 . . . ϕn that comes
7.2. Estimation of Transfer Function Models 155
Y
ϕ2
p2 E
Ŷ
p1
ϕ1
closest to the vector Y , i.e. that minimizes the squared error E T E. This is illustrated in
Fig. 7.2 for the special case n = 2: what we are looking for is the projection Ŷ of the
vector Y onto the space (a plane if n = 2) spanned by the measurement vectors ϕi , and
Ŷ is the vector closest to Y if the error E is orthogonal to this space (plane). But E is
orthogonal to this space if it is orthogonal to each of the measurement vectors, i.e. if it
satisfies
ϕTi E = 0, i = 1, . . . , n
This can be written in a more compact form as M T E = 0 or
M T (Y − M p) = 0
We will now apply the idea of least squares estimation to identify a transfer function model
of a system from measured input and output data. Thus, assume that data sequences
{u(0), . . . , u(k)} and {y(0), . . . , y(k)} are available and that we want to find the pulse
transfer function that gives the best fit between input and output data. In order to apply
the above estimation technique, we need to fix the number of parameters, in this case the
order of numerator and denominator polynomial of the estimated transfer function. We
will initially assume for simplicity that the system to be identified can be modelled by
the difference equation
which means there is no autoregressive component in the output (the ai ’s are assumed to
be zero) - we will later remove this assumption. The difference equation can be written
156 7. System Identification
in regressor form as
b1
..
ŷ(k) = [u(k − 1) u(k − 2) . . . u(k − n)] . = mT (k)p
bn
where we used the shorthand notation ui for u(i). For a solution (7.7) to the estimation
problem to exist, this n × n matrix needs to be invertible. This requirement places a
condition on the input sequence {u(0), . . . , u(k)}. For example, it is obvious that with
a constant input sequence {1, . . . , 1} the rank of M T M will be one and a solution for a
model with more than one estimated parameter will in general not exist. To explore this
further, we will use the empirical autocorrelation of the data sequence {u(k)}, defined as
k
1X
cuu (l) = lim u(i)u(i − l)
k→∞ k
i=1
we find that
1 T
lim M M = Cuu (n)
k→∞ k
Thus, for sufficiently long data sequences (when the end effects can be neglected and we
can consider all sums as taken from 1 to k) we may interpret the matrix M T M as a scaled
version of the empirical covariance Cuu (n) of the input signal.
7.2. Estimation of Transfer Function Models 157
Persistent Excitation
The condition that the matrix M T M must have full rank is called an excitation condition
- the input signal must be sufficiently rich to excite all dynamic modes of the system.
We have seen that for long data sequences we can consider the matrix Cuu (n) instead of
M T M . The following definition provides a measure for the richness of an input signal.
Definition 7.1
A signal u(k) is said to be persistently exciting of order n if the matrix Cuu (n) is positive
definite.
The next result is useful for checking whether a signal is persistently exciting of a given
order.
Theorem 7.2
zu(l) = u(l + 1)
It is straightforward to prove the above Theorem by observing that the sum on the left
hand side of the inequality can be rewritten as
k
1X
lim (a(z)u(l))2 = aT Cuu (n)a
k→∞ k
l=0
For step functions, sine waves and white noise, this is discussed in Exercises 7.2, 7.3 and
7.4. White noise is therefore commonly used as test input when a linear model is to be
identified experimentally.
ARX Models
The model (7.8) was based on the assumption that the present output does not depend
on past outputs, i.e. there is no autoregressive component in the output. We now remove
this assumption and consider the model
which corresponds to the difference equation model introduced for discrete-time systems
in the previous chapter. Such a model is called an ARX model, where ARX stands
for AutoRegressive with eXogenous input. The results discussed in this section can be
extended to ARX models by using the empirical cross-covariance function
k
1X
cuy (l) = lim u(i)y(i − l)
k→∞ k
i=1
where the matrices Cyy and Cuy are defined in the same way as Cuu .
The least-squares estimation of an ARX model from measured data is illustrated in Ex-
ercise 7.5.
7.3. Subspace Identification of State Space Models 159
The discussion in the previous section was limited to SISO transfer function models.
The approach presented there can be extended to cover MIMO models in ARX form, but
working with multivariable systems is usually more convenient in a state space framework.
In this and the following section we present a recently developed approach to estimating
SISO and MIMO state space models.
To introduce the idea, we begin with a SISO state space model
Note that we use the same symbol Γ for discrete-time SISO and MIMO models. In a SISO
model it represents a column vector. Now assume that x(0) = 0, and consider the impulse
response of the above model, i.e. the response to the input u(k) = δ(k). Observing that
for k > 0 we have x(k) = Φk−1 Γ, we find that the impulse response g(k) is given by
0, k<0
g(k) = d, k=0
k−1
cΦ Γ, k > 0
The values {d, cΓ, cΦΓ, . . .} of the impulse response sequence are called the Markov
parameters of the system.
Turning now to multivariable systems, we first need to clarify what we mean by the
impulse response of a MIMO model
We can apply a unit impulse to one input channel at a time and observe the resulting
response at each output channel
0
..
.
0 g1i (k)
yδi (k) = ...
uδi (k) =
δ(k)
→
0
gli (k)
.
.
.
0
Here δ(k) is placed in the ith entry of the input vector, while all other inputs are zero.
An entry gji (k) in the output vector represents the response at output channel j to a unit
160 7. System Identification
impulse applied at input channel i. The complete information about the impulse responses
from each input to each output can then be represented by the impulse response matrix
g11 (k) . . . g1m (k)
g(k) = ... ..
.
gl1 (k) . . . glm (k)
Introducing the notation
Γ = [Γ1 Γ2 . . . Γm ], D = [d1 d2 . . . dm ]
where Γi and di denote the ith column of the matrices Γ and D, respectively, we find that
with input uδi (k) we have xδi (k) = Φk−1 Γi for k > 0, and at k = 0 we have yδi (k) = di .
Combining the responses to impulses at all input channels, we obtain
0, k<0
g(k) = [yδ1 (k) . . . yδm (k)] = D, k=0 (7.11)
k−1
CΦ Γ, k > 0
The impulse response describes input-output properties of a system, and we would expect
it to be independent of a particular coordinate basis that has been chosen for a given
state space model. This seems to contradict the fact that the impulse response in (7.11)
is given in terms of the matrices (Φ, Γ, C) of a state space model, which clearly depend
on the choice of coordinate basis. However, it is easily checked that applying a similarity
transformation T - which yields a realization (T −1 ΦT, T −1 Γ, CT ) - will not change the
impulse response.
Assume the number of samples is sufficiently large so that k > n, where n is the order
of the state space model. Note that at this point we know nothing about the system
apart from its impulse response. In particular, we do not know the order n of the system.
Important in this context is the rank of the matrix Hk . To investigate this, we first observe
that we can factor Hk as
C
CΦ
Hk = . [Γ ΦΓ . . . Φk−1 Γ] = Ok Ck
. .
CΦk−1
Here Ok and Ck are the extended observability and controllability matrices, respectively,
of the model (7.10), where the term “extended” is added because the number of samples
k is greater than the expected order n of the system. Assuming that we are interested in
estimating a model (7.10) that represents a minimal realization of a system, i.e. if (Φ, Γ)
is controllable and (C, Φ) is observable, then we have
rank Ok = rank Ck = n
which implies
rank Hk = n (7.13)
Thus, we can obtain the order from the measured data by computing the rank of Hk .
Hk = M L, M ∈ IRlk×n , L ∈ IRn×mk
such that
rank M = rank L = n
This can be done using singular value decomposition as explained below. Note that this
factorization is not unique. Finally, we define the matrices M and L to be the extended
observability and controllability matrices
Ok = M, Ck = L
162 7. System Identification
The first l rows of M therefore represent the measurement matrix C, and the first m
columns of L form the input matrix Γ. To find the state matrix Φ, define
CΦ
Ōk = ... = Ok Φ
CΦk
Note that we can generate Ōk from measured data by factorizing the larger Hankel matrix
Hk+1 and removing the first l rows from Ok+1 . Multiplying the above from the left by
OkT we obtain
OkT Ok Φ = OkT Ōk
The above describes a procedure for constructing the matrices Φ, Γ and C from measured
impulse response data. At this point a question arises: we know that a state space model
of a given system is not unique but depends on the coordinate basis chosen for the state
space. One could therefore ask where this choice was made in the above construction.
The answer is that the factorization Hk = M L is not unique, in fact if M and L are
factors of rank n and if T is an arbitrary nonsingular n × n matrix, then it is easy to see
that M T and T −1 L are also rank n factors of Hk . With this latter choice we obtain
Õk = Ok T, C˜k = T −1 Ck
From Chapters 2 and 3 we know however that these are the observability and controllabil-
ity matrices, respectively, of the model obtained by applying the similarity transformation
T to (Φ, Γ, C). This shows that a choice of coordinate basis is made implicitly when Hk
is factored into M L.
σi
1 2 3 4 5 6 i
If σr+1 is much smaller than σr , we say that the numerical rank of Hk is r. Another way
of looking at this is to write (7.14) as
v1T
σ1 0 0 ... 0
. v2T
Hk = q1 q2 . . . qkl .. 0 ... 0
..
.
0 σp 0 . . . 0 T
vkm
164 7. System Identification
where qi and vi represent the ith column of Q and V , respectively. Expanding the right
hand side column by column, we obtain
p r p
X X X
T T
Hk = σi qi v i = σi qi v i + σi qi viT = Qs Σs VsT + Qn Σn VnT
i=1 i=1 i=r+1
where Qs ∈ IRkl×r and Vs ∈ IRkm×r are the matrices formed by the first r columns of Q
and V , respectively. The matrices Qn ∈ IRkl×(kl−r) and Vn ∈ IRkm×(km−r) are similarly
formed by the remaining columns. If the singular values σr+1 , . . . , σp are much smaller
than σr , the last term on the right hand side can be neglected and we have
Hk ≈ Qs Σs VsT
or
v1T
σ1 0 0 ... 0
... v2T
= Qs Σ1/2 1/2 T
H k ≈ q 1 q 2 . . . q r 0 . . . 0 .. s Σs Vs
.
0 σr 0 ... 0
vrT
where r is the numerical rank of Hk . Now taking r as the estimated model order n̂, we can
define the extended observability and controllability matrices Or ∈ IRkl×r and Cr ∈ IRkm×r
as
Or = Qs Σs1/2 , Cr = Σ1/2
s Vs
T
A state space model (Φ, Γ, C) of order n̂ can then be obtained as in the case of ideal
measurements.
The identification of a state space model from the impulse response is illustrated in Ex-
ercise 7.6.
The method outlined in the previous section assumes that the measured impulse response
is available. In practice it is usually better to use more general data, obtained for example
by applying a white noise input signal. We will now present a technique for identifying
state space models without using the measured impulse response, referred to as direct
subspace identification.
Consider again the model (7.10). Beginning at time k, the output at successive time
instants is given by
y(k) = Cx(k) + Du(k)
y(k + 1) = CΦx(k) + CΓu(k) + Du(k + 1)
y(k + 2) = CΦ2 x(k) + CΦΓu(k) + CΓu(k + 1) + Du(k + 2)
..
.
7.4. Direct Subspace Identification 165
Yk = Oα x(k) + Ψα Uk (7.15)
where
C D 0 0 ... 0
CΦ
CΓ D 0 ... 0
Oα = CΦ2
,
Ψα = CΦΓ CΓ D 0
.. .. ... ...
. .
CΦα−1 CΦα−2 Γ CΦα−1 Γ . . . CΓ D
Assume that a sufficient number of measurements has been collected so that we can form
the input and output data matrices
Y = [Y1 Y2 . . . YN ], U = [U1 U2 . . . UN ],
Y = Oα X + Ψ α U (7.16)
In this equation only U and Y are known; note that U ∈ IRmα×N . We assume that the
number (N + α − 1) of measurements - which is required to fill the above matrices - is
large enough such that α can be chosen greater than the expected model order, and N
such that N > mα. To identify a state space model, we need to estimate the matrices
Oα (from which C and Φ can be extracted) and Ψα (from which we get D and Γ).
N (U) = {q : Uq = 0}
UΠ = U − UU T (UU T )−1 U = 0
Note that Π is constructed from measured data only. Here we assumed that (UU T ) is
invertible, a condition for this is that the input is persistently exciting of order mα.
Multiplying equation (7.16) from the right by Π then yields
YΠ = (Oα X + Ψα U)Π = Oα XΠ
The left hand side is known (because it is constructed from measured data), thus the
product Oα XΠ is known. Observing that Oα ∈ IRlα×n and XΠ ∈ IRn×N , we can obtain
an estimate of the extended observability matrix by determining the numerical rank n̂
of the matrix YΠ and by factoring it into a left factor with n̂ columns and full column
rank, and a right factor with n̂ rows. This can be done by computing the singular value
decomposition
YΠ = Qs Σs VsT + Qn Σn VnT ≈ Qs Σ1/2 1/2 T
s Σs Vs
and by taking
Oα = Qs Σs1/2
Here again the order of the system is estimated by inspection of the singular values - this
time of the data matrix YΠ. From Oα the matrices C and Φ can be obtained as described
in the previous section.
QTn YU −R = QTn Ψα
where U −R = U T (UU T )−1 denotes the right inverse of U. The left hand side of this
equation and Qn are known, so that Ψα is the only unknown term. The matrices Γ and
D can then be obtained by solving a linear system of equations, details are omitted.
Exercises 167
Exercises
Problem 7.1
introduced in (7.3). Show that V (p) is minimized by the parameter vector p = p̂ where
p̂ = (M T M )−1 M T Y
Problem 7.2
(z − 1)σ(k) = 1 at k = −1
and
(z − 1)σ(k) = 0 at k ≥ 0
b) If for a given signal u(k) there is at least one polynomial a(z) of order n such that
k
1X
lim (a(z)u(l))2 = 0
k→∞ k
l=0
what does this indicate about the order of persistent excitation u(k)?
a(z) = z − 1
and the results from (a) and (b) to find the greatest possible order of persistent
excitation of a step function.
d) Calculate the empirical covariance matrix Cuu (1) for the step function. Use this
matrix to show that the order of persistent excitation of a step function is 1.
168 7. System Identification
Problem 7.3
show that
(z 2 − 2z cos ωT + 1)u(k) = 0
b) Find the greatest possible order of persistent excitation of the signal u(k)?
Hint: Write the elements of Cuu (2) in terms of the autocorrelation function.
2π 2π
i) T = ω
ii) T 6= ω
Problem 7.4
Use the empirical covariance matrices Cuu (1), Cuu (2) . . . Cuu (n) to show that the order of
persistent excitation of sampled white noise is arbitrarily high.
Hint:Use the fact that the correlation between white noise at times T and kT + T is 0 and
P
limk→∞ k1 ki=0 u2i = S0 , where S0 is the spectral density of the white noise.
Exercises 169
Problem 7.5
Download cs7_LSsysdat.mat. The MAT file contains sampled input and output signals
of a SISO system, where the input signal is a step (input u1, output y1), a sinusoid (input
u2, output y2) or white noise (input u3, output y3). A pulse transfer function is to be
determined that approximates the behaviour of the system from which the measurements
were taken.
a) From N samples of the inputs and N samples of the output create the measurement
matrix M for a system of order n. What is the dimension of the matrix M produced
from these data?
i) the sinusoid
ii) the white noise signal
c) From output data generated from white noise, calculate a least squares estimate of
the model parameters.
d) i) Estimate the models of order 2, 3 and 4 using the white noise input signal.
ii) Validate the model for the step and sinusoidal input signals.
iii) What is the order of the system?
e) Explain the results from (d) with the help of Problems 7.2 and 7.3.b.
Problem 7.6
Download the Matlab script cs7_mkdata.m. This script generates the impulse responses
g(k) and gn(k) of a system with 2 inputs and 2 outputs. The sequence g(k) is noise free,
whereas gn(k) is corrupted by measurement noise.
a) Generate from the sequence g(k) the block Hankel matrix of the impulse response.
Estimate upper and lower limits for the order n of the system and determine by
factorization of the block Hankel matrix linear state space models of the system for
different values of the order n. Compare the impulse responses of the estimated
models with the output sequence g(k).
Hint: You can use the function mkhankel.m to generate the Hankel matrix.
Problem 7.7
This exercise uses the Matlab Identification toolbox GUI ident to identify a state space
model from sets of data with two inputs and two outputs. The data is in the file
cs9_identGUI.mat.
Two sets data are contained in the file,iodata1 and iodata2. They are in the Matlab
format iddata that can be directly imported into ident.
a) Import the data set iodata1 and generate direct subspace identified models of
different orders using the command n4sid.
b) validate the models generated against the data set iodata2. What is the model
order that most effectively describes the plant behaviour?
Chapter 8
Modern state space design techniques - like those discussed in this course - produce
controllers of the same dynamic order as the generalized plant. Thus, when the system
to be controlled has a high dynamic order, the controller may be too complex to be
acceptable for implementation. In such cases, the plant model should be approximated
by a simplified model. Alternatively, a high-order controller can be approximated by a
low-order controller. This chapter gives a brief introduction to the topic of model order
reduction.
Consider a stable system with transfer function G(s) and state space realization
ẋ = Ax + Bu, y = Cx + Du
with n dimensional state vector x. If the number of states n is very large, one could try
to find a model of lower order that behaves ”similar” to the original system. For example,
if some of the state variables do not have much effect on the system behaviour, one might
consider removing these states from the model. Thus, we need to know which states are
”important” for the model and which ones are not. The controllability Gramian and the
observability Gramian turn out to be helpful for answering this question.
Controllability Gramian
Recall the definition of the controllability Gramian
Z ∞
T
Wc = eAt BB T eA t dt
0
In other words: we consider all input signals with energy less than or equal to 1 (assume
that u(τ ) = 0 for τ > t). Compute the singular value decomposition of the controllability
Gramian
Wc = V ΣV T
where
Σ = diag (σ1 , σ2 , . . . σn ) and V = [v1 v2 . . . vn ]
The singular values are ordered such that σ1 ≥ σ2 ≥ . . . ≥ σn as usual. The set Sc of
points in the state space reachable with input energy 1 is then a hyper-ellipsoid centered
at the origin, with semi-axes in the directions of the columns vi of V , and length given by
the square root of the corresponding singular value. This is illustrated in Figure 8.1 for a
second order system. Note that v1 is the direction that is most easily controlled, whereas
small singular values indicate directions which are difficult to control.
x2
1/2
σ1 v 1
x1
1/2
σ2 v 2
Sc
Observability Gramian
A similar interpretation can be given for the observability Gramian
Z ∞
T
Wo = eA t C T CeAt dt
0
Define the set So as the set of all points in the state space which - when taken as initial
conditions x(0) - lead to a zero-input response y(t) that satisfies
Z ∞
y T (t)y(t)dt ≤ 1
0
Wo = V ΣV T
173
where
determines the set So - it is a hyper-ellipsoid centered at the origin with semi-axes given
√
by vi / σ i . This set is illustrated for a second order system in Figure 8.2. Note that the
axes are long in directions with small singular values, indicating directions that have little
effect on the output. These directions are difficult to observe.
x2
v2
1/2
σ2
v1
1/2
σ1
x1
So
Balanced Realization
The question posed at the beginning of the chapter was: which state variables are im-
portant for the system and which ones are not? The singular value decomposition of the
Gramians tells us which states show only a weak response to a control input (the ones
associated with small singular values of Wc ), and which ones have only weak influence on
the observed output (the ones where the singular values of Wo are small). Now it would
be unwise to remove a state variable from the model only because it shows little response
to control inputs - the same state variable may have a strong effect on the output. The
reverse may be true for states with small singular values of Wo . To find out which states
have little influence both in terms of controllability and observability, we will use a special
state space realization of the plant model which is known as balanced realization.
We should keep in mind that controllability and observability are state space concepts.
The problem considered here can of course also be expressed in terms of input-output be-
haviour, i.e. transfer function models: the notion of near-uncontrollable or near-unobservable
then takes the form of near pole-zero cancellations. The state space framework lends itself
however better to a numerical treatment.
Initially we assumed a state space realization of G(s) with system matrices A, B, C and
174 8. Model Order Reduction
for the same plant. The eigenvalues and the input/output behaviour of both state space
models are the same, because they are realizations of the same transfer function. The
controllability and observability Gramians however are different: it is straightforward to
check that if Wc and Wo are the Gramians of the original model, then T −1 Wc T −T and
T T Wo T are the Gramians of the transformed model, respectively.
A balanced realization of G(s) has the property that its controllability and observability
Gramians are equal and diagonal
σ1 0
Wc = Wo =
...
0 σn
Using Cholesky factorization and singular value decomposition, it is always possible to
find a similarity transformation T that brings a given state space model into this form -
in MATLAB one can use the function balreal() for this task. The diagonal entries of the
Gramian are called the Hankel singular values of the system. In the coordinate basis of
the state space associated with the balanced realization, a small Hankel singular value
indicates that a state has little influence both in terms of controllability and observability.
Therefore, this realization is well suited for model reduction by removing ”unimportant”
state variables. We will discuss two different ways of doing this.
Let (A, B, C, D) be a balanced realization of G(s) with n state variables. Assume the
inspection of the Hankel singular values indicates that only r states are significant and
that the last n − r Hankel singular values are small enough to be neglected. Partition the
state space model as
A11 A12 B1
A= , B= , C = [C1 C2 ] (8.1)
A21 A22 B2
Balanced Truncation
The subsystem (A11 , B1 , C1 , D) of the partitioned model (8.1) contains the states with
significantly large Hankel singular values. One approach to model order reduction is to
use this system with r state variables as an approximation of the full order model G(s).
This approach is known as balanced truncation, because the parts of the model associated
with insignificant states are simply ignored.
An important property of the resulting reduced order model
is that it satisfies
Gtr (j∞) = G(j∞) = D
The direct feedthrough terms of full order and truncated model are the same. This
indicates that both models will exhibit similar high-frequency behaviour.
Balanced Residualization
An alternative approach is not to ignore the insignificant states, but to assume that they
are constant and take them into account in the reduced order model. This method is
known as balanced residualization. In the partitioned model
ẋ1 A11 A12 x1 B1
= + u
ẋ2 A21 A22 x2 B2
x1
y = [C1 C2 ] + Du
x2
Cres = C1 − C2 A−1
22 A21 , Dres = D − C2 A−1
22 B2
The feedthrough term is different from that of the full order model, indicating that the
high-frequency behaviour will not be the same. In fact the reduced model was arrived at
by assuming that derivatives of some state variables are zero. This is true in steady state,
so we would expect similar behaviour of full order and reduced model at low frequencies.
One can indeed verify that the steady state gains of both models are the same, i.e.
Example 8.1
Figure 8.3: Comparison of balanced truncation (a) and balanced residualization (b)
Exercises 177
unstable high order system is to split it up into a stable and an antistable part, i.e.
where Gs (s) is stable and Gu (s) has all poles in the right half plane. The stable part of
the system can then be reduced by balanced truncation or residualization to Ĝs (s), and
a reduced model of the plant that retains all unstable modes is
Exercises
Problem 8.1
a) Write Matlab functions based on the theory in this Chapter to perform model order
reduction by both truncation and residualisation. You may use the matlab command
balreal to get a balanced realisation.
This chapter uses a realistic example of a the design of a controller for a chemical process
to demonstrate how the various tools described in this course should be applied in practice.
This is an important aspect of the study of control systems - without such study it can
be difficult to see how the different identification, control design and discretization tools
fit together and can be applied to practically meaningful problems.
Figure 9.1 shows a schematic diagram of a process evaporator. In the evaporation vessel
certain conditions on temperature and pressure must be fulfilled by adjustment of liquid,
gas and heating fluid flows so that the required rate of evaporation is maintained.
In this chapter a linear model of the process will be identified, where the plant is repre-
sented by a non-linear Simulink model. Based on the linear model a controller is then
designed and applied to the non-linear plant model.
Download the file cs9_evaporator.zip. It contains several files concerning the model
and the design of the controller for it. These files are explained in the next section. The
control objectives and the controller design procedure to be followed are then described.
evapmod.mdl is a nonlinear Simulink model of the evaporator. The model has three
inputs:
u1 = Liquid outflow
u2 = Gas outflow
u3 = Heating fluid flow
9.2. Control Objectives 179
u2 =Gas outflow
y2 =Pressure
Inflow
u3 =Heating
y1 =Level fluid inflow
y3 =Temperature
u1 =fluid outflow
There are three outputs that are measured and are to be controlled:
y1 = Reactor liquid level
y2 = Reactor pressure
y3 = Reactor temperature
as well as an unmeasured disturbance – Reactor inflow.
The time unit of the model is the minute.
The file plantdata.mat contains the steady state values for input u0, state vector x0evap,
output y0 and a vector of process parameters pvec used in obtaining the test data. The
file also contains two sets of measured input and output signals for the plant in open
loop with a sampling time of 0.05 minutes which should be used for system identification:
utest1, ytest1 and utest2, ytest2.
1. After step changes in setpoints or reactor inflow the steady state error must be zero.
2. Following a step disturbance of size 0.4 in the evaporator inflow the following con-
ditions must be satisfied:
(b) The 5% settling times must not be greater than the following
level: 50 min, pressure: 10 min, temperature: 10 min
(c) The inputs should always remain with in ±1.0
3. The standard deviation of the noise on the individual inputs should be less than
0.01.
Use the system identification toolbox (command ident) to estimate a linear model from
the data sequences utest1 and ytest1 with a sampling time of 0.05 minutes. Validate
this model against the data in the matrices utest2 and ytest2.
Hints for using the Identification Toolbox:
• First you must import the signals: Import data → Time domain data:
Input: utest1
Output: ytest1
Starting time: 1
Samp. interv.: 0.05
Repeat for the second data set.
• Remove the means from all signals using the Preprocess drop-down list → Remove
means. The new set of signals should be used as Working data and Validation data.
• Estimate models of 3rd, 4th and 5th order using N4SID (subspace identification).
For this purpose choose Linear parametric models from the Estimate drop-down
box. Select State-space as Structure and repeat the identification for the different
orders.
• Validate the identified models using the second data set. Use the Model Views
check-boxes in the lower-right corner of the GUI.
When you have found a good model, call this model mod1 (by right-clicking it) and export
this model into the workspace.
9.4. Controller Design Files 181
The file contdesign.m is a Matlab script that performs the following tasks:
• Display of setpoint step response, comparison of linear and non-linear models for a
particular controller.
Re-running the script, without closing the figures allows comparing different designs. A
set of weighting matrices for testing is offered in the file tuningparset.m.
3. Scale the inputs and outputs to do the controller design. Remember that the scaling
needs to be accounted for when the designed controllers are applied to the plant.
4. To simulate the non-linear model, the initial state of the observer and integrator
blocks need to be calculated so that the simulation can begin in steady state.
5. The script simandplot.m simulates and plots the closed-loop response of the non-
linear system and the designed controller to different step signals.
6. Run the script contdesign.m to design a controller and see the closed-loop simula-
tion. Repeat for different controller settings. Tune the controller and the observer
to achieve the control objectives.
7. Discretize the continuous-time controller and simulate the closed-loop response with
the discrete-time controller and a sampling time of 3 seconds (0.05 minutes) in
Simulink. Compare your results with the solution provided in discretise.m
Appendix A
Consider a vector space X . A norm kxk is function mapping a vector x into a real number,
that satisfies the following four properties for any x, y ∈ X
1) kxk ≥ 0
2) kxk = 0 ⇔ x=0
4) kx + yk ≤ kxk + kyk
| n into C
more generally C | m . One can compare the vector norms of x and y, and associate
a ”gain” with A as the ratio of these vector norms. This ratio depends on x, and an
important property of the matrix A is the maximum value of kyk/kxk over all x ∈ C | n
(the ”maximum gain” of A). This positive real number is defined to be the norm of the
matrix A; since it depends also on the choice of vector norm, it is called an induced norm.
The matrix-2-norm induced by the vector-2-norm is defined as
kAxk2
kAk2 = max (A.1)
x6=0 kxk2
Again, we will drop the subscript and write kAk for the matrix-2-norm. It is straightfor-
ward to verify that the matrix-2-norm - and indeed all induced matrix-p-norms - satisfy
the four properties of a norm.
To find the value of kAk, we take squares on both sides of (A.1) to get
kAxk2 xH AH Ax xH M x
kAk2 = max = max = max
x6=0 kxk2 x6=0 xH x x6=0 xH x
1) | n)
M is positive semi-definite (xH M x ≥ 0 ∀x ∈ C
xH M x = xH AH Ax = y H y ≥ 0
Note that this implies that xH M x is real even if x is complex. That the eigenvalues of
M are real can be shown as follows. Let λ be an eigenvalue and v be an eigenvector of
M , and consider
M v = λv
Multiplying with v H from the left yields v H M v = λv H v. We established already that the
left hand side of this equation is real, and on the right hand side v H v is also real. Thus,
λ must be real.
To show that two eigenvectors of M belonging to different eigenvalues are orthogonal,
consider
M v 1 = λ1 v1 , M v 2 = λ2 v 2 , , λ1 6= λ2
We have
(λ1 v1 )H v2 = (M v1 )H v2 = v1H M v2 = v1H λ2 v2
thus λ1 v1H v2 = λ2 v1H v2 , and from the assumption λ1 6= λ2 it then follows that v1H v2 = 0.
A consequence of property (3) is that if all eigenvectors vi of M are normalized such that
kvi k = 1, i = 1, . . . , n, the eigenvector matrix V is unitary, i.e. V H V = I, or V −1 = V H .
(Strictly speaking, we have shown this only for matrices with distinct eigenvalues. It can
be shown however that even a matrix with repeated eigenvalues has a full set of orthogonal
eigenvectors.)
Note that properties (2) and (3) are true for any Hermitian matrix even when it is not
positive semidefinite.
We now return to finding the value of kAk by solving
xH AH Ax
max
x6=0 xH x
With the diagonalisation AH A = V ΛV H this becomes
xH V ΛV H x
max
x6=0 xH x
and introducing y = V H x and thus x = V y (using orthonormality of V ), we obtain
where λ1 , . . . , λn are the eigenvalues of AH A. Assume that the eigenvalues are ordered
such that λ1 ≥ λ2 ≥ . . . ≥ λn . Then it is easy to see that the maximum value of the
above expression is λ1 , which is achieved if we choose y = [1 0 . . . 0]T , and the minimum
value is λn , achieved by choosing y = [0 . . . 0 1]T .
A.3. The Singular Value Decomposition 185
Because the above expression is the square of the matrix-2-norm of A, we have thus
established that
kAxk p
kAk = max = λmax (AH A)
x6=0 kxk
In the last section we used the fact that any Hermitian matrix M can be factored into
M = V ΛV H
where V is the eigenvector matrix of M and unitary, and Λ is the diagonal eigenvalue
matrix of M . The same factorization is obviously not possible for non-Hermitian or even
non-square matrices. A similar factorization is however possible in these cases, if we do
not insist on the same matrix V on both sides, but allow different unitary matrices U and
V as left and right factors.
that
A = U ΣV H (A.2)
and Σ is real and diagonal with non-negative entries.
The matrix Σ has the same size as A. For example, if A is a 3 × 2 or 2 × 3 matrix, then
σ1 0
σ1 0 0
Σ = 0 σ2 or Σ =
0 σ2 0
0 0
respectively, where σ1,2 ≥ 0. The diagonal entries σi are called the singular values of A.
From (A.2) we obtain AV = U Σ and thus
Avi = σi ui , i = 1, . . . , n
186 A. Vector Norms, Matrix Norms . . .
where vi and ui are the columns of V and U , respectively. Compare this with M vi = λi vi
- an eigenvector vi is transformed into λi vi , whereas A transforms vi into σi ui . From (A.2)
we also have
AAH = U ΣV H V ΣT U H = U ΣΣT U H (A.3)
and
AH A = V ΣT U H U ΣV H = V ΣT ΣV H (A.4)
Equation (A.3) shows that U is the eigenvector matrix of AAH , and (A.4) shows that V is
the eigenvector matrix of AH A. The eigenvalue matrices are ΣΣT and ΣT Σ, respectively.
Again, if A is 3 × 2 then
2
σ1 0 0 2
T 2 T σ 1 0
ΣΣ = 0 σ2 0 , Σ Σ=
0 σ22
0 0 0
This shows that the singular values of A are the square roots of the eigenvalues of AAH
and AH A.
Proof
To prove Theorem A.1, we show how to construct U, V and Σ that satisfy (A.2) for a
given matrix A. We start with the diagonalisation of AH A: we established already that
there exists a unitary matrix V such that
AH A = V H ΛV
kAvi k2 = λi (A.5)
This implies that λi ≥ 0. Assume that the eigenvalues λ1 , . . . , λr are positive and the
remaining n − r eigenvalues λi and vectors Avi are zero. Note that r ≤ min(n, m). Define
p 1
σi = λi , ui = Avi , i = 1, . . . , r
σi
It follows from (A.5) that kui k = 1. Moreover, we have
U H AV = Σ
Because σj uH H
i uj is zero if i 6= j and σj if i = j, the above shows that the entries of U AV
are all zero except for the first r entries on the main diagonal, which are the singular
values of A. This completes the proof.
188 A. Vector Norms, Matrix Norms . . .
Exercises
Problem A.1
Problem A.2
The left nullspace is the nullspace of AT . It contains all vectors y such that AT y = 0.
b) Find a basis for the nullspace of A, the column space of A, the row space of A and
the left nullspace of A.
c) Use the matrix command svd in Matlab to calculate bases for these subspaces.
Verify that the results given by Matlab are equivalent to your results from (b).
Appendix B
This chapter briefly reviews in a tutorial fashion some fundamental concepts of probability
theory, stochastic processes, and of systems with random inputs.
Assuming that all events are equally likely (i.e. that the coins are unbiased) and mutually
exclusive, we conclude that
1
P (HH) = P (HT) = P (TH) = P (TT) =
4
More generally, if there are N possible, equally likely and mutually exclusive outcomes of
a random experiment, and if NA denotes the number of outcomes that correspond to a
given event A, then the probability of that event is
NA
P (A) =
N
There is however a fundamental flaw in the above reasoning: using the words “equally
likely” amounts to saying that outcomes are equally probable - in other words, we are
using probability to define probability.
We define the probability space S associated with this experiment as the set of all possible
outcomes, i.e.
S = {HH, HT, TH, TT}
An event can then be viewed as a subset of S. Here we have
See Figure B.1. More events can be associated with different subsets of S. There are two
events that are of particular significance. Since at least one outcome must be obtained
B.1. Probability and Random Variables 191
on each trial, the space S corresponds to the certain event. Similarly, the empty set
∅ corresponds to the impossible event. An event consisting of only one element of S is
called an elementary event, whereas events consisting of more than one elements are called
composite events.
Event B
HH TT
Event A
HT TH
The axiomatic approach defines the probability of an event as a number that satisfies
certain conditions (axioms). Let A and B denote two possible events. Also, let (A ∪ B)
denote the event “A or B or both”, and (A ∩ B) the event “both A and B”. The probability
P of an event, say A, is a number associated with that event that satisfies the following
three conditions.
P (A) ≥ 0 ∀A ∈ S (B.1)
P (S) = 1 (B.2)
A∩B=∅ ⇒ P (A ∪ B) = P (A) + P (B) ∀A, B ∈ S (B.3)
From these axioms, the whole body of probability theory can be derived. For example,
the probability P (A ∪ B) that A or B or both occur, is given by (B.3) for the case that
A and B are mutually exclusive. For events A and B that are not mutually exclusive, we
can use the above axioms to show that
The proof is left as an exercise. The probability P (A ∩ B) that A and B occur simultane-
ously, is called joint probability of A and B, for this we also use the notation P (A, B).
Note that the axiomatic approach does not give us the numerical value of a probability
P (A), this must be obtained by other means.
Conditional Probability
The conditional probability is the probability of one event A, given that another event
B has occurred, it is denoted P (A|B). This concept can be introduced intuitively using
192 B. Probability and Stochastic Processes
the relative-frequency approach. Consider for example an experiment where resistors are
picked at random from a bin that contains resistors with different resistance values and
power ratings, as shown in Table B.1.
1Ω 10Ω Totals
2W 50 100 150
5W 10 200 210
Totals 60 300 360
We could ask: what is the probability of picking a 1Ω resistor, when it is already known
that the chosen resistor is 5W? Since there are 210 5W resistors, and 10 of these are 1Ω,
from a relative-frequency point of view the conditional probability is
10
P (1Ω|5W) = = 0.048
210
Independence
An important concept in probability theory is that of statistical independence. Two events
A and B are said to be independent if and only if
As a consequence, we have from (B.5) that P (A|B) = P (A) if A and B are independent.
Random Variables
Returning to the resistor bin example, we could ask whether a resistor labelled 1 Ω actually
has a resistance value of exactly 1 Ω. In reality, the value of resistance can be expected to
be close to 1 Ω, but will probably differ from that value by a small amount. We may then
B.1. Probability and Random Variables 193
consider the resistance of a 1-Ω resistor as a quantity whose exact value is uncertain, but
about which some statistical information is available. Such a quantity is called a random
variable. Depending on the set of values that a random variable can take, we distinguish
between continuous and discrete random variables. In the resistor example, even if we
know that is has a resistance value between 0.9 and 1.1 Ω, there is still an infinite number
of possible values in this range. On the other hand, if we consider the experiment of
throwing a die, there are only six possible values as outcomes. If the number of possible
values is finite, the random variable is discrete (as is the value showing on a die), otherwise
it is continuous (like the value of resistance).
Figures B.2.a and B.2.b show the probability distribution functions of a discrete (throw-
ing a die) and a continuous (resistance value) random variable, respectively. Note that
the probability distribution of a discrete random variable is discontinuous, and that the
magnitude of a jump of FX (x) at say x0 is equal to the probability that X = x0 .
194 B. Probability and Stochastic Processes
FX (x) FX (x)
1 1
1 2 3 4 5 6 x 1Ω x
fX (x) fX (x)
1
6
1 2 3 4 5 6 x 1Ω x
Figure B.2: Probability distribution and density, respectively, of throwing a die (a,c) and
measuring a resistance (b,d)
The probability of the event that X takes a value in the interval [x1 , x2 ] can be expressed
B.1. Probability and Random Variables 195
both in terms of the probability distribution function and the probability density function,
we have Z x2
P (x1 < X ≤ x2 ) = FX (x2 ) − FX (x1 ) = fX (x)dx
x1
P (x − dx < X ≤ x) = fX (x)dx
Figures B.2.c and B.2.d show the probability density functions of the random variables
associated with throwing a die and selecting a resistor. Since the probability density
function of a discrete random variable is discontinuous, the derivative does strictly speak-
ing not exist. However, a reasonable way of handling this difficulty is to represent the
derivative at a point of discontinuity by a delta function of area equal to the magnitude
of the jump.
Joint Probability
Random experiments may involve two or more random variables. We can define the joint
probability distribution function of two random variables X and Y as
and
P (x − dx < X ≤ x, y − dy < Y ≤ y) = fXY (x, y)dxdy
If we integrate over the entire sample space, we obtain
Z ∞Z ∞
FXY (∞, ∞) = fXY (x, y)dxdy = 1
−∞ −∞
In the same way the concept of marginal probability was introduced, we define the prob-
ability distribution function of X irrespective of the value Y takes as the marginal prob-
ability distribution function
FX (x) = FXY (x, ∞)
From (B.9) and (B.11), it follows that
Z ∞ Z x
FX (x) = fXY (x′ , y)dx′ dy
−∞ −∞
196 B. Probability and Stochastic Processes
and since
dFX (x)
fX (x) =
dx
we obtain Z ∞
fX (x) = f (x, y)dy
−∞
A similar result can be shown for fY (y). We say that the random variables X and Y are
independent if and only if
FXY (x, y) = FX (x)FY (y)
which also implies
fXY (x, y) = fX (x)fY (y)
where E denotes the expectation operator. Thus, the expectation of X is the average
of the possible values weighted with their associated probability. This quantity is also
referred to as mean value or first moment of the discrete random variable X. Similarly,
for a continuous random variable X the expectation is defined as
Z ∞
E[X] = xfX (x)dx (B.13)
−∞
We will use the notation X̄ = E[X] for the expectation of X. The expectation operator
can be applied to functions of random variables as well. Let n be a positive integer, then
E[X n ]
is called the nth moment of X. Of particular importance are the first and second moment.
The first moment was already introduced as the expectation. The second moment
E[X 2 ]
is called the mean-square value of X. Subtracting X̄ before taking powers yields the nth
central moment
E[(X − X̄)n ]
B.1. Probability and Random Variables 197
Whereas the first central moment is zero, the second central moment
2
σX = E[(X − X̄)2 ]
Two random variables X and Y are said to be uncorrelated if ρXY = 0. Note that while
statistically independent random variables are always uncorrelated, the converse is not
necessarily true, unless their probability distribution is Gaussian.
f (x) F (x)
1
√1 0.841
2πσ
0.607
√
2πσ 2σ 0.5
0.159
0 X̄ −σ X̄ X̄ +σ x 0 X̄ −σ X̄ X̄ +σ x
1 −(x − X̄)2
f (x) = √ exp 2
, −∞ < x < ∞ (B.16)
2πσX 2σX
2
where X̄ and σX are the mean value and the variance, respectively, of X. The Gaussian
distribution function cannot be expressed in closed form; both Gaussian density and
distribution function are shown in Figure B.3.
Z ∞
fZ (z) = fX (x) ∗ fY (y) = fX (z − y)fY (y)dy (B.17)
−∞
In Exercise B.4, this result is used to illustrate the central limit theorem.
An important fact concerning Gaussian random variables is that any linear combination
of Gaussian random variables, independent or not, is also Gaussian.
B.2. Stochastic Processes 199
1 2 3 4 5
Generator
1 t
Generator
2 t
Generator
3 t
Generator
4 t
Generator
5 t
Let X be a random variable that takes on the value of the generator output during a
given time interval. Using the relative-frequency approach, we might then estimate the
probability P (X = +1) as the number of times where X = 1, divided by the number N
200 B. Probability and Stochastic Processes
of observed time intervals, where N is a large number. In this case, the output values are
observed sequentially in time. An alternative - and more useful - way of interpreting the
probability P (X = +1) is to assume that we have N identical random generators, and
observe their outputs simultaneously in time. The relative frequency is then the number
of generators with output +1 at a given time, divided by the number N of generators.
This is illustrated in Figure B.4. An important advantage of the latter approach is that
it is able to account for changes of the statistical properties of the generators over time
(e.g. aging of the random generators).
In the same way as in Figure B.4, we can imagine performing any random experiment
many times simultaneously. Another example is shown in Figure B.5: a random variable
is used to represent the voltage at the terminal of a noise generator. We can in fact
define two random variables X(t1 ) and X(t2 ) to represent the voltage at time t1 and t2 ,
respectively. The outcome of a particular experiment is then a waveform x(t). The entire
collection of possible waveforms {x(t)} is called an ensemble, and a particular outcome
x(t) is called a sample function of the ensemble. The underlying random experiment is
called a stochastic process. The difference between a random variable and a stochastic
process is that for a random variable an outcome in the probability space is mapped into
a number, whereas for a stochastic process it is mapped into a function of time.
x1 − ∆x1 x2 − ∆x2
Noise
Gen. 1 t
x1 − ∆x1 x2 − ∆x2
Noise
Gen. 2 t
..
.
x1 − ∆x1 x2 − ∆x2
Noise
Gen. M t
t1 t2
Ergodic Processes
It is possible that almost every member x(t) of the ensemble of outcomes of a given sta-
tionary stochastic process X(t) has the same statistical properties as the whole ensemble.
In this case, it is possible to determine the statistical properties by examining only one
sample function. A process having this property is said to be ergodic. For an ergodic
process, time and ensemble averages are interchangeable: we can determine mean values
and moments by taking time averages as well as ensemble averages. For example, we have
for the nth moment
Z ∞ Z T
n n 1
E[X (t)] = x (t)fX (x(t))dx(t) = lim xn (t)dt
−∞ T →∞ 2T −T
The first integral represents the average over the ensemble of outcomes at a given time t,
whereas the second integral represents the average over time. We will use the notation
Z T
1
hx(t)i = lim x(t)dt
T →∞ 2T −T
to denote the time average of a sample function x(t). The above can therefore be written
as
E[X n (t)] = hxn (t)i
202 B. Probability and Stochastic Processes
We also have
2
σX = E[(X(t) − X̄)2 ] = h[X(t) − hX(t)i]2 i
Since a time average cannot be a function of time, it is clear from the above that an
ergodic process must be stationary - all non-stationary processes are non-ergodic. On
the other hand, stationarity does not necessarily imply ergodicity - it is possible for a
stationary process to be non-ergodic. Even though it is generally difficult to give physical
reasons for this, it is customary to assume ergodicity in practical applications unless there
are compelling physical reasons for not doing so.
The first term on the right hand side is called the autocorrelation function
RX (τ ) = E[X(t)X(t + τ )] = hx(t)x(t + τ )i
where x(t) is any sample function of X(t). Note that the time average on the right hand
side is identical with the definition of the time autocorrelation function of a deterministic
power signal.
It is straightforward to show that the autocorrelation function RX (τ ) of an ergodic process
has the following properties, see Exercise B.5.
B.2. Stochastic Processes 203
1. |RX (τ )| ≤ RX (0) ∀τ .
2. RX (−τ ) = RX (τ ) ∀τ .
Crosscorrelation Function
Consider two random processes X(t) and Y (t) which are jointly stationary in the wide
sense. For fixed t and τ , X(t) and Y (t + τ ) are two random variables, and we can define
the crosscorrelation functions
RXY (τ ) = E[X(t)Y (t + τ )]
and
RY X (τ ) = E[Y (t)X(t + τ )]
A crosscorrelation function is not an even function of τ , however RXY (τ ) = RY X (−τ ) ∀τ .
There is not necessarily a maximum at τ = 0, however one can show that
The time average on the right hand side in the above equations is again identical with
the definition of the time crosscorrelation function of a deterministic power signal.
• The total power (the mean-square value) E[X 2 (t)] = σx2 + X̄ 2 is the sum of ac power
and dc power.
204 B. Probability and Stochastic Processes
Figure B.6 shows a typical example of an autocorrelation function. We can infer the
following from the plot:
• The ac power is A − B.
RX (τ )
A
B
τ
A more widely used way of representing statistical properties of the process samples is
the covariance matrix, which - following the definition in (B.14) - is defined as
T
ΛX = E[(X − X̄ )(X − X̄ ) ] = RX − X̄ X̄ T (B.21)
B.2. Stochastic Processes 205
where ρi denotes the correlation coefficient defined in (B.15) between samples taken at
time instants separated by i∆.
Another situation encountered in practical applications arises when the relationship be-
tween a number of different stochastic processes - say X1 (t), X2 (t), . . . , XN (t) - is
considered. The collection of processes can be represented by a single vector process
Assuming that the vector process is wide-sense stationary, its correlation matrix is
R1 (τ ) R12 (τ ) . . . R1N (τ )
R21 (τ ) R2 (τ )
RX (τ ) = E[X(t)X T (t + τ )] = .
... .. (B.23)
.. .
RN 1 (τ ) . . . RN (τ )
where
Ri (τ ) = E[Xi (t)Xi (t + τ )] and Rij (τ ) = E[Xi (t)Xj (t + τ )]
Spectral Density
When dealing with deterministic signals, the transformation from time to frequency do-
main via Fourier and Laplace transform is known to greatly simplify the analysis of linear
systems. Reasons for this simplification are that differential equations and convolution
in time domain are replaced by algebraic equations and multiplication, respectively, in
frequency domain. We will see that similar simplifications are possible when dealing with
stochastic processes. Earlier it was shown that the autocorrelation function of an ergodic
process - defined as a statistical average - is equal to the definition of the time autocor-
relation function of a deterministic signal - computed as a time average. An important
frequency domain concept for deterministic signals is the spectral density, defined as the
Fourier transform of the autocorrelation function. We will see that this concept can be
extended to stochastic processes.
206 B. Probability and Stochastic Processes
We first recall some concepts and definitions related to a deterministic signal x(t). Let
Z ∞
E= x2 (t)dt
−∞
the signal power of x(t). We call a given a deterministic signal x(t) an energy signal if
0<E<∞
has the physical significance that its magnitude represents the amplitude density as a
function of frequency. Moreover, Parseval’s theorem states that the signal energy can be
expressed in terms of the Fourier transform as
Z ∞ Z ∞
2 1
x (t)dt = |X(ω)|2 dω
−∞ 2π −∞
ΨX (ω) = |X(ω)|2
which indicates how the energy of the signal x(t) is distributed over frequency.
When dealing with power signals, we recall that - strictly speaking - the Fourier transform
of a power signal does not exist, but that for certain power signals a generalized Fourier
transform may be defined. For example, it is possible to define the generalized Fourier
transform of a periodic signal by using delta functions to represent the amplitude densities
at discrete frequencies. Similarly, if a power signal can be expressed as the sum of an
energy signal and a dc component, the dc component can be represented by a delta
function in the generalized Fourier transform.
For a general deterministic power signal x(t), the power spectral density SX (ω) is defined
as the Fourier transform of its autocorrelation function RX (τ ), i.e.
Z ∞
SX (ω) = F[RX (τ )] = RX (τ )e−jωτ dτ
−∞
The discussion earlier in this section showed that the autocorrelation itself may not be
square integrable, so that its Fourier transform may not exist. However, if that is the
B.2. Stochastic Processes 207
case, the autocorrelation function has either a periodic component or can be decomposed
into a square integrable component and a dc component (see e.g. Figure B.6), so that a
generalized Fourier transform containing delta functions can be used instead.
i) taking the expectation, i.e. the average over the ensemble of sample functions, and
ii) letting T → ∞.
Wiener-Khinchine Theorem
Note that (B.25) is one way of defining the spectral density of a stochastic process in
terms of the Fourier transform of truncated sample functions. For deterministic power
signals, the spectral density is defined as the Fourier transform of the autocorrelation
function. It turns out that for an ergodic process, the same relationship can be derived
from the definition (B.25). Thus, if SX (ω) is the power spectral density of the ergodic
process X(t) as defined in (B.25), we can show that
Z ∞
SX (ω) = F[RX (τ )] = RX (τ )e−jωτ dτ (B.26)
−∞
Taking the ensemble average, and changing the order of averaging and integration, we
obtain
Z T Z T
2
E |F[xT (t)]| = E[x(t)x(τ )ejω(t−τ ) ]dtdτ
−T −T
Z T Z T
= RX (t − τ )ejω(t−τ ) dtdτ
−T −T
Thus, the mean-square value (the total power) of a process is proportional to the area of
the power spectral density.
A frequently encountered class of power spectral densities is characterized by being ra-
tional functions of ω. Since the functions are even in ω, only even powers are involved;
thus
ω 2m + b2m−2 ω 2m−2 + . . . + b2 ω 2 + b0
SX (ω) = S0 2n
ω + a2n−2 ω 2n−2 + . . . + a2 ω 2 + a0
Note that a finite mean-square value (total power) of the process requires m < n. If a
process has a dc or periodic component, there will be delta impulses δ(ω ± ω0 ) present in
the power spectral density, where ω0 is the frequency of the periodic component, or zero
for a dc component.
Spectral densities have been expressed so far as functions of the angular frequency ω.
When analyzing linear systems, it is often convenient to use the complex variable s instead.
This can be done by replacing ω by −s or ω 2 by −s2 , yielding SX (−s) instead of SX (ω).
Since this notation is somewhat clumsy, we will simply write SX (s), keeping in mind that
this and SX (ω) are not the same function of their respective arguments. Note however
B.3. Systems with Stochastic Inputs 209
that for rational spectral densities, where only even powers are involved, the two are
equivalent.
where RXY (τ ) and RY X (τ ) are the cross-correlation functions of the two processes.
In contrast to the power spectral density, a cross spectral density needs not be real,
positive or an even function of ω. One can however show that
White Noise
When studying linear systems with deterministic inputs, certain input signals play a
prominent role in analysis, the most important one being the unit delta impulse δ(t).
When dealing with stochastic inputs, a role similar to that of the delta impulse is played
by a particular stochastic process referred to as white noise. A stochastic process is called
white noise if its spectral density is constant over all frequencies, i.e.
SX (ω) = S0
Since the power spectral density is the Fourier transform of the autocorrelation function,
it is clear that the autocorrelation function of a white noise process is a delta function
RX (τ ) = S0 δ(τ ) (B.27)
210 B. Probability and Stochastic Processes
and therefore
Z ∞ Z ∞
−jωτ
SX (ω) = RX (τ )e dτ = S0 δ(τ )e−jωτ dτ = S0
−∞ −∞
where W is called the bandwidth of the process. Power spectral density and autocorre-
lation function of this process are shown in Figure B.7. Even though the total power is
finite (it is 2W S0 ), this process is also not realizable because physical processes cannot
have flat spectral density functions. It can however be approached arbitrarily close, and
its usefulness comes from the fact that when used as input to a system with a bandwidth
much smaller than W , the response will be very close to that obtained with unlimited
white noise as input.
RX (τ )
SX (ω)
2W S0
S0
3 τ
−2πW 0 2πW ω - 2W 1
- W1 - 2W 1
2W
1
W
3
2W
Figure B.7: Power spectral density and autocorrelation function of bandlimited white
noise
If these processes are wide-sense stationary, mutually uncorrelated and have zero mean,
B.3. Systems with Stochastic Inputs 211
Gaussian Processes
The importance of the Gaussian distribution was already discussed earlier. It turns out
that Gaussian processes are not only realistic models of many real-life processes, they are
also very useful when analyzing linear systems. Whereas the former is due to the central
limit theorem, the latter is due to the fact that if the input signal to a linear system is a
Gaussian process, the output will also be Gaussian. Thus, when all inputs are Gaussian,
all signals in the system under consideration will have that property. Gaussian processes
therefore play a role similar to that of sinusoidal signals for steady state analysis.
If the input is a random signal that can be modelled as a stochastic process U (t), we write
Z ∞
Y (t) = U (t − τ )g(τ )dτ
0
This equation defines the output as a stochastic process Y (t), whose sample functions
y(t) are generated by the sample functions u(t) of the input process according to (B.29).
In the rest of this section we will assume that the stochastic processes are stationary or
wide-sense stationary.
212 B. Probability and Stochastic Processes
Observing that the integral of the impulse response is the static gain, we conclude that
Ȳ = G(0)Ū
where G(s) denotes the transfer function of the system. Thus, the mean value of the
output is equal to the mean value of the input times the static gain. In particular, if the
input has zero mean the output will also have zero mean.
Since the expectation on the right hand side is the autocorrelation function of the input,
we have Z ∞Z ∞
2
E[Y (t)] = RU (τ2 − τ1 )g(τ1 )g(τ2 )dτ1 dτ2 (B.30)
0 0
For the special case of a white noise input process with spectral density S0 , substituting
(B.27) in the above yields Z ∞
2
E[Y (t)] = S0 g 2 (τ )dτ
0
provided the integral on the right hand side exists.
RU (τ ) = S0 δ(τ )
and we obtain Z ∞
RY (τ ) = S0 g(λ)g(λ + τ )dλ
0
RU Y (τ ) = E[U (t)Y (t + τ )]
Z ∞
= E U (t) U (t + τ − λ)g(λ)dλ
0
Z ∞
= E[U (t)U (t + τ − λ)]g(λ)dλ
0
and thus Z ∞
RU Y (τ ) = RU (τ − λ)g(λ)dλ (B.32)
0
Similarly we obtain Z ∞
RY U (τ ) = RU (τ + λ)g(λ)dλ
0
If the input is a white noise process with RU (τ ) = SU δ(τ ), then (B.32) simplifies to
SU g(τ ), τ ≥ 0
RU Y (τ ) =
0, otherwise
This result provides a way of determining the impulse response of a linear system experi-
mentally by applying white noise at the input and computing the crosscorrelation between
input and output.
SY (ω) = F[RY (τ )]
Z ∞ Z ∞ Z ∞
= RU (λ2 − λ1 − τ )g(λ1 )g(λ2 )dλ1 dλ2 ejωτ dτ
−∞ 0 0
214 B. Probability and Stochastic Processes
where G(ω) is the Fourier transform of g(λ). Since we assumed that the system under
consideration is stable and has no poles on the imaginary axis, we can replace the Fourier
transform G(ω) by the Laplace transform G(s) to obtain
Note however that SY (s) and SU (s) are not the same functions of their arguments as
SY (ω) and SU (ω) unless the spectral densities are rational.
The relationship (B.33) shows that the term G(s)G(−s) plays the same role in relating
stochastic input and output processes, as does the transfer function G(s) in relating
deterministic input and output signals. Note however that we assumed that the input
process is stationary (or wide-sense stationary). For non-stationary processes, the above
results do not apply in general.
and
SY U (s) = G(−s)SU (s)
B.3. Systems with Stochastic Inputs 215
Exercises
Problem B.1
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
Problem B.2
Show that
2
σX = E X 2 − X̄ 2
Problem B.3
Let the random variable X have the uniform probability density function
(
1
b−a
, a≤x≤b
fX (x) =
0, otherwise
i) a = 1 and b = 2
ii) a = 0 and b = 2.
Problem B.4
Z = X1 + X2 + X3 + X4
for i = 1, 2, 3, 4. Use (B.17) and Matlab to plot fZ (z). Plot also the Gaussian probability
density function with zero mean and σ 2 = 1/3, and compare both functions.
Problem B.5
Show that the autocorrelation function (B.18) has the four properties listed in Section
B.2.
Problem B.6
Solutions to Exercises
218 C. Solutions to Exercises
C.1 Chapter 1
di d 2 vc
vl = L = LC 2
dt dt
dvc
vr = iR = RC
dt
vs = vc + vr + vl
dvc d2 vc
vs = vc + RC + LC 2
dt dt
c) According to the differential equation which is derived in part (b), by defining state
variables x1 = vc , x2 = dvc /dt and also vs as the input, we have
ẋ1 = x2
1
ẋ2 = (−RCx2 − x1 + u)
LC
Then a state space model can be formed as
0 1 0
A= B= 1
−1/LC −R/L LC
vr = RC v̇c = RCx2
C = 0 RC
C.1. Solutions to Chapter 1 219
λn + an−1 λn−1 + . . . + a0 = 0
λ1 0 . . . 0
0 λ2 . . . 0
Λ=. .. . . ..
. . . . .
0 ... 0 λn
so
Λn + an−1 Λn−1 + . . . + a0 I = 0
This is the matrix characteristic equation.
A = T ΛT −1
A2 = T ΛT −1 T ΛT −1 = T Λ2 T −1
..
.
Am = T Λm T −1
T Λn T −1 + an−1 T Λn−1 T −1 + . . . + a0 T T −1 = 0
An + an−1 An−1 + . . . + a0 I = 0
a) By eigenvalue decomposition
λ 0
A = T ΛT −1
, T = [t1 t2 ], Λ= 1
0 λ2
so
(sI − A)X(s) = x(0), T (sI − Λ)T −1 X(s) = x(0)
" #
1
−1 −1 0
X(s) = T (sI − Λ) T x(0) = T s−λ1 1 T −1 x(0)
0 s−λ2
It is worthy to observe that we could not investigate the dynamic of a system with
its transfer function when it is affected only by some none-zero initial condition,
while this possibility is provided by state space model.
220 C. Solutions to Exercises
c) From the result of part (b) for any initial condition of the form x(0) = k1 t1 + k2 t2 ,
the solution x in frequency domain is
k1 k2
X(s) = t1 + t2
s − λ1 s − λ2
Similarly (λt)n+1 is
All further terms can be replaced by expressions where the highest power of λt is
n − 1, so
eλt = α0 (t) + α1 (t)λ + . . . + αn−1 (t)λn−1
The α′ s are functions of t because each coefficient of (λt)k in the expressions E
involves t.
1 2 2 −1
eAt = T IT −1 + T ΛtT −1 + T Λ t T + ...
2!
1 22
= T (I + Λt + Λ t + . . .)T −1
2!
= T eΛt T −1
By comparing the above expression with what we have already proved about
eAt = T eΛt T −1 , we can conclude
e λ1 t 0 . . . 0 α0 (t) 0 ... 0 α1 (t)λ1 0 ... 0
0 e λ2 t . . . 0 0 α0 (t) . . . 0 0 α1 (t)λ2 . . . 0
eΛt
= . .. ... .. = .. .. ... .. + .. .. ... .. +
.. . . . . . . . .
0 . . . 0 e λn t 0 ... 0 α0 (t) 0 ... 0 α1 (t)λn
αn−1 (t)λ1n−1 0 ... 0
0 αn−1 (t)λ2n−1 . . . 0
... + .. .. ... ..
. . .
0 ... 0 αn−1 (t)λnn−1
The above equation is a matrix equation and can be expressed by n separate equa-
tions
As for the above equation, we started from the equation (1.23), it is now clear that
the functions αi (t), i = 1, . . . , n − 1 are identical in both equations.
When the eigenvalues are distinct, the rows of the square matrix in the above
equation are linearly independent and this matrix would be invertible and then it
C.1. Solutions to Chapter 1 223
The eigenvalues of A for this system are λ1 = −3 and λ2 = −2, so α0 and α1 can
be found as solutions to this equation
−1 −3t
α0 1 −3 e −2 3 e−3t
= =
α1 1 −2 e−2t −1 1 e−2t
The output response with the given input is, with y0 (t) from above as shown below.
The matrix eA(t−τ ) is partitioned as
A(t−τ ) Φ11 Φ12
e =
Φ21 Φ22
Z t
y(t) = y0 (t) + c eA(t−τ ) bu(τ )dτ
0
Z t
Φ11 Φ12 1
= y0 (t) + 2 1 1 dτ
0 Φ21 Φ22 0
Z t
= y0 (t) + 2 (Φ11 + Φ21 ) dτ
0
Z t
= y0 (t) + 2 (α0 (t − τ ) − 6α1 (t − τ ) − 6α1 (t − τ )) dτ
0
Z t
= y0 (t) + 2 10e−3(t−τ ) − 9e−2(t−τ ) dτ
0
20 −3t 18
y(t) = y0 (t) − (e − 1) + (e−2t − 1)
3 2
Z t
g(t−τ ) 1 gt
e dτ = (e − 1)
0 g
1
Y m b(s)
= 2 b k
=
U s + m
s + m
a(s)
As has been shown in Section 1.1, we can construct a state space model of a system
in controller canonical form from its transfer function by comparing (1.11) with
C.1. Solutions to Chapter 1 225
(1.14) and (1.15). In this way there are two first order differential equations and an
output equation
ẋ1 = x2
k b
ẋ2 = ẍ1 = − x1 − x2 + u
m m
1
y= x1
m
which are equivalent to the following state space model
x˙1 0 1 x1 0
= k b + u
x˙2 − m − m x2 1
1 x1
y= m 0
x2
Second way: We can choose the states as x1 = y and x2 = ẏ which have physical
significance and are, respectively, the displacement and velocity of the mass. In this
case it is straightforward to write a state space model of the system as
x˙1 0 1 x1 0
= k b + 1 u
x˙2 − m − m x2 m
x1
y= 1 0 .
x2
By comparing this state space model with (1.14) and (1.15), we realise that this
model is not in the controller canonical form. However if we apply similarity trans-
formation T = m1 I, we can transform this model to the controller canonical form.
Thus,
−1
−1 s −1 1 s + 0.1 1
(sI − A) = = 2
1 s + .1 s + 0.1s + 1 −1 s
and
1
c(sI − A)−1 b =
s2 + 0.1s + 1
c) In open loop
ẋ = Ax + bu y = cx + du
226 C. Solutions to Exercises
In closed loop
u = fx
ẋ = (A + bf )x
The closed loop system (Af b , b, c) is again in controller canonical form (why?), there-
fore we can directly observe that the coefficients of the closed-loop characteristic
polynomial are
ā0 = 1.0 − f1
ā1 = 0.1 − f2
s2 + 2ζωn s + ωn2 = 0
and finally
1.0
f1 = − (1.314)2 = −0.73
1.0
0.1
f2 = − 2 × 0.7 × 1.314 = −1.74
1.0
Note that the fact that the closed loop system was again in controller canonical form,
simplified the calculation of state feedback coefficients; this is the main reason for con-
structing the model in this canonical form.
Matlab: see cs1_springmass.m
ẋ1 = −a0 x3 + b0 u
ẋ2 = −a1 x3 + b1 u + x1
ẋ3 = −a2 x3 + b2 u + x2
y = x3
b) The transfer function can be directly calculated in the following way. We take
Laplace Transforms of both sides of the first order differential equations and the
output equation to obtain
sX1 = −a0 X3 + b0 U
sX2 = −a1 X3 + b1 U + X1
sX3 = −a2 X3 + b2 U + X2
Y = X3
Now the standard procedure is to solve the first three equations to find X1 ,X2 and
X3 based on the given model parameters and then substituting them in the output
−1
equation, which is equivalent to use the formula G(s) = c sI − A b. Here as
the output equals X3 we can simply solve only the first two equations. Thus, we
rearrange the first two equations in matrix form
s 0 X1 −a0 X3 + b0 U
=
−1 s X2 −a1 X3 + b1 U
Then we have
X1 1 s 0 −a0 X3 + b0 U
= 2
X2 s 1 s −a1 X3 + b1 U
which results in
1
X2 = (−(a1 s + a0 )X3 + (b1 s + b0 )U )
s2
By substituting this in the third equation to find X3 based on U and then substi-
tuting X3 in the output equation, we finally obtain the transfer function as
Y (s) b2 s 2 + b1 s + b0
= 3
U (s) s + a2 s 2 + a1 s + a0
Another way: it is also possible to first construct the governing differential equa-
tion of the system and then calculating the transfer function by taking Laplace
Transforms of both sides of the equation.
228 C. Solutions to Exercises
ẏ = ẋ3 = − a2 y + b 2 u + x 2
ÿ = −a2 ẏ + b2 u̇ + ẋ2 = − a2 ẏ + b2 u̇ + (−a1 )y + b1 u + x1
...
y = −a2 ÿ + b2 ü + (−a1 )ẏ + b1 u̇ + ẋ1 = − a2 ÿ + b2 ü + (−a1 )ẏ + b1 u̇ + (−a0 )y + b0 u
Y (s3 + a2 s2 + a1 s + a0 ) = U (b2 s2 + b1 s + b0 )
s2 + 5s + 2 b̃(s)
H(s) = 4 + = 4 +
s3 + 6s2 + 10s + 8 a(s)
Then by comparing this with the equation G(s) = c(sI − A)−1 b + d (1.17), we realize that
the first term is a ‘feedthrough’ term, corresponding to d = 4.
The controller and observer forms are calculated from the coefficients of a(s) and b̃(s);
note that a(s) is the denominator of H(s), whereas b̃(s) is not the numerator of H(s) but
of the strictly proper remainder after polynomial division.
Controller canonical form
0 1 0 0 1 0 0
A= 0 0 1 = 0 0 1 b = 0
−a0 −a1 −a2 −8 −10 −6 1
c= 2 5 1 d=4
c= 0 0 1 d=4
C.1. Solutions to Chapter 1 229
Alternatively, the controller and observer canonical forms can be obtained without poly-
nomial division by using equation (1.13) and following the derivation in time domain
discussed in Chapter 1. This is illustrated here for the case of the controller canonical
form:
For a bi-proper system, equation (1.13) changes to
dn dn−1 d
y(t) = bn n
v(t) + b n−1 n−1
v(t) + . . . + b1 v(t) + b0 v(t)
dt dt dt
and consequently, we have an additional term in equation 1.15
x1
x2
y(t) = [b0 b1 . . . bn−1 ] . + bn ẋn
..
xn
Then, because I = T −1 T
G2 (s) = c1 T [T −1 sT − T −1 A1 T ]−1 T −1 b1
230 C. Solutions to Exercises
b) With variables at steady state indicated by u0 , h0 , fin0 , the steady state is defined
by ḣ0 = 0, so
fin0
u0 = √
kt h0
c) For small changes in h,u
∂ kt u
fnl (h, u) = − √
∂h At 2 h
∂ kt √
fnl (h, u) = − h
∂u At
so at h0 , u0
kt u0 k p f k p
δ ḣ = − √ δh − t h0 δu = − in0 δh − t h0 δu
2At h0 At 2At h0 At
s̈ = f1 (x, u)
α̈ = f2 (x, u)
Jw
meq := mp + 2mw + 2
r2
Jeq := Jp + mp l2
∆ := meq Jeq − (mp l cos(α))2 = det(M (s, α))
kt
Rr
u − kRr
t kb kt kb
2 ṡ + Rr α̇ + mp lsin(α)α̇
2
fq (s, α) = k k k k
− kRt u + Rr t b
ṡ − tR b α̇ + mp glsin(α)
sin(α) = 0.
solution I: linearizing (1.24) first using small angle approximation, then deriving
a linear state space model
solution II: directly calculating the jacobians from the Taylor series
C.1. Solutions to Chapter 1 233
solution I
simplifying the following nonlinear terms with small angle approximation
cos(α) ≈ 1
sin(α) ≈ α
α̇2 ≈ 0
Jw kt kb kt kb kt
(mp + 2mw + 2 2
)s̈ + mp lα̈ + 2
ṡ − α̇ = u
r Rr Rr Rr
kt kb kt kb kt
(Jp + mp l2 )α̈ + mp ls̈ − ṡ + α̇ − mp glα = − u
Rr R R
this linearized equations can be written in matrix notation
s̈ ṡ s
M0 + D0 + K0 = F0 u
α̈ α̇ α
0 0
K0 =
0 −mp gl
kt
F0 = Rr
− kRt
kt kb 1 Jeq
a11 = − ( + mp l)
Rr ∆0 r
kt kb 1
a12 = (Jeq + mp lr)
Rr ∆0
kt kb 1 l
a21 = (meq + mp )
Rr ∆0 r
kt kb 1
a22 = − (meq r + mp l)
Rr ∆0
mp gl
a31 = − mp l
∆0
mp gl
a32 = meq
∆0
kt 1
b1 = (Jeq + mp lr)
Rr ∆0
kt 1
b2 = − (meq r + mp l).
Rr ∆0
solution II
directly calculating the jacobians from the Taylor series
0 0 1 0
0 0 0 1
A = ∂f1 ∂f1 ∂f1 ∂f1 |x,0
∂s ∂α ∂ ṡ ∂ α̇
∂f2 ∂f2 ∂f2 ∂f2
∂s ∂α ∂ ṡ ∂ α̇
0
0
B = ∂f1 |x,0
∂u
∂f2
∂u
C.1. Solutions to Chapter 1 235
derivative w.r.t. s
∂f1 ∂f2
= =0
∂s ∂s
since f1 and f2 are independent of s.
derivative w.r.t. α
∂f1 ∂f2
For ∂α
and ∂α
the partial derivation gets simplified because ∆′ |x,0 = 0
5◦ : stable and good match between linear and nonlinear simulation. The maxi-
mum voltage is inside the limitation (umax < Vs ).
0.04
linear
0.02 nonlinear
0
0 0.5 1 1.5 2 2.5
0.1
linear
0 nonlinear
-0.1
0 0.5 1 1.5 2 2.5
10
linear
5 nonlinear
0
0 0.5 1 1.5 2 2.5
c) 10◦ : still stable, but difference between linear and nonlinear simulation. Since the
maximum voltage is not inside the limitation anymore, the saturation block
has an influence on the input voltage.
15◦ : difference becomes more noticeable. The nonlinear states becomes unstable,
the nonlinear input is completely in saturation.
238 C. Solutions to Exercises
0.1 1
linear linear
0.05 nonlinear 0.5 nonlinear
0 0
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
0.2 10
linear linear
0 nonlinear 0 nonlinear
-0.2 -10
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
20 30
linear 20 linear
10 nonlinear nonlinear
10
0 0
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
good match between experiment and simulation until t = 3.5 s, but then the differ-
ence in the position gets higher over time. In the experiment, the input voltage is
corrupted by noise.
Figure C.6: comparison of simulation and experiment (d0 = 7.2 V , tstart = 3 s, ∆t = 0.1 s)
240 C. Solutions to Exercises
C.2 Chapter 2
This has rank 1, so the system is not controllable. The system is transformed to
the controller form using
−2 0.5
T =
1 1
which gives the transformed system A and b matrices
−1 0.5 1
A= , b=
0 1 0
c) The controllable subspace is the line x2 = −0.5x1 which is also the range of the
controllability matrix. The uncontrollable subspace is perpendicular to this.
1
x1i (s) = x10
s+1
1 1
x2i (s) = x10 + x20
(s − 1)(s + 1) s−1
1 1 x10 0.5x10 + x20
Yi (s) = x10 + x20 = − +
(s − 1)(s + 1) s−1 2(s + 1) s−1
The solution has a stable and an unstable part (eigenvalues -1,1). There is an
unbounded, unstable, solution when 0.5x10 + x20 6= 0. Otherwise the solution is
bounded and indeed eventually reaches the origin. Although comparing this to the
answer of part (c), 0.5x10 + x20 6= 0 also means that the initial state is not in the
C.2. Solutions to Chapter 2 241
controllable subspace. But this is of course just a coincidence and in general there
is no relationship between the concepts of controllability of a system and its free
response!
In the phase portrait of a two-dimensional linear system, there would be some
straight-line trajectories which are the lines spanned by eigenvectors of A. If we
start on one of these lines, we would stay on it forever and the solution is a simple
exponential growth or decay along it. If the line is spanned by an eigenvector which
corresponds to a stable eigenvalue, the solution would be an exponential decay and
if it is spanned by an eigenvector which corresponds to an unstable eigenvalue, the
solution would be an exponential growth. Here A has two eigenvalues -1 and 1
and their corresponding eigenvectors are [−2 1]T and [0 1]T . So if the initial values
satisfy the equation 0.5x10 + x20 = 0, it means that the solution stays on the line
spanned by [−2 1]T which corresponds to the stable eigenvalue -1 and therefore the
solution would be an exponential decay towards the origin. Such a solution has
nothing to do with the vector b.
If the system is not controllable, the controllability matrix does not have full rank.
In order for the 2 × 2 controllability matrix not to have full rank, its columns
should be linearly dependent or Ab = λb which means that the vector b should
be an eigenvector of A. Thus, if we have either [−2 1]T or [0 1]T as the vector
b, the system would be uncontrollable. If we have a system with b = [0 1]T then
the line spanned by this vector is the controllable subspace and the line spanned
by [−2 1]T is still the trajectory which exhibits exponential decay and unlike the
previous system, they are not the same lines.
f) The transfer function is not a complete description of the behaviour: it assumes the
initial conditions are zero,and has a pole zero cancellation.
Simulation: cs2_unstab.m
so
d At T AT t T T
e bb e = AeAt bbT eA t + eAt bbT eA t AT
dt
b) From the definition,
Z t Z t
T Aτ T AT τ T
AWc + Wc A = A e bb e dτ + eAτ bbT eA τ dτ AT
Z t0 0
T T
= {AeAτ bbT eA τ + eAτ bbT eA τ AT }dτ
Z0 t
d Aτ T AT τ
= e bb e dτ
0 dτ
eAτ → 0 as τ → ∞
for stable A:
AWc + Wc AT → −bbT as t → ∞
adj(sI − A) = Isn−1 + (A + an−1 I)sn−2 + . . . + (An−1 + an−1 An−2 + . . . + a1 I)
adj(X)
X −1 =
det(X)
adj(sI − A) = “Resolvent” = (sI − A)−1 det(sI − A)
Let
RHS = Isn−1 + (A + an−1 I)sn−2 + . . . + (An−1 + an−1 An−2 + . . . + a1 I)
Then
RHS · (sI −A) = Isn−1 + (A + an−1 I)sn−2 + . . .
. . . + (An−1 + an−1 An−2 + . . . + a1 I) (sI −A)
and
RHS ·(sI−A) = Isn +(A+an−1 I −A)sn−1 +(an−2 I +an−1 A+A2 −an−1 A−A2 )sn−2 +. . .
− (a1 A + . . . + an−1 An−1 + An )
C.2. Solutions to Chapter 2 243
RHS · (sI −A) = I(sn + an−1 sn−1 + an−2 sn−2 + . . . + a1 s) − (a1 A + . . . + an−1 An−1 + An )
From the Cayley-Hamilton Theorem
Thus
and hence
(s − 1)(s + 4) + 6 = s2 + 3s + 2
so
p = ā1 − a1 ā0 − a0
p = (2.8 − 3) (4 − 2) = −0.2 2
1 3 −1 1 −3
Ta = , Ta =
0 1 0 1
The controllability matrix C is
3 1
b Ab =
1 5
−1 1 5 −1
C =
14 −1 3
So the solution is
f = −pTa−1 C −1
1 −3 1 5 −1
f = − −0.2 2 = 0.26 −0.57
0 1 14 −1 3
For the script for the step response see cs2_BassGura.m.
244 C. Solutions to Exercises
a) If q T A = λq T and q T b = 0 then
q T C = q T b Ab A2 b . . . An−1 b
qT b = 0
q T Ab = λq T b = 0
so
q T Am b = λm q T b = 0
qT C = 0
Now consider that for matrix Āc̄ there exists a left eigenvector xc̄ such that
q T (sI − A) = 0 and qT b = 0
q T A = sq T and qT b = 0
That is, such a vector q can only exist if and only if it is a left eigenvector of A and
q T b = 0.
C.2. Solutions to Chapter 2 245
d) From parts (a) and (b), it follows that a system is uncontrollable iff any left eigen-
vector of A belongs to the left null space of b, i.e. a q exists such that q T A = λq T
and q T b = 0. By having such a q, we know from part (c) that there exists some
s ∈C | such that sI − A b does not have full rank. This implies the following:
There exists an s ∈ C,| such that sI − A b does not have full rank, if and only
if the system is uncontrollable. Note that such an s must be an eigenvalue of A,
because if s is not an eigenvalue of A, (sI − A) always has full rank, by definition
of eigenvalues (and therefore sI − A b always has full (row) rank if s is not an
eigenvalue of A.).
From the above, we conclude that the system is controllable iff for all s ∈ C,
|
sI − A b has full rank.
x 1 = θ1 , x 2 = θ2 , x3 = θ̇1 , x4 = θ̇2
we have
0 0 1 0 0
0 0 0 1 0
A= , b=
a1 a2 0 0 −1/(M l1 )
a3 a4 0 0 −1/(M l2 )
(M + m)g mg
a1 = a2 =
M l1 M l1
mg (M + m)g
a3 = a4 =
M l2 M l2
(M + m)g mg
a1 = a2 =
M l1 M l1
246 C. Solutions to Exercises
c) This then causes the last two rows of the controllability matrix to be identical:
C= b Ab A2 b A3 b
b1 0 0 b̃1
b 0 0 b̃1
Ab = 1 , A2 b = = , A3 b =
0 a1 a2 b1 b̃1 0
0 a2 a1 b 1 b̃1 0
0 b1 0 b̃1
0 b1 0 b̃1
C=
b1 0 b̃1 0
b1 0 b̃1 0
C is singular: the controllable subspace is defined by x1 = x2 , x3 = x4 , that is with
θ1 = θ2 and θ̇1 = θ̇2
Ac = T −1 AT bc = T −1 b cc = cT
b) To calculate t1 , t2
AT = T Ac
0 1 0
At1 At2 At3 = T Ac = t1 t2 t3 0 0 1
−a0 −a1 −a2
At3 = t2 − a2 t3
At2 = t1 − a1 t3
C.2. Solutions to Chapter 2 247
t3 = b
t2 = Ab + a2 b
t1 = At2 + a1 b = A2 b + Aba2 + ba1
T
t1 = b Ab A2 b a1 a2 1
T
t2 = b Ab A2 b a2 1 0
T
t3 = b Ab A2 b 1 0 0
a1 a2 1
T = C a2 1 0
1 0 0
d) The second matrix on the right hand side is always invertible. So T is invertible,
and therefore is an allowable transformation matrix, if and only if C is invertible.
ẋ1 = u − a0 x2
ẋ2 = x1 − a1 x2
y = g 1 x1 + g 2 x2
0 −a0 1
A= , b= , c = g1 g2
1 −a1 0
C = b Ab = I
The transfer function is c(sI − A)−1 b
s a0 −1 1 s + a1 −a0
(sI − A) = , (sI − A) = 2
−1 s + a1 s + a1 s + a0 1 s
so
−1 1 s + a1 −a0 1 g1 (s + a1 ) + g2
c(sI − A) b = 2 g1 g2 = 2
s + a1 s + a0 1 s 0 s + a1 s + a0
248 C. Solutions to Exercises
ẋ = Ax + bu, y = cx u = f x + uv
If the system is controllable, there exists a similarity transformation x = Tco xco such that
the matrices Aco , bco are in the controllability form
C.3 Chapter 3
a) The states and the Matrices A and b of the state space model are
xT = d d˙ d + Lθ d˙ + Lθ̇
0 1 0 0 0
0 −F/M 0 0 1/M
A= , b=
0 0 0 1 0
−g/L 0 g/L 0 0
The angle θ is the output of the system. So
c = −1/L 0 1/L 0
s(s + F/M ) = 0
s2 − gL = 0
p
The eigenvalues are {0, −F/M, ± g/L}
A condition for controllability is that there does not exist a vector q 6= 0 such that
q T A = λq T
AT q = λq
Ap = λp, cp = 0
d) With the new state variables, from the original state space model the following
relations follow:
F 1
d¨ = − d˙ + u
M M
g g g
d¨ + Lθ̈ = − d + d + Lθ
L L L
¨
Lθ̈ = gθ − d
T
So with the new state vector d˙ Lθ Lθ̇ d :
F 1
x˙1 = d¨ = − x1 + u
M M
x˙2 = Lθ̇ = x3
F 1 F g 1
x˙3 = Lθ̈ = gθ + d˙ − u = x1 + x2 − u
M M M L M
˙
x˙4 = d = x1
C.3. Solutions to Chapter 3 251
1 1
T
b̄ = 0 − 0
hM M
i
c̄ = 0 1 0 ... 0 = co 0
L
The matrices are in the form of the Kalman decomposition and Āō - which is zero
- represents the unobservable state d. Physically, this means that it is not possible
to establish how far the cart has moved by measuring the angle alone.
rank O = 4
This can also be shown, for example, using the PBH observability test. The system
is simulated in cs3_pendel.m
Problem 3.3 (Necessity of existence of T for two state space realizations with identical
transfer functions)
a) The transform from S1 to the controller form Sc is T1c and the transform from Sc
−1 −1
to S2 is T2c . Hence the transform from S1 to S2 is T2c T1c .
a) the term (s + 1) in the numerator and denominator, cancel each other and this
cancellation corresponds to an uncontrollable mode, an unobservable mode or an
uncontrollable-unobservable mode.
b) The controller and observer canonical forms are calculated from the coefficients of
numerator and denominator of the transfer function:
Controller canonical form
0 1 0 1 0
A= = b=
−a0 −a1 −2 −3 1
c= 1 1
c= 0 1
Two state space models corresponding to the given transfer function are constructed;
the controller canonical form which is always controllable and the observer canonical
form which is always observable. We also know that controllability and observability
are properties of a linear system and are invariant under similarity transformation.
On the other hand, the p-z cancellation represents an uncontrollable or/and un-
observable mode. Hence we predict that the above controller canonical model is
not observable and the above observer canonical model is not controllable. We also
predict there does not exist a similarity transformation to convert one of these mod-
els into the other. These two state space realizations belong to different systems,
although they have an identical transfer function.
C.4. Solutions to Chapter 4 253
C.4 Chapter 4
a) The model is in the controllable canonical form form. Thus, the gain is f =
−1.5 −1 .
c) In order to obtain the closed loop bode plot we should first construct the closed
loop system, which is given as
ẋ A bf x b 0 d
= +
x̂˙ −lc A + bf + lc x̂ 0 −l n
x
y = c 0
x̂
the Bode plot can then be generated from the above closed loop model.
d) The faster observer poles lead to faster response of the plant, but at the expense
of increased sensitivity to high frequency noise: this is evident in the time and
frequency domain responses.
a) The system is in the controllable canonical form, so the solutions of the equation
s3 + 4s2 + 5s + 2 = 0
c3 s2 + c2 s + c1 = 0
s2 + 4s + 5 = 0
x0 = (zI − A)−1 b
To cope with the complex signals, we make use of the principle of superposition to
find real signals for the initial condition and control input. As the system is linear
if we have a linear combination for the initial conditions x0 and x̄0 , and at the same
time, a similar combination for their corresponding control inputs, the output of
the system and the solution for x, are expected to be again a combination of the
results of each individual case, in the same fashion. Thus, such combinations again
produce zero output.
We form two sets of such combinations. In the first set, the real initial condition
and the corresponding real input are
T
x01 = (x0 + x̄0 )/2 = 0.5 −1 1.5 , u1 = u0 (ez1 t + ez̄1 t )/2 = e−2t cos(t)
and in the second set, the real initial condition and the corresponding real input are
T
x02 = (x0 − x̄0 )/2i = 0 0.5 −2 , u2 = u0 (ez1 t − ez̄1 t )/2i = e−2t sin(t)
b) We have
The error states are not controllable, so the system (A + bf, b, c) defines the closed
loop transfer function Gcl (s) = c(sI − A − bf )−1 bv
(s + 2.5)
Gcl (s) = v · 4
(s2+ 6s + 18)
1
v=− = 1.8
c(A + bf )−1 b
we get
A + bf −bf 0
Acl = , bcl = , ccl = c 0
0 A + lc −l
which, unlike the first position, does not split into controllable and uncontrollable
parts.
The poles are the poles of the whole closed loop system, minus any cancellations
that may happen: The ’possible’ poles are the eigenvalues of A + bf and A + lc
which have already been designed: s = −3 ± 3i, −10 ± 10i.
One way of calculating the zeros is directly from the transfer function
Another way of calculating the zeros is shown in Fig.C.7, which highlights the
interaction between the components of the system.
256 C. Solutions to Exercises
Compensator C(s)
b
r R u y
l f G(s)
-
The equations of the compensator C(s) and the plant G(s) are
It therefore follows that the closed loop system zeros are the zeros of G(s) plus the
zeros of C(s), minus any cancellations.
1490.5(s + 2.55)
C(s) =
(s + 86.95)(s − 65.95)
(s + 2.5)
G(s) = 4
(s + 2)(s + 3)
In this case there are no cancellations, so the zeros are s = −2.55, s = 2.5
Note that although the close loop system is stable, the compensator itself is unstable,
which in practice is not preferable.
The overshoot in (b) is larger due to the zeros in C(s) (two zeros in the closed loop
System). In part (a), zeros cancel out the poles of the observer.
C.4. Solutions to Chapter 4 257
f)
ẋ = Ax + bu = Ax + bf x̂ + bvr
x̂˙ = (A + bf + lc)x̂ − ly + wr
x̃ = x − x̂
x̃˙ = (A + lc)x̃ + bvr − wr
a) The numerator contains the zeros of the system; as the system is minimal none of
these have been cancelled with poles of the system.
and from part (a) of this problem, by designing the controller and the observer, we
have
−238.5
f = −5 4 , l =
74.5
The transfer function Gyd (s) is clearly just that of the system G(s) and it can be
calculated directly from its state space model. Therefore we calculate
(s + 2.5)
i) Gyd (s) = G(s) = 4
(s + 2)(s + 3)
We use equation (4.8)- compare with problem 4.3 part (f)- together with w = bv to
obtain
Aur = A + bf + lc
bur = bv
cur = f
dur = v
to calculate
(s2 + 20s + 200)
ii) Gur (s) = 1.8
(s + 86.95)(s − 65.95)
x̂˙ = (A + bf + lc)x̂ − ly
u = f x̂
Auy = A + bf + lc
buy = −l
cuy = f
to obtain
(s + 2.55)
iii) Guy (s) = −1490.5
(s + 86.95)(s − 65.95)
C.4. Solutions to Chapter 4 259
About the zeros of Gur : In problem 4.3 part (a) we placed the eigenvalues of A+lc to
−10 ± 10j and we know, in this combination of the controller and the observer, the
eigenvalues of A+lc are uncontrollable from the control input. Thus, we expect them
to be canceled out with some zeros in the closed loop transfer function. Considering
the equation
G(s)Gyu (s)
Gcl (s) =
1 − G(s)Guy (s)
we see that such zeros cannot be zeros of G(s) as the denominator would be 1 6= 0
and therefore no cancellation occurs. Thus, they must be zeros of Gur i.e. zeros of
s2 + 20s + 200, as they are.
Y (s) G(s)
=
D(s) 1 − G(s)Guy (s)
as t → ∞, s → 0
Y (0) G(0)
= = −15.93
D(0) 1 − G(0)Guy (0)
d) From the simulation in cs4_spfollow.m, limt→∞ (y − r) = 0.032.
e) see cs4_spfollow_int.mdl
f) Before adding an integrator in parts (c) and (d), we had a non-zero steady state
error. We investigate the steady state tracking error after augmentation by forming
the closed-loop transfer functions. The closed-loop transfer function of the aug-
mented system from r to y is
Gcl (s) fsI fI Gcl (s)
Gcl−aug (s) = =
1+ Gcl (s) fsI s + fI Gcl (s)
which as s goes to zero, goes to 1. This simple calculation shows that even if we
have any modeling uncertainty, the steady state error would always be zero after
adding an integrator.
We check disturbance rejection in steady state by forming the closed-loop transfer
function of the augmented system from d to y
Gcl (s) sGcl (s)
Gdcl−aug (s) = =
1 + Gcl (s)f racfI s s + fI Gcl (s)
260 C. Solutions to Exercises
Problem 4.6 (Optimal controller and observer design with symmetric root locus)
a) First it should be clear that cz = 2 1 .
The transfer function equivalent of state space representation A, b, cz is calculable
by hand or using Matlab commands tf on the system form of the plant
(s + 2)
G(s) = −
(s + 1)(s − 1)
(s − 2)(−s − 2)
Gss (s) =
(s2 − 1)(s2 − 1)
c) The transfer function for use in getting the symmetric root locus for the observer is
0.1s + 1
Gn (s) =
(s2 − 1)
(0.1s + 1)(−0.1s + 1)
Gnn (s) =
(s2 − 1)(s2 − 1)
The root locus for 1 + qGnn is in cs4_6symrl.m
a) In the limit
ac (s)ac (−s) = a(s)a(−s)
for stable poles the feedback is small as little ’effort’ is used, there is little movement
of the poles.
Unstable poles are moved to their mirror image position in the left half plane.
Interpretation for controller: Stable poles need not be moved if energy is to be
minimised; controller K is small. For minimal energy unstable poles are moved to
their mirror image location in the LHP.
Interpretation for Kalman filter: With a lot of noise in the output and relatively
little state noise (q → 0), The states are best estimated without using the output
at all (l → 0).
b) i)
a(s) = s2 + 1, a(−s) = s2 + 1
b(s) = s + 1, b(−s) = −s + 1
so total no. of roots = 4, as ρ → 0 two roots tend towards the roots of (−s2 +1),
so closed loop root of system is at -1 (i.e. the stable root).
ii) For the ’large’ values of s and corresponding small values of ρ
1
s4 + (2s2 + 1) + (−s2 + 1)
ρ
1 1
≈s4 + 2s2 − s2 = s4 + (2 − )s2
ρ ρ
1
≈s4 − s2
ρ
The two ’small’ roots are the zero solutions of this equation found in part (i).
The other 2 roots are the roots of
1 1
s2 = , s = ±√
ρ ρ
262 C. Solutions to Exercises
1
ac (s)ac (−s) = a(−s)a(s) + b(s)b(−s)
ρ
2n
= a + . . . a0 + b 0
b)
a(−s)a(s) ≈ (−1)n s2n , b(−s)b(s) ≈ (−1)m b2m s2m
so
1
(−1)n s2n + (−1)m b2m s2m ≈ 0
ρ
c) From (b)
1
s2(n−m) = (−1)m−1−n b2m
ρ
b) Having
1
|a(jw)|2 + |b(jw)|2 = 0
ρ
from part (a), implies:
⇒ a(jw) = 0 and b(jw) = 0
Thus, a(jw) and b(jw) must have roots at jw:
a) i&ii) Shown in figure C.8 is the response to the initial condition [0; 9π/180; 0; 0]T .
By setting the Q=diag(1000,1000,10,1) and R=0.1 the desired performance can
be achieved.
iii) On increasing the initial perturbation of the inverted pendulum angle, it is
found that the error between the linear and the nonlinear response increases.
This is due to the fact that the system was linearised about the vertical position
and so moving away from this position reduces the accuracy of our linearised
model.
0.1
linear
0.05 X: 1.5 nonlinear
Y: 0.004323
0
-0.05
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0.2
linear
0.1 nonlinear
-0.1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
20 linear
nonlinear
10
a) Shown in figure C.9 is the tracking of a sine reference input with frequency 0.4 rad/s
and amplitude 0.2 m. It is shown that after fine tuning of the controller a good
experimental performance could be achieved.
0.2
reference
0.15 Nonlinear state
0.1
experiment
0.05
-0.05
-0.1
-0.15
-0.2
0 5 10 15 20 25 30
C.5 Chapter 5
Problem 5.1 (characteristic loci) There are two characteristic loci as the dimension
of the system is 2 × 2.
The larger eigenvalue locus crosses the negative real axis at about -0.058, so the maximum
value of k is approximately 17.
a) Ger (s):
e = r − GKe
Ger (s) = S = (I + GK)−1
Gydo (s):
y = do − GKy
(I + GK)y = do
Gydo = (I + GK)−1
b) Gyr (s):
y = GKr − GKy
−1
y = (I + GK)−1 GKr = G−1 (I + GK) Kr = {G−1 + K}−1 Kr
−1
= (I + KG)G−1 Kr
= G(I + KG)−1 Kr
−1 −1
y = (I + GK)−1 GKr = (GK)−1 (I + GK) r = (GK)−1 + I r
−1
= (I + GK)(GK)−1 r
= GK(I + GK)−1 r
c)
ug = di − KGug
SI = (I + KG)−1
SG = (I + GK)−1 G
−1 −1
= G−1 (I + GK) = (I + KG)G−1
= G(I + KG)−1 = GSI
266 C. Solutions to Exercises
d
r 1 xI u y
FI G(s)
- s
xO 1
F B
s
C L
b) For this design it is only necessary to consider the poles of the controller, as the re-
sponse to setpoint changes is not affected by the observer, with the 6 poles positions
−1 ± 0.5j, −2 ± 1.0j, −3, −3.
The settling time for r1 = 2.45, for r2 = 1.75
maximum (y2 ) with change in r1 = 0.11
maximum (y1 ) with change in r2 = 0.06
See simulation cs5_tgen_Place.m and Simulink model cs5_tgensfmod.mdl, when
the switch is in “state feedback” position
C.5. Solutions to Chapter 5 267
c) For this part the observer dynamics must be considered because the observer dy-
namics are controllable from d.
4 observer poles that fulfill this are: −10 ± 10j, −15 ± 15j
The integral action enforces the first row to be zero. The rank of the matrix is
clearly one, so one of the singular values is zero. This singular value corresponds to
a direction defined by the null space of this matrix.
The maximum singular value at low frequencies corresponds to a direction where
the steady state error of the second output is at its maximum.
At high frequencies both singular values tend to 1 because at high frequencies the
feedback path has very low gain.
T at low frequencies T tends to I − S0 . Both singular values are close to 1 as there
is reasonable setpoint following of both setpoints at low frequencies.
T at high frequencies tends to zero as the closed loop plant is strictly proper.
c) At low frequencies the input directions are the columns of the V matrix of the
singular value decomposition of S0 . They correspond to the directions of the input
setpoint vector corresponding to the error in channel 2 to being maximised and the
error in channel 2 being zero.
The output directions are the vectors of the U matrix of the SVD of S0 . The
0
direction corresponding to the maximum singular value is i.e. it is completely
1
in the direction of setpoint or output 2. The direction corresponding to the minimum
singular value is the direction of setpoint or output 1.
268 C. Solutions to Exercises
a) Scaling of error: the scale in each channel is the reciprocal of the maximum error
to make the scaled output have magnitude 1
ēmax = Se emax
1 1 0 0.2 0
√ = Se
2 0 1 0 0.5
1 5 0
Se = √
2 0 2
y = Gu
ȳ = Se Gu = Se GSu ū
= Ḡū
Scaling of inputs: the scale in each channel is equal to the maximum input in that
√
channel multiplied by 2
Su ūmax = umax
1 1 0 1 0
Su √ =
2 0 1 0 2
√ 1 0
Su = 2
0 2
In the scaled model, 1.0 represents the maximum value of the p magnitude of the
signal vector: in a 2 × 2 system if any signal has the value 1/ (2), it means that
the system is at the limit of acceptable performance.
c) Scaling of disturbance:
Sd d¯max = dmax
1 1 0 0.1 0
Sd √ =
2 0 1 0 0.1
√ 0.1 0
Sd = 2
0 0.1
d) Scaling of setpoint: r̃ is normalised setpoint. First define the scaling between the
normalised setpoint r̃ and the true setpoint r
rmax = Sr r̃max
√ 4 0
Sr = 2
0 0.4
C.5. Solutions to Chapter 5 269
The scaling between r and r̄ is the same as that between e and ē:
r̄ = Se r = Se Sr r̃
r̄ = Rr̃
20 0
R=
0 0.8
ē = S r̄
σ̄(SR) < 1
which is guaranteed if
σ̄(S)σ̄(R) < 1
that is
σ̄(S) < σ(R−1 )
a)
1 (s + 2) 2(s + 2)
G(s) =
(s + 1)(s + 2) −1 s+1
Use elementary operations to bring a zero into position (1, 2) into position (2, 1)
row2 → row2 + (s + 2) ∗ row1
then
column2 → column2 + (s + 1) ∗ column1
" #
−1
(s+1)(s+2)
0
G2 (s) = (s+3)
0 (s+1)
270 C. Solutions to Exercises
−1.1667 0.2041 −0.3727 -0.0000 −0.4564
−0.0000 −1.0000 −0.0000 0.0000 −0.0000
Ak = −0.3727 0.4564 −1.8333 -0.0000 −1.0206
0.0000 −1.1180 −0.0000 −2.0000 2.5000
-0.0000 0.0000 -0.0000 0.0000 −1.0000
0.1491 −0.5402
−1.0954 −0.0000
Bk = 0.3333 1.2416
−0.8165 0.4082
0.0000 -0.0000
−1.8257 −0.9129 0.8165 0.0000 −0.4082
Ck =
0.4472 0.3651 1.0000 -0.0000 −0.8165
The states can be (c, o), (nc, o), (c, no) or (nc, no):
Ac,o 0 M1 0 Bc,o
M Ac,no M3 M4 B
Ak = 2 Bk = c,no
0 0 Anc,o 0 0
0 0 M5 Anc,no 0
Ck = Cc,o 0 Cnc,o , 0
There are: 3 (c, o) states,1 (c, no) state and 1 (nc, o) state.
C.5. Solutions to Chapter 5 271
d(s) = s2 + 3s + 2
Solutions of d(λ) = 0 are λ1 = −1, λ2 = −2
R1 R2
G(s) = +
s − λ1 s − λ2
1 1 −2 −1
R1 = , R2 =
1 0 −2 −1
1 0 1 1 1
R1 = C 1 B 1 = , R2 = C 2 B 2 = −2 −1
0 1 1 0 1
−1 0 0 1 1
B1
A = 0 −1 0 B= = 1 0
B2
0 0 −2 −2 −1
1 0 1
C = C1 C2 =
0 1 1
With r = 2:
C = B AB
B1 B1 0 I
B= =
B2 0 B2 I
272 C. Solutions to Exercises
λ1 I 0
A=
0 λ2 I
λ1 B 1 B1 0 λ1 I
AB = =
λ2 B 2 0 B 2 λ2 I
B1 0
B=
0 B2
V has full rank so the system is controllable if and only if B has full rank.
B1 has full rank, B2 has full rank as these are conditions from the Gilbert realisation.
Because of the block diagonal structure B also has full rank. Hence C has full rank. A
similar argument can be used to show observability.
a) Let x = xp xI . Then y = Cxp
y1
k1 (y12 + y22 ) = k1 y1 y2 = k1 xTp C T Cxp
y2
u1
k3 (u21 + u22 ) = k3 u1 u2
u2
R = k3 I2
on the state errors would be increased, which means, the weight on the integrated
errors and the inputs would be decreased. If the decreasing effect on the weight
on the integrated error is stronger, then the response would be slower. To be sure,
that the response will become faster, the weight on the input should be decreased,
in which case all effects would lead to a faster response.
Only ratios between weights affect the resulting controller. So any one of the three
constants could be fixed to one without loss of generality.
a) From the Smith-McMillan form (or the Matlab commands zero and pole: see
cs5_rhand_zeros.m)
Pole: −2, −1, −2, −1
Zeros: +0.75
G(zi )uzi = 0
G(zi ) does not have full rank, so nor does GT (zi ). So there must exist a yzi , such
that
GT (zi )yzi = 0
T
yzi G(zi ) = 0
−0.8944
yzi = k ·
0.4472
The singular value decomposition can be used to calculate yzi . See appendix A and
the MATLAB file cs5_rhand_zeros.m.
274 C. Solutions to Exercises
and this can only be true if K(zi ) = ∞, which is exactly when zi is a pole of K(s).
But K(s) cannot have unstable poles which the corresponding zeros of G(s) could
cancel.
T
e) Let yzi = yzi1 yzi2
T11 T12
yzi,1 yz2,2 =0
T21 T22
yzi1 T11 (zi ) + yzi2 T21 (zi ) = 0
yzi1 T12 (zi ) + yzi2 T22 (zi ) = 0
α1 = −0.08944, α2 = 0.4472
b) For no interaction:
T12 (s) = 0, T21 (s) = 0
C.5. Solutions to Chapter 5 275
By inspection T1 (s) does not fulfill this constraint but T2 (s) does.
Valid transfer functions always have an inverse response: the open loop unstable
poles are also apparent in the closed loop if we want to fulfill the decoupling and
steady state conditions. Similar limitations will always apply for any complementary
sensitivity function.
a) w1 = S(s)v, w2 = ψ(s)v
b)
s 0
1 0
ψ(s) = 0 s2
0 s
0 1
2 1
s 0 −1 0
S(s) = 3 , S (s) = s2 1
0 s 0 s3
1
s
0
1 0
s2
G0 (s) = 0 1s
0 s12
0 s13
c) One solution:
y = N (s)v
y = Nl ψ(s)v
N (s) = Nl ψ(s)
276 C. Solutions to Exercises
s 0
1 0
s 0 2
N (s) = , ψ(s) = 0 s
−s s2
0 s
0 1
s 0
1 0
s 0 n11 n12 n13 n14 n15
2 = 0 s2
−s s n21 n22 n23 n24 n25
0 s
0 1
with
1 0 0 0 0
Nl =
−1 0 1 0 0
s 0
1 0
1 0 0 0 0 2 s 0
Nl ψ(s) = 0 s =
−1 0 1 0 0 −s s2
0 s
0 1
d) from G0 (s):
ẏ1 = u1 , ÿ2 = u1
...
ẏ3 = u2 , ÿ4 = u2 , y 5 = u2
so
ẋ2 = x1
ẋ4 = x3 , ẋ5 = x4
and
y1 = x1 ,y2 = x2 , y3 = x3 , y4 = x4 , y5 = x5
0 0 0 0 0 1 0
1 0 0 0 0 0 0
A = 0 0 0 0 0 , B = 0 1
0 0 1 0 0 0 0
0 0 0 1 0 0 0
C = I5×5
e)
A = A0 − B0 Dh−1 Dl
B = B0 Dh−1
C = Nl
C.5. Solutions to Chapter 5 277
c)
hi = ψ(λi )pi
y1 m11 m12
y m
2 21 m22
x1
y3 = m31 m32
x
y4 m41 m42 2
y5 m51 m52
x1
is AL y (AL is the left inverse of A).
x2
Matlab: p(:,ind)=pinv(psi)*h(:,ind);
T −0.0537 + 0.0985i 0.0102 + 0.0191i
p1 −0.0537 − 0.0985i
0.0102 − 0.0191i
..
. = 0.0592 − 0.1399i 0.0210 + 0.0280i
pT5 0.0592 + 0.1399i 0.0210 − 0.0280i
0.0400 0.2173
c) Construct the required transfer functions (e.g., with linmod) and use the Matlab
function sigma to obtain the singular value plots.
Note in particular:
The closed loop frequency response from r to y has a gain of 0dB at low frequencies,
corresponding to high open loop gain.
The closed loop frequency response to noise is greater than 1 at low frequencies, but
this is probably not a problem as noise is usually high frequency.
278 C. Solutions to Exercises
a) Calculation of G(s):
b) with the inputs to G(s) being w1 and w2 (in either closed loop or open loop),
w2 = −y2
g21
y2 = w1
1 + g22
z1 = −y1 = −g11 w1 + g12 y2
g12 g21
= −g11 w1 + w1
1 + g22
c) The transfer functions with one loop closed are just integrators: Gain margin: ∞
Phase margin: 90o
Would therefore expect good robustness in the individual channels.
1 + ǫ1 0
B̃ =
0 1 + ǫ2
Ãcl = A − B̃C
0 a 1 + ǫ1 0 1 a
= −
−a 0 0 1 + ǫ2 −a 1
−(1 + ǫ1 ) −aǫ1
=
aǫ2 −(1 + ǫ2 )
s2 + (2 + ǫ1 + ǫ2 )s + 1 + ǫ1 + ǫ2 + (a2 + 1)ǫ1 ǫ2
g) The interpretation of this is that robustness is very good with errors with ’direction’
ǫ1 = ǫ2 or ǫ1 , ǫ2 = 0 but is not good for the direction ǫ1 = −ǫ2 .
The conclusion of this is: simple SISO open loop measures of robustness are not a
good guide to MIMO robustness.
280 C. Solutions to Exercises
C.6 Chapter 6
a)
2z 2 − 6z
G(z) =
2z 2 − 6z + 4
2 − 6z −1
G(z) =
2 − 6z −1 + 4z −2
G(z) = 1z 0 + 0z −1 − 2z −2 − 6z −3 − 14z −4 . . .
b)
1
G(z) =
1 − 2z −1
with input
u(k) = 2e−k σ(k)
2z
U (z) =
z − e−1
2z
G(z)U (z) =
(1 − 2z −1 )(z − e−1 )
2z 2 4.736z − 1.472
= −1
=2+
(z − 2)(z − e ) (z − 2)(z − 0.368)
a1 a2
=2+ +
z − 2 z − 0.368
a1 = 4.902, a2 = −0.166
1
From Z[σ(k)αk ] = 1−αz −1
,
1
Z[σ(k − 1)αk−1 ] =
z−α
so
y(0) = 2
y(k) = 4.9 · 2k−1 − 0.166 · 0.368k−1 , k≥1
C.6. Solutions to Chapter 6 281
Proof:
∞
X
Z x(k) = X(z) = x(k)z −k
k=0
X∞
Z x(k − 1) = z −1 X(z) = x(k − 1)z −k
k=0
∞
X ∞
X
−k
x(k)z − x(k − 1)z −k = X(z) − z −1 X(z)
k=0 k=0
∞
X ∞
X
−k −k
lim x(k)z − x(k − 1)z = lim X(z) − z −1 X(z)
z→1 z→1
k=0 k=0
with x(k) = 0 when k < 0, and when the system is stable the left hand side becomes
∞
X
lim x(k)z −k − x(k − 1)z −k = [x(0) − x(−1)] + [x(1) − x(0)] + [x(2) − x(1)] + . . .
z→1
k=0
= lim x(k)
k→∞
so
lim x(k) = lim X(z) − z −1 X(z) = lim (z − 1)X(z)
k→∞ z→1 z→1
as ω : 0 → 2πT
a circle is described, with radius e−0.1 . Larger magnitude σ
correspond to smaller radiuses.
ii) Lines of constant imaginary parts ω of continuous poles ω = 0.5 Tπ , 1.0 Tπ and
1.5 Tπ .
The line in the z-plane is z = esT . With ω = 0.5 Tπ :
b) Second order system with damping factor ζ and natural frequency ωn . Lines for
ζ = 0, 0.5, 1 as ωn : 0 → Tπ .
p
s = ζωn ± jωn 1 − ζ 2
z = eσT +jωT
= (cos ωn + j sin ωn )
p
In general with k = 1 − ζ2
z = eσT +jωT
= e−ζωn T (cos kωn T + j sin kωn T )
π
when ωn : 0 → T
starting radius is 1.0, final radius is e−πζ .
π π
c) For the second order system, lines for ωn = ,
2T T
as ζ : 0 → 1.
π
With ωn = 2T
z = eσT +jωT
π
π π
= e−ζ 2 cos k + j sin k
2 2
with ζ = 0, angle = 90o and Radius=1.0
π
with ζ = 1, angle = 0o , radius = e− 2
With ωn = Tπ :
with ζ = 0, angle = 180o and Radius=1.0
with ζ = 1, angle = 0o , Radius = e−π
These can easily be drawn using the Matlab command rltool with grid.
(z 3 + a1 z 2 + a2 z + a3 )X1 = U
Y = b1 z 2 X1 + b2 zX1 + b3 X1
C.6. Solutions to Chapter 6 283
with
X2 = zX1
X3 = zX2
zX3 = z 3 X1 = −a1 X3 − a2 X2 − a3 X1 + U
Y = b1 X3 + b2 X2 + b3 X1
zX = ΦX + ΓU
Y = CX + DU
0 1 0 0
Φ= 0 0 1 , Γ = 0
−a3 −a2 −a1 1
C = b3 b2 b1 D=0
b1 y
b2
x3 x2 x1
u z −1 z −1 z −1 b3
ẋ = −ax + bu,
y=x
zx = Φx + Γu, y=x
where,
Z T
−aT
Φ=e , Γ= e−at bdt
0
e−at T
=[ ] b
−a 0
(1 − e−aT )
= b
a
Then,
Γ b 1 − e−aT
Y (z) = C(zI − Φ)−1 Γ = =
z−Φ a z − e−aT
c) The discrete root locus has a pole at z = e−aT = 0.67
1 + K(z)G(z) = 0
0.16
1 + Kpd =0
z − 0.67
z − 0.67 + 0.16Kpd = 0
The root locus is on the real axis, so the system becomes unstable when it hits the
unit circle at z = −1: Kpd = 10.44.
d) The continuous proportional controller is always stable: the closed loop pole tends
to s = −∞. As shown in (c), this is not true for the discrete controller.
ẋ = Ax + bu,
y=x
where,
0 1 0
A= , b= , c= 1 0 .
0 0 1
Let the discrete time state space representation be given as
zx = Φx + Γu, y=x
where
Z T
AT
Φ=e , Γ= eAt Bdt
0
Then we obtain
c)
1
C2 (z) = Kpd 1 + Td (1 − z −1 ) = Kpd (1 + Td )z − Td
z
1 + Td Td
= Kpd z−
z 1 + Td
Td 1
α= , = 1 + Td
1 + Td 1−α
d) See cs6_discrete_PD.m. To assist the design in rltool right-click over the root-
locus, chose Properties, Options and select Show grid. One solution is Kpd=12.4,
alpha=0.83.
step 1) generate state space representation of complete controller. This can be done
using linmod()
step 2) use c2d() to get discrete time approximation.
– for controller discretised with T = 0.02 the controlled response is almost iden-
tical.
C.6. Solutions to Chapter 6 287
– for controller discretised with T = 0.5 the controlled response is notably dif-
ferent because of large sampling time.
c) See cs6_TG_discrete_K.m.
d) The response of closed loop systems are notably different then the one achieved by
approximating the continuous time controllers by Tustin approximation. This is
because same weighting matrices Q, R, Qe and Re are used in both the cases. This
is not the correct approach if one wants to design discrete time controllers to achieve
same performance as achieved by continuous time controllers. It can be shown that
to achieve same performance, one needs to modify the weighting matrices. This
even requires modification in cost function J by including cross terms of the form
x(kT )Su(kT ) in its discrete counterpart.
e) The response of continuous time model with time delay of 0.25 s resembles that of
the discretised controller with sampling time 0.5 s. This is particularly clear in the
first several samples after the step disturbance is applied. This shows that the effect
of Tustin approximation with sampling time T is similar to the effect of adding a
time delay T /2 to the original system.
a) The impulse response can be easily computed by the so-called Markov parameters:
g(0) = D = 0
g(1) = CΓ = 1
g(2) = CΦΓ = 4.2
g(3) = CΦ2 Γ = 0.84
b) The response to a particular input can be computed using the discrete convolution
P
formula: y(k) = kl=0 g(l)u(k − l)
y(0) = g(0)u(0) = 0
y(1) = g(0)u(1) + g(1)u(0) = 0 + 5 = 5
y(2) = g(0)u(2) + g(1)u(1) + g(2)u(0) = 0 + 0 + 21 = 21
y(3) = g(0)u(3) + g(1)u(2) + g(2)u(1) + g(3)u(0) = 0 − 1 + 0 + 4.2 = 3.2
a) i)
s+1
G(s) =
s2
+ 4s + 1
n = 2, m = 1, n − m − 1 = 0
For this system there is one continuous zero s = −1.
The discretisation results in only one zero, and it approaches e−T .
ii)
1
G(s) =
s3
+ s2 + s
n = 3, m = 0, n − m − 1 = 2
The discretisation of this system results in two zeros: they approach the zeros
of the exact discretisation 1/s3 , so we need the exact discretisation of 1/s3 .
...
A state space representation of 1/s3 is(using that x = u)
ẋ1 = x2 0 1 0 0
ẋ2 = x3 A= 0
0 1 , B = 0
ẋ3 = u 0 0 0 1
y = x1 C= 1 0 0
C.6. Solutions to Chapter 6 289
Discretisation
0 0 1
A2 = 0 0 0 , A3 = 03×3
0 0 0
A2 T 2 A2 T 2
Φ = eAT = I + AT + + . . . = I + AT +
2 2
1 T T 2 /2
Φ= 0 1 T
0 0 1
T T2
1
AT A2 T 2 2 6
T
Ψ=1+ + = 0 1 2
2 3!
0 0 1
3
T
T62
Γ = ΨT B = 2
T
Let z be a zero of
then
(zI − Φ) −Γ x
=0
C D u
Since, D = 0, so we have Cx = 0
x 0
1
1 0 0 x2 = 0, ⇒ x1 = 0, x = x2
x3 x3
Therefore the zeros of the discretisation of G(s) approach the values −3.73 and
−0.268.
b) The requirement for a minimum phase system is |zi | < 1. If T is increased, the
zeros become closer and closer to the unit disk. When T ≥ 3.3 the system becomes
minimum phase. See cs6_DTzeros.m.
a) The time constant τ = 5 sec, the bandwidth is ωb = 0.2 rad/s and the gain is 1.
c) To avoid aliasing effect the sampling frequency should be ωs > 2ωb . If we chose
ωs = 10ωb = 2 rad/s, then the sampling time is Ts = 2π/ωs = 3.14 sec. Therefore
sampling times smaller than 3.14 sec. are suitable.
C.6. Solutions to Chapter 6 291
σ(t) − σ(t − T )
σ(t − T ) − σ(t − 2T )
σ(t − 2T ) − σ(t − 3T )
are shown below on Figure C.12. The y(t) is shown on Figure C.13.
σ(t) − σ(t − T )
σ(t − T ) − σ(t − 2T )
σ(t − 2T ) − σ(t − 3T )
0 T 2T 3T
Mathematically y(t) is
y(t) = u(0) σ(t) − σ(t − T )
+ u(T ) σ(t − T ) − σ(t − 2T )
+ u(2T ) σ(t − 2T ) − σ(t − 3T ) + . . .
∞ h
X i
= u(kT ) σ(t − kT ) − σ t − (k + 1)T
k=0
e−kT s
L σ(t − kT ) =
s
e−(k+1)T s
L σ t − (k + 1)T =
s
e−kT s − e−(k+1)T s
L σ(t − kT ) − σ t − (k + 1)T =
s
292 C. Solutions to Exercises
u(t)
u(2T ) y(t)
u(T )
u(0)
0 T 2T 3T
So it follows that
1 − e−T s ∗
Y (s) = U (s)
s
That is, the transfer function of the sampler Gzoh is
Y (s) 1 − e−T s
Gzoh = =
U ∗ (s) s
1 − e−T jω
Gzoh (jω) =
jω
T jω T
T e 2 − e− 2 jω −0.5T jω
= e
ωT /2 2j
sin(ωT /2) −0.5T jω
=T e
ωT /2
C.6. Solutions to Chapter 6 293
Bode Diagram:
sin(ωT /2)
Gzoh (jω) = E(jω)F (jω), E(jω) = T , F (jω) = e−0.5T jω
ωT /2
2nπ
|E(0)| = T, E = 0, |F (jω)| = 1 ∀ω
T
6 E(jω) = 0, 0 < ω < 2π/T, 4π/T < ω < 6π/T, . . .
6 E(jω) = 180◦ (= −180◦ ), 2π/T < ω < 4π/T, 6π/T < ω < 8π/T, . . .
6 F (jω) = −0.5T ω
Every 2nπ
T
the sign of E(jω) changes. G(jω) experiences at these values a further
phase shift of ±180◦ .
Matlab: cs6_S_and_H.m
After the zero order hold zoh and G(s) (which act as filters) only the first elements
are large.
With the input ω = 60rad/s
a) Setting the low pass filter’s cutoff frequency to a larger value will result in noisy
measurement which in turn will introduce noise in the states. Depending the con-
troller the system might be also unstable as the controller is a gain that amplifies
the noise.
b) The noise is the difference and an observer is a possible solution to output the
filtered states by combining the state estimates from the model and the measured
states from the sensors.
i) When using lqrd command no re-tuning is needed since the command takes the
continuous time ’A’,’B’,’Q’ and ’R’ matrices which are already tuned (cf. Prob-
lem )4.10) in continuous time design and generates the discretised controller.
C.6. Solutions to Chapter 6 295
Thus the discrete time response to initial conditions or position tracking will be
identical to the continuous time one. On the other hand the input arguments
of the dlqr command are the discretised ’A’, ’B’, ’Q’ and ’R’ where tuning is
required to achieve the desired performance.
ii) Lqrd discretises the ’A’,’B’,’Q’ and ’R’ matrices in the controller case, however
in the observer case the input arguments are the AT and C T . Thus in this case
the lqrd will dicretise the C T matrix which has no meaning and therefore it can
be concluded that dualism fails in this specific problem. As a result observer
gains could not be designed using lqrd.
Problem 6.18 (Mini Segway: Simulation discrete time observed-based state feedback
controller)
0.25
0.2
0.15
0.1
0.05
-0.05
-0.1
-0.15
reference
-0.2 without Obsv state
with Obsv
-0.25
0 5 10 15 20 25 30
As shown in figure C.14 the observer filters the position and angular states such that the
noise content in the position tracking is reduced.
Problem 6.19 (Mini Segway: Experiment discrete time observed-based state feedback
controller)
296 C. Solutions to Exercises
a) As shown in figure C.15 is the tracking of a sine reference input with frequency
0.4 rad/s and amplitude 0.2 m. It is shown that after fine tuning of the controller
and the observer a good experimental performance could be achieved. From the red
and black curve it can be noticed that the observer provides better performance at
the peak of the sinewave where the change in direction of the Minseg Robot is much
smoother.
0.2
reference
0.15
Obsv
NoObsv
0.1
0.05
-0.05
-0.1
-0.15
-0.2
0 5 10 15 20 25 30
C.7 Chapter 7
V (p) = (Y − M p)T (Y − M p)
= Y T Y − pT M T Y − Y T M p + pT M T M p
So
dV (p)
= −M T Y − M T Y + 2M T M p = 0
dp
0 = −M T Y + M T M p
p = (M T M )−1 M T Y
Another way of finding the p which minimizes V (p) is by using the approach of completion
of squares. Let
V (p) = (Y − M p)T (Y − M p)
= Y T Y − pT M T Y − Y T M p + pT M T M p
or
V (p) − Y T Y = pT M T M p − pT M T Y − Y T M p
V (p) − Y T Y + Y T M (M T M )−1 M T Y = pT M T M p − pT M T Y − Y T M p
+Y T M (M T M )−1 M T Y
= (p − (M T M )−1 M T Y )T M T M (p − (M T M )−1 M T Y )
Thus
Which shows that V (p) is minimum if the first term on right hand side is minimum or
equal to zero. Then
p − (M T M )−1 M T Y = 0
or
p = (M T M )−1 M T Y
298 C. Solutions to Exercises
a)
(z − 1)u(kT ) = u(kT + T ) − u(kT )
For the step function u(t + T ) − u(t) is only 1 at k = −1.
the PE order is n + 1.
This means that if any polynomial a(z) can be found such that
k
1X
lim (a(z)u(l))2 = 0
k→∞ k
l=0
c) With the polynomial a(z) = z − 1 (order n = 1) and u(l) a step function, then from
a).
a(z)u(l) = 0, l = 0, 1, . . . ∞
so
k
1X
lim (a(z)u(l))2 = 0
k→∞ k
l=0
d) Next, we will find the exact order by analyzing the auto correlation Cuu (1) = cuu (0).
Since, it is a scaler hence it has rank of 1 or PE order is 1.
a)
Since,
A+B A−B
sin A + sin B = 2 sin cos
2 2
a(z)u(l) = 0, ∀l = 0, 1, . . . ∞
Then,
k
1X
lim (a(z)u(l))2 = 0
k→∞ k
l=0
At times 0 and T
N
1 X
Ru (0) = lim u(kT )u(kT )
N →∞ N
k=0
N
1 X
Ru (T ) = lim u(kT )u (k + 1)T
N →∞ N
k=0
2π
d) At T = ω 1 1
cos 0
2 2
cos ωT 0.5 0.5
Cuu (2) = 1 1 =
2
cos ωT 2
cos 0 0.5 0.5
rankCuu (2) = 1
300 C. Solutions to Exercises
So PE order is 1. All the samples are at the same position in the sine wave so it
looks like a step.
At ω 6= 2π
T
, the samples are at different positions in the sine wave, so it has more
information: PE order =2.
or
y0 −y−1 −y−2 u−1 u−2 a1 e2
y1
−y0 −y−1 u0 u−1 e3
a2
.. = .. .. .. .. + ..
. . . . . b1 .
yN −1 −yN −2 −yN −3 uN −2 uN −3 b2 eN −1
C.7. Solutions to Chapter 7 301
However, the values of input sequence u(−1), u(−2), ...u(−n), and y(−1), y(−2),
...y(−n) is not available in measurement data. This means that first n-rows will be
0, hence to make M T M full rank these rows should be eliminated. This results in,
−y1 −y0 u1 u0
−y2 −y1 u2 u1
M = . .. .. ..
.. . . .
−yN −2 −yN −3 uN −2 uN −3
b) See cs7_LSrankM.m.
For the sinusoid: rank = 4 (singular values confirm this).
For the white noise: rank = arbitrary
As the sequence becomes longer, the matrix M T M approaches a scaled version of
the empirical covariance matrix; thus the rank of M T M for a long sequence can be
expected to have the same rank as the PE order.
c) See Matlab solution in cs7_LSparest.m. 3rd and 4th order models generated are
identical. A pole and zero cancel in the 4th order model.
d) Exact validation achieved with these models: the model order is clearly 3.
e) Inconsistent results when attempt to generate models from sinusoidal or step input.
A true inverse is only possible when rank M T M = 2n: with a PE order of 2 it is
only possible to accurately estimate a system of order 1 (which has 2 parameters).
b) The cut-off is not so clear for the noisy signal. Since after the 4th singular value of
Hn the others are relative small one can chose 4th or 5th order model. Because the
difference between the 3rd and the 4th singular values is also large one can try also
identifying a 3rd order model of the system.
302 C. Solutions to Exercises
2. Import the data: Import data → Data object → Object: iodata1. Repeat for the
second data set.
4. Drag and drop the first signal set to Working model. Remove the means from all
signals using the Preprocess drop-down list → Remove means. Repeat for the second
signal set. One of the new set of signals should be used as Working data and the
other one as Validation data.
5. Estimate models of 2nd, 3rd, 4th and 5th order using N4SID (subspace identifica-
tion). For the purpose choose Linear parametric models from the Estimate drop-
down box. Select State-space as Structure and repeat the identification for the
different orders.
6. Validate the identified models using the second data set. Use the Model Views
check-boxes in the lower-right corner of the GUI.
The model used to create the data was 4th order with noise. The identified models of
order 2-4 all very accurately reproduce the original data.
C.8 Chapter 8
b) The graphs are generated in the script cs8_HVDC_reduce.m. You can compare your
results to the results obtained using the Matlab function balred.
Bibliography
[1] T. Kailath
Linear Systems
Prentice Hall, 1980
[5] L. Ljung
System Identification - Theory for the User
Prentice Hall, 1999
[6] L. Ljung
System Identification Toolbox For Use with MATLAB - User’s Guide
The MathWorks, 2001
[7] G. Strang
Linear Algebra and its Applications
Third Edition
Harcourt Brace Jovanich College Publishers, 1988