0% found this document useful (0 votes)
11 views

Closed-loop Identification Revisited

The document discusses closed-loop identification methods in system identification, focusing on various approaches and their asymptotic properties. It aims to unify these methods under a common framework based on the prediction error method, highlighting the connections and differences among them. The paper also introduces a projection approach for closed-loop identification that allows for approximation of open-loop dynamics in a user-defined frequency domain norm.

Uploaded by

Hermann Simon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Closed-loop Identification Revisited

The document discusses closed-loop identification methods in system identification, focusing on various approaches and their asymptotic properties. It aims to unify these methods under a common framework based on the prediction error method, highlighting the connections and differences among them. The paper also introduces a projection approach for closed-loop identification that allows for approximation of open-loop dynamics in a user-defined frequency domain norm.

Uploaded by

Hermann Simon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Closed-loop Identication Revisited {

Updated Version

Urban Forssell and Lennart Ljung


Department of Electrical Engineering
Linkping University, S-581 83 Linkping, Sweden
WWW: http://www.control.isy.liu.se
Email: [email protected], [email protected]
1 April 1998

ERTEKNIK
REGL

AU
TOM OL
ATIC CONTR

LINKÖPING

Report no.: LiTH-ISY-R-2021


Submitted to Automatica
Technical reports from the Automatic Control group in Linkping are available
by anonymous ftp at the address ftp.control.isy.liu.se. This report is
contained in the compressed postscript le 2021.ps.Z.
Closed-loop Identication Revisited {
Updated Version ?

Urban Forssell and Lennart Ljung


Division of Automatic Control, Department of Electrical Engineering, Linkoping
University, S-581 83 Linkoping, Sweden. URL: http://www.control.isy.liu.se/.

Abstract
Identication of systems operating in closed loop has long been of prime interest
in industrial applications. The problem oers many possibilities, and also some
fallacies, and a wide variety of approaches have been suggested, many quite recently.
The purpose of the current contribution is to place most of these approaches in a
coherent framework, thereby showing their connections and display similarities and
dierences in the asymptotic properties of the resulting estimates. The common
framework is created by the basic prediction error method, and it is shown that most
of the common methods correspond to dierent parameterizations of the dynamics
and noise models. The so called indirect methods, e.g., are indeed "direct" methods
employing noise models that contain the regulator. The asymptotic properties of
the estimates then follow from the general theory and take dierent forms as they
are translated to the particular parameterizations. In the course of the analysis we
also suggest a projection approach to closed-loop identication with the advantage
of allowing approximation of the open loop dynamics in a given, and user-chosen
frequency domain norm, even in the case of an unknown, non-linear regulator.

Key words: System identication Closed-loop identication Prediction error


methods

? This paper was not presented at any IFAC meeting. Corresponding author U.
Forssell. Tel. +46-13-282226. Fax +46-13-282622. E-mail [email protected].

Preprint submitted to Elsevier Preprint 1 April 1998


Noise

-
Extra input fInput - Plant -?
++ f Output -
++
6
Controller  ?f- + Set point

Fig. 1. A closed-loop system


1 Introduction

1.1 Motivation and Previous Work

System identication is a well established eld with a number of approaches,


that can broadly be classied into the prediction error family, e.g,, 22], the
subspace approaches, e.g., 31], and the non-parametric correlation and spectral
analysis methods, e.g., 5]. Of special interest is the situation when the data
to be used has been collected under closed-loop operation, as in Fig. 1.
The fundamental problem with closed-loop data is the correlation between
the unmeasurable noise and the input. It is clear that whenever the feedback
controller is not identically zero, the input and the noise will be correlated.
This is the reason why several methods that work in open loop fail when
applied to closed-loop data. This is for example true for the subspace approach
and the non-parametric methods, unless special measures are taken. Despite
these problems, performing identication experiments under output feedback
(i.e. in closed loop) may be necessary due to safety or economic reasons, or if
the system contains inherent feedback mechanisms. Closed-loop experiments
may also be advantageous in certain situations:
In 13] the problem of optimal experiment design is studied. It is shown
that if the model is to be used for minimum variance control design the
identication experiment should be performed in closed-loop with the opti-
mal minimum variance controller in the loop. In general it can be seen that
optimal experiment design with variance constraints on the output leads to
closed-loop solutions.
In \identication for control" the objective is to achieve a model that is
suited for robust control design (see, e.g., 7, 19, 33]). Thus one has to tailor
the experiment and preprocessing of data so that the model is reliable in
regions where the design process does not tolerate signicant uncertainties.
The use of closed-loop experiments has been a prominent feature in these
approaches.

3
Historically, there has been a substantial interest in both special identication
techniques for closed-loop data, and for analysis of existing methods when
applied to such data. One of the earliest results was given by Akaike 1] who
analyzed the eect of feedback loops in the system on correlation and spectral
analysis. In the seventies there was a very active interest in questions concern-
ing closed-loop identication, as summarized in the survey paper 15]. See also
3]. Up to this point much of the attention had been directed towards identi-
ability and accuracy problems. With the increasing interest "identication for
control", the focus has shifted to the ability to shape the bias distribution so
that control-relevant model approximations of the system are obtained. The
surveys 12] and 29] cover most of the results along this line of research.

1.2 Scope and Outline

It is the purpose of the present paper to \revisit" the area of closed-loop


identication, to put some of the new results and methods into perspective,
and to give a status report of what can be done and what cannot. In the course
of this expose, some new results will also be generated.
We will exclusively deal with methods derived in the prediction error frame-
work and most of the results will be given for the multi-input multi-output
(MIMO) case. The leading idea in the paper will be to provide a unied
framework for many closed-loop methods by treating them as dierent pa-
rameterizations of the prediction error method:
There is only one method. The dierent approaches are
obtained by dierent parameterizations of the dynamics
and noise models.
Despite this we will often use the terminology \method" to distinguish between
the dierent approaches and parameterizations. This has also been standard
in the literature.
The organization of the paper is as follows. Next, in Section 2 we characterize
the kinds of assumptions that can be made about the nature of the feedback.
This leads to a classication of closed-loop identication methods into, so
called, direct, indirect, and joint input-output methods. As we will show, these
approaches can be viewed as variants of the prediction error method with the
models parameterized in dierent ways. A consequence of this is that we may
use all results for the statistical properties of the prediction error estimates
known from the literature. In Section 3 the assumptions we will make regarding
the data generating mechanism are formalized. This section also introduces
the some of the notation that will be used in the paper.

4
Section 4 contains a brief review of the standard prediction error method as
well as the basic statements on the asymptotic statistical properties of this
method:
Convergence and bias distribution of the limit transfer function estimate.
Asymptotic variance of the transfer function estimates (as the model orders
increase).
Asymptotic variance and distribution of the parameter estimates.
The application of these basic results to the direct, indirect, and joint input-
output approaches will be presented in some detail in Sections 5{8. All proofs
will be given in the Appendix. The paper ends with a summarizing discussion
in Section 9.

2 Approaches to Closed-loop Identication

2.1 A Classication of Approaches

In the literature several dierent types of closed-loop identication methods


have been suggested. In general one may distinguish between methods that
(a) Assume no knowledge about the nature of the feedback mechanism, and
do not use the reference signal r(t) even if known.
(b) Assume the feedback to be known and typically of the form
u(t) = r(t) ; K (q)y(t) (1)
where u(t) is the input, y(t) the output, r(t) an external reference signal,
and K (q) a linear time-invariant regulator. The symbol q denotes the
usual shift operator, q;1y(t) = y(t ; 1), etc.
(c) Assume the regulator to be unknown, but of a certain structure (like (1)).
If the regulator indeed has the form (1), there is no major dierence between
(a), (b) and (c): The noise-free relation (1) can be exactly determined based on
a fairly short data record, and then r(t) carries no further information about
the system, if u(t) is measured. The problem in industrial practice is rather
that no regulator has this simple, linear form: Various delimiters, anti-windup
functions and other non-linearities will have the input deviate from (1), even
if the regulator parameters (e.g. PID-coecients) are known. This strongly
disfavors the second approach.
In this paper we will use a classication of the dierent methods that is similar
to the one in 15]. See also 26]. The basis for the classication is the dierent

5
kinds of possible assumptions on the feedback listed above. The closed-loop
identication methods correspondingly fall into the following main groups:
(1) The Direct Approach : Ignore the feedback and identify the open-loop
system using measurements of the input u(t) and the output y(t).
(2) The Indirect Approach : Identify some closed-loop transfer function and
determine the open-loop parameters using the knowledge of the con-
troller.
(3) The Joint Input-Output Approach : Regard the input u(t) and the output
y(t) jointly as the output from a system driven by the reference signal
r(t) and noise. Use some method to determine the open-loop parameters
from an estimate of this system.
These categories are basically the same as those in 15], the only dierence
is that in the joint input-output approach we allow the joint system to have
a measurable input r(t) in addition to the unmeasurable noise e(t). For the
indirect approach it can be noted that most methods studied in the literature
assume a linear regulator but the same ideas can also be applied if non-linear
and/or time-varying controllers are used. The price is, of course, that the
estimation problems then become much more involved.
In the closed-loop identication literature it has been common to classify the
methods primarily based on how the nal estimates are computed (e.g. di-
rectly or indirectly using multi-step estimation schemes), and then the main
groupings have been into \direct" and \indirect" methods. This should not,
however, be confused with the classication (1)-(3) which is based on the
assumptions made on the feedback.

3 Technical Assumptions and Notation

The basis of all identication is the data set


Z N = fu(1) y(1) : : : u(N ) y(N )g (2)
consisting of measured input-output signals u(t) and y(t), t = 1 : : :  N . We
will make the following assumptions regarding how this data set was generated.
Assumption 1 The true system S is linear with p outputs and m inputs and
given by
y(t) = G0(q)u(t) + v(t) (3)
v(t) = H0(q)e(t)
where fe(t)g (p  1) is a zero-mean white noise process with covariance matrix

6
0, and bounded moments of order 4+ , some  > 0, and H0(q) is an inversely
stable, monic lter.
For some of the analytic treatment we shall assume that the input fu(t)g is
generated as
u(t) = r(t) ; K (q)y(t) (4)
where K (q) is a linear regulator of appropriate dimensions and where the
reference signal fr(t)g is independent of fv(t)g.
This assumption of a linear feedback law is rather restrictive and in general we
shall only assume that the input u(t) satises the following milder condition
(cf. 20], condition S3):
Assumption 2 The input u(t) is given by
u(t) = k(t yt ut;1 r(t)) (5)
where y t = y (1) : : : y(t)], etc., and where where the reference signal fr(t)g is
a given quasi-stationary signal, independent of fv (t)g and k is a given deter-
ministic function such that the closed-loop system (3) and (5) is exponentially
stable, which we dene as follows: For each t s t  s there exist random
variables ys(t) us(t), independent of rs and v s but not independent of rt and
vt, such that
E ky(t) ; ys(t)k4 < Ct;s (6)
E ku(t) ; us(t)k4 < Ct;s (7)
for some C < 1  < 1. In addition, k is such that either G0 (q ) or k contains
a delay.
Here we have used the notation
1 XN
Ef (t) = Nlim
!1 N t=1
Ef (t) (8)

The concept of quasi-stationarity is dened in, e.g., 22].


If the feedback is indeed linear and given by (4) then Assumption 2 means
that the closed-loop system is asymptotically stable.
Let us now introduce some further notation for the linear feedback case. By
combining the equations (3) and (4) we have that the closed-loop system is
y(t) = S0(q)G0 (q)r(t) + S0 (q)v(t) (9)

7
where S0(q) is the sensitivity function,
S0(q) = (I + G0(q)K (q));1 (10)
This is also called the output sensitivity function. With
Gc0(q) = S0(q)G0(q) and Hc0(q) = S0 (q)H0(q) (11)
we can rewrite (9) as
y(t) = Gc0(q)r(t) + vc(t) (12)
vc(t) = Hc0(q)e(t)
In closed loop the input can be written as
u(t) = S0i (q)r(t) ; S0i (q)K (q)v(t) (13)
= S0i (q)r(t) ; K (q)S0 (q)v(t) (14)
The input sensitivity function S0i (q) is dened as
S0i (q) = (I + K (q)G0(q));1 (15)

The spectrum of the input is (cf. (14))


!u = S0i !r (S0i ) + KS0!v S0K  (16)
where !r is the spectrum of the reference signal and !v = H00H0 the noise
spectrum. Superscript  denotes complex conjugate transpose. Here we have
suppressed the arguments ! and ei! which also will be done in the sequel
whenever there is no risk of confusion. Similarly, we will also frequently sup-
press the arguments t and q for notational convenience. We shall denote the
two terms in (16)
!ru = S0i !r (S0i ) (17)
and
!eu = KS0 !v S0K  = S0i K !v K  (S0i ) (18)
The cross spectrum between u and e is
!ue = ;KS0 H00 = ;S0i KH0 0 (19)
The cross spectrum between e and u will be denoted !eu, !eu = !ue.
Occasionally we shall also consider the case where the regulator is linear as in
(4) but contains an unknown additive disturbance d:
u(t) = r(t) ; K (q)y(t) + d(t) (20)

8
The disturbance d could for instance be due to imperfect knowledge of the
true regulator: Suppose that the true regulator is given by
K true(q) = K (q) + "K (q) (21)
for some (unknown) function "K . In this case the signal d = ;"K y. Let !rd
(!dr ) denote the cross spectrum between r and d (d and r), whenever it exists.

4 Prediction Error Identication

In this section we shall review some basic results on prediction error methods,
that will be used in the sequel. See Appendix A and 22] for more details.

4.1 The Method

We will work with a model structure M of the form


y(t) = G(q )u(t) + H (q )e(t) (22)
G will be called the dynamics model and H the noise model. We will assume
that either G (and the true system G0) or the regulator k contains a delay
and that H is monic. The parameter vector ranges over a set DM which is
assumed compact and connected. The one-step-ahead predictor for the model
structure (22) is 22]
y^(tj ) = H ;1(q )G(q )u(t) + (I ; H ;1(q ))y(t) (23)
The prediction errors are
"(t ) = y(t) ; y^(tj ) = H ;1(q )(y(t) ; G(q )u(t)) (24)
Given the model (23) and measured data Z N we determine the prediction
error estimate through
^N = arg min VN (  Z N ) (25)
2DM
1 XN
VN (  Z ) = N "TF (t );1"F (t )
N (26)
t=1
"F (t ) = L(q )"(t ) (27)
Here  is a symmetric, positive denite weighting matrix and L a (possibly
parameter-dependent) monic prelter that can be used to enhance certain

9
frequency regions. It is easy to see that
"F (t ) = L(q )H ;1(q )(y(t) ; G(q )u(t)) (28)
Thus the eect of the prelter L can be included in the noise model and
L(q ) = 1 can be assumed without loss of generality. This will be done in the
sequel.
We say that the true system is contained in the model set if, for some 0 2 DM ,

G(q 0) = G0(q) H (q 0) = H0 (q) (29)


This will also be written S 2 M. The case when the true noise properties
cannot be correctly described within the model set but where there exists a
0 2 DM such that

G(q 0) = G0(q) (30)


will be denoted G0 2 G .

4.2 Convergence

Dene the average criterion V ( ) as


V ( ) = E"T (t );1"(t ) (31)
Then we have the following result (see, e.g., 20, 22]):
^N ! Dc = arg min V ( ) with probability (w. p.) 1 as N ! 1 (32)
2DM
In case the input-output data can be described by (3) we have the following
characterization of Dc (G is short for G(q ), etc.):
2 3 2 32 3
Z h (G G )  ! ! ( G G  7 ; i
) 
Dc = arg min tr ;1 H ;1 6
4 0 ;  7 6 u ue 7 6 0
5 4 5 4
;
5 H d!
2DM ;   
(H0 ; H ) !eu 0 (H0 ; H )
(33)
This is shown in Appendix A.1. Note that the result holds regardless of the
nature of the regulator, as long as Assumptions 1 and 2 hold and the signals
involved are quasistationary.
From (33) several conclusions regarding the consistency of the method can be
drawn. First of all, suppose that the parameterization of G and H is suciently
$exible so that S 2 M. If this holds then the method will in general give

10
consistent estimates of G0 and H0 if the experiment is informative 22], which
means that the matrix
2 3
! !
!0 = 64 u ue75 (34)
!eu 0
is positive denite for all frequencies. (Note that it will always be positive
semi-denite since it is a spectral matrix.) Suppose for the moment that the
regulator is linear and given by (4). Then we can factorize the matrix in (34)
as
2 32 32 3
I!  ; r
! 0 I 07
!0 = 64 ue 0 75 64 u 75 64 ;1
1
5 (35)
0 I 0 0 0 !eu I
The left and right factors in (35) always have full rank, hence the condition
becomes that !ru is positive denite for all frequencies (0 is assumed positive
denite). This is true if and only if !r is positive denite for all frequencies
(which is the same as to say that the reference signal is persistently excit-
ing 22]). In the last step we used the fact that the analytical function S0i (cf.
(17)) can be zero at at most nitely many points. The conclusion is that for
linear feedback we should use a persistently exciting, external reference signal,
otherwise the experiment may not be informative.
The general condition is that there should not be a linear, time-invariant, and
noise-free relationship between u and y. With an external reference signal this
is automatically satised but it should also be clear that informative closed-
loop experiments can also be guaranteed if we switch between dierent linear
regulators or use a non-linear regulator. For a more detailed discussion on this
see, e.g., 15] and 22].

4.3 Asymptotic Variance of Black Box Transfer Function Estimates

Consider the model (23). Introduce


h i
T (q ) = vec G(q ) H (q ) (36)
(The vec-operator stacks the columns of its argument on top of each other in
a vector. A more formal denition is given in Appendix A.2.) Suppose that
the vector can be decomposed so that
h iT
= 1  2 : : :  n dim k = s dim = n s (37)

11
We shall call n the order of the model (23) and we allow n to tend to innity
as N tends to innity. Suppose also that T in (36) has the following shift
structure:
@ T (q ) = q;k+1 @ T (q ) (38)
@k @1
It should be noted that most polynomial-type model structures, including the
ones studied in this paper, satisfy this shift structure. Thus (38) is a rather
weak assumption.
More background material including further technical assumptions and addi-
tional notation can be found in Appendix A.2. For brevity reasons we here go
directly to the main result ( denotes the Kronecker product):
2 3;T
h ^ i! i n 6 !u (!) !ue(!)7
Cov vec TN (e ) N 4 5 !v (!) (39)
!eu(!) 0
The covariance matrix is thus proportional to the model order divided n by
the number of data N . This holds asymptotically as both n and N tend to
innity. In open loop we have !ue = 0 and
h i
Cov vec G^ N (ei! ) Nn !u(!);T !v (!) (40)
h ^ i! i n ;1
Cov vec HN (e ) N 0 !v (!) (41)
Notice that (40) for the dynamics model holds also in case the noise model is
xed (e.g. H (q ) = I ).

4.4 Asymptotic Distribution of the Parameter Vector Estimates

If S 2 M then ^N ! 0 as N ! 1 under reasonable conditions (e.g., !0 > 0,


see 22]). Then, if  = 0,
p
N ( ^N ; ) 2 AsN (0 P ) (42a)
h i ;1
P = E (t 0 );0 1 T (t 0) (42b)
where is the negative gradient of the prediction errors " with respect to .
In this paper we will restrict to the SISO case when discussing the asymptotic
distribution of the parameter vector estimates for notational convenience. For
ease of reference we have in Appendix A.3 stated a variant of (42) as a theorem.

12
5 Closed-loop Identication in the Prediction Error Framework

5.1 The Direct Approach

The direct approach amounts to applying a prediction error method directly


to input-output data, ignoring possible feedback. In general one works with
models of the form (cf. (23))
y^(tj ) = H ;1(q )G(q )u(t) + (I ; H ;1(q ))y(t) (43)
The direct method can thus be formulated as in (25)-(27). This coincides with
the standard (open-loop) prediction error method 22, 26]. Since this method
is well known we will not go into any further details here. Instead we turn to
the indirect approach.

5.2 The Indirect Approach

5.2.1 General
Consider the linear feedback set-up (4). If the regulator K is known and r is
measurable, we can use the indirect identication approach. It consists of two
steps:
(1) Identify the closed-loop system from the reference signal r to the output
y.
(2) Determine the open-loop system parameters from the closed-loop model
obtained in step 1, using the knowledge of the regulator.
Instead of identifying the closed-loop system in the rst step one can identify
any closed-loop transfer function, for instance the sensitivity function. Here
we will concentrate on methods in which the closed-loop system is identied.
The model structure is
y(t) = Gc(q )r(t) + H(q)e(t) (44)
Here Gc(q ) is a model of the closed-loop system. We have also included a
xed noise model H which is standard in the indirect method. Often H(q) =
1 is used, but we can also use H as a xed prelter to emphasize certain
frequency ranges. The corresponding one-step-ahead predictor is
y^(tj ) = H;1(q)Gc(q )r(t) + (I ; H;1(q))y(t) (45)
Note that estimating in (45) is an \open-loop" problem since the noise
and the reference signal are uncorrelated. This implies that we may use any

13
identication method that works in open loop to nd this estimate of the
closed-loop system. For instance, we can use output error models with xed
noise models (prelters) and still guarantee consistency (cf. Corollary 4 below).
Consider the closed-loop system (cf. (12))
y(t) = Gc0(q)r(t) + vc(t) (46)
Suppose that we in the rst step have obtained an estimate G^ cN (q) =
Gc(q ^N ) of Gc0(q). In the second step we then have to solve the equation
G^ cN (q) = (I + G^ N (q)K (q));1G^ N (q) (47)
using the knowledge of the regulator. The exact solution is
G^ N (q) = G^ cN (q)(I ; G^ cN (q)K (q));1 (48)
Unfortunately this gives a high-order estimate G^ N in general { typically the
order of G^ N will be equal to the sum of the orders of G^ cN and K . If we attempt
to solve (47) with the additional constraint that G^ N should be of a certain
(low) order we end up with an over-determined system of equations which can
be solved in many ways, for instance in a weighted least-squares sense. For
methods, like the prediction error method, that allow arbitrary parameteriza-
tions Gc(q ) it is natural to let the parameters relate to properties of the
open-loop system G, so that in the rst step we should parameterize Gc(q )
as
Gc(q ) = (I + G(q )K (q));1G(q ) (49)
This was apparently rst suggested as an exercise in 22]. This parameteriza-
tion has also been analyzed in 8].
The choice (49) will of course have the eect that the second step in the
indirect method becomes super$uous, since we directly estimate the open-
loop parameters. The choice of parameterization may thus be important for
numerical and algebraic issues, but it does not aect the statistical properties
of the estimated transfer function:
As long as the parameterization describes the same set
of G, the resulting transfer function G^ will be the same,
regardless of the parameterizations.

5.2.2 The Dual-Youla Parameterization


A nice and interesting idea is to use the so called dual-Youla parameterization
that parameterizes all systems that are stabilized by a certain regulator K

14
(see, e.g., 32]). To present the idea, the concept of coprime factorizations of
transfer functions is required: A pair of stable transfer functions N D 2 R H1
is a right coprime factorization (rcf) of G if G = ND;1 and there exist stable
transfer functions X Y 2 R H1 such that XN + Y D = I . The dual-Youla
parameterization now works as follows. Let Gnom with rcf(N D) be any system
that is stabilized by K with rcf(X Y ). Then, as R ranges over all stable
transfer functions, the set
n o
G : G(q ) = (N (q) + Y (q)R(q ))(D(q) ; X (q)R(q ));1 (50)
describes all systems that are stabilized by K . The unique value of R that
corresponds to the true plant G0 is given by
R0(q) = Y ;1(q)(I + G0(q)K (q));1(G0(q) ; Gnom(q))D(q) (51)

This idea can now be used for identication (see, e.g., 16], 17], and 6]): Given
an estimate R^N of R0 we can compute an estimate of G0 as
G^ N (q) = (N (q) + Y (q)R^N (q))(D(q) ; X (q)R^N (q));1 (52)
Using the dual-Youla parameterization we can write
Gc(q ) = (N (q) + Y (q)R(q ))(D(q) + X (q)Y ;1 (q)N (q));1 (53)
, (N (q) + Y (q)R(q ))M (q) (54)
With this parameterization the identication problem
y(t) = Gc(q )r(t) + vc(t) (55)
becomes
z(t) = R(q )x(t) + vc(t) (56)
where
z(t) = Y ;1 (q)(y(t) ; N (q)M (q)r(t)) (57)
x(t) = M (q)r(t) (58)
vc(t) = Y ;1 (q)vc(t) (59)
Thus the dual-Youla method is a special parameterization of the general indi-
rect method. This means, especially, that the statistical properties of the re-
sulting estimates for the indirect method remain unaected for the dual-Youla
method. The main advantage of this method is of course that the obtained
estimate G^ N is guaranteed to be stabilized by K , which clearly is a nice fea-
ture. A drawback is that this method typically will give high-order estimates
| typically the order will be equal to the sum of the orders of Gnom and R^N .

15
In this paper we will use (49) as the generic indirect method. Before turning to
the joint input-output approach, let us pause and study an interesting variant
of the parameterization idea used in (49) which will provide useful insights
into the connection between the direct and indirect methods.

5.3 A Formal Connection Between Direct and Indirect Methods

The noise model H in a linear dynamics model structure has often turned
out to be a key to interpretation of dierent \methods". The distinction be-
tween the models/\methods" ARX, ARMAX, output error, Box-Jenkins, etc.,
is entirely explained by the choice of the noise model. Also the practically im-
portant feature of preltering is equivalent to changing the noise model. Even
the choice between minimizing one- or k-step prediction errors can be seen as
a noise model issue. See, e.g., 22] for all this.
Therefore it should not come as a surprise that also the distinction between
the fundamental approaches of direct and indirect identication can be seen
as a choice of noise model.
The idea is to parameterize G as G(q ) and H as
H (q ) = (I + G(q )K (q))H1(q ) (60)
We thus link the noise model to the dynamics model. There is nothing strange
with that: So do ARX and ARMAX models. Although this parameterization
is perfectly valid, it must still be pointed out that the choice (60) is a highly
specialized one using the knowledge of K . Also note that this particular pa-
rameterization scales H1 with the inverse model sensitivity function. (Similar
parameterizations have been suggested in, e.g., 4, 9, 18].)
Now, the predictor for
y(t) = G(q )u(t) + H (q )e(t) (61)
is
y^(tj ) = H ;1(q )G(q )u(t) + (I ; H ;1(q ))y(t) (62)
Using u = r ; Ky and inserting (60) we get
y^(tj ) = H1;1(q )(I + G(q )K (q));1G(q )r(t) + (I ; H1;1(q ))y(t) (63)
But this is exactly the predictor also for the closed-loop model structure
y(t) = (I + G(q )K (q));1G(q )r(t) + H1(q )e(t) (64)

16
and hence the two approaches are equivalent. We formulate this result as a
lemma:
Lemma 3 Suppose that the input is generated as in (4) and that both u and
r are measurable and that the linear regulator K is known. Then, applying a
prediction error method to (61) with H parameterized as in (60), or to (64)
gives identical estimates ^N . This holds regardless of the parameterization of
G and H1.
Among other things, this shows that we can use any theory developed for the
direct approach (allowing for feedback) to evaluate properties of the indirect
approach, and vice versa. It can also be noted that the particular choice of
noise model (60) is the answer to the question how H should be parameterized
in the direct method in order to avoid the bias in the G-estimate in the case of
closed-loop data, even if the true noise characteristics is not correctly modeled.
This is shown in 23].

5.4 The Joint Input-output Approach

The third main approach to closed-loop identication is the so called joint


input-output approach. The basic assumption in this approach is that the in-
put is generated using a regulator of a certain form, e.g., (4). Exact knowledge
of the regulator parameters is not required | an advantage over the indirect
method where this is a necessity.
Suppose that the regulator is linear and of the form (20). The output y and
input u then obey
2 3 2 3 2 32 3
y ( t ) G ( q )
64 75 = 64 c0 75 r(t) + 64 0 0 S ( q ) H ( q ) G0 ( q ) S i ( q )75 64e(t)75
0
(65)
u(t) S0i (q) ;K (q )S0 (q )H0 (q ) S0i (q) d(t)
2 3
e(t)
, G0(q)r(t) + H0 (q) 64 75 (66)
d(t)
Consider the parameterized model structure
2 3 2 3
64y(t)75 = G(q )r(t) + H(q ) 64e(t)75 (67)
u(t) d(t)
2 3 2 32 3
G (q )7 H (q ) Hyd (q )7 6e(t)7
= 64 yr 5 r(t) + 64 ye 54 5 (68)
Gur (q ) Hue(q ) Hud(q ) d(t)

17
where the parameterizations of the indicated transfer functions, for the time
being, are not further specied. Dierent parameterization will lead to dierent
methods, as we shall see. Previously we have used a slightly dierent notation,
e.g., Gyr (q ) = Gc(q ). This will also be done in the sequel but for the
moment we will use the \generic" model structure (68) in order not to obscure
the presentation with too many parameterization details.
The basic idea in the joint input-output approach is to compute estimates
of the open-loop system using estimates of the dierent transfer functions in
(68). We can for instance use
G^ Nyu(q) = G^ Nyr (q)(G^ Nur (q));1 (69)
The rationale behind this choice is the relation G0 = Gc0(S0i );1 (cf. (65)).
We may also include a prelter, Fr , for r in the model (68), so that instead
of using r directly, x = Fr r is used. The open-loop estimate would then be
computed as
G^ Nyu(q) = G^ Nyx(q)(G^ Nux(q));1 (70)
where G^ Nyx(q) and G^ Nux(q) are the estimated transfer functions from x to y and
u, respectively. In the ideal case the use of the prelter Fr will not aect the
results since G0 = Gc0Fr;1(S0i Fr;1);1 regardless of Fr , but in practice, with
noisy data, the lter Fr can be used to improve the quality of the estimates.
This idea really goes back to Akaike 1] who showed that spectral analysis
of closed-loop data should be performed as follows: Compute the spectral
estimates (SISO)
!^ N ^N
G^ Nyx = ^yxN and G^ Nux = !^ux (71)
!x !Nx
where the signal x is correlated with y and u but uncorrelated with the noise
e | a standard choice is x = r. The open-loop system may now be estimated
as
G^ N !^ N
G^ Nyu = ^ Nyx = ^ Nyx (72)
Gux !ux
In 30] a parametric variant of this idea was presented. This will be brie$y
discussed in Section 5.4.1 below. A problem when using parametric methods
is that the resulting open-loop estimate will typically be of high-order: from
(70) it follows that the order of G^ N will generically be equal to the sum of the
orders of the factors G^ yx(q) and G^ ux(q). This problem is similar to the one
we are faced with in the indirect method, where we noted that solving for the
open-loop estimate in (47) typically gives high-order estimates. However in
the joint input-output method (70) this can be circumvented, at least in the

18
SISO case, by parameterizing the factors Gyx(q ) and Gux(q ) in a common-
denominator form. The nal estimate will then be the ratio of the numerator
polynomials in the original models.
Another way of avoiding this problem is to consider parameterizations of the
form
Gyx (q ) = Guy (q )Gux(q  ) (73a)
Gux(q ) = Gux(q  ) (73b)
This way we will have control over the order of the nal estimate through
the factor Guy (q ). If we disregard the correlation between the noise sources
aecting y and u we may rst estimate the  -parameters using u and r and
then estimate the -parameters using y and r, keeping  -parameters xed to
their estimated values. Such ideas will be studied in Sections 5.4.2-5.4.3 below.
We note that also H0(q) contains all the necessary information about the open-
loop system so that we can compute consistent estimates of G0 even when no
reference signal is used (r = 0). As an example we have that G^ Nyu = H^ ydN (H^ ud
N );1
is a consistent estimate of G0. Such methods were studied in 15]. See also 3]
and 26].

5.4.1 The Coprime Factor Identication Scheme


Consider the method (70). Recall that this method gives consistent estimates
of G0 regardless of the prelter Fr (Fr is assumed stable). Can this freedom
in the choice of prelter Fr be utilized to give a better nite sample behavior?
In 30] it is suggested to choose Fr so as to make G^ yx(q) and G^ ux(q) normalized
coprime. The main advantage with normalized coprime factors is that they
form a decomposition of the open-loop estimate G^ N in minimal order, stable
factors. There is a problem, though, and that is that the proper prelter Fr
that would make G^ yx(q) and G^ ux(q) normalized coprime is not known a priori.
To cope with this problem, an iterative procedure is proposed in 30] in which
the prelter Fr(i) at step i is updated using the current models G^ (yxi) (q) and
G^ (uxi) (q) giving Fr(i+1) , and so on. The hope is, of course, that the iterations
lead to normalized coprime factors G^ yx(q) and G^ ux(q).

5.4.2 The Two-stage Method


The next joint input-output method we will study is the two-stage method 28].
It is usually presented using the following two steps (cf. (73)):

19
(1) Identify the sensitivity function S0i using, e.g., an output error model
u^(tj ) = S i(q )r(t) (74)
(2) Construct the signal u^ = S^Ni r and identify the open-loop system using
the output error model
y^(tj ) = G(q )^u(t) = G(q )S^Ni (q)r(t) (75)
possibly using a xed prelter
Note that in the rst step a high-order model of S0 can be used since we in
the second step can control the open-loop model order independently. Hence
it should be possible to obtain very good estimates of the true sensitivity
function in the rst step, especially if the noise level is low. Ideally S^Ni ! S0i
as N ! 1 and u^ will be the noise free part of the input signal. Thus in
the ideal case, the second step will be an \open-loop" problem so that an
output error model with xed noise model (prelter) can be used without
loosing consistency. See, e.g., Corollary 4 below. This result requires that the
disturbance term d in (66) is uncorrelated with r.

5.4.3 The Projection Method


We will now present another method for closed-loop identication that is in-
spired by Akaike's idea (71)-(72) which may be interpreted as a way to corre-
late out the noise using the reference signal as instrumental variable. In form
it will be similar to the two-stage method but the motivation for the methods
will be quite dierent. Moreover, as we shall see the feedback need not be lin-
ear for this method to give consistent estimates. The method will be referred
to as the projection method 10, 11].
This method uses the same two steps as the two-stage method. The only
dierence to the two-stage method is that in the rst step one should use a
doubly innite, non-causal FIR lter instead. The model can be written
X
M
u^(tj ) = S i(q )r(t) = sik r(t ; k) M ! 1 M = o(N ) (76)
k=;M

This may be viewed as a \projection" of the input u onto the reference signal
r and will result in a partitioning of the input u into two asymptotically
uncorrelated parts:
u(t) = u^(t) + u~(t) (77)

20
where
u^(t) = S^Ni (q)r(t) (78)
u~(t) = u(t) ; u^(t) (79)
We say asymptotically uncorrelated because u^ will always depend on e since
u does and S i(q ) is estimated using u. However, as this is a second order
eect it will be neglected.
The advantage over the two-stage method is that the projection method gives
consistent estimates of the open-loop system regardless of the feedback, even
with a xed prelter (cf. Corollary 4 below). A consequence of this is that
with the projection method we can use a xed prelter to shape the bias
distribution of the G-estimate at will, just as in the open-loop case with output
error models.
Further comments on the projection method:
Here we chose to perform the projection using a non-causal FIR lter but
this step may also be performed non-parametrically as in Akaike's cross-
spectral method (71)-(72).
In practice M can be chosen rather small. Good results are often obtained
even with very modest values of M . This is clearly illustrated in Example
5 below.
Finally, it would also be possible to project both the input u and the output
y onto r in the rst step. This is in fact what is done in (71)-(72).

5.5 Unifying Framework for All Joint Input-Output Methods

Consider the joint system (66) and assume, for the moment, that d is white
noise with covariance matrix d independent of e. The maximum likelihood
estimates of G0 and H0 are computed as
2 3T 2 3;1 2 3
X
1 6y(t) ; y^(tj
N )7 60 0 7 6 y(t) ; y^(tj )7
min 4
2DM N t=1 u(t) ; u^(tj
5 4 5 4 5 (80)
) 0 d u(t) ; u^(tj )
where
2 3 2 3
64y^(tj )7 ;1 y(t)
5 = H (q )G(q )r(t) + (I ; H;1 (q )) 64 75 (81)
u^(tj ) u(t)
The parameterizations of G and H can be arbitrary. Consider the system (66).
This system was obtained using the assumption that the noise e aect the

21
open-loop system only and the disturbance d aect the regulator only. The
natural way to parameterize H in order to re$ect these assumptions in the
model is
2 3
(I + G(q )K (q ));1H (q
) G(q )(I + K (q )G(q ));1
H(q ) = 64 75
;K (q )(I + G(q )K (q ));1 H (q ) (I + K (q )G(q ));1
(82)
The inverse of H is
2 3
; 6 H ;1(q ) ;H ;1(q )G(q )7
H (q ) = 4
1
5 (83)
K (q ) I
Thus, with G parameterized as
2 3
;
G(q )(I + K (q )G(q )) 7
G(q ) = 64
1
5 (84)
(I + K (q )G(q ));1
we get
2 3 23 2 32 3
64y^(tj ) 7 607 I ; H ;1(q ) H ;1(q )G(q )7 6y(t)7
5 = 4 5 r(t) + 64 54 5 (85)
u^(tj ) I ;K (q ) 0 u(t)
or
y^(tj ) = (I ; H ;1(q ))y(t) + H ;1(q )G(q )u(t) (86)
u^(tj ) = r(t) ; K (q )y(t) (87)
The predictor (86) is the same as for the direct method (cf. (23)), while (87) is
the natural predictor for estimating the regulator K . The maximum likelihood
estimate becomes
(X
N
min 1 (y(t) ; y^(tj ))T ;0 1(y(t) ; y^(tj ))
2DM N t=1
X
N )
+ (u(t) ; u^(tj ))T ;d 1(u(t) ; u^(tj )) (88)
t=1
We may thus view the joint input-output method as a combination of di-
rect identication of the open-loop system and a direct identication of the
regulator. Note that this holds even in the case where r(t) = 0. If the pa-
rameterization of the regulator K is independent of the one of the system
the two terms in (88) can be minimized separately which decouples the two
identication problems.

22
Let us return to (66). The natural output-error predictor for the joint system
is
2 3 2 3
64y^(tj )7 6G(q )7
5=4 5 (I + K (q )G(q ));1r(t) (89)
u^(tj ) I
According to standard open-loop prediction error theory this will give con-
sistent estimates of G0 and K independently of 0 and d, as long as the
parameterization of G and K is suciently $exible. See Corollary 4 below.
With S i(q ) = (I + K (q )G(q ));1 the model (89) can be written
2 3 2 3
64y^(tj )7 6G(q
5=4
)7 i
5 S (q )r(t) (90)
u^(tj ) I
Consistency can be guaranteed if the parameterization of S i(q ) contains the
true input sensitivity function S0i (q) (and similarly that G(q ) = G0(q) for
some 2 DM ). See, e.g., Corollary 4 below. If
2 3
G(q ) = G(q ) S i(q ) = S i(q  ) = 64 75 (91)

the maximum likelihood estimate becomes
(X
N
min 1 (y(t) ; G(q )S i(q  )r(t))T ;0 1(y(t) ; G(q )S i(q  )r(t))
2DM N t=1
X
N )
+ (u(t) ; S i(q  )r(t))T ;d 1(u(t) ; S i(q  )r(t)) (92)
t=1
Now, if 0 = 0 I and d = d I , d ! 0 then the maximum likelihood
estimate will be identical to the one obtained with the two-stage or projection
methods. This is true because for small d the  -parameters will minimize
1X N
i T i
N t=1(u(t) ; S (q  )r(t)) (u(t) ; S (q  )r(t)) (93)

regardless of the -parameters, which then will minimize


1X N
i T i
N t=1(y(t) ; G(q )S (q  )r(t)) (y(t) ; G(q )S (q  )r(t)) (94)

The weighting matrices 0 = 0 I and d = d I may be included in


the noise models (prelters). Thus the two-stage method (and the projection
method) may be viewed as special cases of the general joint input-output
approach corresponding to special choices of the noise models. In particular,

23
this means that any result that holds for the joint input-output approach
without constraints on the noise models, holds for the two-stage and projection
methods as well. We will for instance use this fact in Corollary 6 below.

6 Convergence Results for the Closed-loop Identication Methods

Let us now apply the result of Theorem A.1 to the special case of closed-loop
identication. In the following we will suppress the arguments ! ei! , and .
Thus we write G0 as short for G0(ei! ) and G as short for G(ei!  ), etc. The
subscript is included to emphasize the parameter dependence.
Corollary 4 Consider the situation in Theorem A.1. Then, for
(1) the direct approach, with a model structure
y(t) = G(q )u(t) + H (q )e(t) (95)
where G(q ) is such that either G0 (q) and G(q ) or the controller k in
(5) contains a delay, we have that
Z h n
^N ! Dc = arg min  tr ;1 H;1(G0 + B ; G )!u (G0 + B ; G )
2DM ; oi
+(H0 ; H )(0 ; !eu!;u 1 !ue)(H0 ; H )]H; d! w. p. 1 as N ! 1
(96)
where
B = (H0 ; H )!eu !;u 1 (97)
(!eu is the cross spectrum between e and u.)
(2) the indirect approach, if the model structure is
y(t) = Gc(q )r(t) + H(q)e(t) (98)
and the input is given by (20), we have that
Z h
^N ! Dc = arg min  tr ;1 H;1(Gc0 D ; Gc )!r
2DM ; i
(Gc0 D ; Gc ) (H); d! w. p. 1 as N ! 1 (99)
where
D = (I + !dr !;r 1 ) (100)
(!dr is the cross spectrum between d and r.)

24
(3) the joint input-output approach,
(a) if the model structure is
2 3 2 3 2 32 3
64y(t)75 = 64Gc(q )7 H (q) 0 7 6e(t)7
5 r(t) + 64 1 54 5 (101)
u(t) S i(q ) 0 H2(q) d(t)
and the input is given by (20), we have that
Z h n
^N ! Dc = arg min  tr ;1 H1;1(Gc0 D ; Gc )!r
2DM ; oi
(Gc0 D ; Gc ) H1; + H2;1(S0i D ; Si )!r (S0i D ; Si )H2; d!

w. p. 1 as N ! 1 (102)
where D is given by (100).
(b) if the model structure is
y(t) = G(q )^u(t) + H(q)e(t) = G(q )S^Ni (q)r(t) + H (q)e(t)
(103)
and the input is given by (20), we have that
Z h
^N ! Dc = arg min  tr ;1H;1(Gc0 D ; G S^)!r
2DM ;
i
(Gc0 D ; G S^) (H); d! w. p. 1 as N ! 1 (104)
where D is given by (100).
The proof is given in Appendix B.1.
Remarks:
(1) Let us rst discuss the result for the direct approach, expression (96). If
the parameterization of the model G and the noise model H is $exible
enough so that for some 0 2 DM, G(q 0 ) = G0(q) and H (q 0) = H0(q)
(i.e. S 2 M) then V ( 0 ) = 0 so Dc = f 0 g (provided this is a unique
minimum) under reasonable conditions. See, e. g., the discussion following
Eq. (33) and Theorem 8.3 in 22].
(2) If the system operates in open loop, so that !ue = 0, then the bias term
B = 0 regardless of the noise model H and the limit model will be the
best possible approximation of G0 (in a frequency weighting norm that
depends on H and !u). A consequence of this is that in open loop we
can use xed noise models and still get consistent estimates of G0 pro-
vided that G(q 0 ) = G0(q) for some 2 DM and certain identiability
conditions hold. See, e.g., Theorem 8.4 in 22].

25
(3) A straightforward upper bound on (B ) (the maximal singular value of
B ) is
(B ) (H0 ; H ) (!eu) (!1 ) (105)
u
((!u) denotes the minimal singular values of !u.) If the feedback is
linear and given by (4) we get the following alternative upper bound:
v
u v
u
u (0) u
(B ) (H0 ; H ) t (! t (!eu )
(!u ) (106)
u)

It follows that the bias inclination in the G-estimate will be small in


frequency ranges where either (or all) of the following holds:
The noise model is good ((H0 ; H ) is small).
The feedback noise contribution to the input spectrum ((!eu )=(!u ))
is small.
The signal to noise ratio is high ((0)=(!u ) is small).
In particular, if a reasonable $exible, independently parameterized noise
model is used the bias-inclination of the G-estimate can be small.
(4) Consider now the indirect approach and especially expression (99). If the
disturbance signal d in (20) is uncorrelated with r (!dr = 0) then it
is possible to obtain consistent estimates of G0 even with a xed noise
model. Suppose that Gc is parameterized as in (49). Then
Gc0 ; Gc = Gc0 ; S G = S0(G0 ; G )Si (107)
Thus with under-modeling the resulting G-estimate will try to minimize
the mismatch between G0 and G and at the same time try to minimize
the model sensitivity function S i. There will thus be a \bias-pull" towards
transfer functions that give a small sensitivity for the given regulator, but
unlike (96) it is not easy to quantify this bias component.
(5) If d is correlated with r (!dr 6= 0) then the multiplicative factor D 6= I
(cf. (100)) and the G-estimate will try to minimize
Gc0 D ; G (I + KG );1 (108)
which can lead to an arbitrarily bad open-loop estimate depending on D.
(6) Let us now turn to the joint input-output approach. From expression
(102) we conclude that, if the parameterizations of Gc and S i are su-
ciently $exible, then G^ N = G^ cN (S^Ni );1 ! Gc0D(S0i D);1 = G0 regardless
of D. Thus for linear feedback this approach can give consistent estimates,
even if the regulator is unknown.
(7) Finally let us comment on expression (104). This holds for both the two-
stage and projection methods. Here we see that any disturbance d that is
correlated with r will deteriorate the G-estimate in the two-stage method.

26
The error is quantied by the multiplicative factor D in (104). With
the projection method, on the other hand, D = I since !dr = 0 (by
construction). Hence with this method it is possible to obtain consistent
estimates of G0 with xed noise models (prelters), regardless of the
feedback. The dierence in the applicability of the two-stage method and
the projection method will be illustrated in Example 5 below.
The results in Corollary 4 will now be illustrated by means of a small simula-
tion study.
Example 5 Consider the closed-loop system in Fig. 2. Here
v
r - e u- G (q) -?
++ e y-
+-
6 0

( ) 
Fig. 2. Closed-loop system with non-linear feedback.

G0 (q) = q;1 ; 0:8q;2 (109)


The feedback controller is
8
<
(x) = :0:25x + 4 if x > 0 (110)
0:25x ; 4 if x < 0
Thus, apart from just a scaling, the controller consists of a columbic friction
element.
In order to identify G a simulation was made with no noise present, i.e. with
v  0, and using a white noise reference signal. 5000 data samples were col-
lected. Thus the only errors in the models should be bias errors.
Next, the system was identied using the direct, indirect, two-stage, and pro-
jection methods. For the direct method the model was
G(q ) = b1 q;1 + b2 q;2 (111)
For the indirect method we used (cf. (49))
;1 + b2 q ;2
1q
Gc(q ) = 1 + 0b:25( b1 q;1 + b2 q;2) (112)

The following models were employed in the rst step of the two-stage method:

27
(1) An 11th order FIR model:
S (q  ) = b0 + b1 q;1 + + b10 q;10 (113)
(2) An 11th order ARX model:
;1 + b5 q;5
S (q  ) = b10++ab1qq;1 ++
+ a5q;5 (114)
1

In the rst step of the projection method the model was


X
5
S (q  ) = sk q;k (115)
k=;5
That is, an 11th order non-causal FIR model. In the second step of the two-
stage and projection methods the model was chosen as
G(q ) = b1 q;1 + b2 q;2 (116)

The identication results are summarized in Table 1: Here we can note that
Table 1. Summary of identication results.
b1 b2

True system 1:0000 ;0:8000


Direct 1:0000 ;0:8000
Indirect ;1:8060 ;4:8793
Two-stage (FIR) 0:9595 ;0:8258
Two-stage (ARX) 0:8857 ;0:6906
Projection 0:9985 ;0:7965
the direct method retrieves the true system correctly while the indirect method
gives completely useless results. This is not surprising since we did not take
the non-linear element  into account when applying the indirect method. The
two variants of the two-stage method give biased results but with the projection
method the estimate is almost perfect. Bode plots of the resulting model errors
are shown in Fig. 3 together with a plot of G0 . Here we chose to use 11th order
models in the rst step of the two-stage and projection methods other choices
give similar results as long as the model order is not too small (larger than
5 say). Note especially that the projection method in this example works ne
with a small number of parameters in the non-causal FIR model. For linear
systems the two-stage method applied with a high-order FIR model in the rst
step usually gives very good results. However, as this example shows, the non-
causal terms in the FIR model in the projection method can not be excluded
in general.

28
40

True system
30 Indirect
Two−stage (FIR)
20 Two−stage (ARX)
Projection
10

dB
−10

−20

−30

−40

−50

−60
0 0.5 1 1.5 2 2.5 3

Fig. 3. Bode plot of the true system and the errors in the estimated models.
7 Asymptotic Variance of Black Box Transfer Function Estimate
in the Case of Linear Feedback

Consider now the case where the system to be identied operates in closed
loop with a linear regulator given by (4) controlling the system. Then, for the
direct, the indirect, and the joint input-output approach we have that
h i
Cov vec G^ N (ei! ) Nn (!ru(!));T !v (!) (117)
as the model order n as well as the number of data N tends to innity. This
follows from
Corollary 6 Suppose that the input is given by (4). Then
p h i
N vec G^ N (ei!  n ) ; Gn(ei! ) 2 AsN (0 P (! n ))
as N ! 1 for xed n  (118)
where
lim lim 1 P (! n ) = (!r (!));T ! (!) (119)
!0 n!1 n u v

holds for
(1) the direct approach, under the conditions in Theorem A.2.
(2) the indirect approach, under the conditions in Corollary A.3.
(3) the joint input-output approach, under the conditions in Corollary A.3.

29
The proof is given in Appendix B.2. The variable  is explained in Appendix
A.2.
Remarks:
(1) The dierence between (117) (or (119)) and the corresponding open-loop
expression (40) (or (A.24)) is thus that !ru (i.e., that part of the input
that originates from the reference signal r) replaces !u.
(2) The expression (117) { which also is the asymptotic Cramer-Rao lower
limit { tells us precisely \the value of information" of closed-loop experi-
ments. It is the noise-to-signal ratio (where \signal" is what derives from
the injected reference) that determines how well the open-loop transfer
function can be estimated. From this perspective, that part of the input
that originates from the feedback noise has no information value when
estimating G. Since this property is, so to say, inherent in the problem, it
should come as no surprise that all approaches give the same asymptotic
variance.
(3) The results in (A.2), (40), and (117) are independent of the parameter-
ization and an important assumption is that the model order tends to
innity. It is then quite natural that the dierences between the methods
disappear, since they may be viewed as dierent parameterizations of the
same prediction error method. When we x the model order the results
will be dierent, though. This will be illustrated in the next section. Let
us also mention that if we in the direct method use a xed noise model,
equal to the true one, H (q ) = H(q) = H0(q), then as the model order
tends to innity we have that
h i
Cov vec G^ N (ei!  n) Nn (!u(!));T !v (!) (120)
just as in the open-loop case. This is true since the covariance matrix is
determined entirely by the second order properties (the spectrum) of the
input, and it is immaterial whether this spectrum is a result of open-loop
or closed-loop operation.

8 Asymptotic Distribution of Parameter Vector Estimates in the


Case of Closed-loop Data

8.1 The Direct and Indirect Approaches

The following corollary to Theorem A.4 in Appendix A.3 states what the
asymptotic distribution of the estimate ^N will be for the direct and in-
direct approaches in their standard formulations. The result will be given

30
for the SISO case only. In the following G0 will be used as short for
(d=d )G(ei!  )j=0 , etc.
Corollary 7 Assume that the conditions in Theorem A.4 hold and, in addi-
tion, that the data set Z 1 is informative enough with respect to the chosen
model structure. Then for
(1) the direct approach, with a model structure
2 3
y(t) = G(q )u(t) + H (q  )e(t) = 64 75 (121)

satisfying S 2 M and that is globally identiable at 0 , we have that
p
N (^N ; 0 ) 2 AsN (0 PD) (122a)
where
PD = 0(Rr + ");1 (122b)
Z r
Rr = 21 jH!uj2 G0G0 d! (122c)
; 0
" = R ; Re (Re );1Re 
e (122d)
Z
Re = 21 0jK j2jS0j2G0 G0 d! (122e)
;Z
R = ; 2 jH0j2 KS0 H0G0H 0d!
1
e 
(122f)
1 Z ; 0 0
Re = 2 jH j2 H 0 H 0 d! (122g)
; 0
(2) the indirect approach,
(a) with a model structure,
2 3
y(t) = 1 + GG((qq ))K (q) r(t) + Hc(q  )e(t) = 64 75 (123)

satisfying S 2 M and that is globally identiable at 0 , we have that
p
N (^N ; 0 ) 2 AsN (0 PI) (124a)
where
PI = 0 (Rr );1 (124b)
with Rr given by (122c)

31
(b) with a model structure,
y(t) = 1 + GG((qq ))K (q) r(t) + H(q)e(t) (125)
satisfying G0 2 G and that is globally identiable at 0 , we have that
p
N (^N ; 0 ) 2 AsN (0 PI) (126a)
where
PI = (R);1Q(R);1 (126b)
1 Z  jS0j2!r
R = 2 u G0 G0 d! (126c)
; jH j 2  
1 Z  jS0j4!v !r
Q = 2 u G0 G0 d! (126d)
; jH j 4  

The proof is given in Appendix B.3.


A number of comments can be made regarding the "-term in (122). We collect
these in
Lemma 8 Consider the situation in Corollary 7. For all choices of noise
model H (q  ) such that S 2 M we have that
0 " Re (127)
Furthermore,
"!0 (128)
as the order of the noise model tends to innity, and
" = Re (129)
if the noise model is xed and equal to the true one, that is if H (q  ) =
H(q) = H0(q).
The proof is given in Appendix B.4.
Further remarks:
(1) An important observation regarding the result (122) is that the term "
is entirely due to the noise part of the input spectrum and since "  0
this contribution has a positive eect on the accuracy, contrarily to what
one might have guessed. We conclude that in the direct method the noise
in the loop is utilized in reducing the variance. For the indirect methods
this contribution is zero.

32
(2) From (122) it is also clear that | from the accuracy point of view | the
worst-case experimental conditions with respect to the reference signal is
when there is no external reference signal present, that is when !r = 0.
In this case
Cov ^N N0 ";1 (130)
Thus " characterizes the lower limit of achievable accuracy for the direct
method. Now, if " is non-singular, (122) says that we can consistently es-
timate the system parameters even though no reference signal is present.
The exact conditions for this to happen are given in 24] for some com-
mon special cases. However, even if " is singular it will have a benecial
eect on the variance of the estimates, according to (122). Only when
" = 0 there is no positive eect from the noise source on the accuracy
of the estimates. According to Lemma 8 this would be the case when the
order of the noise model tends to innity. At the other extreme we have
that a xed (and correct) noise model will make " as \large" as it can
be.
This puts the nger on the value of information in the noise source e
for estimating the dynamics:
It is the knowledge/assumption of a constrained noise
model that improves the estimate of G.
This also explains the dierence between (117) (which assumes the noise
model order to tend to innity) and (120) (which assumes a xed and
correct noise model).
(3) For the indirect approach, note that for all H,
PI PI (131)
with equality for H = Hc0 = S0 H0 in (125). By comparing (124) with
(122) we also see that the covariance matrix for the indirect method will
always be larger than the one for the direct. The dierence is the term
" that is missing in (124). We thus have the following ranking:
PD PI PI (132)

8.2 Further Results for the Indirect Approach

In Corollary 7 we considered two variants of the indirect approach and it is


important to realize that the conclusions drawn in the previous section do not
hold for all possible variants and parameterizations of the indirect approach.
In this section we will study two other variants where rst the closed-loop

33
system is identied and then the open-loop system is estimated by solving
an over-determined system of equations, relating the open-loop parameters to
the closed-loop parameters, in the least-squares sense using the knowledge of
the controller (cf. the discussion in Section 5.2). We will then see that it is
possible to get the same level of accuracy with the indirect method as with
the direct, even with nite model orders. This was rst shown in 25] and in
this section we will review and extend the results there. See also 15] and 27].

8.2.1 Indirect Identication Using ARMAX Models


Let us now outline the idea. Suppose that the regulator is linear and of the
form (4), i.e.
u(t) = r(t) ; K (q)y(t) (133)
Factorize K as
K (q) = X (q)
Y (q) (134)
where the polynomials X and Y are coprime. Let the closed-loop and open-
loop model structures be
Ac(q)y(t) = Bc(q)r(t) + Cc(q)e(t) (135)
and
A(q)y(t) = B (q)u(t) + C (q)e(t) (136)
respectively. From
Gc(q) = 1 + GG((qq))K (q) and Hc(q) = 1 + GH((qq))K (q) (137)
(cf. (9)) it follows that we may solve for A, B and C in
8
>
< Ac(q) = A(q)Y (q) + B (q)X (q)
> Bc(q) = B (q)Y (q) (138)
: Cc(q) = C (q)Y (q)
Let  and denote the closed-loop and open-loop parameter vectors, respec-
tively. Then, for given , (138) is a system of linear equations in from which
can be estimated. If the Markov estimate is used, i.e. if (138) is solved
in the least squares sense, using the estimated covariances of ^N as weights,
Corollary 9 below states that this indirect method is consistent under reason-
able assumptions and gives the same accuracy as the direct method with an
ARMAX model structure.

34
Corollary 9 Consider the situation in Theorem A.4 and let the model struc-
ture M be given by (135). Assume that the model structure M is identiable
at 0 and that the data set Z 1 = fy(1) r(1) : : : y (1) r(1)g is informative
enough with respect to M. Assume also that S 2 M holds for the model struc-
ture (136). Then, if the open-loop parameters are estimated from (138) using
the Markov estimate, we have that
p
N ( ^N ; 0 ) 2 AsN (0 PAX) (139a)
where
h i;1
PAX = 0 E (t 0) T (t 0 ) (139b)
 !
d A ( q ) B ( q )
(t ) = ; d C (q) y(t) ; A(q) u(t) (139c)

The proof is given in Appendix B.5.


Through this theorem we have established yet another link between the direct
and indirect approaches. As will become more apparent below, the fact that
the dynamics model and the noise model share the same poles is crucial for
obtaining the same accuracy as with the direct approach.

8.2.2 Indirect Identication Using Box-Jenkins Models


Suppose now that we in the rst step in the indirect method use the following
model structure with an independently parameterized noise model:
y(t) = Gc(q )r(t) + Hc(q  )e(t)
=B c (q ) Cc(q) e(t) (140)
F (q) r ( t) + D (q )
c c
This is also known as a Box-Jenkins model structure. Let the open-loop model
be
y(t) = G(q )u(t) + H (q  )e(t)
(141)
=B (q) r(t) + C (q) e(t)
F (q) D(q)
To nd the indirect estimate of G0 we may solve for B and F in (cf. (138))
(
Bc(q) = B (q)Y (q) (142)
Fc(q) = F (q)Y (q) + B (q)X (q)
The nal estimate will also this time be consistent (under mild conditions)
but not of optimal accuracy. Instead we get

35
Corollary 10 Consider the situation in Theorem A.4 and let the model struc-
ture M be given by (140). Assume that the model structure M is identiable
at 0 and that the data set Z 1 = fy(1) r(1) : : : y (1) r(1)g is informative
enough with respect to M. Assume also that G0 2 G holds for the model struc-
ture (141). Then, if the open-loop parameters are estimated from (142) using
the Markov estimate, we have that
p
N (^N ; 0) 2 AsN (0 PBJ) (143a)
where
" Z r #;1
PBJ 1
= 0 2 ! u 0 0
G G d! (143b)
; jH0 j2  

The proof is given in Appendix B.6.


Thus
PBJ = PI ( PD) (144)
and once again we conclude that with an independently parameterized noise
model the indirect method gives worse accuracy than the direct method. The
dierence is quantied by the term " (cf. (122)) which is missing in (143).

8.3 Variance Results for the Projection Method

In this section some results for the eciency of the projection method will be
presented. We will consider the model
y^(tj ) = H;1(q)G(q )^u(t) + (1 ; H;1(q))y(t) (145)
Note that the true system (3) can be rewritten as
y(t) = G0(q)^u(t) + w(t) (146)
where
w(t) = v(t) + G0 (q)(u(t) ; u^(t)) (147)
The key feature of the projection method is now that u^ will be uncorrelated
with w. Estimating the model (145) is therefore an open-loop problem and all
standard open-loop results hold. As an example we have
Corollary 11 Consider the estimate ^N determined by (25)-(27). Assume
that the model structure (145) is uniformly stable and that Assumptions 1 and

36
2 hold. Suppose that u^ and w given by (147) are uncorrelated. Suppose also
that for a unique value  interior to DM we have
^N !  w. p. 1 as N ! 1 (148)
00
R=V ( )>0  (149)
" XN #
NE 1  (t )"(t ) ; E (t )"(t )] ! 0 as N ! 1 (150)
p
N t=1
Then
p
N ( ^N ;  ) 2 AsN (0 P ) (151a)
P = R;1QR;1 (151b)
1 Z  !u^
R = 2 jH j2 G0G0 d! (151c)
1 Z; !w!u^
Q = 2 G0 G0 d! (151d)
; jH j4  

Remarks:
(1) Suppose that !w = jH0j2 for some constant  and some monic lter H0 .
For all H,
P  R;1 , Popt (152)
Equality holds if H = H0 .
(2) The result (151) is equivalent to the open-loop result, except that !u is
replaced by !u^ and !v by !w . Typically
!u^ < !u and !w > !v (153)
so that the signal to noise ratio is worse than for open-loop identication
(!u^ =!w < !u =!v ), hence the variance will be larger.

9 Summarizing Discussion

We have attempted to give a status report on identication of closed-loop


systems. Several methods have been studied and we have shown that most of
the common methods can be viewed as special parameterizations of the general
prediction error method. This allows simultaneous analysis of the statistical
properties of the methods using the general prediction error theory. The main
results can be summarized as follows.
The direct method

37
gives consistency and optimal accuracy, regardless of the feedback, if the
noise and dynamic models contain a true description of the system.
requires a correct noise model (possibly estimated) to avoid bias in the dy-
namics transfer function for closed-loop data (Corollary 4). For this reason,
user chosen weightings for custom-made bias distribution cannot be used.
The indirect method
requires perfect knowledge of the regulator but can be applied with xed
noise models and still guarantee consistency (Corollary 4).
The projection method
allows the true system to be approximated in an arbitrary, user chosen
frequency domain norm regardless of the feedback (Corollary 4).
typically gives worse accuracy than the direct method (Corollary 11).
Further comments:
(1) The asymptotic variance expressions for increasing model orders are the
same for all methods (Corollary 6).
(2) For the nite model order case, the direct method meets the Cramer-Rao
bound and is thus optimal. The indirect method generally gives worse
accuracy. The improved accuracy for the direct method can be traced
to the fact that a constrained noise model (that includes the true noise
description) allows some leverage from the noise part of the input to
estimate the dynamics (Corollary 7 and Lemma 8).
The optimal statistical properties, the simplicity, and the general applicabil-
ity of the direct method implies that this should be seen as the rst choice
of methods for closed-loop identication. When this method fails no other
method will succeed. In the literature several other methods, e.g., the indirect
method, have been advocated mainly due to their ability to shape the bias
distribution by use of xed noise models (prelters). However, it is important
to realize that the total error in the resulting estimate will be due to both
bias and variance errors and since these methods typically give sub-optimal
accuracy, this \advantage" may be questionable.
If a low order model is sought that should approximate the system dynamics
in a pre-specied frequency norm there are two ways to proceed:
First estimate a higher order model, with small bias, using the direct ap-
proach and then reduce this model to lower order using the proper frequency
weighting.
Use the projection method. This has the advantage that a higher order
model need not be estimated, but has the drawback that it may have worse

38
accuracy.

Acknowledgement

The authors wish to thank all friends and colleagues, including the the anony-
mous reviewers, that have given helpful comments and suggestions for im-
provements of the manuscript during the course of this work.

A Theoretical Results for the Prediction Error Method

A.1 Complement to Section 4.2

Here we will prove (33). Let us formalize the result in a theorem:


Theorem A.1 Let ^N be dened by (25) and (27) where L(q ) = I . Suppose
that the model structure (22) is uniformly stable and that Assumptions 1 and
2 hold. Suppose also that all signals involved are quasistationary. Then
2 3 2 3
Z h 
(G0 ; G ) 7 6 !u !ue7
^N ! Dc = arg min tr  ;1 H ;1 6
4 54 5
2DM ; 
(H0 ; H ) !eu 0
2 3
64 0  75 H ;id! w. p. 1 as N ! 1 (A.1)
( G ; G ) 

(H0 ; H )

PROOF. From (32) we have that ^N will tend to the minimizing argument
of V ( ), w. p. 1 as N ! 1. Using Parselval's relationship we get
Z h i
V ( ) = 21 tr ;1!" d! (A.2)
;
Here !" is the spectrum of the prediction errors "(t ) = H ;1(q )(y(t) ;
G(q )u(t)). Write H as short for H (ei!  ), etc. Then, using (3) we get
h i
" = H;1(y ; G u) = H;1 (G0 ; G )u + (H0 ; H )e + e (A.3)
The last term is independent of the rest since (G0(q) ; G(q ))u(t) depends
only on e(s) for s < t (by assumption) and H0(q) ; H (q ) contains a delay

39
(since both H0 and H are monic). The spectrum of " thus becomes
2 3 2 32 3
(G ; G )  ! ! (G ; G ) 
!" = H;1 64 0  75 64 u ue 75 64 0  75 H; + 0 (A.4)
(H0 ; H ) !eu 0 (H0 ; H )
The result now follows since the last term is independent of . 

A.2 Complement to Section 4.3

In this appendix we will give the exact statements of the results (39)-(41),
including the necessary technical assumptions. The main results are given in
Theorem A.2 and Corollary A.3 below. These are used in Corollary 6, Section
7.
First of all the vec-operator introduced in (36) is dened as (Aij is the (i j )
element of the (n  m) matrix A)
h i h iT
vec A = A11 A21  : : :  An1 A12 A22 : : :  Anm (A.5)
The model structure considered is
"(t ) = y(t) ; y^(tj ) = H ;1(q )(y(t) ; G(q )u(t)) (A.6)
Let
 (n) = arg min E"T (t )"(t ) (A.7)
2DM
(If the minimum is not unique, let  (n) denote any minimizing element.) Now
dene the estimate ^N (n ) by
^N (n ) = arg min VN (  n  Z N ) (A.8)
2DM
X
N
VN (  n  Z N ) = N1 "T (t )"(t ) + k ;  (n)k2 (A.9)
t=1
Here  is a regularization parameter helping us to select a unique minimizing
element in (A.8) in case  = 0 leads to non-unique minima. Let
Tn(ei! ) = T (ei!  (n)) (A.10)
T^N (ei!  n ) = T (ehi!  ^N (n )) i (A.11)
T0 (ei! ) = vec G0 (ei! ) H0(ei! ) (A.12)
It will be assumed that
h i
 (n)) ; e(t))T ("(t  (n)) ; e(t)) = 0
lim
n!1 n2
E ( "( t (A.13)

40
which implies that Tn(ei! ) tends to T0(ei! ) as n tends to innity. Dene Z (q )
through (cf. (38))
@ T (q ) = q;k+1 @ T (q ) , q;k+1Z (q ) (A.14)
@k @1
Let the matrix function Z (ei!  ) be denoted by Z0(ei! ) when evaluated for
T0 (ei! ). It will also be assumed that
Z0(ei! )Z0T (ei! ) (A.15)
is invertible and that
N d
1 X 
lim p E "T (t  (n))"(t  (n)) = 0 (n xed) (A.16)
N !1 N t=1 d

Finally it will be assumed that


(!u(!)) C (!u (!))   > 0 8! (A.17)
nX
(N )
q1 j j (Ru ( )) ! 0 as N ! 1 n(N ) ! 1 (A.18)
n(N ) =;n(N )

Here
Ru( ) = Eu(t)uT (t ;  ) (A.19)
The main variance result is now
Theorem A.2 Consider the estimate T^N (ei!  n ) under the assumptions
(38) and (A.6)-(A.18). Suppose also that Assumptions 1 and 2 hold. Then
p h i
N vec T^N (ei!  n ) ; Tn(ei! ) 2 AsN (0 P (! n ))
as N ! 1 for xed n  (A.20)
where
2 3;T
!
1 P (! n ) = 6 u ( ! ) ! (
ue 7! )
lim
!0 nlim
!1 n 4 5 !v (!) (A.21)
!eu(!) 0
(Here denotes the Kronecker product.)
This was proven for the SISO case in 21]. The result for the MIMO case,
cited here, was established in 34]. The expression (39) is an intuitive, but not
formally correct, interpretation of (A.20), (A.21). In open loop we have

41
Corollary A.3 Consider the same situation as in Theorem A.2 but assume
that H (q ) is independent of . Assume that !ue = 0 and that (A.13) is
relaxed to
n2 kGn(ei! ) ; G0(ei! )k ! 0 as n ! 1 (A.22)
Then
p h i
N vec G^ N (ei!  n ) ; Gn(ei! ) 2 AsN (0 P (! n ))
as N ! 1 for xed n  (A.23)
where
lim lim 1 P (! n ) = !u (!);T !v (!)
!0 n!1 n
(A.24)

A.3 Complement to Section 4.4

We will now turn to the question of asymptotic distribution of the parameter


vector estimates. For notational convenience the discussion will be limited to
the SISO case. Let 0 denote the variance of the noise e in (3) in the SISO
case. The following result is a variant of Theorem 9.1 in 22]:
Theorem A.4 Consider the estimate ^N determined by (25)-(27) with
L(q ) = 1. Assume that the model structure is linear and uniformly sta-
ble and that Assumptions 1 and 2 hold. Suppose that the regulator is given by
(4) and that for a unique value  interior to DM we have
^N !  w. p. 1 as N ! 1 (A.25)
00 
R=V ( )>0 (A.26)
 XN 
NE N1  (t )"(t ) ; E (t )"(t )] ! 0 as N ! 1 (A.27)
p
t=1
Then
p
N ( ^N ; ) 2 AsN (0 P ) (A.28a)
P = R;1QR;1 n o (A.28b)
Q = Nlim N E  V 0 (   Z N )]V 0 (   Z N )]T (A.28c)
!1 N N

Remark: Frequency domain expressions for R and Q can easily be derived


using Parseval's relationship (cf. 22], Section 9.4).

42
B Additional Proofs

B.1 Proof of Corollary 4

From Theorem A.1 we know that


Z h i
Dc = arg min tr ;1 ! d! (B.1)
2DM ; "

where !" is the spectrum of the prediction errors ". Consider rst the direct
approach. The spectrum of the prediction errors is given by (A.4). Note that
(cf. (35))
2 3 2 32 32 3
;
64 !u !ue75 = 64 I 075 64!u 0 75 64I !u !ue 75 (B.2)
1

!eu 0 !eu!;u 1 I 0 (0 ; !eu!;u 1 !ue) 0 I


Thus with B as in (97) we get
2 3 2 32 3
(G ; G + B )  
!" = H;1 64 0   75 64 u
! 0 75 64(G0 ; G + B ) 75 H ;

(H0 ; H ) 0 (0 ; !eu!;u 1 !ue) (H0 ; H )
+ 0 (B.3)
Equation (96) now follows since the last term is independent of . For the
indirect approach we get, using (20),
h i
" = H;1(y ; Gc r) = H;1 (Gc0 ; Gc )r + S0 v + Gc0d (B.4)
Thus
h
!" = H;1 (Gc0 ; Gc )!r (Gc0 ; Gc ) + (Gc0 ; Gc )!rdGc0 +
i
+ Gc0!dr (Gc0 ; Gc ) + -indep. terms (H ); (B.5)
or
h
!" = H;1 (Gc0 ; Gc + Gc0!dr !;r 1 )!r (Gc0 ; Gc + Gc0!dr !;r 1 ) +
i
+ -indep. terms (H ); (B.6)
With D as in (100), we thus have
h i
!" = H;1 (Gc0 D ; Gc )!r (Gc0 D ; Gc ) + -indep. terms (H);
(B.7)
from which (99) follows. The proofs of (102) and (104) are analogous. 

43
B.2 Proof of Corollary 6

For the direct method, note that the resulth in Theorem i A.2 applies also to
the closed-loop case. The covariance of vec G^ N (ei!  n) is given by the top left
block in P given by (A.21). Note that
2 3;1 2 3
! ! (! !  ;1 ! );1 
64 u ue 75 = 64 u ; ue 0 eu 75 ( = don't care) (B.8)
!eu 0  

From (16) and (19) we get


!u ; !ue;0 1!eu = S0i !r (S0i ) = !ru (B.9)
which proves the result for the direct approach.
Let us now turn to the proof for the indirect approach. The model is given by
(45). According to Corollary A.3 the random variable
p h i
N vec G^ cN (ei!  n ) ; Gcn(ei! ) (B.10)
will have an asymptotic normal distribution with covariance matrix P (! n )
satisfying
lim lim 1 P (! n ) = (! (!));T ! (!) (B.11)
!0 n!1 n r vc

where !vc = S0 !v S0. This result holds regardless of the noise model H used
in (45). From the Taylor expansion
h i
vec G^ N (ei!  n ) ; Gn(ei! ) =
 !T  h i
d vec G ] i!  ^ i!  i!
=
d vecGc] (e  (n)) vech GcN (e  n ) ; Gcn(e )i
+ O(kvec G^ cN (ei!  n ) ; Gcn(ei! ) k) (B.12)
we see that the covariance matrix of
p h i
N vec G^ N (ei!  n ) ; Gn(ei! ) (B.13)
will be
 !;T  !;1
P (! n ) = ddvec
vecG] (ei!  (n))
Gc]
d vec G ] ; i! 
P (! n ) d vecG ] (e  (n))
c
(B.14)

44
Note that
Gc = SG = (I + GK );1G = G ; GKG + GKGKG ; : : : (B.15)
Repeated use of the formula 14]
d vecAXB ] = (B AT ) (B.16)
d vecX ]
and the chain rule now gives
d vecGc] = I ; (KG I ) ; (I K T GT ) + (KG K T GT ) +
d vecG]
+ (KGKG I ) + (I K T GT K T GT ) ; : : : (B.17)
As can be readily veried, with S i = (I + KG);1 and S = (I + GK );1, we
thus have that
d vecGc] = S i S T  = (S i)T S T (B.18)
d vecG]
Consequently
lim 1 P (! n ) = (S i )T S ;1 (! );T !  (S i )T S ; (B.19)
n!1 n 0 r vc 0
 0i ;T ;1  ;T   0 i T ; ;
= (S0 ) S0 (!r ) !vc ((S0) ) S0
 i ;T ;T i  ;T   ;1 ; (B.20)
= (S0 ) (!r ) ((S0) ) S0 !vc S0 (B.21)
r
= (!u) ; T !v (B.22)
which concludes the proof for the indirect approach.
For the joint input-output approach the model is (in case an output error
model is used):
2 3 2 3
64y^(t )7 6Gc(q
5=4 i
)7
5 r(t) (B.23)
u^(t ) S (q )
Theh open-loopi estimate is computed as G^ N = G^ cN (S^Ni );1. Write Q as short
T
for GTc (S i)T . According to Corollary A.3 the random variable
p h i
N vec Q^ N (ei!  n ) ; Qn(ei! ) (B.24)
will have an asymptotic normal distribution with covariance matrix P (! n )

45
satisfying
2 3 2 3
1 P (! n ) = (! (!));T 64 I 75 !vc (!) 64 I 75 (B.25)
lim lim
!0 n!1 n r
;K (ei! ) ;K (ei! )

where !vc = S0!v S0. From the Taylor expansion


h i
vec G^ N (ei!  n ) ; Gn(ei! ) =
0 1T 
d vecG ] h i
=@ h i (ei!  (n))A vec Q^ N (ei!  n ) ; Qn(ei! )
d vec Q
h i
+ O(kvec Q^ N (ei!  n ) ; Qn (ei! ) k) (B.26)
we see that the covariance matrix of
p h i
N vec G^ N (ei!  n ) ; Gn(ei! ) (B.27)
will be
 !;T  !;1
P (! n ) = dd vec G] (ei!  (n))
vecQ]
d vec G ]
P (! n ) d vecQ] (e;i!  (n))
(B.28)
By use of the formula (B.16) and the chain rule it can be shown that
d vecG] = (S i);1 hI 0iT  + (S i);1 h0 ; Gc i
(S ) ; i
1 T
(B.29)
d vecQ]
h iT
= (S i);1 I ; Gc(S i);1 (B.30)
Note that Gc(S i);1 ! G0 as n ! 1. Thus

1 P (! n ) = (S i );T I ; Gh i
lim
n!1 n 0 0
0 2 3 2 3 1
B@(!r );T 64 I 75 !vc 64 I 75 CA (S i );T hI ; G0
i
0
;K ;K

   ;1 ; (B.31)
= (S0i );T (!r );T ((S0i ) );T S0 !vc S0 (B.32)
= (!ru);T !v (B.33)
which concludes the proof for the joint input-output approach and the also
the proof of Corollary 6. 

46
B.3 Proof of Corollary 7

Consider rst the direct approach. Under the conditions in the corollary we
have that ^N ! 0 w. p. 1 as N ! 1. See, e.g., Theorem 8.3 in 22]. It follows
from Theorem A.4 that
p
N ( ^N ; 0 ) 2 AsN (0 P ) (B.34)
where
h i;1
P = 0 E (t 0 ) T (t 0 ) (B.35)
, 0R;1 (B.36)
Introduce 0 = u e]T . The spectrum of 0 is
2 3
! !
!0 = 64 u ue75 (B.37)
!ue 0
Since !ue = ;KS0 H0 0, we may also write this as
!0 = !r0 + !e0 (B.38)
where
2 3 2 32 3
r
! 0 KS H KS H
!r0 = 64 u 75 and !e0 = 0 64 0 0 75 64 0 0 75 (B.39)
0 0 ;1 ;1

Using Parseval's relation R can be written as (see, e.g., 22])


1 Z 1
R = 2 jH j2 T0 !0 T0 d! (B.40)
; 0
where T = G H ] and where T0 = dd T j=0 . From (B.38) it follows that
T0 !0 T0 = T0 !r0 T0 + T0 !e0 T0 (B.41)
We may thus write
R = Rr + Re (B.42)
where, due to the chosen parameterization,
2 3
1 Z  1 6Rr 07
r 0 e 0
R = 2 jH j2 T !0 T d! = 4 5 (B.43)
; 0 0 0

47
and
2 3
Z R R
e e
Re = 21 jH1 j2 T0 !r0 T0d! = 64 e e 75

(B.44)
; 0 R R
Straight-forward calculations will also show that
1 Z  !r
Rr = u G0 G0 d! (B.45)
2 ; jH0j2  
and
2 32 3
Z Z 0
KS H G KS H G 0
Re = 21 jH1 j2 T0 !e0 T0d! = 21 jH0j2 64 0 00 75 64 0 00 75 d!
 
; 0 ; 0 ;H ;H
(B.46)
P, the covariance matrix for ^N can be found as the top left block of P .
Combining (B.35), (B.36), (B.43), and (B.44) we thus have that
P = 0(Rr + ");1 (B.47)
" = Re ; Re (Re );1Re  (B.48)
where Rr is given by (B.45) and expressions for Re , Re , and Re can be
extracted from (B.46). This nishes the proof of (122). Let us now turn to
the proof of (124). The covariance matrix P is given by (B.35). Applying
Parseval's formula to R = E (t ) T (t ) gives
2 3
Z h 0 i6
R = 21 jH1 j2 G0c Hc 4
!r 0 7 h 0 0 i
5 Gc Hc d! (B.49)
; c0 0 0
2 3
Z 0 0
1 6!r GcGc 0 7
= 21

; jHc0j
4 5 d! (B.50)
2
0 0Hc0 Hc0
Hence
" Z #;1
1 ! r
P = 0 2 jH j2 G0cG0cd! (B.51)
; c0

The theorem now follows after noting that G0c = S02G0 and Hc0 = S0H0. The
proof of (126) is similar. 

48
B.4 Proof of Lemma 8

Consider " dened by (122d). Note that " is the Schur complement of Re in
the matrix Re given by (B.44). Since Re  0 we thus have "  0. This proves
the lower bound in (127). Next, dene
2 3
0 6K (q)S0(q)H0(q)75 e(t)
e (t 0 ) = T (q 0 ) 4 (B.52)
;1

Then we can alternatively write Re as


Re = E e (t 0) eT (t 0 ) (B.53)
Now introduce the notation
w(t) = H0;1(q)e(t) (B.54)
G0 (q) = K (q)S0(q)H0(q)G0(q) (B.55)
so that
2 3
G0 (q )w(t)
6 
e (t 0 ) = 4
75 (B.56)
;H 0 (q )w(t)

Here the number of rows in G0 and H 0 are consistent with the partitioning
(B.43), (B.44). From well known least-squares projections (see, e.g., 2]), we
now recognize " as the error covariance matrix when estimating G0 w from
H 0 w.
Suppose now that the order of the noise model tends to innity. Without loss
of generality we can assume that
X
M
H (q  ) = Mlim
!1
1+ hk q;k (B.57)
k=1

Then knowing H 0 w is equivalent to knowing all past w. Then G0w can be


determined exactly from H 0 w, and " = 0. This proves (128)
At the other extreme, a xed (and correct) noise model will make " = Re =
E G0w(G0w)T , which is the largest value " may have. This proves (129) and
the upper bound in (127). 

49
B.5 Proof of Corollary 9

Let
A(q) = 1 + a1q;1 + + ana q;na
B (q) = b1 q;1 + + bnb q;nb (B.58)
C (q) = 1 + c1q;1 + + cnc q;nc
and similarly for the closed-loop polynomials. Dene as
h iT
= a1  : : :  ana  b1  : : :  bnb  c1 : : :  cnc (B.59)
and  equivalently. Furthermore let the regulator polynomials be
X (q) = x0 + x1 q;1 + + xnx q;nx (B.60)
Y (q) = 1 + y1q;1 + + yny q;ny

With these denitions the system of equations (138) can be written as


; = (B.61)
where
2 3 2 3
2 3 66 1 77 66 x0 7
66 y1 . . . 77 66 x1 . . . 777
66;Y ;X 0 77 66 . 77 66 . 7
6 7 6 . .
; = 66 0 ;Y 0 77  ;Y = 6 . . 1 7  ;X = 66 ..
. 7 . . . x 77 (B.62)
07
4 5 66 77 66 77
0 0 ;Y 6
64 y1 7 7 66 x1 77
.. 5 4 .. 5
. .
while
h iT
 = ac1 ; y1 ac2 ; y2 : : :  bc1 bc2  : : :  cc1 ; y1 cc2 ; y2 : : :
(B.63)
Now, if the closed-loop parameters  are estimated from a set of N data we
have from Theorem A.4 that
p
N (^N ; 0 ) 2 AsN (0 P ) (B.64)
h i;1
P = 0 E c (t 0) cT (t 0) (B.65)
 !
d A c(q ) B c (q )
c (t  ) = ;
d Cc(q) y(t) ; Ac (q) r(t) (B.66)

50
Consider equation (B.61). The Markov estimate of is
h i
^N = ;T (Cov ^N );1 ; ;1;T (Cov ^N );1^N (B.67)
from which it follows that
p
N ( ^N ; 0 ) 2 AsN (0 P ) (B.68)
h i;1
P = N ;T (Cov ^N );1; (B.69)
We also have that
h i;1
Cov ^N = 1 P = 0 E c(t 0) cT (t 0 ) (B.70)
N N
Hence
h i;1
P = 0 E ;T c(t 0) cT (t 0 ); (B.71)
The gradient vector c(t 0 ) is
1 h;y(t ; 1) : : :  ;y(t ; n )
c (t 0 ) = ac
Cc0(q)
iT
r(t ; 1) : : :  r(t ; nbc ) e(t ; 1) : : :  e(t ; ncc ) (B.72)
and since
Y (q) = 1 and XC(q)(yq()t) ; YC(q)(rq()t) = ; C 1(q) u(t) (B.73)
Cc0(q) C0(q) c0 c0 0

we have
h
;T c(t 0 ) = C 1(q) ;y(t ; 1) : : :  ;y(t ; na )
0
iT
u(t ; 1) : : :  u(t ; nb ) e(t ; 1) : : :  e(t ; nc) (B.74)
It follows that
 (q) u(t)
;T c(t 0) = ; dd CA((qq)) y(t) ; B
A(q) (B.75)
where the right-hand side should be evaluated at = 0 . But the expression on
the right is equal to the negative gradient of the open-loop prediction errors,
i.e. (t ), evaluated at = 0 . Thus ;T c(t 0 ) = (t 0 ) and
h i;1
P = 0 E (t 0 ) T (t 0 ) (B.76)
which together with (B.68) proves the corollary. 

51
B.6 Proof of Corollary 10

Let
B (q) = b1 q;1 + + bnb q;nb
F (q) = 1 + f1 q;1 + + fnf q;nf
(B.77)
C (q) = 1 + c1 q;1 + + cnc q;nc
D(q) = 1 + d1q;1 + + dnd q;nd
and similarly for the closed-loop polynomials and let X , Y , ;X , and ;Y be as
in the proof of Corollary 9. Then (142) can be written
;~ = (B.78)
where
2 3
; 0 h iT
;~ = 64 Y 75  = bc1 bc2 : : :  fc1 ; y1 fc2 ; y2 : : : (B.79)
;X ;Y
Using the same arguments as in the proof of Corollary 9 we get
p
N (^N ; 0 ) 2 AsN (0 P) (B.80)
h i;1
P = N ;~ T (Cov ^ N );1;~ (B.81)
We also have that
h i;1
Cov ^ N = Cov ^ N = N0 E T
c (t 0 ) c (t 0 ) (B.82)
where c is the negative gradient of the closed-loop predictions errors,
1  
"c(t ) = H (q  ) y(t) ; Gc(q )r(t) (B.83)
c
 
= H (q1  ) y(t) ; B c (q )
F (q) r ( t ) (B.84)
c c
taken w.r.t. , giving
1 h
c (t 0 ) =
Hc0(q)Fc0(q) r(t ; 1) : : :  r(t ; nbc )
;
Bc0(q) (r(t ; 1) : : :  r(t ; n ))i (B.85)
fc
F (q)
c0
Returning to P we thus have that
h i;1
P = 0 E ;~ T c(t 0 ) cT (t 0);~ (B.86)

52
However, since
Hc0 = S0H0 Fc0Y F;2 Bc0X = SF0  BFc20Y = SF0 BF 0
2 2
(B.87)
c0 0 c0 0 0

we get (after some calculations)


h
;~ T c (t 0) = H (Sq0)(Fq)(q) r(t ; 1) : : :  r(t ; nb )
0 0

;
B0 (q) (r(t ; 1) : : :  r(t ; n ))iT (B.88)
f
F (q)
0

so
S 0 (q ) d  B ( q ) 
;~ T
c (t 0 ) = ;
H0(q) d  y(t) ; F (q) r(t) (B.89)
=HS0(q) d G(q )r(t) (B.90)

0 (q ) d 

where the derivatives on the right should be evaluated at  = 0 . Using


Parseval's formula we thus have that
" Z r #;1
1
P = 0 2 jH j2 GG d! ! u 0 0 (B.91)
; 0
and the corollary follows. 

References

1] H. Akaike. Some problems in the application of the cross-spectral method. In


B. Harris, editor, Spectral Analysis of Time Series, pages 81{107. John Wiley
& Sons, 1967.
2] B. D. O. Anderson and J. B. Moore. Optimal Filtering. Information and System
Sciences Series. Prentice-Hall, 1979.
3] B.D.O. Anderson and M. Gevers. Identiability of linear stochastic systems
operating under linear feedback. Automatica, 18(2):195{213, 1982.
4] K. J. 
Astrom. Matching criteria for control and identication. In Proceedings
of the 2nd European Control Conference, pages 248{251, Groningen, The
Netherlands, 1993.
5] D. R. Brillinger. Time Series: Data Analysis and Theory. Holden-Day, 1981.
6] R. de Callafon and P. Van den Hof. Multivariable closed-loop identication:
From indirect identication to Dual-Youla parametrization. In Proceedings of
the 35th Conference on Decision and Control, pages 1397{1402, Kobe, Japan,
1996.

53
7] R. de Callafon, P. Van den Hof, and M. Steinbuch. Control relevant
identication of a compact disc pick-up mechanism. In Proceedings of the
32nd Conference on Decision and Control, volume 3, pages 2050{2055, San
Antonio,TX, 1993.
8] E. T. van Donkelaar and P. M. J. Van den Hof. Analysis of closed-loop
identication with a taylor-made parametrization. In Proc. 4th European
Control Conference, volume 4, Brussels, Belgium, 1997.
9] B. Egardt. On the role of noise models for approximate closed loop
identication. In Proceedings of the European Control Conference, Brussels,
Belgium, 1997.
10] U. Forssell. Properties and Usage of Closed-loop Identication Methods.
Licentiate thesis LIU-TEK-LIC-1997:42, Department of Electrical Engineering,
Linkoping University, Linkoping, Sweden, September 1997.
11] U. Forssell and L. Ljung. A projection method for closed-loop identication.
Technical Report LiTH-ISY-R-1984, Department of Electrical Engineering,
Linkoping University, Linkoping, Sweden, 1997.
12] M. Gevers. Towards a joint design of identication and control. In H. L.
Trentelman and J. C. Willems, editors, Essays on Control: Perspectives in the
Theory and its Applications, pages 111{151. Birkhauser, 1993.
13] M. Gevers and L. Ljung. Optimal experiment design with respect to the
intended model application. Automatica, 22:543{554, 1986.
14] A. Graham. Kronecker Products and Matrix Calculus: with Applications. Ellis
Horwood Limited, 1981.
15] I. Gustavsson, L. Ljung, and T. Soderstrom. Identication of processes in closed
loop | Identiability and accuracy aspects. Automatica, 13:59{75, 1977.
16] F. R. Hansen. A fractional representation to closed-loop system identication
and experiment design. Phd thesis, Stanford University, Stanford, CA, USA,
1989.
17] F. R. Hansen, G. F. Franklin, and R. Kosut. Closed-loop identication via the
fractional representation: experiment design. In Proceedings of the American
Control Conference, pages 1422{1427, Pittsburg, PA, 1989.
18] I. D. Landau and K. Boumaiza. An output error recursive algorithm for
identication in closed loop. In Proceedings of the 13th IFAC World Congress,
volume I, pages 215{220, San Francisco, CA, 1996.
19] W. S. Lee, B. D. O. Anderson, I. M. Y. Mareels, and R. L. Kosut. On some
key issues in the windsurfer approach to adaptive robust control. Automatica,
31(11):1619{1636, 1995.
20] L. Ljung. Convergence analysis of parametric identication methods. IEEE
Transactions on Automatic Control, 23(5):770{783, 1978.

54
21] L. Ljung. Asymptotic variance expressions for identied black-box transfer
function models. IEEE Transactions on Automatic Control, 30(9):834{844,
1985.
22] L. Ljung. System Identication: Theory for the User. Prentice-Hall, 1987.
23] L. Ljung and U. Forssell. An alternative motivation for the indirect approach
to closed-loop identication. Accepted for publication in IEEE Transactions on
Automatic Control, 1998.
24] T. Soderstrom, I. Gustavsson, and L. Ljung. Identiability conditions for linear
systems operating in closed loop. Int. J. Control, 21(2):234{255, 1975.
25] T. Soderstrom, L. Ljung, and I. Gustavsson. On the accuracy of identication
and the design of identication experiments. Technical Report 7428,
Department of Automatic Control, Lund Institute of Technology, Lund, Sweden,
1974.
26] T. Soderstrom and P. Stoica. System Identication. Prentice-Hall
International, 1989.
27] T. Soderstrom, P. Stoica, and B. Friedlander. An indirect prediction error
method for system identication. Automatica, 27:183{188, 1991.
28] P. M. J. Van den Hof and R. J. P. Schrama. An indirect method for transfer
function estimation from closed loop data. Automatica, 29(6):1523{1527, 1993.
29] P. M. J. Van den Hof and R. J. P. Schrama. Identication and control |
Closed-loop issues. Automatica, 31(12):1751{1770, 1995.
30] P. M. J. Van den Hof, R. J. P. Schrama, R. A. de Callafon, and O. H. Bosgra.
Identication of normalized coprime factors from closed-loop experimental data.
European Journal of Control, 1(1):62{74, 1995.
31] P. Van Overschee and B. De Moor. Subspace identication for linear systems.
Kluwer, 1996.
32] M. Vidyasagar. Control System Synthesis: A Factorization Approach. MIT
Press, 1985.
33] Z. Zang, R. R. Bitmead, and M. Gevers. Iterative weighted least-squares
identication and weighted LQG control design. Automatica, 31(11):1577{1594,
1995.
34] Y.-C. Zhu. Black-box identication of MIMO transfer functions: Asymptotic
properties of prediction error models. International Journal of Adaptive Control
and Signal Processing, 3:357{373, 1989.

55

You might also like