0% found this document useful (0 votes)
31 views

Canonical Correlations Between Input and Output Processes of Linear Stochastic Models

The document discusses canonical correlations between input and output processes of linear stochastic models. It defines the principal angles between subspaces of block Hankel matrices formed from the input and output sequences of a linear model. Expressions are derived for the canonical correlations of the corresponding processes in terms of the model parameters and the principal angles between the input and output subspaces.

Uploaded by

okoroglu84
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Canonical Correlations Between Input and Output Processes of Linear Stochastic Models

The document discusses canonical correlations between input and output processes of linear stochastic models. It defines the principal angles between subspaces of block Hankel matrices formed from the input and output sequences of a linear model. Expressions are derived for the canonical correlations of the corresponding processes in terms of the model parameters and the principal angles between the input and output subspaces.

Uploaded by

okoroglu84
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Canonical Correlations Between Input and Output Processes of

Linear Stochastic Models


Katrien De Cock and Bart De Moor
1
K.U.Leuven
Dept. of Electrical Engineering (ESATSCD)
Kasteelpark Arenberg 10
B3001 Leuven (Heverlee), Belgium
Tel. +32/(0)16.32.17.09, Fax +32/(0)16.32.19.70
http://www.esat.kuleuven.ac.be/sista-cosic-docarch
[email protected], [email protected]
Abstract
In this paper, we obtain expressions for the principal angles between the row spaces of input and
output data block Hankel matrices of a linear stochastic model in terms of the model parameters.
The canonical correlations of the corresponding processes are equal to the limiting values of the
cosines of the principal angles. From these parametric expressions, the relations between the
dierent sets of canonical correlations can be easily deduced.
1 Introduction
Canonical correlation analysis (CCA) is a well developed tool in statistical analysis that is used
for measuring the linear relationship between two sets of random variables. It was developed
by H. Hotelling [10]. Although a wide variety of applications exists in econometrics, biometrics,
chemometrics, statistics, meteorology, etc., the technique has only got introduced quite recently in
the communities of signal processing, system theory and identication and neural networks [4, 14,
20]. In a classic paper by Gelfand and Yaglom [9], CCA is extended to stochastic processes and
related to the notion of mutual information, a concept from information theory that is closely related
to CCA and that was introduced by Shannon [18] in 1948. A slightly dierent interpretation in
terms of channel capacity and information rate is given in [17]. Another area where CCA is applied,
is stochastic realization and identication of dynamical models [1, 3, 5, 11, 12, 15, 16, 21, 22]. The
order of the model and a state sequence can be derived from the canonical correlations and the
canonical variates of the past and the future output data.
1
Katrien De Cock is a research assistant at the K.U.Leuven. Dr. Bart De Moor is a full professor at the K.U.Leuven.
Our research is supported by grants from several funding agencies and sources: Research Council KUL: Concerted Re-
search Action GOA-Mesto 666 (Mathematical Engineering), IDO (IOTA Oncology, Genetic networks), several PhD,
postdoc & fellow grants; Flemish Government: Fund for Scientic Research Flanders (several PhD and postdoc grants,
projects G.0256.97 (subspace), G.0115.01 (bio-i and microarrays), G.0240.99 (multilinear algebra), G.0197.02 (power
islands), G.0407.02 (support vector machines), research communities ICCoS, ANMMM), AWI (Bil. Int. Collabora-
tion Hungary/ Poland), IWT (Soft4s (softsensors), STWW-Genprom (gene promotor prediction), GBOU-McKnow
(Knowledge management algorithms), Eureka-Impact (MPC-control), Eureka-FLiTE (utter modeling), several PhD
grants); Belgian Federal Government: DWTC (IUAP IV-02 (1996-2001) and IUAP V-22 (2002-2006): Dynamical Sys-
tems and Control: Computation, Identication & Modelling), Program Sustainable Development PODO-II (CP/40:
Sustainibility eects of Trac Management Systems); Direct contract research: Verhaert, Electrabel, Elia, Data4s,
IPCOS;
1
In this paper we will work with the geometric interpretation of canonical correlation analysis, as
is usually done in the subspace identication literature, see e.g. [22]. The canonical correlations
and the canonical variates are respectively equal to the cosines of the principal angles between
and the principal vectors in two linear subspaces. These subspaces are the row spaces of block
Hankel matrices obtained by stacking the measured input and output sequences. In this way it
is straightforward and computationally ecient to compute an approximation of the canonical
correlations between two measured processes, of which in practice, only a nite amount of data is
available. Meanwhile, we are able to give expressions for the real canonical correlations, viz. the
asymptotic values for innite data.
The paper is organized as follows. In Section 2 we describe the models we will work with. The
principal angles between two subspaces are dened in Section 3. In Section 4 we discuss the
principal angles and canonical correlations between the past and future input and output spaces,
respectively processes, of a linear stochastic model.
2 Model class
We describe in Section 2.1 the state space representation of the model class that we will work with
throughout the paper. We also give the assumptions on the dierent stochastic processes involved.
We dene the controllability and observability matrices and Gramians of the model and the inverse
model in Section 2.2. In Section 2.3 we introduce the past and future input and output block
Hankel matrices.
2.1 State space representation
The forward innovation representation of a given stationary stochastic process {y(k)}
kZ
with m
components, i.e. y(k) R
m
k, is the following
_
x(k + 1) = Ax(k) +Ku(k) ,
y(k) = Cx(k) +u(k) .
(2.1)
The process {x(k)}
kZ
R
n
is the state process associated to this model, where n is the model
order, and A R
nn
, C R
mn
are the system and output matrices, respectively. The matrix
K R
nm
is the Kalman gain.
We will denote the model (2.1) by the threesome (A, K, C).
Its Markov parameters are denoted by the matrices H(k), k 0:
_
H(0) = I
m
H(k) = CA
k1
K for k > 0
(2.2)
The model has the following properties. The input process {u(k)}
kZ
, i.e. the innovation pro-
cess of the stochastic process {y(k)}
kZ
, is an m-component zero-mean, stationary, white stochas-
tic process with full rank covariance matrix S
u
R
mm
. Its autocovariance function R
u
() =
E
_
u(k +)u(k)
T
_
is thus equal to
2
R
u
() = S
u
(). The state process {x(k)}
kZ
is a zero-mean,
2
() is the Kronecker delta: (0) = 1 and () = 0 = 0.
2
stationary and ergodic stochastic process with covariance matrix E{x(k)x(k)
T
} = R
nn
, which
satises the Lyapunov equation
= AA
T
+KS
u
K
T
. (2.3)
Furthermore, the state x(k) is independent of the present and all future inputs. Consequently,
E{u(k + )
T
x(k)} = 0 for 0 and E{u(k + )
T
y(k)} = 0 for > 0. The system in (2.1) is
stable and strictly minimum phase. This means that all the poles and zeros of the model are less
than one in modulus. The inverse model is then also stable and minimum phase. Its state space
description is readily derived from (2.1):
_
x(k + 1) = (AKC)x(k) +Ky(k) ,
u(k) = Cx(k) +y(k) .
The state space matrices of the inverse model are denoted by (A
z
, K
z
, C
z
):
(A
z
, K
z
, C
z
) = (AKC, K, C) . (2.4)
The Markov parameters of the inverse model are denoted by H
z
(k) and they are equal to
_
H
z
(0) = I
m
,
H
z
(k) = C(AKC)
k1
K for k > 0 .
(2.5)
2.2 The controllability and observability matrices and Gramians
The controllability matrix C
i
of the forward innovation model (2.1) is dened as
C
i
=
_
K AK A
2
K A
i1
K
_
,
and its observability matrix
i
is

i
=
_
_
_
_
_
_
_
_
C
CA
CA
2
.
.
.
CA
i1
_
_
_
_
_
_
_
_
. (2.6)
The controllability Gramian P of the forward innovation model (2.1) is dened as the solution of
the controllability Lyapunov equation
P = APA
T
+KK
T
, (2.7)
while the observability Gramian Q follows from the observability Lyapunov equation
Q = A
T
QA+C
T
C . (2.8)
Since the model is stable and minimal, the matrices P and Q are the unique and positive denite
solutions of the respective equations. The explicit solution for P is of the form
P =

k=0
A
k
KK
T
(A
k
)
T
= C

C
T

,
3
where C

is the innite controllability matrix. Similarly, the observability Gramian Q can be


obtained as
Q =

k=0
(A
k
)
T
C
T
CA
k
=
T

,
where

is the innite observability matrix of the model. We will also need the observability
matrix of the inverse model, denoted by
z
i
:

z
i
=
_
_
_
_
_
_
_
_
C
C(AKC)
C(AKC)
2
.
.
.
C(AKC)
i1
_
_
_
_
_
_
_
_
,
where the subscript i in
z
i
denotes the number of block rows.
The observability Gramian of the inverse model is denoted by Q
z
and it is equal to
Q
z
=
T
z

. (2.9)
It is the solution of the observability Lyapunov equation for the inverse model
Q
z
= (AKC)
T
Q
z
(AKC) +C
T
C . (2.10)
2.3 Data block Hankel matrices
We dene the input and output block Hankel matrices U and Y . These matrices play an impor-
tant role in the computation of the canonical correlations. The output block Hankel matrix Y is
dened as
Y =
1

j
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
y(0) y(1) y(2) y(j 1)
y(1) y(2) y(3) y(j)
.
.
.
.
.
.
.
.
.
.
.
.
y(i 1) y(i) y(i + 1) y(i +j 2)
y(i) y(i + 1) y(i + 2) y(i +j 1)
y(i + 1) y(i + 2) y(i + 3) y(i +j)
.
.
.
.
.
.
.
.
.
.
.
.
y(2i 1) y(2i) y(2i + 1) y(2i +j 2)
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
(2.11)
= Y
0|2i1
=
_
Y
0|i1
Y
i|2i1
_
=
_
Y
p
Y
f
_
R
2mij
, (2.12)
where
The number of columns (j) is typically equal to K 2i + 1, where K is the total number of
data samples, which implies that all given data samples are used. For statistical reasons we
will assume that j, K throughout this paper.
4
The subscripts of Y
0|2i1
, Y
0|i1
, Y
i+1|2i1
denote the subscript of the rst and last element
of the rst column in the block Hankel matrix. The subscript p stands for past and the
subscript f for future.
The input block Hankel matrices U
0|2i1
, U
p
, U
f
are dened in a similar way.
We will also need the state sequence matrix, which is dened as
X
i
=
1

j
_
x(i) x(i + 1) x(i +j 1)
_
, (2.13)
where the subscript i denotes the subscript of the rst element of the state sequence. Analogously
to the past inputs and outputs, we denote the past state sequence by X
p
and the future state
sequence by X
f
: X
p
= X
0
R
nj
and X
f
= X
i
R
nj
. The state space equations (2.1) can now
be formulated in terms of data block Hankel matrices as follows
X
f
= A
i
X
p
+
i
U
p
, (2.14)
Y
p
=
i
X
p
+H
i
U
p
, (2.15)
Y
f
=
i
X
f
+H
i
U
f
, (2.16)
where
i
R
nmi
is the reversed controllability matrix:

i
=
_
A
i1
K A
i2
K AK K
_
,
the matrix
i
R
min
is the observability matrix of the model (see (2.6)) and the matrix H
i

R
mimi
is a block lower triangular and block Toeplitz matrix with the Markov parameters of the
model (the impulse response sequence) as its elements:
H
i
=
_
_
_
_
_
_
_
_
I
m
0 0 0
CK I
m
0 0
CAK CK I
m
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CA
i2
K CA
i3
K CA
i4
K I
m
_
_
_
_
_
_
_
_
. (2.17)
From (2.15) or (2.16) it immediately follows that the observability matrix of the inverse model (see
(2.18)) is equal to

z
i
= H
1
i

i
. (2.18)
Note that the input covariance matrix lim
j
U
p
U
T
p
= lim
j
U
f
U
T
f
, which will be denoted by
Q
u
i
is a block diagonal matrix with diagonal blocks all equal to S
u
. By using the state sequence
matrices, we can write the state covariance matrix as = lim
j
X
p
X
T
p
= lim
j
X
f
X
T
f
. The
fact that the states are uncorrelated with the present and future inputs and that the output is
uncorrelated with the future inputs, translates to
_
lim
j
X
p
U
T
= 0
lim
j
X
f
U
T
f
= 0
and lim
j
Y
p
U
T
f
= 0 , (2.19)
respectively.
5
3 Principal angles between and principal directions in subspaces
The concept of principal angles between subspaces of linear vector spaces is due to Jordan [13]
in the 19th century. In the area of systems and control, the principal angles between and the
principal directions in two subspaces are used in subspace identication methods [22] and also in
model updating [7] and damage location [8]. In the latter two applications, one starts from a nite
element model and measurements of a certain mechanical structure and one tries to nd the subset
of parameters of the model that should be adapted to explain the measurements, which is done
by computing the principal angles between a certain measurement space and the parameterized
space. In that way, damage to the structure can be located. The subspace-based fault detection
algorithm of Basseville et al. [2], on the other hand, is based on linear dynamical models, the type
of models that we deal with. Changes in the eigenmodes of the observed system are determined
by monitoring the dierence between the column spaces of the observability matrix of the nominal
linear dynamical model and the observability matrix of the model that can be identied from the
measurements. The dierence between the column spaces can be quantied by the principal angles
between the subspaces.
The principal angles between and principal directions in two subspaces S
1
and S
2
are dened as
follows.
Denition 3.1. The principal angles between and principal directions in two subspaces
Let S
1
and S
2
be subspaces of dimension p and q, respectively, where p q. Then, the p principal
angles between S
1
and S
2
, denoted by
1
, . . . ,
p
, and the corresponding principal directions u
i
S
1
and v
i
S
2
are recursively dened as
cos
1
= max
uS
1
max
vS
2
|u
T
v| = u
T
1
v
1
cos
k
= max
uS
1
max
vS
2
|u
T
v| = u
T
k
v
k
(k = 2, . . . , p)
subject to u = v = 1, and for k > 1: u
T
u
i
= 0 and v
T
v
i
= 0, where i = 1, . . . , k 1.
Let A R
pn
be of rank r
a
and B R
qn
of rank r
b
, where r
a
< r
b
. Then, the ordered set of r
a
principal angles between the row spaces of A and B is denoted by
(
1
,
2
, . . . ,
r
a
) = [A B] .
Assume that the matrices A and B are of full row rank and that p q. Then, the squared
cosines of the principal angles between their row spaces can be computed as the eigenvalues of
(AA
T
)
1
AB
T
(BB
T
)
1
BA
T
:
cos
2
[A B] =
_
(AA
T
)
1
AB
T
(BB
T
)
1
BA
T
_
. (3.20)
Since we will have to compute the principal angles between subspaces in R
n
, where n is a large
number, e.g. 10 000, it is useful to have an ecient algorithm. We present here an algorithm that
is based on the LQ decomposition, which is rst dened.
6
Denition 3.2. The LQ factorization of a matrix
The LQ factorization of a real mn matrix A is given by
A = LQ
T
,
where Q R
nn
is orthogonal and L R
mn
is lower triangular.
Note that the LQ decomposition of a matrix A boils down to the QR decomposition of A
T
, which
is the numerical version of the Gram-Schmidt orthogonalization (see e.g. [19]).
It can be shown (see e.g. [6]) that the principal angles between two full row rank matrices A R
pn
and B R
qn
, where p q and p +q n, can be computed as follows.
1. Compute the triangular part of the LQ factorization of the matrix
_
A
B
_
. The triangular part
is denoted by
_
L
11
0
L
21
L
22
_
R
(p+q)(p+q)
,
where L
11
R
pp
, L
21
R
qp
and L
22
R
qq
.
2. Compute the triangular part of the LQ factorization of
_
L
21
L
22
_
:
_
L
21
L
22
_
=
_
S 0
_
T .
The resulting lower triangular matrix S R
qq
is non-singular.
3. The cosines of the principal angles between row(A) and row(B) are the singular values of the
matrix S
1
L
21
.
The above described computational scheme leads to a very simple Matlab program, which is given
in Table 1.
function cosines = cosines lq(A,B)
p = size(A,1);
q = size(B,1);
L = triu(qr([A;B]));
L = L(1:p+q,p+1:p+q);
S = triu(qr(L));
S = S(1:q,:);
L = L(1:p,:);
cosines = svd(L/S);
Table 1: The Matlab program cosines lq.m for the computation of the principal angles between
the row spaces of the matrices A and B.
7
4 Principal angles and canonical correlations of input and output
In this section we compute the principal angles between the past and future input and output spaces
and the canonical correlations of the corresponding processes. In Section 4.1 we rst describe the
future and the past of a stochastic process. We show how the canonical correlations of the processes
will be computed and indicate how they are related. In Section 4.2, we derive expressions for the
principal angles between dierent combinations of past and future input and output spaces, i.e. the
row spaces of the mi j data block Hankel matrices, where we assume j . The expressions
are in terms of the system matrices (A, K, C) and the input covariance matrix S
u
. As we will
see, the principal angles converge for i . The cosines of the limiting principal angles are
the canonical correlations of the corresponding processes. In Section 4.4 the relations between the
dierent canonical correlations are derived.
4.1 Introduction
4.1.1 Past and future input and output processes of a linear model
Let {u(k)}
kZ
and {y(k)}
kZ
denote the input and output process of a linear stochastic model in
forward innovation form. We assume that the processes are zero-mean, stationary and ergodic.
The past output process is dened as
y
p
= {y(k) (k < 0)} , (4.21)
and the future output process is
y
f
= {y(k) (k 0)} . (4.22)
Analogous denitions hold for the past and the future input process, u
p
and u
f
, respectively.
4.1.2 The canonical correlations of the past and future input and output processes
The canonical correlations of the past and future input and output processes are dened as the
canonical correlations of the corresponding random variables U
1|
, U
0|
, Y
1|
and Y
0|
,
where
Y
1|
=
_
_
_
y(1)
y(2)
.
.
.
_
_
_
and Y
0|
=
_
_
_
y(0)
y(1)
.
.
.
_
_
_
,
and analogously for U
1|
and U
0|
. For example, the canonical correlations of the past and
future of the output process are equal to
cc(y
p
, y
f
) = cc(Y
1|
, Y
0|
) . (4.23)
Due to the stationarity and ergodicity of the processes, the canonical correlations are equal to the
cosines of the principal angles between the row spaces of the doubly innite block Hankel matrices
(see (2.11) and (2.12)):
cc(y
p
, y
f
) = lim
j
cos
_
Y
1|
Y
0|

.
We can already treat three trivial cases:
8
1. Due to the independence of the past and future input processes, the canonical correlations of
u
p
and u
f
are all equal to 0:
cc(u
p
, u
f
) = 0, 0, . . .
2. The future input is also independent of the past output process. Consequently, their canonical
correlations are all equal to 0:
cc(y
p
, u
f
) = 0, 0, . . .
3. The output at a certain time step k is a linear combination of the present input and all past
inputs:
y(k) =

i=1
CA
i1
Ku(k i) +u(k) =

i=0
H(i)u(k i) ,
where H(i) is the ith Markov parameter of the linear model (see (2.2)). Consequently, all
random variables in Y
1|
can be obtained as linear combinations of the random variables
in U
1|
. Otherwise formulated, the row space of Y
1|
is contained in the row space of
U
1|
:
row(Y
1|
) row(U
1|
) . (4.24)
The canonical correlations of the past input and past output processes are consequently all
equal to 1:
cc(u
p
, y
p
) = 1, 1, . . . (4.25)
Moreover, by applying the same reasoning to the inverse model, we obtain u(k) as a linear
combination of the present and past outputs:
u(k) =

i=1
C(AKC)
i1
Ky(k i) +y(k) =

i=0
H
z
(i)y(k i) ,
where H
z
(i) are the Markov parameters of the inverse model (see (2.5)). This leads to
row(U
|1
) row(Y
|1
) . (4.26)
It follows from (4.24) and (4.26) that the past input and the past output process span the
same space:
row(Y
|1
) = row(U
|1
) . (4.27)
The canonical correlations of the other combinations of past and future input and output processes
can be obtained as the following limits:
cc(u
f
, y
f
) = lim
i
cc(U
0|i1
, Y
0|i1
) , (4.28a)
cc(u
p
, y
f
) = lim
i
cc(U
i|1
, Y
0|i1
) , (4.28b)
cc(y
p
, y
f
) = lim
i
cc(Y
i|1
, Y
0|i1
) , (4.28c)
9
where
Y
i|1
=
_
_
_
_
_
_
y(i)
y(i + 1)
.
.
.
y(1)
_
_
_
_
_
_
and Y
0|i1
=
_
_
_
_
_
_
y(0)
y(1)
.
.
.
y(i 1)
_
_
_
_
_
_
,
and analogously for the input random variables. The parameter i describes how far we go back
into the past (k = i) and forward into the future (k = i 1), where the present is at k = 0.
Since the processes are stationary, we can as well take the present at time instant k = i, the past
from k = 0 to k = i 1 and the future from k = i to k = 2i 1. This is only a convention
that allows us to estimate the canonical correlations from measured data sequences. The canonical
correlations of the past and future output processes, e.g., can then be computed as
cc(y
p
, y
f
) = lim
i
cc(Y
0|i1
, Y
i|2i1
) .
Due to the stationarity and ergodicity of the processes, the canonical correlations of Y
0|i1
and
Y
i|2i1
can be obtained as the cosines of the principal angles between the row spaces of the mi j
data block Hankel matrices Y
0|i1
and Y
i|2i1
, provided j . These block Hankel matrices are
equal to the past and future output block Hankel matrices Y
p
and Y
f
, which are dened in (2.12).
Consequently, we can compute the canonical correlations of the combinations of past and future
processes in (4.28a4.28c) as follows:
cc(u
f
, y
f
) = lim
i
lim
j
cos [U
f
Y
f
] , (4.29a)
cc(u
p
, y
f
) = lim
i
lim
j
cos [U
p
Y
f
] , (4.29b)
cc(y
p
, y
f
) = lim
i
lim
j
cos [Y
p
Y
f
] . (4.29c)
This explains why we denote the rst i block rows of the output block Hankel matrix by Y
p
and
the following i block rows by Y
f
(see (2.12)), and similarly for U
p
and U
f
.
4.1.3 Overview of the relations between the canonical correlations
From Equation (4.27) we can already deduce that the canonical correlations of the past input and
future output are equal to the canonical correlations of the past and future output:
cc(u
p
, y
f
) = cc(y
p
, y
f
) .
The parametric expressions that we derive in Section 4.2, will also reveal that the canonical corre-
lations of the future input and future output are related to the canonical correlations of the past
and future output in the following way:
cc
2
(u
f
, y
f
) = 1 cc
2
(y
p
, y
f
) .
The corresponding principal angles are complementary:
[U
f
Y
f
] =

2
[Y
p
Y
f
] for i, j .
An overview of the relations of the canonical correlations of the dierent combinations of processes
is given in Table 2. The canonical correlations of the past and the future output are denoted by
k
in this table.
10
u
p
u
f
y
p
y
f
u
p
1 0 1
k
u
f
0 1 0
_
1
2
k
y
p
1 0 1
k
y
f

k
_
1
2
k

k
1
Table 2: Overview of the relations between the dierent sets of canonical correlations.
4.2 The principal angles between the input and output spaces
Based on the state space equations (2.14)(2.16), the properties in (2.19) and (3.20) the following
expressions are derived for the principal angles between the past and future input and output spaces.
From these expressions, the canonical correlations of the corresponding processes are deduced in
Section 4.3. We only give the results. The computations can be found in [6].
The principal angles between row(U
f
) and row(Y
f
)
The squared cosines of the largest n principal angles between row(U
f
) and row(Y
f
) for j and
nite i are the eigenvalues of (I
n
+G
z
i
)
1
, where G
z
i
is equal to
G
z
i
=
T
z
i
Q
1
u
i

z
i
=
i1

k=0
(AKC)
k
T
C
T
S
1
u
C(AKC)
k
. (4.30)
The other mi n principal angles are equal to zero:
lim
j
cos
2
[U
f
Y
f
] =
_
(I
n
+G
z
i
)
1
_
, 1, . . . , 1
. .
min
. (4.31)
Remark 4.1. G
z
i
as the solution of a Lyapunov equation
If the state space matrices (A, K, C) and the input covariance matrix S
u
are known, then the matrix
G
z
i
can be computed by making the sum in (4.30). However, G
z
i
is also the solution of the following
Lyapunov equation:
G
z
i
= (AKC)
T
G
z
i
(AKC) +C
T
S
1
u
C (AKC)
i
T
C
T
S
1
u
C(AKC)
i
. (4.32)

The principal angles between row(U


p
) and row(Y
f
)
The squared cosines of the smallest n principal angles between row(U
p
) and row(Y
f
) for j
and nite i are the eigenvalues of D
i
(G
1
z
i
+ )
1
, where
D
i
=
i
Q
u
i

T
i
=
i1

k=0
A
k
KS
u
K
T
A
k
T
.
11
The other mi n principal angles are equal to

2
:
lim
j
cos
2
[U
p
Y
f
] =
_
D
i
(G
1
z
i
+ )
1
_
, 0, . . . , 0
. .
min
. (4.33)
The principal angles between row(Y
p
) and row(Y
f
)
The squared cosines of the smallest n principal angles between row(Y
p
) and row(Y
f
) for j
can be computed as the eigenvalues of
_
A
i
R
T
i
+D
i
R
i
T
i
R
T
i
+A
i
G
z
i
(A
i
T
+T
i
R
T
i
)
+
_
R
i
A
i
G
z
i
T
i
G
z
i
+R
i
T
i
G
z
i
_
A
i
T
_
_
G
1
z
i
+
_
1
, (4.34)
where
T
i
= (
1
+G
z
i
)
1
,
R
i
=
i

z
i
=
i1

k=0
A
i1k
KC(AKC)
k
.
The other mi n angles are equal to

2
.
4.3 The canonical correlations of the input and output processes
The canonical correlations of u
f
and y
f
The smallest n canonical correlations of u
f
and y
f
are the square roots of the eigenvalues of
(I
n
+G
z
)
1
, where is the state covariance matrix, which can be found by solving the Lyapunov
equation = AA
T
+KS
u
K
T
, and G
z
= lim
i
G
z
i
is the solution of the Lyapunov equation
G
z
= (AKC)
T
G
z
(AKC) +C
T
S
1
u
C . (4.35)
The other canonical correlations are equal to 1.
cc
2
(u
f
, y
f
) =
_
(I
n
+G
z
)
1
_
, 1, 1, 1, . . .
The canonical correlations of y
p
and y
f
/ u
p
and y
f
The largest n canonical correlations of the past and the future output (and also of the past input
and future output) are the square roots of the eigenvalues of (G
1
z
+ )
1
. The other canonical
correlations are equal to 0.
cc
2
(y
p
, y
f
) =
_
(G
1
z
+ )
1
_
, 0, 0, 0, . . .
4.4 Relation of the canonical correlations between the dierent processes
The canonical correlations of the dierent pairs of processes (or the principal angles between the
pairs of subspaces) are closely related, as we have already indicated in Table 2. Here, we show
12
that the canonical correlations of the future input and output are complementary
3
to the canonical
correlations of the past and future output (or past input and future output). The relation is
straightforwardly proven by means of the matrices given in Section 4.3.
Property 4.1. Complementarity of cc(u
f
, y
f
) and cc(y
p
, y
f
)
The canonical correlations of u
f
and y
f
are complementary to the canonical correlations of y
p
and
y
f
(u
p
and y
f
).
Proof.
The smallest n squared canonical correlations of u
f
and y
f
are the eigenvalues of (I
n
+G
z
)
1
=
I
n
(G
1
z
+ )
1
and the other canonical correlations are equal to 1. The eigenvalues of I
n

(G
1
z
+)
1
are equal to the eigenvalues of I
n
(G
1
z
+)
1
. Since the eigenvalues of (G
1
z
+
)
1
are the largest n squared canonical correlations of y
p
and y
f
(u
p
and y
f
) and the other
canonical correlations are equal to 0, we have proven that the canonical correlations of u
f
and y
f
are complementary to the canonical correlations of y
p
and y
f
(u
p
and y
f
).
Remark 4.2. Simplications for single-input single-output (SISO) models
For SISO models, the expressions for the canonical correlations can be simplied. By comparing
(4.35) with (2.10), we see that for SISO models, the matrix G
z
is equal to
1

2
Q
z
, where
2
is the
variance of the input process and Q
z
is the observability Gramian of the inverse model. Similarly,
a comparison of (2.3) and (2.7) shows that the state covariance matrix of a SISO model is equal
to
2
P, where P is the controllability Gramian of the model. We consequently obtain the following
expressions for the canonical correlations of the input and output processes of a SISO model:
cc
2
(u
f
, y
f
) =
_
(I
n
+Q
z
P)
1
_
, 1, 1, 1, . . .
cc
2
(y
p
, y
f
) =
_
P(Q
1
z
+P)
1
_
, 0, 0, 0, . . .

5 Conclusions
In this paper we have given expressions for the canonical correlations of the dierent past and
future input and output processes of a linear stochastic model, in terms of the model parameters.
References
[1] H. Akaike, Stochastic theory of minimal realization, IEEE Transactions on Automatic
Control 19, 667674, 1974.
[2] M. Basseville, M. Abdelghani, and A. Benveniste, Subspace-based fault detection algorithms
for vibration monitoring, Automatica 36, 101109, 2000.
[3] D. Bauer, Order estimation for subspace methods, Automatica 37, 15611573, 2001.
3
Two canonical correlations
1
and
2
are complementary if
2
1
= 1
2
2
. For the corresponding principal angles,

1
and
2
holds:
1
=

2

2
.
13
[4] M. Borga, Learning Multidimensional Signal Processing, PhD thesis, Linkoping University,
Linkoping, Sweden, 1998. Available on http:// people.imt.liu.se/~magnus.
[5] P. E. Caines, Linear Stochastic Systems, Wiley, New York, 1988.
[6] K. De Cock, Principal Angles in System Theory, Information Theory and Signal Processing,
PhD thesis, Faculty of Applied Sciences, K.U.Leuven, Leuven, Belgium, 2002. Available on
ftp.esat.kuleuven.ac.be/pub/sista/decock/reports/ as le phd.ps.gz.
[7] M. I. Friswell, J. E. Mottershead, and H. Ahmadian, Combining subset selection and pa-
rameter constraints in model updating, Journal of Vibration and Acoustics 120, 854859,
1998.
[8] M. I. Friswell, J. E. T. Penny, and S. D. Garvey, Parameter subset selection in damage
location, Inverse Problems in Engineering 5, 189215, 1997. Available on http://www.swan.
ac.uk/mecheng/staff/ mfriswell/PDF Files/ as le J33.pdf.
[9] I. M. Gelfand and A. M. Yaglom, Calculation of the amount of information about a random
function contained in another such function, American Mathematical Society Translations,
Series (2), 12, 199236, 1959.
[10] H. Hotelling, Relations between two sets of variates, Biometrika 28, 321372, 1936.
[11] N. P. Jewell and P. Bloomeld, Canonical correlations of past and future for time series:
denitions and theory, The Annals of Statistics 11, no. 3, 837847, 1983.
[12] N. P. Jewell, P. Bloomeld, and F. C. Bartmann, Canonical correlations of past and future
for time series: bounds and computation, The Annals of Statistics 11, no. 3, 848855, 1983.
[13] C. Jordan, Essai sur la geometrie `a n dimensions, Bulletin de la Societe Mathematique 3,
103174, 1875.
[14] J. Kay, Feature discovery under contextual supervision using mutual information, In Pro-
ceedings of the 1992 International Joint Conference on Neural Networks volume 4, pages 7984,
Baltimore, 1992.
[15] W. E. Larimore, Statistical optimality and canonical variate analysis system identication,
Signal Processing 52, no. 2, 131144, July 1996.
[16] M. Pavon, Canonical correlations of past inputs and future outputs for linear stochastic
systems, Systems & Control Letters 4, no. 4, 209215, June 1984.
[17] L. L. Scharf and C. T. Mullis, Canonical coordinates and the geometry of inference, rate, and
capacity, IEEE Transactions on Signal Processing 48, no. 3, 824831, March 2000. Available
on http://schof.Colorado.EDU/~scharf/scharf.html.
[18] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal
27, 379423 and 623656, July and October 1948. Available on http://cm.bell-labs.com/
cm/ms/what/shannonday/paper.html.
14
[19] L. N. Trefethen and D. Bau, Numerical Linear Algebra, SIAM, Philadelphia, 1997.
[20] T. Van Gestel, J. A. K. Suykens, J. De Brabanter, B. De Moor, and J. Vandewalle, Kernel
canonical correlation analysis and least squares support vector machines, In Proceedings of
the International Conference on Articial Neural Networks (ICANN 2001), Vienna, Austria,
August 2001. Available on ftp://ftp.esat.kuleuven. ac.be/pub/SISTA/suykens/reports
as le lssvm 01 24.ps.gz.
[21] P. Van Overschee and B. De Moor, Subspace algorithms for the stochastic identication
problem, Automatica 29, 649660, 1993. Available on ftp://ftp.esat.kuleuven.ac.be/
pub/SISTA/vanoverschee/reports as le stoch auto2.ps.Z.
[22] P. Van Overschee and B. De Moor, Subspace Identication for Linear Systems: Theory
Implementation Applications, Kluwer Academic Publishers, Boston, 1996.
15

You might also like