Estimation 2 PDF
Estimation 2 PDF
Ian Reid
Hilary Term, 2001
We ended the first part of this course deriving the Discrete-Time Kalman Filter as a recursive Bayes
estimator. In this lecture we will go into the filter in more detail, and provide a new derivation for
the Kalman filter, this time based on the idea of Linear Minimum Variance (LMV) estimation of
discrete-time systems.
1.1 Background
The problem we are seeking to solve is the continual estimation of a set of parameters whose values
change over time. Updating is achieved by combining a set of observations or measurements z t
which contain information about the signal of interest x t . The role of the estimator is to provide
^ t at some time t . If > we have a prediction filter, if < a smoothing
an estimate x
filter and if
the operation is simply called filtering.
(+ )
=0
()
()
Recall that an estimator is said to be unbiased if the expectation of its output is the expectation of
the quantity being estimated, E x
^ Ex.
[ = [
Also recall that a minimum variance unbiased estimator (MVUE) is an estimator which is unbiased and minimises the mean square error:
x
^ = argmin E [jjx
^
x^
xjj2 jz = E [xjz
[
[(
)(
Notation
The following notation will be used.
zk
Zk
xk
x^kji
x
~kjk
Pk
Fk
Gk
Hk
wk
vk
Qk
Rk
Kk
k
Sk
xk ; (tilde notation)
estimation error, x
^ k jk
Covariance matrix.
i.
xk+1
= Fk xk + Gk uk + wk
(1)
where xk is the state at time k , uk is an input control vector, wk is additive system or process noise,
Gk is the input transition matrix and Fk is the state transition matrix.
We further assume that the observations of the state are made through a measurement system which
can be represented by a linear equation of the form,
zk
= Hk xk + vk ;
(2)
where zk is the observation or measurement made at time k , xk is the state at time k , Hk is the
observation matrix and vk is additive measurement noise.
w
+
u
G
unit
delay
1.3 Assumptions
We make the following assumptions;
The process and measurement noise random processes wk and vk are uncorrelated, zero-mean
white-noise processes with known covariance matrices. Then,
E [wk wlT
E [vk vlT
=
E [wk vlT = 0
Qk
0
Rk
0
k = l;
otherwise;
(3)
otherwise;
k = l;
(4)
for all k; l
(5)
The initial system state, x0 is a random vector that is uncorrelated to both the system and
measurement noise processes.
The initial system state has a known mean and covariance matrix
x
^0j0 = E [x0 and P0j0 = E [(x
^0j0
x0 )(x
^ 0 j0
x0 )T
(6)
Given the above assumptions the task is to determine, given a set of observations z1 ; : : : ; zk+1 , the
th instance in time generates an optimal estimate of the state xk+1 ,
estimation filter that at the k
which we denote by x
^k+1 , that minimises the expectation of the squared-error loss function,
+1
E [jjxk+1
x
^k+1 jj2 = E [(xk+1
x^k+1 )T (xk+1
x^k+1 )
(7)
1.4 Derivation
Consider the estimation of state x
^k+1 based on the observations up to time k , z1 : : : ; zk , namely
x
^k+1jZk . This is called a one-step-ahead prediction or simply a prediction. Now, the solution to
k + 1 conditioned on the
x^k+1jk
(8)
x
^k+1jk
=
=
=
=
E [xk+1 jZk
E [Fk xk + Gk uk + wk jZk
Fk E [xk jZk + Gk uk + E [wk jZk
Fk x
^kjk + Gk uk
(9)
where we have used the fact that the process noise has zero mean value and uk is known precisely.
The estimate variance Pk+1jk is the mean squared error in the estimate x
^k+1jk .
=
=
=
(10)
(11)
where K0k+1 and Kk+1 are weighting or gain matrices (of different sizes). Our problem now is to
reduced to finding the Kk+1 and K0k+1 that minimise the conditional mean squared estimation error
where of course the estimation error is given by:
x~k+1jk+1
= x^k+1jk+1
xk+1
(12)
= [
^k+1jk+1
E xk+1 . Lets assume (and argue
For our filter to be unbiased, we require that E x
by induction) that x
^kjk is an unbiased estimate. Then combining equations (11) and (2) and taking
expectations yields
E [x^k+1jk+1
=
=
(13)
Note that the last term on the right hand side of the equation is zero, and further note that the
prediction is unbiased:
E [x^k+1jk
=
=
=
E [Fk x^kjk + Gk uk
Fk E [x
^kjk + Gk uk
E [xk+1
4
(14)
E [x^k+1jk+1
=
=
I
I
Kk+1 Hk+1
(15)
x
^k+1jk+1
(16)
Pk+1jk+1
and with
E [vk+1 vkT+1
E [x~k+1jk x~Tk+1jk
E [x~k+1jk vkT+1
=
=
=
Rk+1
Pk+1jk
0
we obtain
Pk+1jk+1
= (I
Kk+1 Hk+1 )T
+ Kk+1 Rk+1KTk+1
(17)
Thus the covariance of the updated estimate is expressed in terms of the prediction covariance
Pk+1jk , the observation noise Rk+1 and the Kalman gain matrix Kk+1 .
=
=
=
(18)
A
trace ABAT
T
i ai Bai
) = 2AB
where ai are the columns of AT , and then
Combining equations (17) and (18) and differentiating with respect to the gain matrix (using the
relation above) and setting equal to zero yields
L
Kk+1
= 2(I
(19)
Together with Equation 16 this defines the optimal linear mean-squared error estimator.
x
^k+1jk
Pk+1jk
=
=
Fk x^kjk + Gk uk
^ kjk FTk + Qk
Fk P
k+1
(20)
(21)
Update:
also known as the measurement update. This updates the state and variance using a
combination of the predicted state and the observation zk+1 .
x^k+1jk+1
Pk+1jk+1
(22)
(23)
measurement
z
+
innovation
(whitening)
measurement
prediction
state
prediction
state
estimate
correction
u
unit
delay
G
+
x
F
Blending
(24)
Together with the initial conditions on the estimate and its error covariance matrix (equation 6)
this defines the discrete-time sequential, recursive algorithm for determining the linear minimum
variance estimate known as the Kalman filter.
^zk+1jk
=
=
=
E [zk+1 jZk
E [Hk+1 xk+1 + vk+1 jZk
Hk+1 x
^k+1jk
(25)
k+1 = zk+1
Hk+1 x
^k+1jk
(26)
The innovation or residual is an important measure of how well an estimator is performing. For
example it can be used to validate a measurement prior to it being included as a member of the
observation sequence (more on this later).
The process of transforming zk+1 into k+1 is sometimes said to be achieved through the Kalman
whitening filter. This is because the innovations form an uncorrelated orthogonal white-noise process sequence Vk+1 which is statistically equivalent to the observations Zk+1 . This is important
because where as zk+1 is in generally statistically correlated, the innovation k+1 is uncorrelated so
effectively provides new information or innovation.
The innovation has zero mean, since,
E [ k+1 jZk
=
=
=
(27)
=
=
=
Sk+1
Sk+1
E [ k+1 Tk+1 ;
E [(zk+1 Hk+1 x^k+1jk )(zk+1
Rk+1 + Hk+1 Pk+1jk HTk+1
Hk+1 x
^k+1jk )T
(28)
Using Equation 26 and 28 we can re-write the Kalman updates in terms of the innovation and its
variance as follows.
x
^k+1jk+1
Pk+1jk+1
Pk+1jk+1
=
=
=
=
(29)
(30)
1
Kk+1 = Pk+1jk HTk+1 Sk+1
(31)
(32)
and
1.10 Example
Consider a vehicle tracking problem where a vehicle is constrained to move in a straight line with
a constant velocity. Let p t and p t represent the vehicle position and velocity. We assume that
observations of position can be made where the measurement noise is v t . Since the vehicle is
moving at constant speed, p t
.
( ) _( )
( ) = 0
()
x_ (t)
z(t)
where
A=
=
=
Ax(t) + w(t);
Hx(t) + v(t);
0 1 ;
0 0
H=
(33)
1 0 :
Suppose we have sampled observations of the system at discrete time intervals 4T , then the discrete
equivalent is given by (see later this lecture for a derivation)
xk+1 = Fk xk + wk
where
Fk
zk
Kalman filter:
(34)
= eA4T = 10 41T
= Hxk + vk
x
^0j0 = E (x0 ) =
Assume also that
Qk
0
10
P0j0 =
= E (wk wkT ) = 10 01
Rk
(35)
(36)
= x(0) are,
10 0
0 10
(37)
= E (vk2 ) = 1
(38)
The Kalman filter involves sequential application of the recursive equations as given above for k
%% innovation
%% innovation covariance
%% Kalman gain
%% new state
%% new covariance
%%% Matlab script to simulate data and process usiung Kalman filter
delT = 1;
F = [ 1 delT
0
1 ];
H = [ 1 0 ];
x = [ 0
10];
P = [ 10 0
0 10 ];
Q = [ 1 1
1 1 ];
R = [ 1 ];
z = [2.5 1 4 2.5 5.5 ];
for i=1:5
[xpred, Ppred] = predict(x, P, F, Q);
[nu, S] = innovation(xpred, Ppred, z(i), H, R);
[x, P] = innovation_update(xpred, Ppred, nu, S, H);
end
Results: The plots in Figure 3a-c illustrate the result of running the Kalman filter using
Some interesting observations can be made as follows.
1. Both Kk+1 and Pk+1jk+1 tend to constant (steady-state) values as k
t = 1.
! 1.
^k+1jk+1 tend to follow the measurement values quite closely. Indeed since
2. The estimates x
Kk+1 is a weighting function acting on the measurement it is clear that this effect is more
prominent when Kk+1 is high.
3. From Equation 19 Kk+1 is decreases with Rk+1 and increases with Qk+1 . Thus the convergence properties are dependent on the relative magnitudes of the process and measurement
noise. Figure 4a illustrates this effect clearly. Here we have re-run the Kalman filter but decreased the elements of Qk+1 by a factor of 10 and 100 (Rk+1 was kept at the original value).
It is clear from the figure that the net effect is that the estimates follow the measurements less
closely. Similar effects can be observed if the relative magnitude of Rk+1 is increased (Figure
4b). How do you explain what is observed in the latter case?
This example illustrates the fact that the performance of a Kalman filter is dependent on initialisation
conditions. In the next lecture we examine this observation in more detail when we consider the
performance of the discrete-time Kalman filter.
10
1.0
0.9
K0
0.8
^
x(k)
3
0.7
K1
2
0.6
z(k)
0.5
k
0
0
0.4
(a)
(b)
P(k|k)11
2
P(k|k)00
k
0
0
(c)
Figure 3: Example 1: (a) measurement and estimated state trajectories; (b) Kalman gain; (c) diagonal elements of error covariance matrix. Note how in the latter diagrams the curves tend to constant
values as k gets large.
12
11
10
beta=100
9
4
8
7
alpha=0.01
6
5
alpha=0.1
alpha=1
z(k)
beta=1
z(k)
0
0
Figure 4: Example 1: Effect of changing the process and measurement noise covariance matrices
on the estimated position trajectory. (a) decreasing Qk+1 by a factor of 10 and 100; (b) increasing
Rk+1 by a factor of 100.
11
( ) = 0
. In the
Consider a particle moving at constant velocity. Its ideal motion is described by x t
real world, the velocity will undergo some perturbation w t which we will assume to be randomly
distributed. Therefore the real motion equation is given by,
()
x(t) = w(t)
where
E [w(t)
E [w(t)w( )
= 0
= q(t)(t
)
Recall from your 2nd year maths that this latter expectation is the auto-correlation function and that
an autocorrelation of this form corresponds to a constant power spectral density, more commonly
referred to as white noise.
The state vector is,
x
x_
(39)
= Ax(t) + Bw(t)
(40)
x=
and the continuous state-space system equation is
x
t
where
A=
0 1
0 0
and B
= 01
(41)
We obtain the discrete time equivalent for equation 40 for a sample interval of
over one sampling period. Taking Laplace transforms we obtain:
)
)
t by integating up
Bw( )d
Note that the integration on the right hand side is a consequence of the convolution rule for Laplace
transforms.
Shifting this forward in time to match our initial conditions we have
1
x(tk + t) =
0
= 01
or xk+1 =
t x(tk ) +
1
t x(tk ) +
1
Fk xk +
12
t 1
t
Bw(t + )d
k
t t
1 w(tk + )d
wk
(42)
E [wk wkT
Z t
Z t
t
u
t
v
T
E [(
w(tk + u)du)(
1 w(tk + v)dv)
0 1
0
Z t Z t
t u t v 1 E [w(tk + u)w(tk + v)T dudv
=
1
0 0
Z t
2 (t u)
(
t
u
)
=
1 q(tk + u)du
0 (t u)
1 3 1 2
= q 13 tt2 2tt
2
since (by our initial assumptions) q is constant.
Qk
=
=
= E [wk wkT :
(43)
In summary:
Fk
= 01 1t
and Qk
1 3 1 2
t 2 t
= q 13
2
2 t t
(44)
Note that we have derived the process noise covariance using a continuous-time white noise assumption. If, on the other hand we assume a piece-wise constant white noise model, then the target
undergoes a constant acceleration wk , with the accelerations independent from period to period, and
the process noise is slightly different. Bar-Shalom (p86) discusses (and derives the piecewise version) this and you should familiarise yourselves with both, but always keeeping in mind that both
are approximations.
1.11.2 Constant-acceleration particle
The analysis is similar to the previous case. In this case the particle moves at constant acceleration.
. In the real world, the acceleration will not
The ideal particle motion is described by x t =t
be perfectly constant, thus,
( )
=0
x(t)
t
= w(t);
where as before,
E [w(t)
E [w(t)w( )
= 0;
= q(t)(t
)
x=4
x
x_
x
3
5
(45)
x
t
= Ax(t) + Bw(t);
13
(46)
where
2
2 3
0 1 03
A=4 0 0 1 5
0 0 0
and B
0
= 405
1
(47)
t is given by,
xk+1 = Fk xk + wk
(48)
Once again taking Laplace transforms (etc) and using the assumption that q is constant, we can
determine the state transition matrix Fk and the process noise covariance matrix Qk :
2
and
3
1
t t2
Fk = eAt
= 40 1 t 5
0 0 1
2 5
3
t =20 t4 =8 t3 =6
Qk = E [wk wkT = q 4 t4 =8 t3 =3 t2 =25
t3=6 t2=2 t
(49)
In this lecture we consider how to evaluate the performance of a Kalman filter. We will focus on
understanding the following problems
1. how a Kalman filter of a perfectly modelled system with perfectly estimated noise behaves.
2. the effect that changes in the values of the noise sources have on the overall performance of a
Kalman filter.
3. how to recognise if the filter assumptions are met in practice. This is particularly important
in practical situations since many real systems are not well represented by a linear model and
measurement and system noise is non-Gaussian.
In the case of (3), here, we will only consider how to detect a problem with a Kalman filter. We consider how to modify the Kalman filter to accommodate nonlinear process and measurement models
and non-Gaussian noise in the final lecture.
2.1 Example
We will illustrate Kalman filter performance using as an example a variant of the constant velocity
model we considered in section 1.10, in which we use the Kalman filter to track a vehicle moving in
a straight line.
14
70
60
50
40
30
20
10
0
10
20
30
40
50
60
70
80
90
100
-10
(a)
1.6
70
1.4
60
1.2
50
1.0
40
0.8
30
0.6
20
0.4
10
0.2
0.0
10
20
30
40
50
60
70
80
90
10
100
20
30
40
50
60
70
80
90
-10
-0.2
(b)
(c)
Figure 5: Input to Example 2: (a) true position; (b) true velocity; (c) observations.
System model
We assume that sampled observations are acquired at discrete time intervals
and output equations are of the form
xk+1
zk
= 01 1t xk + wk
= 1 0 xk + vk :
Further, we assume that the process and observation noise are given by
Qk
3
2
= TT 2==32 T T=2 q2
Rk = r2 :
(52)
= 0 01
=01
= [0 0
15
100
Pk+1jk
= Fk Pkjk FTk + Qk
and
Pkjk
= Pkjk
we have
Pk+1jk
Fk [Pkjk 1
+ Rk
1 Hk Pkjk 1 FTk + Qk
(53)
Equation (53) is known as the discrete-time matrix Ricatti equation. It turns out that if the system is
time-invariant (i.e. F; G; and H are constant), and the measurement and process noise are stationary
(Q and R are constant) then as k ! 1 the solution to equation (53) converges to a positive definite
provided that the system model is completely observable and completely controllable (for
matrix P
PH
T S 1 will also be
precise definitions of these see Jacobs). The corresponding gain matrix K
constant and called the steady-state gain.
The importance of this result is that in some applications you can assume that the Kalman filter
and hence K
from the start
works under steady-state conditions. In this case you fix the value of P
and initial conditions do not need to be specified. Since K is now fixed it means that considerable
computational saving can be made since K does not have to be recomputed at each time step.
2.2.2 Initialisation
Recall that part of the requirements for a Kalman filter is specification of initial conditions. Therefore, when considering implementation of a Kalman filter an important concern is how to set (guess!)
16
70
60
50
40
30
20
10
0
10
20
30
40
50
60
70
80
90
100
10
(a)
1.0
50
49
0.8
48
0.6
47
0.4
46
45
0.2
44
0.0
2
43
10
42
-0.2
41
-0.4
40
-0.6
39
74
(b)
76
78
80
82
84
(c)
Figure 6: Position track: (a) predicted, estimated positions and observation ; (b) initial track (zoom
of (a)); (c) steady-state track (zoom of (a).
1.6
1.4
estimate
1.2
1.0
0.8
prediction
0.6
0.4
0.2
0.0
10
20
30
40
50
60
70
0.2
17
80
90
100
86
2.0e01
2.0e01
Prediction
1.5e01
1.5e01
1.0e01
1.0e01
Predication
Estimate
Estimate
5.0e02
5.0e02
0.0e+00
0.0e+00
0
10
20
30
40
50
60
70
80
90
100
(a)
10
20
30
40
50
60
70
80
90
(b)
Figure 8: (a) Position error covariance; (b) Velocity error covariance.
values for x0j0 and P0j0 as they are not known. The obvious question to ask then, is does it matter
how good (or bad) your guess is?
One possibility is to initialise the state vector from the measurements
z
x
^0j0 = z0 0z 1
t
and a simple way to initialise the state covariance matrix is to set it to be a multiple R of the process
noise matrix
= 10).
P0j0 = RQk ;
Figure 9 illustrates the effect that changing initialisation parameters has on long term Kalman filter
performance. Note that regardless of the initial values both x
^ and P tend to constant values in a few
iterations.
More formally, it can be shown that provided that the system is observable and controllable the error
due to poor initialisation tends to zero as k ! 1. Finally note that although good initialisation
is desirable for a linear Kalman filter it is not essential (the estimator merely takes longer to settle
down). However, good initialisation is critical in the implementation of Kalman filters for nonlinear
system models (see final lecture).
2.2.3 Checking consistency
Since in practice we can not measure performance with respect to the state error measures (since
we dont know the true state values) how do we check that the filter is performing correctly? The
answer is that we can define filter performance measures in terms of the innovation
We know that if the filter is working correctly then k is zero mean and white with a covariance Sk
(see previous lecture). So we can verify that the filter is consistent by applying the following two
procedures.
1. check that the innovation is consistent
p with its covariance by verifying that the magnitude of
the innovation is bounded by Sk .
18
100
1.3e01
3.5
1.2e01
3.0
1.1e01
1.0e01
2.5
R=100
9.0e02
2.0
8.0e02
R=20
R=10
7.0e02
1.5
6.0e02
5.0e02
1.0
4.0e02
0.5
R=1
3.0e02
2.0e02
0.0
2
10
12
14
16
18
1.0e02
20
0.0e+00
-0.5
(a)
10
12
14
16
18
20
(b)
Figure 9: (a) Effect of using different position initialisation values; (b) Effect of changing
initialise P0j0 .
to
1.2
1.0
0.8
0.6
0.4
0.2
0.0
10
20
30
40
50
60
70
80
90
100
0.2
0.4
0.6
0.8
1.0
1.2
95%
2
test:
(54)
()
for a sequence of i trials of a Kalman filter. If the filter assumptions are met then the qk+1 i are
each 2 in m degrees of freedom, where m
in our case (the dimension of the measurement
=1
19
0
0
10
20
30
40
50
60
70
80
90
100
E [qk+1 = m
This provides the test for unbiasedness. To estimate the mean we need to have
; : : : N . The mean of this sequence,
samples of qk+1 i ; i
() =1
qk+1 =
(55)
independent
N
1X
qk+1 (i)
N i=1
can be used as a test statistic since N qk+1 is 2 on Nm degrees of freedom.
In our case, however, we can exploit the fact that the innovations are ergodic to estimate the sample
mean from the time average for a long sequence (ie. the moving average) rather than an ensemble
average. Thus we can estimate the mean as,
N
X
1
q =
qk
N
k
=1
(56)
from a single run of a Kalman filter. Figure 11 shows the normalised innovation and the moving
average of the innovation. The latter tends to : as k gets large. To test unbiasedness we need to
verify that q lies in the confidence interval r1 ; r2 defined by the hypothesis H0 that N q is 2Nm
distributed with probability
. Thus we need to find r1 ; r2 such that
10
[
P (N q 2 [r1 ; r2 jH0 ) = 1 ;
(57)
For the example we are considering, N = 100, q = 1:11, and let = 0:05 (ie. define the two-sided
95% confidence region). Using statistical tables we find that,
[r1 ; r2 = 2100 (0:025); 2100(0:975) ;
= [74:22; 129:6
1
20
1.0
0.8
0.6
0.4
0.2
0.0
10
20
30
40
50
60
70
80
90
100
-0.2
-0.4
E [ Ti j = Si ij
(58)
We can test this by checking that everywhere except where i j , the statistic defined by Equation
(58) is zero within allowable statistical error. Again, we can exploit ergodicity to redefine the test
statistic as a time-averaged correlation
r( ) =
1 NX 1 T k+
=0
(59)
(0)
95%
=0
()
delT2/2
delT ];
zeros(2,N);
zeros(2,N);
zeros(1,N);
zeros(1,N);
i=2:N
w(:,i) = gennormal([0;0], Q);
x(:,i) = F*x(:,i-1) + w(:,i);
v(:,i) = gennormal([0], R);
z(:,i) = H * x(:,i) + v(:,i);
end
plot(x(1,:));
The matlab code to process the sequence and generate the various graphs is given below.
%%% Matlab script to assess Kalman filter performance
%%% The script assumes the existence of a vector z of
%%% noise corrupted observations
N = length(z);
Qfactor = 1;
Rfactor = 10;
delT = 1;
F = [ 1 delT
22
0
1 ];
H = [ 1 0 ];
% measurement matrix
sigmaQ = Qfactor*sqrt(0.01);
sigmaR = Rfactor*sqrt(0.1);
Q = sigmaQ2 * [ 1/3 1/2
1/2 1 ];
P = 10*Q;
R = sigmaR2 * [ 1 ];
xhat = zeros(2,N);
nu = zeros(1,N);
S = zeros(1,N);
q = zeros(1,N);
%
%
%
%
state estimate
innovation
innovation (co)variance
normalised innovation squared
for i=2:N
[xpred, Ppred] = predict(xhat(:,i-1), P, F, Q);
[nu(:,i), S(:,i)] = innovation(xpred, Ppred, z(i), H, R);
[xhat(:,i), P] = innovation_update(xpred, Ppred, nu, S, R);
q(:,i) = nu(:,i)*inv(S(:,i))*nu(:,i);
end
sumQ = sum(q)
r = xcorr(nu);
plot(xhat(1,:));
pause;
plot(nu)
% plot innovation and 2sigma confidence interval
hold on;
plot(2*sqrt(S),r);
plot(-2*sqrt(S),r);
hold off;
pause;
plot(q);
pause;
plot(r(N:2*N-1)/r(N));
Under-estimating q : Refer to Figure 13. This illustrates the performance tests for the case when
the process noise is under-estimated by a factor of .
10
The normalised innovations squared are larger than expected and the sample mean falls outside
the confidence bound defined by the 2 test (for my trial the value came to 492.34/100 which is
clearly above the 95% confidence region [74.22/100,129.6/100] computed above). This tells us that
the combined process and measurement noise levels are too low, i.e. too little weight is placed on
current measurements in the update process.
The autocorrelation sequence shows time correlations.
23
70
60
1.5
50
40
0.5
30
20
0.5
10
1.5
10
10
20
30
40
50
60
70
80
90
100
10
20
30
40
(a)
50
60
70
80
90
100
60
70
80
90
100
(b)
25
0.8
20
0.6
15
0.4
0.2
10
5
0.2
10
20
30
40
50
60
70
80
90
100
0.4
10
20
30
(c)
40
50
(d)
10
Figure 13: Effect of underestimating q by a factor of . (a) state estimates; (b) innovation sequence; (c) normalised innovations squared; (d) normalised autocorrelation of the innovation sequence.
Over-estimating q : Refer to Figure 14. This illustrates the performance tests for the case when
the process noise is over-estimated by a factor of . The innovations are well within the required
bounds.
10
The normalised innovations squared are smaller than expected and the sum (32.81, or eqivalently
the average) falls below the confidence bound defined by the 2 test. This tells us that the combined
process and measurement noise levels is too high.
The autocorrelation sequence shows no obvious time correlations.
Under-estimating r : Refer to Figure 15. This illustrates the performance tests for the case when
the measurement noise is under-estimated by a factor of .
10
The normalised innovations squared are larger than expected and the sample mean (3280/100) falls
outside the confidence bound [0.74,1.3] defined by the 2 test. This tells us that the combined
process and measurement noise levels is too low.
24
70
10
60
6
50
4
40
30
20
4
10
6
0
10
10
20
30
40
50
60
70
80
90
100
10
10
20
30
40
(a)
50
60
70
80
90
100
60
70
80
90
100
(b)
3.5
3
0.5
2.5
1.5
0
0.5
10
20
30
40
50
60
70
80
90
100
0.5
10
20
30
(c)
40
50
(d)
10
Figure 14: Effect of overestimating q by a factor of . (a) state estimates; (b) innovation sequence;
(c) normalised innovations squared; (d) normalised autocorrelation of the innovation sequence.
The autocorrelation sequence shows no obvious time correlations.
Over-estimating r : Refer to Figure 16. This illustrates the performance tests for the case when
the measurement noise is over-estimated by a factor of .
The innovations are below the
10
2 bounds.
The normalised innovations squared are smaller than expected and the sample mean (4.95/100) falls
outside the confidence bound defined by the 2 test. This tells us that the combined process and
measurement noise levels is too high.
The autocorrelation sequence shows time correlations.
General observations:
1. If the ratio of process to measurement noise is too low the innovation sequence becomes
correlated.
25
70
2.5
60
1.5
50
1
40
0.5
30
0
20
0.5
10
1
0
10
1.5
10
20
30
40
50
60
70
80
90
100
10
20
30
40
(a)
50
60
70
80
90
100
60
70
80
90
100
(b)
400
350
300
0.5
250
200
150
0
100
50
10
20
30
40
50
60
70
80
90
100
0.5
10
(c)
20
30
40
50
(d)
10
Figure 15: Effect of underestimating r by a factor of . (a) state estimates; (b) innovation sequence; (c) normalised innovations squared; (d) normalised autocorrelation of the innovation sequence.
2. The absolute values of the process and measurement noise can be set by adjusting their values
so that the 2 innovation test is satisfied.
3. In the example shown here, tuning is much more sensitive to changes in the measurement
noise rather than the process noise. In this example, this is because measurement noise affects
position, process noise only affects velocity (refer to the continuous system model in Lecture
1).
2.3.2 Detecting process modelling errors
We now consider what happens if we try to apply an estimator to measurement data that doesnt fit
the model - the so-called mis-matched filter problem.
Specifically, we consider the case of using a constant-velocity Kalman filter to track a particle which
has a true motion trajectory defined by a constant-acceleration model. Thus, the true motion is
26
70
60
50
40
30
20
10
10
10
20
30
40
50
60
70
80
90
100
10
20
30
40
(a)
50
60
70
80
90
100
70
80
90
100
(b)
0.25
0.8
0.2
0.6
0.15
0.4
0.2
0.1
0
0.05
0.2
10
20
30
40
50
60
70
80
90
100
0.4
10
20
(c)
30
40
50
60
(d)
10
Figure 16: Effect of overestimating r by a factor of . (a) state estimates; (b) innovation sequence;
(c) normalised innovations squared; (d) normalised autocorrelation of the innovation sequence.
described by the transition equation
xk+1
where the state transition matrix is
with
= Fxk + wk
(60)
1 T T 2=23
T 5
F = 40 1
0 0
1
(61)
3
T 5=20 T 4=8 T 3 =6
Q = E [wk wkT = 4 T 4 =8 T 3 =3 T 2 =25 q2
T 3=6 T 2=2 T
(62)
Figure 17 shows the result of applying the constant-velocity filter to the constant-acceleration model
: and r : .
where the filter noise parameters were q
= 0 01
=01
Observe that the innovation behaves like a first order Gauss-Markov process (recall this implies that
in continuous-time dx=dt A t x
w, where w is white noise). The normalised squared values
+ () =
27
12
4500
4000
10
3500
8
3000
2500
2000
4
1500
1000
500
0
0
500
10
20
30
40
50
60
70
80
90
100
10
20
30
40
(a)
50
60
70
80
90
100
70
80
90
100
(b)
500
1.2
450
1
400
0.8
350
300
0.6
250
0.4
200
150
0.2
100
0
50
10
20
30
40
50
60
70
80
90
100
0.2
10
20
(c)
30
40
50
60
(d)
Figure 17: Performance tests for an unmatched filter. (a) state estimates; (b) innovation sequence;
(c) normalised innovations squared; (d) normalised autocorrelation of the innovation sequence.
show a substantial drift in the mean and is not stationary. The autocorrelation reduces exponentially
in time - again typical of a first-order Gauss-Markov process.
Boosting Q to reduce effects of modelling errors: one obvious thing to try in order to reduce the
effects of modelling errors is to boost the magnitude of the process noise Q artificially to take into
account unmodelled errors. Recall that this should boost the value of the Kalman gain and hence let
the estimate follow the measurements more closely. The result of doing this where the process noise
was increased by a factor of 10 is shown in Figure 18. Some improvement is seen but this has not
totally compensated for the process model error.
So far we have considered the discrete-time formulation of the Kalman filter. This is the version
which finds the most wide-spread application in practice. The Kalman filter estimation approach
can also be derived for continuous-time. This is what we look at in this section. It is interesting to
28
20
6000
15
5000
10
4000
5
3000
0
2000
1000
10
1000
15
10
20
30
40
50
60
70
80
90
100
20
10
20
30
40
(a)
50
60
70
80
90
100
60
70
80
90
100
(b)
1.2
4.5
1
4
0.8
3.5
3
0.6
2.5
0.4
1.5
0.2
1
0
0.5
10
20
30
40
50
60
70
80
90
100
0.2
10
20
30
(c)
40
50
(d)
Figure 18: Performance tests for an unmatched filter with process noise boosted to compensate.
(a) state estimates; (b) innovation sequence; (c) normalised innovations squared; (d) normalised
autocorrelation of the innovation sequence.
study the continuous-time Kalman filter for two principal reasons;
1. to understand the asymptotic behaviour of discrete-time Kalman filters and,
2. to provide insight into the relationship between Kalman filtering and Wiener filtering.
We consider both of these factors in this section. In reading what follows you may find it useful
to refer back to the equivalent results that we have previously derived for the discrete-time case in
Sections and .
29
x(t)
t
= F(t)x(t) + w(t)
(63)
(t).
(64)
Recall from the derivation of the discrete-time estimator that we also need to specify initial conditions for the state and its error covariance. Thus we assume that the initial conditions
E [x(0) = x^(0);
(65)
are given.
^_(t) = F(t)x
x
^(t) + K(t)[z(t)
H(t)x
^(t)
(66)
K(t)R(t)KT (t)
(67)
which is known as the matrix Riccati equation. This matrix differential equation has been studied
extensively and an analytic solution exists for the constant parameter case.
Here the Kalman gain matrix is defined by
(68)
In summary Equations 66, 67 and 68 together with the initial conditions specified by Equation 65
describe the continuous-time Kalman filter algorithm. You should compare these equations with
the equivalent results for the discrete-time case.
3.3 Example
()
Consider the problem of estimating the value of a constant signal x t given measurements corrupted
by Gaussian white noise which is zero-mean and has constant spectral density . Derive (1) the
continuous-time Kalman filter; and (2) the discrete-time Kalman filter assuming a sampling time
interval of T .
x_ (t)
z (t)
= 0
= x(t) + v(t);
30
v N (0; )
(69)
(70)
The scalar Riccati equation (Equation 67) governs error covariance propagation and is given by
p_ = fp + pf + q krk
where k
(71)
p_
dp
2
p0 p
p
=
=
Z p
p2
Z
dt;
0
p0 (1 + (p0 =) t) 1
(72)
k=
and the state estimation by
Discrete-time solution: Let us consider what the result would have been if rather than analyse
the continuous-time measurements we had sampled the signals at instants in time t
k T; k
; ; ; : : : (satisfying the Nyquist criterion of course).
=
012
x(k + 1)
z (k)
= x(k)
= x(k) + v(k); v(k) N (0; )
We have F(k ) = H(k ) = 1, Q(k ) = 0 and R(k ) = .
(73)
(74)
The predicted state and error covariance are given by (see Section 1)
andP
(k + 1jk) = P (kjk)
Using this result the update equation for the error covariance is
gives
1
P (k + 1jk + 1) = P (kjk)
1 + P (kjk)= :
If our initial error covariance was P0 then it follows from Equation 76 that, at time k
P (k + 1jk + 1) = P0
31
1
1 + kP0 =
(76)
+1
(77)
N
V
+
Hopt
+
H_1
H_2
(78)
where
K (k + 1) =
(P0 =)
1 + (kP0=)
Compare this with the result for the continuous-time case. Note again that as k
tends to a constant value.
One final point: unfortunately only simple types of continuous-time problems such as the example
given above can be solved analytically using the covariance equations. For more complicated problems, numerical methods are required. This is the main reason why the continuous-time Kalman
filter has not found wide-spread use as an estimation method.
()
^( )
()
()
(79)
()
Here n t is additive noise. If (1) we know the power spectral densities of the signal and noise, SX s
and SN s respectively; and (2) the observations have been acquired for sufficient time length so that
the spectrum of z t reflects both s t and n t then the goal is to find the optimal filter response
h t to recover the underlying signal.
()
()
()
()
Wiener filter solution: The Wiener filter solution to this problem is to find the transfer function
corresponding to h t using frequency response methods. It can be shown (see for example Brown
chapter 4) that if the signal and noise are uncorrelated then the Wiener optimal filter takes on the
()
32
form
W
Hopt
(s) =
()
()
SX (s)
SX (s) + SN (s)
(80)
where SX s and SN s are the power spectral densities of the signal and noise. However, the filter
defined by Equation 80 defines a non-causal filter, meaning that the output depends on future values
of the input z t as well as the past, or
()
h( ) 6= 0
for some
0
For real-time operation this filter is not physically realisable (however it can be worth studying for
off-line processing applications). To generate a causal filter we need to define the filter such that it
depends only on past and current values of the input z t , thus,
()
h( ) = 0
for all
()
0
()
()
()
Consider the denominator of Equation 80. It can be separated into a part containing poles and zeros
in the left hand plane SX SN and a part containing only poles and zeros in the SX SN + .
This process is called spectral decomposition.
[ +
Let
H1 (s) =
[ +
1
[SX + SN
; H2 (s) =
SX
(81)
[SX + SN +
Then we can consider Hopt as being the cascade of two filters, H1 (s) (causal) and H2 (s) (noncausal) as illustrated in Figure 19. Let v (t) be the intermediate signal as shown. Then the power
spectral density of v (t), SV (s) is given by
SV (s) = jH1 j2 (SN + SX )
= 1
(82)
In other words it is white noise and hence uncorrelated (independent of time). To make the whole
filter causal we can ignore the negative tail of the impulse response corresponding to H2 s . We do
this by taking partial fractions of H2 s and discarding the terms with poles in the right hand plane.
We denote this by fH2 s g . Hence the causal Wiener optimal filter is given by
()
W
opt
()
()
SX (s)
[SN (s) + SX (s)+
(83)
SX (s) =
; SN (s) = 1
s2 + 1
Then,
SN (s) + SX (s)
=
=
s2 + 2
s2 + 1 # "
"
#
p
(s + 2) ( s + p2)
(s + 1) ( s + 1)
33
(84)
(85)
It follows that,
H1 (s) =
H2 (s) is given by
H2 (s)
(86)
1 ( s + 1)
+1) p
( s + 2)
(s +p1)
(s + 2)
s2
= (s + 1)( 1 s + p2)
(87)
p
2
1
2 p1
+
H2 (s) =
s+1
s+ 2
p
Hence
p
fH2 (s)g = (s2+ 1)1
(88)
W
opt
p
(
1)
(s) = H1(s) fH2(s)g = (s +2 p2)
hW
opt (t) = (
2 1)e
p2t
; t0
(89)
(90)
The Kalman filter equivalent: We can solve the same LMV problem by using an alternative
approach based on state-space techniques that leads to a Kalman filter solution.
Specifically, we can re-write Equation 79 in terms of a state-space equation as
0 =
=
FP1 + P1 FT
FP1 + P1 FT
+Q
+Q
34
K1 RKT1 ;
P1 HT R 1 HP1 ;
(91)
^_(t) = F^
x
x(t) + K1 [z(t)
1.
H^
x(t)
(92)
^_(t)
x
(F
K1 H)x
^(t) = K1 z(t)
Taking the Laplace transform and ignoring initial conditions (since we are in the steady state) we
have
(sI
^ (s) = K1 Z(s)
K1 H)X
or
^ (s)
X
Z(s)
= HKopt(s) = K1 [(sI
K1 H) 1
(93)
This last equation defines a transfer function between the state estimate and the measurement which
when multiplied by H gives the Wiener optimal filter,
K
HW
opt (s) = HHopt (s):
(94)
Note that the causal Wiener filter and the continuous-time Kalman filter are equivalent under the
assumptions of time-invariance and one can be determined from the other. The key difference is that
one approach is based on state-space models and the other on frequency domain concepts (auto and
cross correlations).
Example: Let us re-work the example we considered before using the Kalman filter approach.
(t) is decomposed as
1 = 1 1
SX (s) = 2
s +1 s+1 s+1
x_ (t)
z (t)
Then this implies that F
= x(t) + w(t)
= x(t) + v(t)
(95)
(96)
= 1; Q = 1; H = 1; R = 1.
or P1
2P1
P21 + 1 = 0
= 2 1. Hence K1 = P1 HT R 1 = 2 1.
The optimal transfer function between the state and measurement is therefore
HK
opt
2 p1
s+ 2
giving
HW
opt
(s) =
HHK
opt
(s) = s +2 p21
This section deals with some variants of the discrete Kalman filter which prove useful when some of
the assumptions of the conventional Kalman filter break down. Recall that three of the key problem
areas for a Kalman filter are,
1. Initialisation:
known.
we assume that the initial state vector and its error covariance matrix are
2. Modelling: we assume that we have an accurate linear model of the process and measurement
system.
3. Noise: we assume that the process and sensor noise processes are Gaussian.
In this section we look at how to deal with each of these problems.
We begin by considering the information filter which is a variant on the conventional Kalman filter
which gives more accurate results when there is no information about the initial state.
Next we consider how to cope with modelling error. In most practical cases the linear equations
describing the system and observation models of Equations 14 and 15 in Section are not a good
approximation to reality. Although, as we have seen, it is possible to detect that our assumptions
about modelling and noise are invalid it is clear that what we need to do is extend the estimation
approach to accommodate nonlinear models. Recall that for the Kalman filter algorithm, the estimate
. We showed in
is the conditional mean of the state given all the measurements up to time k
Section 1 that under the assumption that the process was linear and the noise processes white this
led to linear, recursive solution of the form,
+1
x^k+1jk+1
Hk+1 x^k+1jk :
(97)
We show that there exists an equivalent version of Equation 97 that can be used in the case of a
nonlinear system model. We do this by linearising the (non-linear) state and observation matrices
about the estimated trajectory. This leads to the so-called Extended Kalman Filter (EKF) which is
the best linear estimator with respect to the minimum-mean-square error.
Finally we take a brief look at validating measurements where due to non-Gaussian sensor noise
some measurements could be confused with background clutter or outliers.
1. the measurement dimension is large compared to that of the process noise; or,
2. the initial system state is unknown.
As with the conventional algorithm, the information filter is a recursive linear estimator that repeatedly estimates the state of a variable called the information vector and its error covariance matrix
called the information matrix. This idea should be familiar from the first part of the course where
we discussed recursive least squares.
(98)
It is then straightforward to prove that its covariance is the inverse of the error covariance matrix,
1
Pk+1
jk+1 , or information matrix.
Update equations: Recall that the conventional Kalman filter update equation for the error covariance matrix is given by
Pk+1jk+1
Kk+1
=
=
(99)
(100)
1
Pk+1
jk+1
(101)
Proof:
=
=
=
=
=
=
1H
1
Kk+1 = Pk+1jk+1 HTk+1 Rk+1
(102)
Proof:
=
=
=
=
=
=
=
Pk+1jk HT [HPk+1jk HT + R 1
1
T
1
T
Pk+1jk+1 Pk+1
jk+1 Pk+1jk H R R[HPk+1jk H
+ R 1
1
T
1
T
1
1
Pk+1jk+1 Pk+1
jk+1 Pk+1jk H R [HPk+1jk H R + I
1 + HT R 1 HPk+1jk HT R 1[HPk+1jk HT R 1 + I
Pk+1jk+1 [Pk+1
jk
Pk+1jk+1 [I + HT R 1 HPk+1jk HT R 1 [HPk+1jk HT R 1 + I 1
Pk+1jk+1 HT R 1 [I + HPk+1jk HT R 1 [HPk+1jk HT R 1 + I 1
Pk+1jk+1 HT R 1
37
^k+1jk+1
(103)
Note that this is exactly the form we derived from Bayes Rule in the first part of the course: it is an
information weighted sum of prediction and measurement
The update can be written succinctly as
1 zk+1
^k+1jk+1 = ^
k+1jk + HTk+1 Rk+1
Prediction equations: Recall that the state prediction is defined by
x
^k+1jk
Pk+1jk
=
=
F^
xkjk + Gu
FPkjk FT + Q
It follows that the corresponding prediction equations for the information filter are
^k+1jk
1
Pk+1
jk
=
=
1 FPkjk ^kjk + P 1 Gu
Pk+1
jk
k+1jk
(104)
1
FPkjk FT + Q
(105)
^k+1jk
1
Pk+1
jk
=
=
1 FPkjk ^kjk + P 1 Gu
Pk+1
jk
k+1jk
FPkjk FT
+ Q
Update:
1
Pk+1
jk+1
^k+1jk+1
=
=
1 + HT R 1 Hk+1
Pk+1
k+1 k+1
jk
k+1jk + HT R 1 z
Comments: As noted at the beginning of this section certain problems are better solved using the
information filter rather than the conventional Kalman filter.
In the case that there is no information about the initial state then the magnitude of P0j0 should be
set very large in the conventional Kalman filter. This may lead to significant loss in accuracy. The
information filter does not have this problem as it uses the inverse of the initial state covariance error
matrix.
When the dimension of the measurement vector m is significantly larger than that of the process
noise p the information filter is computationally more efficient that the conventional Kalman filter.
This is because one of the computationally expense steps in either case is matrix inversion. In the
case of the conventional Kalman filter the inversion of the m m matrix Hk+1 PTk+1jk HTk+1
38
Rk+1 ) is required. In the case of the inverse covariance filter we compute the inverse of the p p
matrix defined by Equation 105.
Finally, note that given the output from one filter it is easy to find the equivalent output of the other
filter using Equation 98 and the imformation matrix (inverse covariance).
xk+1 = f (xk ; uk ; k ) + wk ;
zk = h(xk ; k ) + vk ;
where f
(106)
(107)
(:; :; k) is a nonlinear state transition matrix and h(:; :; k) is the nonlinear observation matrix.
We assume that the process and measurement noise are Gaussian, uncorrelated and zero-mean, and
have no cross-correlation. Thus
E [ wk
E [vk
= 0
= 0
E [wi wjT = ij Qi
E [vi vjT = ij Ri
E [wi vjT = 0
4.2.1 Prediction
As in the linear case, we assume that we have at time k
x
^kjk
= E [xk jZk ;
Pkjk
To generate the prediction we expand Equation 106 in a Taylors series about the prediction x
^kjk up
to the first-order terms.
xk+1
= f (x^kjk ; uk ; k) +
f
[x
x k
x
^kjk + O([xk
x
^kjk 2 ) + wk
(108)
= E [xk+1 jZk
= f (x^kjk ; uk ; k)
39
(109)
The state covariance can be found as follows. First the prediction error is given by
x
~k+1jk
= xk+1 x^k+1jk
= f (x^kjk ; uk ; k) + xf [xk x^kjk
+O([xk x^kjk 2 ) + wk f (x^kjk ; uk ; k)
xf [xk x^kjk + wk
f
[x~ + wk
x k jk
(110)
The prediction covariance is then found by taking the Expectation of the product of the prediction
error with its transpose:
Pk+1jk
=
=
=
(111)
f
Note that the prediction covariance has the same form as its linear equivalent with the Jacobian
playing the role of transition matrix Fk .
zk+1
= h(x^k+1jk ) +
h
[x^
x k+1jk
xk+1 2 ) + wk+1
xk+1 + O([x
^k+1jk
Truncating to first order and taking expectations yields the predicted observation
^zk+1jk
h(x^k+1jk )
(112)
k+1 = zk+1
h( x
^k+1jk )
(113)
Sk+1
=
=
E [ k+1 Tk+1
E [(zk+1 h(x^k+1jk )(zk+1 h(x^k+1jk )T
"
h
(xk+1jk
(
xk+1jk xk ) + wk+1
E
x
h
h
Pk+1jk
x
x
T
+ Rk+1
xk )
h
x
T
!#
wkT
+1
(114)
40
Kk+1 = Pk+1jk
h
x
T
1
Sk+1
(115)
x
^k+1jk+1
= x^k+1jk + Kk+1[zk+1
h( x
^k+1jk )
(116)
= Pk+1jk
Pk+1jk+1
(117)
= f (x^kjk ; uk ; k)
f T
f
= x Pkjk x + Qk
x
^k+1jk
Pk+1jk
(118)
(119)
Update:
=
=
x^k+1jk+1
Pk+1jk+1
x
^k+1jk + Kk+1 [zk+1 h(x
^k+1jk )
T
Pk+1jk Kk+1 Sk+1 Kk+1
(120)
(121)
where
Kk+1 = Pk+1jk
h
x
T
1
Sk+1
(122)
and
Sk+1 =
h
h
P
x k+1jk x
T
+ Rk+1
(123)
f
and
h
are functions of both the state and timestep; they are not con-
41
2. Stability: Since we are dealing with perturbation models of the state and observation matrices
about the predicted trajectory, it is important that predictions are close enough to the true states
otherwise the filter will be poorly matched and possibly diverge.
3. Initialisation: Unlike in the linear case, special care has to be taken when initialising the
Extended Kalman filter.
4. Computational cost: The Extended Kalman filter is computationally significantly more expensive than its linear counterpart. This limited its early use in applications. However, today
real-time computing implementations of the EKF can be achieved using moderate computing
resources.
4.4 Implementation
Implementation issues are similar to those of the linear Kalman filter and you can test the performance of the filter using all the techniques introduced in Section .
In particular, special care has to be taken to check whether the system and noise process modelling
assumptions are met. There are obviously some errors introduced by using a linearised model.
A further important point to note is that the state covariance matrix is only an approximation to the
mean square error and not a true covariance. Recall that Pk+1jk+1 determines the weight given to
new measurements in the updating procedure. Thus, if Pk+1jk+1 is erroneous and becomes small,
the measurements have little affect on the estimation and it is quite possible that the EKF will diverge.
=[
=[
The motion of the vehicle can be described by the nonlinear state transition equation
2
4
xk+1
yk+1
k+1
=4
3
5
+ qk
(124)
Here, t is the time interval between time steps, B is the wheel baseline and qk is the noise vector
which combines errors in modelling the process and control.
We assume that measurements of range (depth) and bearing are made to a set of beacons at fixed
locations Bi
Xi ; Yi T ; i ; : : : N . The nonlinear observation model is therefore
=[
=1
zk
rk
k
"p
(X
tan
#
xk )2 + (Y yk )2
+ rk
1 Y yk
k
X xk
42
(125)
Bi
z(k)=(r,)
x(k)=(x,y,)T
origin
x^k+1jk
y^k+1jk
^k+1jk
=4
3
5
+ qk
(126)
Pk+1jk
where
f
x
f
f
Pkjk
x
x
T
+ Qk
(127)
3
5
(128)
Update:
The equations for the updating of the state and its covariance are:
x^k+1jk+1
Pk+1jk+1
=
=
x
^k+1jk + Kk+1 [zk h(x
^k+1jk )
T
Pk+1jk Kk+1 Sk+1 Kk+1
(129)
where
Kk+1 = Pk+1jk
43
h
x
T
1
Sk+1
(130)
and
Sk+1 =
h
h
P
x k+1jk x
T
+ Rk+1
(131)
and
Here d
h
x
"
xk+1jk X
d
yk+1jk Y
d2
^
^
yk+1jk Y
d
xk+1jk X
d2
0
1
(132)
R( )
^zk+1jk )
g
(133)
In the context of the Kalman Filter algorithm, Equation 133 constrains the region of space where we
look for a measurement. We assume that the correct measurement will be detected in this region. It
is possible, however, that more than one measurement will be in the valid region. The problem of
distinguishing between the correct measurement and measurements arising from background clutter
and other targets (false alarms) is called data association. This topic falls outside of the scope of
the course (see, for example, Bar-Shalom and Fortmann).
44