0% found this document useful (0 votes)
2 views

Multiple-model estimation with variable structure

This paper discusses the limitations of existing multiple-model (MM) estimation algorithms that utilize a fixed structure of models, highlighting that both too few and too many models can negatively impact performance. It introduces the concept of variable structure MM estimation, proposing new algorithms that adapt the model set to improve estimation accuracy while managing computational burdens. Theoretical results and practical examples demonstrate the advantages of this approach over traditional fixed structure methods.

Uploaded by

hlliuxidian
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Multiple-model estimation with variable structure

This paper discusses the limitations of existing multiple-model (MM) estimation algorithms that utilize a fixed structure of models, highlighting that both too few and too many models can negatively impact performance. It introduces the concept of variable structure MM estimation, proposing new algorithms that adapt the model set to improve estimation accuracy while managing computational burdens. Theoretical results and practical examples demonstrate the advantages of this approach over traditional fixed structure methods.

Uploaded by

hlliuxidian
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

478 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO.

4, APRIL 1996

el Estimation with Variable Structure


Xiao-Rong Li, Senior Member, IEEE, and Yaakov Bar-Shalom, Fellow, IEEE

Abstract-Existing multiple-model (MM) estimation algorithms problems can be successfully formulated in terms of such
have a fixed structure, i.e., they use a fixed set of models. An systems. Typical examples can be found in systems subject to
important fact that has been overlooked for a long time is failuredrepairs, piecewise linearization of nonlinear systems,
how the performance of these algorithms depends on the set of
models used. Limitations of the fixed structure algorithms are target tracking, reconfigurable systems, etc. 1171, [18], [lo].
addressed first. In particular, it is shown theoretically that the Increasing attention has been given recently to hybrid systems
use of too many models is performance-wise as bad as that of due to their wide applicability.
too few models, apart from the increase in computation. This To the authors’ knowledge, all of the existing MM esti-
paper then presents theoretical results pertaining to the two ways mators developed prior to [21], except [14] and [29], use a
of overcoming these limitations: selectkonstruct a better set of
models and/or use a variable set of models. This is in contrast fixed set of models, i.e., a fixed structure, at each time. When
to the existing efforts of developing better implementable fixed these algorithms are applied to solve real-world problems,
structure estimators. Both the optimal MM estimator and practi- it is often the case that the use of only a small number
cal suboptimal algorithms with variable structure are presented. of models is not good enough. Such situations exist in, for
A graph-theoretic formulation of multiple-model estimation is example, failure detection and isolation, where many parts
also given which leads to a systematic treatment of model-set
adaptation and opens up new avenues for the study and design can fail or deteriorate. The use of more models increases
of the MM estimation algorithms. The new approach is illustrated the computational burden considerably. More importantly, it
in an example of a nonstationary noise identification problem. does not necessarily improve the performance. In fact, the
performance will deteriorate if too many models are used
due to the excessive “competition” from the “unnecessary”
I. INTRODUCTION
(excess) models. Thus, one may face a dilemma: more models
ULTIPLE-MODEL (MM) estimation is a powerful ap- have to be used to improve the accuracy, but the use of too
proach to adaptive estimation. It is particularly good for many models can degrade the performance, not to mention the
problems involving structural as well as parametric changes. increase in computational burden.
In this approach, a set of models is selected/designed to To find a way out of this dilemma, this paper introduces
represent (or cover) the possible system behavior patterns the concept of variable structure MM estimation and proposes
(called system modes), and the overall estimate is obtained several variable structure algorithms, in contrast to the existing
by a certain combination of the estimates from the filters efforts of developing better implementable fixed structure
running in parallel based on the individual models that match estimators.
the system modes. This approach was initiated in [26]. The The paper is organized as follows. After defining in
early work did not consider jumps in system modes and led to Section I1 the hybrid estimation problem to be considered,
the nonswitching MM algorithms. In the more recent and more Section 111 briefly describes the fixed structure algorithms
realistic switching MM algorithms, first proposed in [ 11, the and the associated problems. In Section IV, theoretical results
jumping of system modes is modeled by switching between pertaining to the effects of the choice of model set are obtained.
models. Section V deals with the theoretical aspects of the variable
The MM approach is best described in terms of stochastic structure estimators. The MM estimation algorithms are
hybrid systems. A hybrid system is one that can be suit- formulated in a new graph-theoretic framework in Section VI
ably described in a hybrid space Rnx x s, the Cartesian for better design and study of both fixed and variable structure
product of the continuous-valued base state space Rnx and algorithms. Then a number of variable structure frameworks
a discrete set S , the collection of the system modes (modal for MM estimation are presented in Section VII. Numerical
states) which characterize the behavior patterns of the system. examples of a nonstationary noise identification problem given
A stochastic hybrid system thus distinguishes itself from in Section VI11 illustrate the superiority of the proposed
conventional systems in its imbedded random jump process variable structure algorithms to the existing fixed structure
which governs the (random) sudden transition of its oper- algorithms. The conclusions are summarized in the last section.
ational modes (system behavior patterns). Many real-world
Manuscript received July 3, 1992; revised May 19, 1995. Recommended by
Associate Editor, P. J. Ramadge. This work was supported by ONRBMDO 11. PROBLEM STATEMENT OF THE HYBRIDESTIMATION
under Grant NOOO14-91-J-1950and by the NSF under Grant ECS-9496319. Consider a stochastic hybrid system
X.-R. Li is with the Department of Electrical Engineering, University of
New Orleans, New Orleans, LA 70148 USA.
Y. Bar-Shalom is with the Department of Electrical and Systems Engineer-
ing, University of Connecticut, Storrs, CT 06268 USA.
Publisher Item Identifier S 0018-9286(96)02828-0.

0018-9286/96$05.00 0 1996 IEEE


LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 419

with (possibly state-dependent) Markovian transition of the It is thus clear that the MM approach fits well into problems
system mode that can be characterized by structural as well as parametric
changes. The existing MM algorithms have a fixed structure in
P{m,(k + l ) b % ( k ) >4 k ) ) the sense that the set A4 used in (5) and (6) is time-invariant,
= $[k, m2, m 3 ,4 k ) l vmz, mj E s (2) even though the models themselves may be time-varying.
and the mode-dependent measurements of the base state Let z k = {z(O),z(l),...,z(k)}be the measurement se-
quence through time k (or more rigorously, the corresponding
z ( k ) = h [ k ,z ( k ) ,m ( k ) ] + w[k, m ( k ) ,4 k ) l (3) a-algebra), where z ( 0 ) denotes the initial information.
The state estimate and its associated covariance matrix
where 2 is the base state vector; z is the noisy measurement
are calculated in a fixed model-set MM algorithm using the
vector; m ( k ) is the modal state (system mode index) at time k ,
minimum mean-square error (MMSE) criterion as follows:
which denotes the mode in effect during the sampling period
k ; P { . } denotes probability; the event that mode m, is in ?(klk) = ~ % ( ~ l ~ ) w j ( ~ ) l ~ k(8)}
effect at time k is denoted as 3

m 3 ( k )k { m ( k )= m,} (4)
j
S is the set of all modal states at all times; and
v[k,m ( k ) , $(IC)] and w [ k , m ( k ) ,z ( k ) ] are the state- . [l”(klk)- ~ ~ ( k l k ) l ’ } ~ { H j ( k ) l z k } (9)
dependent (mode-dependent, in particular) process and where Z j ( k l k ) is the optimal estimate at time k under the
measurement noise sequences, respectively. It is implied a
hypothesis H j ( k ) = {mode history j through time k is
by (3) that the base state measurements are noisy and
the true one}2, and Pj(klk) is the associated covariance.
mode-dependent, and the mode information is imbedded in
As such, there are N assumed possible modes at each time
the measurement sequence. In other words, the system mode
(since any model in M can be in effect at any time) and
sequence is an indirectly observed (and state-dependent)
thus N k hypotheses (possible mode histories) at time k .
Markov chain’. The hybrid state (x,m) is sometimes denoted
Therefore, the full-hypothesis-tree (FHT) estimator in this
by [ in this paper.
setting has to use N’ different permutations3 of N models
The problem of hybrid estimation is to estimate the base
at time k , and both summations in (8) and (9) are over
state and the modal state based on the sequence of the noisy
all hypotheses. This exponential increase in computation and
(mode-dependent) measurements.
memory renders the implementation of the FHT algorithms
infeasible in real time. This is true even for the simplest
111. EXISTINGMULTIPLE-MODEL
stochastic hybrid systems-the so-called (Markovian) “jump-
ESTIMATION AND ITS LIMITATIONS
linear” systems [27]. In view of this, existing efforts have
An estimation algorithm that uses the same set of models been focusing on developing more cost-effective real-time
at all times is referred to as a fixed structure or fixed model- implementable versions of this FHT estimator using certain
set MM algorithm. To the authors’ knowledge, all of the MM hypothesis management techniques such as merging “similar”
algorithms prior to [21] belong to this class with [14] and [29] hypotheses andlor pruning “unlikely” hypotheses to keep the
being the only exceptions. Several papers appeared after [21], number of the remaining hypotheses within a certain limit. As
such as [24]. In the fixed model-set MM approach, a set of such, the summations in (8) and (9) are over the remaining
models must be determined in advance. Denoting by M this hypotheses. Different suboptimal MM algorithms are distinct
fixed set of models assumed in the algorithm, system (1)-(3) is from each other in the criteria and techniques of hypothesis
approximated by one that consists of a set of N conventional management. The above discussion also holds for algorithms
models as based on other optimality criteria, such as the MAP (maximum
a posteriori) estimators.
+
4 k 1) = f,h, 4 k ) I + g , [ k 4 k ) >% [ k ,4 k ) l l The problems associated with the fixed structure MM al-
Vm, E A4 ( 5 ) gorithms are closely related to an important fact hardly men-
+
4 k ) = h,[k, 4 k ) I W J k , 4 k ) l vm, E M (6) tioned in the literature and largely ignored in the MM estima-
tion theory: the performance of an MM estimator depends to
and jumps between the system modes are modeled by switch-
a large extent on the model set M used.
ing from one model to another governed by a Markov law
a If S, the set of all (true) system modes, is known and not
similar to (2). Here, N = IMI is the cardinality of M , i.e., the too large, it is natural to choose M to exactly match this set4.
number of models used, and the subscript j denotes quantities
pertaining to model m,. For example * A mode history through time k is defined as a sequence of modes from
the initial time up to and including time k , that is, { m ( i ) } ; i b .
3That is, each of the N recursive filters at time k has to run N“’ times,
h,[k, 4 k ) I 2 h [ k , .(k), m,(k)l each time with a different previous estimate.
41n this paper, a “mode” refers to the system behavior pattern and a
A
WjF, 4k)l = 4% m j w 4k)l. (7) “model” refers to the (possibly reduced order or simplified) representation (or
description) on which the estimator is based. Such a distinction is convenient
’ It is known as the hidden (state-dependent) Markov chain in the speech and necessary where the mismatch of the model and mode is a major concern.
recognition literature [30]. In other cases, they may be used interchangeably.
480 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

However, since S is usually not known exactly or is too perfectly. Consequently, the best Gaussian approximation for
large, a set of models that can “cover” in some sense the p[.(k), m ( k ) l z k ]at each time in this “Markovian approach”
possible system modes at any time should they be selected or is equivalent to the Generalized Pseudo-Bayesian algorithm
constructed-this is the major task in model design for MM of first order (GPB1) [9]. If a Gaussian approximation for
estimation. To have reliable results, at least one of the models each p [ z ( k ) ,m , ( k ) l z k ] is used, the result is equivalent to the
in M must be “c10se”~to the system mode in effect at any GPB2 algorithm.
time. The existing MM algorithms, which use a fixed set of
models, usually perform reasonably well for problems that can
be handled with a small set of good models. However, in many I v . DESIGNOF THE MODELSET
practical situations, especially with high-dimensional systems, Although the importance of the model set on the perfor-
this requirement is not satisfied. The use of more models mance has been hardly mentioned in the literature, it is evident
in an MM algorithm will increase the computational burden in practice since the primary difficulty in applying the MM
considerably. What is even worse is that, to many people’s estimation theory is to design an appropriate set of models.
surprise, increasing the number of models in a fixed structure A major concern is thus the selection of the model set. This
MM algorithm does not guarantee better performance; rather, can provide a guideline for estimator design. Unfortunately,
it may yield poorer results. This could be true even if an FHT very limited theoretical results on this important issue are
estimator is used. This is a dilemma: additional models should available6.
be used to improve the accuracy, but the use of too many The major reason for the unsatisfactory performance of the
models can degrade the performance, let alone the increase in existing fixed structure algorithms with a large model set is
computation. There are two possible ways out of this dilemma: that many models in this set are so different from the system
1) to design a better set of models and 2) to use a variable set mode in effect at a particular time that not only is the compu-
of models. These are the two main themes of this paper. tation for the filters based on these models a waste, but also
Note that the sequence of the hybrid states [ ( k ) = the excessive “competition” from the “unnecessary” models
[ ~ ( km ) ,( k ) ]is Markov, whereas the base state alone is not. degrades performance. To show this degradation theoretically
It seems better to develop a recursive estimator for ( directly and to obtain better insight into the MM algorithms, the effects
to eliminate the explosion in computational requirements of of using different model sets are investigated in this section.
the above hypothesis-tree-based approach. This would be For simplicity, a static problem is considered with a mea-
really nice if it were true. For simplicity, however, consider surement of the state based on (3) for a single time index that
a jump-linear system with the linear-Gaussian assumption is omitted here.
of the Kalman filter (for each system mode). Suppose that Notations: Denote by 2~ a multiple-model MMSE estimate
p [ x ( k - 1)1m3(k - l), is Gaussian. Then using an arbitrary model set A, that is >

p[.(k - 1),m,(k - 1)12k-1]


= p[.(k - l)lm,(k - 1),z‘-1]P{m,(k - l)Izk-l}
(10)
Similarly, denote by 2s the optimal MMSE estimate, where
is a scaled Gaussian probability density function (pdf). Al- the subscript S stands for the (optimal) set of models, that
thoughp[z(k), m , ( k ) l x ( k - l ) , m z ( k - l ) , z’] is again Gauss- is, the set that exactly matches the true system mode set
ian, it requires the perfect knowledge of (. k - 1) and m(k - 1) S. Since the expectation of a random variable can be used
which is not available to the estimator. In fact to define a norm, the norm squared I/ . 11 will be used as a
shorthand notation for the conditional expectation, where the
P[.(k), m,(k)l.(k - 11, zkl
conditioning, if omitted, will be clear from the context. For
= CP[W,
z
m3(k)l.(k - 11, mz(k - 11, zkl example

’ P{m,(k - l)l.(k - 1), z k } (11)


is no longer a Gaussian pdf-it is a Gaussian mixture of
N = IM/ terms-and thus The advantage of this notation is that it implies a measure of
“distance” in random setting. Note, however, that in the case
P[.(k), m,@lxk1 where no random variable is involved, 11 . 11 reduces to the
conventional two-norm.
= J’P[.(k), m3(k)l.(k - I), zkl The distance between an estimate and the optimal estimate
. p [ x ( k - l)lz‘“]d z ( k - 1). (12) will serve here as a measure of the quality of the estimate.
The following proposition shows that the performance of
At time‘k + 1, the number of terms in the Gaussian mixture an MM estimator will deteriorate in the case of using extra
becomes N 2 because neither m ( k - 1) nor m ( k ) is known models as well as the effects of missing models.
The optimal estimation procedure for the nonswitching model situation
will converge to the model in M which is closest to the true mode in an 6 A recent paper [31] presents a design strategy for MM adaptive estimation
information distance measure [ 5 ] ,[ 6 ] , [2]. with an uncertain parameter that may vary over a continuous region.
LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 481

Proposition 4.1: Suppose that the optimal model set S and (20) follows from
an arbitrary model set A are given. Denote by J the set of
common models of S and A 0s -?A = (1 - b)[?s- 2 ~ 1 . (24)
J = S n A (the set of common models). (15)
, Q.E.D.
s
Let L be the mismatched models between and A (i.e., those s
Corollav 4.1; If either A or is included in the other, then
not common to them), that is, the set of missing models 6% the equivalent estimate using the mismatched models becomes
those in S but not in A ) and extra models (i.e., those in A
but not in S), given by 1
l-bEm, E (A-S)% a3
2.=( 1 A 3 S (extra models case)
if
L = ( S U A) - ( S n A) (25)
= ( S - J ) U ( A - J ) (symmetric difference). (16)
Em,E ( S - A ) ‘&
if A cS (missing models case)
Also let
where b 2 1 if A c S and b 5 1 if A 3 S .
a, = P{m,Iz, A } VmJ E J (mode probability) (17) Note that the sum of the probabilities of the mismatched
= k‘{m&, S } Vmt E J (mode probability) (18) models is 1 - b if A 2 S or 1 - l / b if A c S .
= E[xlz, m,] (mode-conditioned estimate). (19) Proposition 4.1 and Corollary.4.1 indicate that the use of
too many models is as bad as the use of too few models. They
Then the distance between the optimal estimate 2s and the also show that the degradation of the estimation is proportional
estimate 2~ based on model set A is proportional to the to both the mismatched model probabilities and the distance
distance between 2s and the “equivalent” estimate ? L based between the optimal estimate and the equivalent estimate using
on the set L of the mismatched models the mismatched models. An important question thus arises
naturally: When do we add (or delete) a certain group of
112s - 2~11’= (1 - b)’112s - P~ll’ (20) models, denoted as set C below, to improve the performance?
where 2~ is the equivalent estimate (referred to set A ) based The following theorem provides an answer.
on the mutually mismatched models of S and A, given by7 Theorem 4.1: Consider two model sets A and B. Assume
that A is a subset of B. Let C = B - A. As in (22), the
mode probabilities as calculated based on B and on A have
the relation

and b 20 is a constant, given byx P{mjlz, B } = bP{mjlz, A } Vmj E A cB (26)

where 0 < b 5 1 is a constant9. Define the ratio r and the


angle 8 between the two vectors ( 2 s - 2 ~ and
) (2s - 2 c )
Remark: Property (22) always holds true because, for any
common model of sets A and S , its probability is, by Bayes’
formula, equal to the model likelihood times the prior divided
by the normalization constant [9]. Since the model likelihood
and the prior do not depend on the other models used, the only
difference between a, and p3 for m3 E (S n A ) is due to the
different normalization constants used, which depend on the
collective effect of the other models used in the set. where 2s is the optimal estimate, 2~ was defined by (13),
Proof: Since and 2c is the estimate based on the set of additional models
given by

mJ€J m3€ ( A - J )

/ \

n,&(A-J) mr6 ( S - J)

= b2s + (1 - b ) P L (23)
’The first summation is the weighted sum of the estimates based on the and ll?s - 2~11’ = 112s - 2~11’ if and Only if the equality
uncalled-for (extra) models, and the second one is of the estimates based on in (30) holds.
the missing models as calculated using S;it is converted to -4 by scaling by
b.
86 = 1 if and only if E 5,s. 9 b = 1 iff P { m j ( z ,B } = 0, Vm, E C which is equivalent to 2~ %A.
482 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

Proof: Similarly to (23), it follows from (26) that 4


2s - 2~ = 2s - 2 , P { m , l ~ ,B} - ~ , P { ~ , I B}
z, 3
m, E A m, E C
2
= b2s + (1 - b)2s - b ~ , P { ~ , IA
z ,}
m, E A 1
-(1 - b)2,
0
= b(2s - 2 A ) + (1 - b)(2s - 2 c ) . (31)
-1
Hence
-2
112s - 2B1I2 - 112s - 2A1I2
= (b2 - 1)IIPs - 2 ~ 1 1 ~ -3
+ 2b(1 - b)(?S - O A ) ' ( i S - OC)
-4
+ (1 - b)2112s
2c# -
5 -5 -4 -3 -2 -1 0 1 2

= [2brcosQ+ (1 - b)r2- (1+ b ) ] Fig. 1. Illustration of Theorem 4.1. If and only if the estimate based on
model set C falls inside the corresponding ball will the use of the set C in
' (1 - b)ll?s - ?All2. (32) addition to set -4improve the estimation accuracy.

It is then easy to check that the right-hand side (RHS) of (32) same direction as 2~ is, and thus will leave more room in
is not positive if and only if (30) holds. Q.E.D. the opposite direction for 2~ using additional models with a
It can be shown that the region described by (30) is a circle smaller probability 1 - b to balance the offset of 2~ and thus
of a radius (1/(1- b ) ) centered at ( b / ( l - b ) , 180") on the improve the estimation accuracy.
plane determined by the two vectors (2s -?A) and ( 2 s - 2 ~ ) .
Theorem 4.1 requires knowledge of the optimal estimate
Theorem 4.1 provides a guideline for model set design.
OS. It would be much more useful if this assumption could
Specifically, it provides a criterion for deciding when the
be relaxed somehow. Still, the significance of Theorem 4.1 is
addition or deletion of certain models is beneficial.
not as limited as it may appear. An analogy is the tracking in
The geometric interpretation of Theorem 4.1 is interesting:
clutter problem for which the Kalman filter is an invaluable
given 2s, ? A , and model set A , and consider adding a set C of
tool even though it "unrealistically" assumes the availability of
new models to set A (C and A are thus disjoint). The estimate
2 c can be obtained by (29). Note that 2 c depends on b. Place correct measurements. In fact, Theorem 4.1 provides not only
2s at the origin and 2~ at (1, 0), meaning that the space has a insight into the MM estimation but also a theoretical guideline
unit length of 112s - 2 11.~Vary 2 c (i.e., vary cos 9i and r ) such for the model set adaptation required in a variable structure
that the equality in (30) holds. If 2~ is confined to the plane MM estimator. It is also very useful, for instance, in the
determined by the two vectors ( 2 s - 2 ~ and ) (2s - 2c),l0 design and performance evaluation of the MM algorithms and
then the circular loci of 2~ are shown in Fig. 1 with values of for a comparison between two MM algorithms. Specifically,
b fixed at 0.5, 0.6, and 0.7, respectively. Clearly, the circular the (generally time-varying) parameters r , cosi9, and b can
loci become spherical surfaces without the above-mentioned be obtained (perhaps by Monte Carlo simulation) for the
confinement. For a given set C , Theorem 4.1 then states that if particular scenario of interest. Theorem 4.1 can be applied to
and only if 2 , falls inside its corresponding circle (ball), using determine at what time one estimator is superior to the other
a model set B = A U C (i.e., adding models in C to set A ) is one. It is also possible to use Theorem 4.1 to obtain a new
superior to using A alone in the sense that 2~ is closer to 2s estimator based on two estimators that use disjoint model sets.
than 2~ is. As such, Theorem 4.1 is somewhat similar to the It is thus clear that the FHT estimator described in the previ-
unit-circle stability criterion for a discrete-time linear system. ous section is optimal if and only if the model set used at any
In general, for higher dimensional cases, Theorem 4.1 states time exactly matches (with probability one) the set of possible
that the estimate ? A can be improved by adding models in C system modes at that time which is often not the case in reality.
to set A if and only if 2c falls inside the corresponding ball In view of this, it is inappropriate to say that an FHT estimator
depending on the b value of (26) which itself depends on the
(based on an arbitrary model set) is optimal, since it does not
number of the models in C and the quality of the estimates
provide the performance limit for all practical MM algorithms.
based on them.
Using a better model set (obtained generally in real time), it is
Fig. 1 is somewhat surprising at first glance in that the
possible that a real-time implementable estimator can provide
improvement ball is larger if A and B match each other
better (in the sense of a larger probability of the common better results. In addition, a fixed structure FHT estimator is
clearly not optimal if the set of possible system modes is
modes being the true ones), meaning that it is easier to use
time-varying, which is in general the case, as shown later.
the additional models to improve the performance. This can
be explained as follows. Since 2s is at the origin and 2~ MULTIPLE-MODEL
V. VARIABLESTRUCTURE APPROACH
is located at ( r , 0) = (1, 0), 2~ will be better than 2~ if
2~ is in the unit ball. In the case that A and B match each It is well known that the most powerful (MP) test is
other well, 2~ will be away from 2s in approximately the the best test for simple hypothesis testing problems in the
Neyman-Pearson framework under the assumption of a fixed
"Or for simplicity, consider that the i s , ? A , and Pc are two dimensional sample size. For problems of a sequential nature, however,
LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 483

a
the sequential probability ratio test (SPRT) is superior to the Notations: Let S = {SI, SZ,. . . , S N } be the family of all
MP test in the sense of more efficient use of samples-this distinct state-dependent system mode sets. As such, the set of
is the well-known optimality of the SPRT. The major reason system modes S ( k ) at any time k is a member of S. As before,
for this is the following. Using a fixed sample size, the MP let S be the set of all possible system modes, i.e., the union
test does not utilize intermediate information in the sense that of S ( t ) , t = 0, l , . . .. Let S k (or m k )be a sequence of the
it does not make any decision until the required sample size system mode sets (or system modes) through k.
is obtained (and at that time, it has to make a decision even In the sequel, S k and mk are also used to denote the model
if sufficient information is not available), whereas the SPRT (-set) sequence that exactly matches the system mode(-set)
allows a variable sample size and makes a decision when and sequence, respectively.
only when sufficient information is obtained. In view of the state dependency of the system mode sets, it
Drawbacks exist in the fixed structure MM algorithms is meaningless to consider the following sequence:
similar to those in the MP test. The model set M , like the
sample size of the MP test, has to be determined beforehand . . . , S(t - I), Sm3( t ) ., . . with mj $! S ( t - 1) (35)
based only on the initial information about the possible system
modes. Actually, the real-time measurements carry valuable because Sm3( t )will never be a true mode set" at k if S ( t - 1)
information concerning the system mode being currently in is the true one at k - 1. In other words, a true mode sequence
effect; they also provide useful information about the mode set, will never be a member sequence of (35). Thus, we introduce
that is, which modes are "reasonable candidates" to consider the following definitions.
for being in effect. In addition, the set of possible system Dejinitions: A (finite) sequence'* of mode (model) sets
modes depends on the previous state of the system and is
thus time-varying. It is therefore reasonable to consider using
Sk e {S(O),S(l),...,S(k)} is said to be admissible if
S ( t ) is a state-dependent mode (model) set w.r.t. one or
variable structures. That is, M in (5) and (6) is replaced by a
more members of the previous mode (model) set S ( t - 1)
variable set of models to be determined from all (off-line and
on-line) information available-in particular, the measurement for all 1 5 t 5 k . Similarly, a mode (model) sequence
A
sequence. In this sense, the variable structure makes it possible m k = {m(O),m(l),...,m(k)} is said to be admissible if,
to fuse the prior knowledge and the posterior information about for each 1 5 t 5 k , there is an z(t - 1) such that
the system modes. P{m(t)Im(t- l),z(t - 1)) # 0. Loosely speaking, a mode
Similarly to the superiority of the SPRT to the MP test, an (-set) sequence is admissible if and only if it is a possible true
MM estimator with a variable structure may be superior to mode(-set) sequence. Clearly, the admissibility defined here
existing fixed structure schemes in terms of more efficient use concerns the feasibility of the evolution of the system mode
of models. (set). More about the admissibility will be given in Section VI.
In this section, the theoretical basis for MM estimation with Optimal MM Estimator: The optimal MM estimator in the
a variable structure is established. Variable structure estimation MMSE sense with a variable structure is given by
is clearly a generalization of the fixed structure estimation in
the sense that the former includes the latter as a special case. ? ( k l k ) = E [ E [ 2 ( k ) l S kZk]lzk]
, =~?Sk(klk)P{sklzk}
Variable structure MM estimation is based on the following S'"
important facts that have been long overlooked. 1) The FHT (36)
estimator is not optimal if the model set used does not P ( k l k ) = c [ P s S( k l k ) f [?(klk) - ?Sk ( k l k ) ]
exactly match the true system mode set at any time, as S'"
shown before; 2) the measurement sequence carries valuable
' [?(klk) - ? s k ( k l k ) ] ' ] P { s k l z k } (37)
information concerning the state and the system mode set
in effect; and 3) the set of possible system modes at any
where z k is the measurement sequence through time k , and
time depends, in general, on the previous state of the system.
?sk ( k I k ) and Psk ( k I k ) are, respectively, the optimal estimate
This state dependency of the mode set arises from the fact
and its covariance at time k assuming that the true mode-set
that a particular system mode can, in general, jump to only
certain system modes that correspond to nonzero transition sequence is Sk.They can be obtained in a similar way as in
probabilities. (8) and (9) but with a variable model set. Specifically, that is

A
+
The state-dependent system mode set at time k 1 with re-
O s k ( k l k ) = EIE[z(k)lmk,sk,z k ] \ s k z, k ]
spect to (w.r.t.) the previous hybrid state [ ( k ) = ( z ( k ) ,m ( k ) )
is defined as = ? m ' " ( k ] k ) P { m k p kz, k } (38)

where P { m ( k + l)l[(k)} was defined by (2). The mode- ' [ ? s k ( k l k ) - ? m k ( k l k ) ] ' ] P { m k l S kz ,k } (39)
dependent system mode set w.r.t. mode m ( k ) is defined as
S,(,)(k + 1) = {m(k+ 1):P { m ( k + l ) l m ( k ) ,z ( k ) } " A mode set (sequence) is said to be a true one if it includes the true mode
(sequence) in it.
>0 €or Some z ( k ) } . (34) '*The term ''sequence'' Instead of the more rigorous term "ordered k-tuple"
+
Note that SE(k)(k 1) is a subset of S m ( k ) ( k + 1). is used in this paper to keep the terminology as simple as possible (but not
simpler).
484 E E E TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

where 2 m k ( k l k ) is the optimal estimate at time k assuming The recursion for the probability of the mode-set sequence
that the true mode sequence is m' and P,. ( k l k ) is the in (36) and (37) can be obtained by Bayes' formula as
associated covariance.
P { S k I z k } = P { S ( k ) ,S"-llz(k), 2-1)
Proof: Equations (36) and (38) follow straightforwardly
1
from the smoothing property of the expectation operation [9]. = - p [ S ( k ) , z ( k )1sk-1, zk-l]P{Sk-l }'".I
The mixture covariance matrices (37) and (39) can be proven
1
similarly as in [9], even though the proof given there is in a = - p [ z ( k ) I S k , z"']P{S(k) I s k - 1 , 2-1)
C
somewhat different setting. Q.E.D.
Remarks: The summations in (36) and (37) are over all . P(S"-1I2k--l } (42)
the admissible mode-set sequences rather than all possible where c = p [ z (k ) I Z " ~ ] is a normalization constant.
mode sequences. This is a key difference between the variable The first term on the RHS of (42) reflects the information
structure and the fixed structure. To appreciate this better, about the system mode-set sequence gained from the current
suppose that the system mode set depends only on the modal measurement (i.e., the likelihood function of the mode-set
state, that is, the system mode sequence is a Markov instead of sequence) which is given by
a base-state-dependent Markov chain. Then the state (mode)
dependency of the system inode set at any time is a (many-to- P[+)ISk, zk-7
one) point-to-set mapping G : S + S which assigns to each = p [ z ( k ) l m k ,z"-1]P(m"S'", P}(43)
element of S some member of S.The summations in (36) and m"Sk
(37) are over the range (not even the codomain) of the mapping
Gk : S x . . . x S + S x . . . x S (both of k folds), whereas where the second term is the predicted mode (sequence)
those in (8) and (9) are over the domain of G k . This is in probability, given implicitly in the derivation of (40) (i.e.,
analogy conceptually to the difference between the Lebesgue mode sequence probability times mode transition probability),
integral and the Riemann integral. Since S,,J = 1,2.. . . , N, and the first term is the likelihood function of the mode
form a partition of S and each S, is a mode set, it follows sequence.
immediately that the optimal variable structure estimator has The mode-set transition probability, i.e., the second term
a two-level hierarchical structure: multiple model-sets at the on the RHS of (42), can be obtained as follows. Define the
higher level for the elements of S (as a partition of S ) , and mapping G as the (hybrid) state dependency of the system
multiple models at the lower level for the elements of each S,, mode set similarly as shown before for the mode dependency.
as is clear from (36)-(39). Also, it is not difficult to see that Let E, be the inverse image of S, under G, that is, the subset
a partition of the domain implicitly requires the use of a fixed consisting of the elements of the hybrid state space that is
structure (since S is a constant set), whereas a partition of the mapped by G on S,, given by
range does not (because S,,j = 1,2 , . . . , N, are all distinct E, = X , x MJ = G-'({S,}) = { E : G ( [ )= S,}
mode sets). Clearly, the variable structure estimator reduces to
the fixed structure estimator if S is a singleton, i.e., S = { S } = {(x,m ) E Rnx x S : G ( t )= S,} (44)
which is usually not the case (see Section VI). where X , and M, are the projections of EJ in the base-state
Further Discussions of the Optimal Estimator: The mode subspace and the system mode set S , respectively. Then
sequence probability P { m k l S k ,z k } in (38) and (39) can
be obtained similarly as in a fixed structure estimator. P{S,(k)lS"l, 2-1) = /dP{t.S"l, 2-1) (45)
Specifically, one has the following recursion:
P{mkISk, zk} where S,(k)= { S ( k ) = S,};P{[ISk-', zk-'} is a gen-
1 eralized cumulative distribution function because t has both
= -p[z(k)lmk, zk-']P{mklsk, 2-1) continuous and discrete components, and the integration is
1 over the region in probability space such that t E E,. In
= - p [ z ( k ) l m k , z'"-1]P{m(k)lm'-l, S ( k ) , zk-1) the case that the mode sequence is a Markov instead of
. p{mk--lISk-l, (40) state-dependent Markov chain, (45) becomes

where c = p [ z ( k ) l S k xk-']
, is a normalization constant and P{S,(k)lS"-l, zk--l}
p [ . ] denotes the pdf. In view of the fact that only admissible = P { m ( k - 1)1Sk-', (46)
mode sequences are considered, the last term above follows m ( k - 1)E 111,
from:
or
P{mk-lISk-l, z k - ' } = P{mk-'1Sk, z k P 1 } (41)
P{ s, ( k )I sk-1, zk-'}
since the knowledge of S ( k ) carries only feasibility infor-
= P { m ( k - 1), mk-21sk-1, zk-l} (47)
mation (and nothing else) about m ( k - 1). The first term
m(k--l)€M3mk-2
on the RHS of (40) is the likelihood function of the system
mode sequence which can be obtained similarly as in a fixed where the summand in (47) is the mode sequence probability,
structure estimator. The second term is the mode transition given by (40). Note that M, may contain more than one mode,
probability. as A d 2 in the following simple illustrative example.
LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 485

matches the true mode set at each time. In the case of an


unknown system mode set, therefore, no realistic estimator is
optimal even if FHT algorithms are implemented. In the
case of an infinite system mode set, the accuracy of an
FHT algorithm and its real-time implementable versions
depends on the level of the quantization of the system
mode set which is limited by the computational burden. In
both cases, a fixed structure MM estimator may yield poor
Fig. 2. An illustrative example of four modes. results because the models in the predetermined set can be
very different from the (possibly time-varying) true mode.
Example: Fig. 2 shows a set of four system modes. A Since the measurement sequence carries valuable information
connection between two modes represents that either mode concerning the mode set sequence in effect, after receiving
may jump to the other one. It is also assumed that each mode the measurements in real time, it is possible to make a
may jump to itself. Then better guess about the system mode set or to obtain a locally
refined quantization of the system mode set. Consequently,
s ={I?
2, 3, 4) S = (S1,sa, S3) (48) having a built-in learning mechanism for the system mode
s1 ={I1 2, 3) Ml = (1) (49) set, a variable structure estimator can provide better results
than those of a fixed structure estimator of comparable
SZ ={l,2, 3, 4) M.2 = (2, 3) (50)
computational requirement^'^. The advantages of such use
s
3 =(2, 3, 4) Ad3 = (4). (51) of the measurements to determine the state traje~tory’~ has
An example of the admissible sequence of mode sets through been manifested by the superiority of the extended Kalman
k=5is filter to the linearized Kalman filter [16], [28]. The situation
here is also analogous to that of adaptive control versus
S5= ((1, 2 , 3, 41,(2, 3, 4), (1, 2, 3, 41. (1, 2, 31, nonadaptive control.
(1, 2, 31, 11, 2, 3, 4)) (52) With a variable structure, it is not necessary to specify the
set of all models beforehand. Also, a better balance between
since, for example, S(1) = ( 2 , 3, 4) is the mode set con- performance and computation may be obtained.
ditioned on that mode 4 was in effect at k = 0, which Some Practical Aspects of the Variable Structure: Assuming
is a member of S ( 0 ) = (1, 2, 3, 4). The admissible mode ,$ is the hybrid state at k , it was shown in the previous section
sequences through k = 5 that are in S 5 of (52) include the
following two:
that St is the best model set to be used at k +
1 in MM
estimation. If, however, either E’ or E2 is the hybrid state at
(4, 3, 1, 1, 2, 31 (4, 2, 1, 1, 3, 2). (53) +
b, the estimate at k 1 can be obtained by probabilistically
weighting the sum of the two estimates using St1 and St2,
Although not in S5 of (52), the following mode sequences respectively. In the case where either E’ or is the hybrid
are admissible because they belong to some other admissible state at k , one can also obtain the optimal estimate using
mode-set sequence(s): the union (merging) of SEIand S,Z. It is thus more general
to consider S ( k ) as a union of the state-dependent system
(3, 2, 4, 2, 2, 1) (4, 2, 3, 4, 2, 2) (54)
mode sets to allow the merging of the mode sets. The above
which can be verified easily as two “directed walks,” defined variable structure estimator can be extended to this case
later, of the directed graph defined by Fig. 2 . However, the without difficulty by simply allowing (SI, Sz,. . . , S N ) to be
following two mode sequences: a partition of S that consists of unions of the state-dependent
system mode sets.
( 3 , 2, 4,1, 2, 11 (1, 4,3, 4,2, 2) (55) The optimal variable structure estimator is computationally
are not admissible because jumps between modes 1 infeasible. In view of this, hypothesis management techniques
and 4 are involved which cannot occur since there similar to those for the fixed structure algorithms can be used
is no direct link between the two modes. Similarly, such as pruning the “unlikely” and/or merging the “similar”
((2, 3, 4}, (1, 2, 3), (1, 2, 3, 4), . ..) is not an admissible mode-set sequences. Pruning of the “unlikely” mode-set se-
sequence of system mode sets because (1, 2, 3) is the quences is both conceptually and practically simple-it can
system mode set w.r.t. mode 1, which is, however, not in be done by checking the corresponding mode-set probabilities.
the previous mode set. Merging of the “similar” mode-set sequences is also not com-
Variable Structure Versus Fixed Structure: In reality, the plex-the merged mode-set sequence is simply the sequence of
set of possible system modes is often either unknown or
l 3 A recent paper [24] compares a fixed structure interacting multiple-model
infinite, since mathematical models are only simplified de- (IMM) algorithm with a variable structure IMM for a maneuvering target
scriptions of real-world problems. As is clear from Section IV, tracking problem and concludes that the variable structure IMM using five
however, an FHT estimator described in Section I11 is optimal models has a performance about the same as that of a fixed structure IMM
with 13 models.
only if the model set used precisely matches the set of possible
I4The system mode sequence, or the modal state trajectory, in MM
system modes at each time. In addition, the accuracy of the estimation is an essential description of the scenario of the problem and
FHT estimator depends on how well the model set used functions similarly to the state trajectory in conventional estimation [22], [19].
486 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

the unions of the corresponding mode sets-and it leads to no


loss of information. As such, combining merging with pruning, k=O
one may obtain a good balance between the performance and
the computation. We now explore some special features of the
variable structure estimators. k=l
One of the major advantages of the variable structure
estimator is that the set of all system modes need not be
specified in advance. Starting with an initial mode set and then k=2
proceeding with the state-dependent sets of system modes,
the set of all system modes is, in general, time-varying. In Fig. 3 A mode transition graph.
this process, guided by the information in the measurement
Initialization of new filters in the variable structure should
sequence, most “unlikely” mode-set sequences can be dropped
be done based on the state dependency of the system mode
to keep the computation within a desirable limit. More details
set. In general, the initial estimate for the new filter matched
will be given in Section VII.
to mode m, at time k can be obtained as, similarly to the
Equation (38) can be rewritten as
interaction step in the IMM algorithm
?,S’k(klk) k I),
= E [ E [ z ( k ) l m (- sk,zk]Isk,
Zk]
q ( k - Ilk - 1)
= E [ z ( k ) l m ( k- l ) ,SIc-’, S ( k ) , z k ] = E [ z ( k- 1)1m3(k), 9 - 1 , 21
m(k-1)
= E [ E [ z ( k- l)lm(k - l),m 3 ( k ) ,
P { m ( k - 1 ) I s k - 1 , S ( k ) , 2”. (56)
sk-2.2]1m3(k),s k - 1 , 21
If the inverse image of the state-dependent system mode set = E [ z ( k- l)lmz(k- l ) ,m,(k), s k - 2 , 2k]
S ( k ) is a singleton, i.e., { m ( k - 1 ) ) = G-’({S(k)}),meaning m, € E ,
that there is a one-to-one correspondence between m ( k - 1)
. P{m,(k - l)lm,(k), sk-1, 2k} (59)
and S ( k ) , then this equation can be simplified as
where E, = S ( k - 1) n T, is the intersection of the mode set
2,S’k(k\k)= E[z(k)lm(k- l),S k - 2 , S ( k ) , 2”. (57) at k - 1 and the adjacent set to m3 (see Section VI), that is,
Note that this estimator has the same structure as that of E3 is the set of modes that were in effect at k - 1 and are able
the first-order GPBl algorithm (without merging) [9] in the to jump to m,; and E [ z ( k- l)lm,(k - l), m,(k), SIc-’, z‘“]
sense that all of the modes in the state-dependent mode set can be obtained by a conventional estimation technique. In
S ( k ) jumped from the same previous mode m ( k - 1).In the other words, if mode m, can only be reached from mode m,,
case that the inverse image of { S ( k ) } contains more than then & ( k - Ilk - 1) and Pz(k - Ilk - 1) should clearly be
one mode, (56) indicates that 2 s k ( k l k ) is a probabilistically used as the initial estimate and covariance for the new filter
weighted sum of the corresponding GPB 1 estimates. It should based on m3 at time k . If m, can be reached from several
be emphasized that such use of the GPBl algorithm is still modes in effect at k - 1, then the probabilistically weighted
optimal, even though the GPB 1 algorithm itself is suboptimal, sum of the estimates at k - 1 from the filters matched to these
since the approximate step of merging in the GPBl is not modes should be used as the initial estimate for the new filter
used here. matched to m3 at k . This also demonstrates the importance of
In the case that S ( k ) is the union of several state-dependent the concept of the state (mode) dependency.
mode sets, (38) can be rewritten as Clearly, in the case that M j , 2 = 1, 2, . . . , N are all
singletons, there is a one-to-one correspondence between the
2’sk(klk) set of admissible mode-set sequences and that of admissible
= E [ E [ E [ z ( k ) J m-( lk) , m ( k ) , s“’, z”lm(k), mode sequences.
sk-1 , 2k]ISk, 2 1 With the concept of admissibility, it is not difficult to
observe that the summations in (38) and (39) and (8) and
= E[z(k)jm(k- I), m ( k ) , SA-2, 2 1 (9) over all mode sequences are the same as over only all
m(k)m(k-l) admissible mode sequences. Consequently, the computation
. P { m ( k - l ) l m ( k ) ,sk-1, z’”} at time k can be reduced to the number of admissible mode
. P { m ( k ) I S k ,2”. (58) sequences (see Section VI).
Example-Variable Structure Versus Fixed Structure: An
Note that the inner summation is an MM estimate with a illustration of the difference between the variable structure
memory depth of one step. Such an estimate can be ob- and fixed structure estimators via a simple example is now in
tained precisely with a GPB2 algorithm (without merging). Of order. Consider a two-step seven-mode example. Assume that
course, the interacting multiple-model (IMM) estimator [ 121 the inverse images of system mode sets are all singletons, as
with P { m ( k - l)lm(k), Skpl, z k } functioning as the mixing shown in Fig. 3. An arrow from one mode to another indicates
probability can also be used here to save computation. That is, a possible mode transition. Suppose it is known that, at time
the inner expectation E [ z ( k ) l m ( k )S, k - l , z k ]can be obtained k = 1, the system is running in either mode 1 or mode 2 with
from an IMM algorithm. equal probability.
LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 487

With the variable structure estimator, the estimate at 5 = 2 V and E are called the vertices and the (directed) edges of D,
will be the probabilistically weighted sum of the estimates respectively. A stochastic digraph is such a digraph in which
from filters matched to modes 3-7, respectively. As explained each edge is assigned a probabilistic weight that sums up to
before, at IC = 2, a GPB1 algorithm can be applied to modes unity for each vertex. Associated with a stochastic digraph,
3-5 with 91(111) as the initial estimate, where 21 (1 11) denotes there is an adjacency matrix A whose ( i , j)th element a,,
the estimate of x( 1) based on m1 at IC = 1, likewise for modes is the weight of the edge from vertex U, to vertex v3. The
6 and 7 (with 2 ~111) ( as the initial estimate). Note that such adjacent sets of vertices from vertex U, and to vertex vl are,
an application of the GPBl does not lead to a loss of the respectively, defined in terms of a,, as
optimality of the estimator.
In a fixed structure GPBl estimator, however, the mode-
matched filters at k = 2 use the weighted sum of 31(111)and
&(111) as the initial estimate for that ~ y c 1 e . lThis
~ merging
makes the overall estimate at k = 2 nonoptimal. It should It is assumed in this paper that every vertex of the digraph
also be noted that this estimator uses seven filters at k = 2 as under consideration has a self-loop (i.e., adjacent to itself).
opposed to five filters (or a group of three filters plus a group Theorem 6.1: The set of models (or modes) used in the
of two filters) with the variable structure estimator. MM algorithms, together with the model switching law of (2),
If, in the above example, m2 may jump to m5, then the can be represented by a stochastic digraph without parallel
initial estimate for the filter based on m5 in the variable edges (i.e., edges that are incident with the same ordered pair
structure estimator should be the probabilistically weighted of models).
sum of 2:1(111) and 2 2 ( l [ l ) . Pruu$ Represent a model by a vertex with the same
A Suboptimal Variable Structure MM Estimator: A simple index. Add an edge e(v,, U)) from vertex v, to vertex v3 if,
yet good suboptimal variable structure approach is the one according to ( 2 ) , mode m, may jump to mode m3. Assign
based on the best mode-set sequence (BMSS) (possibly with a the transition probability from m, to m3 to be the weight of
proper merging of mode-set sequences) which can be viewed e(Ti,, v3). Obviously, the digraph constructed in this process
as an application of the well-known Viterbi technique to the is a stochastic digraph without parallel edges. Q.E.D.
management of the mode-set sequences. The terms “mode,” “model,” and “vertex” will be used
Let M ( k ) and 111‘“be the model set and model-set sequence, interchangeably.
respectively, at time k in a BMSS algorithm. A large class of DeJnition 6.2: A directed walk of D is a sequence of
recursive algorithms that uses a variable model set consists of vertices U O , 211,. . . , U k such that there is an edge in D from
the following five key steps at each time k . t1-1 to U, for z = 1, 2 , . . . , k . A digraph is said to be strongly
RAMS Approach (Recursive Adaptive Model-Set Ap- connected if there exists a directed walk between any two
proach): vertices.
Model set adaptation: determine the model set M ( k ) With the help of graph theory, four essential ingredients of
based on {A&‘“-’, z’}. a multiple model algorithm can be identified: a set of single-
Initialization of model-based filters: obtain the “initial” model based algorithms (e.g., Kalman filters), each matched
conditions for each filter based on a model in M ( k ) . to a particular mode; a fusion rule that provides the overall
Mode-matched estimation: for each model in M ( k ) , results based on these individual mode-matched (or mode-
obtain estimates under the assumption that this model sequence-matched) algorithms; an initialization rule to obtain
matches the system mode in effect exactly. the initial conditions for each recursive filter at each time; and
Mode sequence probability calculation: compute an evolution mechanism of the underlying digraph at each time
P{M“-l, m ( k ) / z k }for a11 m ( k ) in M(IC). that defines the graph-theoretic relationship between modes.
Estimate fusion: obtain the overall estimate and its We shall call this underlying digraph of an MM algorithm its
associated covariance. (supporting) digraph. It should be emphasized that there may
Step 1) is unique for variable structure algorithms. Its the- be many different supporting digraphs associated with a mode
oretical basis is the merging/pruning criterion and (42)-(47). set, and thus it is better to use digraphs rather than mode sets
More details will be given in Section VII. Step 2) has been to describe the MM algorithms.
discussed. Steps 3)-5) are similar as in a fixed mode-set This graph-theoretic formulation opens up new avenues for
algorithm. the study and application of MM estimation. We provide below
a sampling of useful results without proof
VI. GRAPH-THEORETIC
FORMULATION A Markov chain is ergodic if and only if its associated
OF MULTIPLE-MODEL
ALGORITHMS digraph is a strongly connected stochastic digraph.
DeJnition 6.1: A digraph (or directed graph) D is an or- The state-dependent mode set w.r.t. m3 is the adjacent
dered pair of disjoint sets (V, E ) such that E is a subset of set of modes from m3.
the set of ordered pairs of elements of V16. The elements of The set S of the system modes is not state dependent if
and only if its associated digraph is complete symmetric
I5If the designer correctly chooses the initial mode prob-
ability vector to be [0.5, 0.3, 0, 0,0, 0, 01’ instead of, say,
(i.e., every mode can be jumped from any other one
[1/7, 117. 117. 117, 117, 117. 1/71’, directly). It is thus clear that the state-dependent system
I6Figs. 2-6 are all examples of digraphs. mode sets are usually not the same as their union S.
488 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

0 A mode sequence consisting of the members of S is scaling all weights in the digraph to obtain a stochastic digraph.
admissible if and only if it corresponds to a directed walk In the sequel, let D always be the total digraph obtained by
of the digraph associated with S. normalizing the union of the supporting digraphs at all times
e The number of admissible mode sequences of S at time of the MM algorithm under consideration.
k is given by
A. Active Digraph (AD) Algorithm
N s ( k ) = Ea!,”’ (61)
2, 3 One way of obtaining the variable digraph can be referred to
as the active digraph (or model-set) algorithm. The basic idea
where a!,”)is the ( i ,j)th entry of A k , the kth power of is to use a subdigraph of the total digraph as the “active”
the adjacency matrix A of the digraph associated with S. digraph at each time. This was inspired by the active set
This follows from a theorem in graph theory [32] which method in constrained nonlinear programming [25]. One cycle
states that the number of the directed walks of length k of the AD algorithm is as follows.
from wz to v3 is equal to a:,”’. AD Algorithm:
Certain properties of MM algorithms are related to their Obtain the union of the system mode sets Y =
supporting digraphs, and therefore MM algorithms can be UmED(k-l)S m ( k ) ,where D(k-1) is the active digraph
classified into several categories in terms of their supporting at time k - 1 and S m ( k ) is the “dependent system
digraphs. mode set w.r.t. m, defined by (34).
Definition 6.3: A fixed digraph (or fixed structure) MM Evaluate the probability of each mode in Y .
algorithm is one whose supporting digraphs must be identical Form the active mode set Y’ as the subset of Y con-
(or is om or phi^'^) at all times; otherwise, it is said to be a sisting of no more than K modes that have largest
variable digraph (or variable structure) algorithm. A fixed probabilities, where K depends on the maximum allow-
model-set MM algorithm is one in which all supporting able computational burden.
digraphs at different times must have the same set of models. Obtain D ( k ) by normalizing D[Y’],the subdigraph of
An MM algorithm is said to be switchable if its supporting D induced by Y’.
digraphs need not consist of only isolated vertices. If all of the Execute steps 2)-5) of the RAMS approach of Section V
supporting digraphs are strongly connected, then the algorithm using digraph D ( k ) .
is said to be strongly switchable. Step 2) above can be done based on the following:
Remarks: 1) A fixed digraph algorithm must have a fixed
structure, but a fixed structure algorithm may have, at differ- P{m(k)lm(k - I), WI, 29
ent times, supporting digraphs with different assignments of 1
= - p [ z ( k ) l m ( k ) ,m ( k - l),2 - 1 1
nonzero weights. In other words, a fixed structure algorithm
allows adaptive (or time-varying) mode transition probabil- . P{m(k)Im(k- l),D [ Y ]z,k - 1 )
ities, that is, a nonhomogeneous Markov chain model for V m ( k ) E D[Y] (62)
the system mode sequence. 2) A fixed structure algorithm
has to use a fixed set of models, whereas a fixed model- where c = p [ z ( k ) l m ( k- 1),D [ Y ]z, “ ~ ] is a normalization
set algorithm need not have a fixed structure since both the constant; p [ z ( k ) l m ( k ) m(k-
, 1),zk-’] is the likelihood func-
zero and nonzero weights can be reassigned for its supporting tion; and P { m ( k ) / m ( k- l),D [ Y ]z, k - l } is the predicted
digraph at different times. 3) Almost all effective practical mode probability which can be obtained from P { m ( k -
algorithms are strongly switchable. In fact, the supporting l)]&’} and D[Y].Note that D[Y]is used for notational
digraphs of practical MM algorithms are usually symmetric simplicity to denote the mode set Y along with the mode
(or bidirectional) in the sense that e(w,, v3) exists if and only transitions governed by a (state-dependent) Markov chain.
if e(wj, w,) exists, with only a few exceptions [ I l l . The above AD algorithm can be simplified as follows.
This graph-theoretic formulation of the MM algorithms pro- All modes of a digraph can be classified into three cate-
vides a rigorous framework, which not only makes available gories: unlikely (or insignificant), significant, and principal.
many well-developed techniques and results of graph theory, Consequently, a reasonable set of rules for mode-set (digraph)
but also provides a systematic way of handling mode-set evolution is: 1) discard the unlikely modes, 2 ) keep the
evolution (adaptation) in real time required for the variable significant modes, and 3) activate the modes strongly adjacent
structure MM algorithms. to the principal modes. Such an adjacent set of modes w.r.t.
mode m, is defined by
OF MODELSET (SUPPORTING DIGRAPH)
VII. ADAPTATION
A, = F, n T,= {m3E V ( D ): uz3 # 0 and a3, # O}. (63)
Definitions: Given V’ c IV(D), where V ( D ) denotes
the vertex set of D , then D’ = (V’,E’) is said to be Let U ( k - 1) be a maximum set of the unlikely modes in
the subdigraph of D induced by V’, denoted by D[V’],if D ( k - 1) whose strong connectedness is retained after the
E’ contains all edges of D with both end vertices in V’. removal of U and the incident edges. Then steps 2) and 3) of
Normalization of a weighted digraph is the procedure of the AD algorithm can be replaced by the following two steps:
l7Two digraphs are isomorphic if there is a one-to-one correspondence 2’) Identify each mode in D ( k - 1) to be unlikely, signif-
between their vertex sets that preserves adjacency. icant, or principal.
LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 489

3’) Form Y’ = [ V ( D ( k- 1)) - U ( k - l)]U A , where


A = Cm,ED(lc-l) A, is the union of strong adjacent
sets of the principal modes, which could be empty.
Theorem 7.1: The supporting digraph D ( k ) of the simpli-
fied AD algorithm at any time is that of an ergodic Markov
chain if the initial digraph D ( 0 ) is strongly connected.
Pro08 From the results given in Section VI, it suffices
to show that the strong connectedness is retained in one cycle.
Since normalization and removal of U do not change the strong
connectedness, the theorem clearly holds in the case of no
principal mode. If there is a principal mode, say mode m,, this
mode and any other mode in A, are mutually reachable. Hence,
any pair of modes, say (mn,mm),is mutually reachable since
both (mn,m,) and (mt, m,) are mutually reachable.
Q.E.D.
Fig. 4. A supporting digraph for the model set (64).
The AD algorithm and the above theorem have wide appli-
cability since the supporting digraphs of almost all practical
MM algorithms are bidirectional and thus strongly connected. Example: An IMM estimator that uses a fixed set of 12
If the strong connectedness is not a concern, then U can be modes at any time is used for tracking a maneuvering target
simply replaced by the set of the unlikely modes at a particular in planar motion that may have an acceleration in the range
time and A by the union of the adjacent sets from the principal [-40 m/s2, 40 m/s2] in each dimension [3]. The modes are
modes. characterized by the expected acceleration vector y
The mode probabilities carry the most amount of infor-
m1 : y= [0 01’ m2 :y= [0 201’
mation about which mode is in effect. They are naturally a
m3: y = [a0 01’ m4: y= [-a0 01’
measure of the relative effectiveness of modes. As such, a
m5 : y = [0 - 201’ m6 y = [a0 201’
mode can be considered to be unlikely, principal, or significant (64)
m7: y = [-20 201’ m8: y = [20 - 201’
if its probability is, respectively, below a threshold tl, above
mg: y = [-20 -201’ m10:y = [40 01’
a threshold t z ( t 2 > tl), or in between. mil : 7 = [0 401’ ~ ~ 1 2 =: [0 - 401’.
B. Digraph Switching (DS) Algorithm A supporting digraph for these modes is shown in Fig. 4
with self-loops omitted, where an additional mode, mi3 : y =
Another way of adapting the supporting digraph is to switch 1-40 01, has been added so that an acceleration range of
between a number of predetermined digraphs according to a 0 4 0 m/s2 in all directions is covered as declared in [3]”.
certain rule. Each of these digraphs is a graph-theoretic repre- In fact, mi3 being missed was not discovered until Fig. 4 was
sentation of a certain group of closely related system patterns. plotted-an example of the advantages of the graph-theoretic
The mode sets of these digraphs are not necessarily (in fact, formulation. A strong cover ( 0 1 , Dz, D3, Dq, Os}can then
usually not) disjoint since some modes might belong to more be set up with the following mode sets:
than one group. It is desirable that the predetermined group of
A
distinct digraphs 2) = { D l , D z , . . . , D L } is a (strong) cover V ( D 1 )= { m l , mz, m3, m4r m5>
of the total digraph D in the sense that v(DZ) {ml, m2r m6, m71 mll}
every D, in D is a (strongly connected) stochastic sub- v ( D 3 ) = {ml,m3, m6, m8,mlO}
digraph of D;
v ( D 4 ) { m l , m4, m7, m9, m13}
V ( D )c U,=1 V ( D , ) ,i.e., the mode set of D is covered
by the mode sets of D,, z = 1, 2, . . . , L. This may have V ( D 5 )= (m1, m5r ma, m9, m12}. (65)
to be relaxed if V ( D )is a very large set. Two member digraphs, D1 and Dz, are illustrated in Fig. 5.
In the digraph switching approach, a (strong) cover is set Switching between all pairs of digraphs, except ( D z , 0 5 ) and
up first. This is closely related to the so-called set covering ( 0 3 , D4), is allowed.
problem for which a solution can be obtained by solving an In the context of maneuvering target tracking, D1 in effect
integer linear programming problem [13], [4]. In the context means that very likely the target is not maneuvering and D z ,
of MM algorithms, however, such a cover can be obtained D3, Dq, and 0 5 represent different maneuvers.
in most cases from the physical meaning of the modes. Based on the above design of the strong cover [18], [24], it
The resulting family of digraphs then acts as the range of was shown [24] that a DSIMM with five models at each time
the time-varying supporting digraph of the MM algorithm: has about the same performance as a fixed structure IMM
D ( k ) E 2) = ( 0 1 , D z ,.. ., D L } . Each digraph in this family using all 13 models of Fig. 4.
contains a number of modes in D that are “close” to each Two kinds of switching logics can be used here. A typical
other in, say, an information distance measure [ 5 ] . For many hard switching logic is to switch if the mode probability vector
practical problems, the meaning of closeness is clear, as shown I8Unless mi3 was left out intentionally to stay away from the undesirable
in the following example. total number of modes.
490 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

TABLE I
MODELSETS AND MODELS
USED IN DIFFERENT DESIGNS
design model sets used models used # of models in each set

digraph D1 digraph D2
is particularly advantageous in the case where the set of
Fig. 5. Two representative members of the strong cover of Fig. 4
possible system modes is large.
For example, a simple AG MM algorithm for the example
is in a specified switching region, defined by a certain com- to be considered in the next section may be
bination of thresholds. A simple switching logic is, assuming
D ( 0 ) = (2, 20}
that D2 was in effect at time k - 1
D ( k ) = {Qi, Q 2 , Q3) (67)
with
Q1 = max[l, $ Q ( k - 1)] Q2 = Q ( k - 1)
Q3 = min [30, 2 Q ( k - 1)] (68)
where Q is the estimated value of Q.

where tl and t2 are design parameters, equal to, say, 0.5 and VIII. AN EXAMPLEOF
0.6, respectively. A more sophisticated switching logic based NONSTATIONARYNOISEIDENTIFICATION
on (42)-(47) can also be used.
This section compares, via an example of nonstationary
Assuming that the supporting digraph sequence is a Markov
noise identification, the new adaptive digraph approach and
chain, it is also possible to obtain (design) the transition
the fixed digraph approach. The IMM algorithm is chosen as
probability matrix of this chain and to apply the fixed structure
the MM algorithm on which the comparison is based, because
MM algorithm to it. This can be referred to as the soft
it is one of the most cost-effective hybrid estimation schemes
switching of digraphs (just like the soft model switching in
a decision-free MM algorithm). In this scheme, the fixed [71, 131, ~ 3 1 .
The following second-order linear kinematic system is used
structure MM approach is applied in two levels: model (lower)
in our example:
level and digraph (higher) level.
The initialization of new filters can be done as follows based
on the principle described in Section V. After switching, say,
z(k + 1) = 1 T]z(k) + [$IT2]v(k) (69)
from D1 to D2 at time k , d2(k - Ilk - 1) and Pz(k- I l k - 1) +
z ( k ) = [l O]z(k) w(k) (70)
should be used as the initial conditions for filters based on mg,
m7, and mil. However, after switching, say, from D2 to 0 4
with the sampling period T = 1, where v(k) and w(k) are
at time k , the probabilistically weighted sum of 21( k - 1I k - 1) zero-mean white with variances Q ( k ) and R ( k ) , respectively.
and 2 7 ( k - 1I k - 1) and the corresponding covariance should Consider the problem of estimating time-varying variance
be used as the initial conditions for the filter based on m4; Q of a scalar process noise sequence. If the only information
at time k , the filters based on mg and mi3 should have no about the noise variance to be estimated is its upper and
contributions to the overall estimate because their probabilities lower bounds, an MM algorithm with mly two models can
should both be zero; and at time k +
1, ?4(klk) and its be adopted [20]. If the difference between the upper and
lower bounds is large, however, this approach with a fixed
covariance should be used as the initial conditions for the
filters based on mg and m13. digraph may lead to unsatisfactory results no matter how
many models are used. In this case, the new adaptive digraph
approach can improve the accuracy significantly. Assume that
C. Adaptive Grid (AG) Algorithm the only a priori knowledge is that Q can vary suddenly in the
A third means of obtaining the supporting digraphs is to interval [l], [30]. Denoting by IMMn a fixed structure IMM
make adaptive the grid of the parameters that characterize the algorithm with n models, and by a digraph switching IMM
possible modes. This algorithm follows a similar idea to that algorithm (DSIMM), several designs are listed in Table I with
of the adaptive MMPDA filter [14] or the moving-bank MM the model switching probabilities defined in Fig. 6 (self-loops
estimators [29], [15]. In this scheme, a coarse grid is set up are omitted). In Table I, the numbers denote the values of Q.
initially, and then the grid is adjusted recursively according to The first model set in all of the four variable structure
an adaptation scheme based possibly on the current estimate, designs is the initial digraph specially designed to have a
mode probabilities, and measurement residual. This approach coarse coverage of the mode space.
~

LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 49 1

two- model digraph t hree-model digraph


Fig. 6. Two- and three-model digraphs used i n the designs of Table I

In the results that follow the standard-deviation version was if a more sophisticated adaptation scheme is used. The
used: improvement of the DSIMM algorithm over the fixed
structure IMM algorithm is even more significant if the
Q ( k ) = .(k)2 (71) difference between the upper and lower bounds is larger.
where The digraph switching algorithm is not sensitive to the
choices of models. This is illustrated in Fig. 8 for DSIMM1,
.(k) = .%P{m%(W} (72) DSIMM2, DSIMM3, and DSIMM4, where DSIMM2 and
m,EM(k) DSIMM4 used the same logic (74) as DSIMM1. For DSIMM3,
in which 0 and Q denote the standard deviation and the
variance, respectively. Use of this standard-deviation version
was found to be superior to the normally used variance version
D1 -
the following simple logic was used:
D2
D1 -+ D4
if
if
P(m1) > 0.6
P ( m 2 ) > 0.6
[201 0 2 -+ 0 3 if P { m z } > 0.7
if P { m l } > 0.7 (75)
Q(k) = 1 Q,P{m,(k)lz’I-. (73)
0 3
0 3
-+

-+
DZ
D4 if P{m2} > 0.7
m,EM(k)
0 4 +0 3 if P { m l } > 0.7.
Fig. 7 shows the results via 100 Monte Carlo simulations Note that in this example, DSIMM3, the adaptive digraph
from IMM2, IMM3, and DSIMM1. Three arbitrarily chosen IMM using four digraphs, is not significantly better than the
sequences of the true process noise variances were used, as DSIMM using three digraphs. The major reason is that the
denoted by the solid lines in the figure. Note that the set of noise is not “wild” enough: The lower and upper bounds of its
possible modes is infinite here, and thus there is no optimal standard deviation are approximately 1 and 5.5, respectively.
MM algorithm. The following simple switching logic was used A choice of Q = 1, 10, 30 corresponds roughly to c = 1,
in DSIMM1:19

D1
D1
-
i
D2
D3
if P { m l } > 0.6
if P { m z }> 0.6
3.2, and 5.5. Hence, the model discernibility (separation) will
not be good if more quantization levels are used because it is
too difficult for the algorithm to tell which mode is in effect
from a small number of measurements. A DSIMM using four
0 2

0 3 -
-+ D3
D2
if P { m z }> 0.7
if P ( m 1 ) > 0.7.
The following observations can be made from Fig. 7:
(74)
digraphs indeed gives better results than those from a DSIMM
using three digraphs if the difference between the lower and
upper bounds is larger. Also, if the quantity to be estimated is
the mean, rather than the variance, a well-designed DSIMM
The IMM3 is not better than the IMM2. Specifically, the estimator with four digraphs is better than that with three
IMM3 is too “conservative” to any variation in Q. This is digraphs [20].
a general behavior when too many models are used. The
IMM algorithm will become even more conservative if IX. CONCLUSIONS
more models are used. This is an example where the use
This paper makes contributions in the following four aspects
of more models in a fixed digraph MM algorithm does of MM estimation: 1) the introduction of the variable structure
not improve the performance.
and the construction of the optimal estimator with variable
The DSIMM1, which consists of exactly the same models
structure, 2) the presentation of practical variable structure
as in the IMM3, provides significantly better results
schemes, 3) the formulation of the MM approach in terms
than those from the IMM2 and IMM3. The DSIMMl of graph-theoretic notions, and 4) the theoretical results on the
is also computationally cheaper than the IMM3, since choice of model set.
all of its digraphs consist of only two models and the
Existing fixed structure MM algorithms, which use a fixed
switching logic used is simple. Better results are obtained
set of models, usually perform reasonably well for problems
I9D, denotes the dlgraph associated with the tth model set that can be handled with a small set of models. Many practical
492 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 4, APRIL 1996

25

*^V
,4
+
20 20 -
0)
U
c
15 ._
0 15 -
0
L

W
VI
._ 10 -
10 t
0

5 5 - estimated with G 2
estimated with G 3 .
estimated with G 4
I
o~~~""""'""'"'''''""."...'.""~~~~~~~~ 0 20 40 60 80 100
0 10 20 30 40 50
time time

(a) Fig. 8. Variance estimation by DSIMM with difference designs.

variable structure adaptation schemes have been proposed:


active digraph, digraph switching, and AG. The active digraph
algorithm is quite general and powerful. The simple and effi-
cient digraph switching adaptation based on mode probabilities
works well. This is illustrated via an example of nonstationary
noise identification which demonstrates its superiority to the
fixed structure MM approach. The AG scheme has its unique
advantages for the case in which the set of possible system
modes is very large.
The optimal variable structure MM estimator has been
presented. It has a two-level structure: the multiple model-set
sequences at the higher level and multiple model sequences
at the lower level. The variable structure MM estimation is
based on partitioning the range of the mapping defined by
the state dependency of system mode set, whereas the fixed
structure MM estimation is based on partitioning the domain
of the mapping. Although not feasible, this optimal estimator
provides a theoretical basis for the adaptation of the model
set (digraph). It also suggests the use of variable structures as
a direction for improving performance. This is in contrast to
the existing efforts of developing better implementable fixed
structure algorithms.
The theoretical results for the choice of model set provide
a framework for handling this important but overlooked prob-
lem. This work is the first step towards the establishment of
guidelines for the design of the model set for MM estimation.
The methodology used here is also applicable to multiple-
model control.

REFERENCES
[ l ] G. A. Ackerson and K. S. Fu, "On state estimation in switching
environments," IEEE Trans. Automat. Contr., vol. AC-15, pp. 10-17,
Jan. 1970.
[2] B. D. 0. Anderson and J. B. Moore, Optimal Filtering. Englewood
problems, however, involve more than just a few modes. The Cliffs, NJ: Prentice-Hall, 1979. <

computational burden increases considerably as the number [3] A. Averbuch, S. Itzikowitz, and T. Kapon, "Radar target track-
ing-Viterbi versus IMM," IEEE Trans. Aerosp. Electron. Syst., vol.
of models increases. More importantly, the use of more 27, pp, 550-563, May 1991.
models does not necessarily improve the performance; instead, [4] E. Balas and M. W. Padberg, "Set partitioning: A survey," SIAM Rev.,
vol. 18, pp. 710-761, Oct. 1976.
as shown theoretically in this paper, it may degrade the [5] Y. Baram and N. R. Sandell, Jr., "An information theoretic approach to
performance because of the excessive "competition" from the dynamical systems modeling and identification," IEEE Trans. Automat.
"unnecessary" models. Contr., vol. AC-23, pp. 61-66, Feb. 1978.
[6] -, "Consistent estimation on finite parameter sets with application
The MM algorithms can be formulated in a graph-theoretic to linear systems identification," IEEE Trans. Automat. Contr., vol.
framework. This opens up a new area in MM estimation. Three AC-23, pp. 4 5 1 4 5 4 , June 1978.
LI AND BAR-SHALOM: MULTIPLE-MODEL ESTIMATION 493

[7] Y. Bar-Shalom, Ed., Multitarget-Multisensor Tracking: Applications and Xiao-Rong Li (M’92-SM’95) received the B.S. and
Advances, vol. 11. Nonvood, MA: Artech House, 1992. M.S. degrees from Zhejiang University, Hangzhou,
[8] Y. Bar-Shalom, K. C. Chang, and H. A. P. Blom, “Tracking a maneuver- Zhejiang, P.R.C., in 1982 and 1984, respectively,
ing target using input estimation versus the interacting multiple model and the M.S. and Ph.D. degrees from the University
algorithm,” IEEE Trans. Aerosp. Electron. Syst., vol. 25, pp. 296-300, of Connecticut, Storrs, in 1990 and 1992, respec-
Apr. 1989. tively, all in electrical engineering.
[9] Y. Bar-Shalom and X. R. Li, Estimation and Tracking: Principles, Since 1994, he has been with the Department
Techniques, and Software. Boston, MA. Artech House, 1993. of Electrical Engineering, University of New Or-
1101
. - -, Multitaraet-Multisensor Trackinn: Princiules and Techniques. leans, New Orleans, LA, where he is an Assistant
Storrs, CT: YBS, 1995. Professor. During 1986-1987, he did research on
1111 W. D. Blair, G. A. Watson, and T. R. Rice, “Tracking maneuvering power transmission at the University of Calgary,
targets with an interacting multiple model filter containing exponen- Alberta, Canada. He was an Assistant Professor in the Department of Electrical
tially correlated acceleration models,” in Proc. Southeastern Symp. Syst. Engineering, University of Hartford, West Hartford, CT, from 1992 to 1994.
Theory, Columbia, SC, Mar. 1991. He is a consultant to several companies. His current research interests include
[I21 H. A. P. Blom and Y. Bar-Shalom, “The interacting multiple model stochastic systems, statistical signal processing, and electric power.
algorithm for systems with Markovian switching coefficients,” IEEE Dr. Li has published over 15 refereed joumal articles, three book chapters,
Trans. Automat. Contr., vol. 33, pp. 780-783, Aug. 1988. and coauthored (with Y. Bar-Shalom) two books: Estimation and Tracking:
[ 131 N. Christofides, Graph Theory: An Algorithmic Approach. London: Principles, Techniques, and Soffware (Boston, MA: Artech House, 1993)
Academic, 1975. and Multitarget-Multisensor Tracking: Prcnciples and Techniques (Storrs, C T
[ 141 M. Gauvrit, “Bayesian adaptive filter for tracking with measurements of YBS, 1995). He has also won several outstanding paper awards. He is an
uncertain origin,” Automatica, vol. 20, pp. 217-224, Mar. 1984. Editor for Tracking and Navigation of the IEEE TRANSACTIONS ON AEROSPACE
[I51 J. A. Gublakon and P. S. Maybeck, “Flexible spacestructure control via
AND ELECTRONIC SYSTEMS.
moving-bank multiple model algorithms,” IEEE Trans. Aerosp. Electron.
Syst., vol. 30, pp. 750-757, July 1994.
. - A. H. Jazwinski, Stochastic Processes and Filterinn Theorv. New York:
1161
Academic, 1970.
1171 X. R. Li, “Hybrid estimation techniques,” in Control and Dynamic
Systems: Advaices in Theory and Applications, vol. 76, C. T. Leondes,
Ed. New York: Academic, 1996, pp. 1-76.
1181 -, “Hybrid state estimation and performance prediction with appli-
cations to air traffic control and detection threshold optimization,” Ph.D. Yaakov Bar-Shalom (S’63-M’66-SM’SO-F’84)
dissertation, Univ. Connecticut, Storrs, 1992. was born on May 11, 1941. He received the B S
[I91 X. R. Li and Y. Bar-Shalom, “A hybrid conditional averaging technique and M S degrees from the Technion, Israel Institute
for performance prediction of algorithms with continuous and discrete of Technology, Haifa, in 1963 and 1967, and the
uncertainties.” in Proc. 1994 Amer. Contr. Con(, Baltimore, MD, June Ph D degree from Princeton University, Princeton,
1994, pp. 1530-1534. NJ, in 1970, all in electrical engineering
1201
~. -, “A recursive multiple model approach to noise identification,” From 1970 to 1976, he was with Systems Control,
IEEE Trans. Aerosp. Electron. Syst., voir 30, pp. 671-684, July 1994. Inc., Palo Alto, CA Currently, he is a Professor of
[21] - “Mode-set adaptation in multiple-model estimators for hybrid
~
Electrical and Systems Engineering at the University
systems,” in Proc. 1992 Amer. Contr. Con$, Chicago, IL, June 1992, of Connecticut, Storrs. His research interests are in
pp. 1794-1799. estimation theory and stochastic adaptive control,
[22] __, “Performance prediction of hybrid algorithms,” in Control and and he has published over 180 papers in these areas. He coauthored the
Dynamic Systems: Advances in Theory and Applications, vol. 72, C. T. monograph, Tracking and Data Association (New York: Academic, 1988);
Leondes, Ed. New York: Academic, 1995, pp. 99-151. the graduate text, Estimation and Trucking: Principles, Techniques and
1231 __, “Performance prediction of the interacting multiple model Sofiware (Boston. Artech House, 1993); the text, Multztarget-Multisensor
algorithm,” IEEE Trans. Aerosp. Electron. Syst., vol. 29, pp. 755-771, Tracking. Principles and Techniques (YBS, 1995); and edited the books,
July 1993. Multitarget-Multisensor Tracking Applications and Advances (Boston. Artech
[24] H. Lin and D. P. Atherton, “An investigation of the SFIMM algorithm
House, vol I, 1990, vol 11, 1992). He has been consulting for numerous
for tracking maneuvering targets,” in Proc. 32nd IEEE Conf Decision
companies and originated the series of multitarget-multisensor tracking short
Contr., San Antonio, TX, Dec. 1993, pp. 930-935. courses offered via UCLA Extension, at government laboratories, private
[25] D. G. Luenberger, Linear and Nonlinear Programming, 2nd ed. Read-
companies, and overseas. He has also develo ed the commercially available
ing, MA: Addison-Wesley, 1984.
1261 D. T. Magill, “Optimal adaptive estimation of sampled stochastic interactive software packages MULTIDATTJ for automatic track formation
processes,” IEEE Trans. Automat. Contr., vol. AC-10, pp. 434-439, and tracking of maneuvering or splitting targets in clutter, PASSDATTM
1965. for data association from multiple passive sensors, BEARDATTM for tar et
[27] M. Mariton, Jump Linear Control Systems in Automatic Control. New localization from bearing and frequency measurements in clutter, IMDAT’M
York: Marcel Dekker, 1990. for image segmentation and target centroid tracking, and FUSEDATTM for
[28] P. S. Maybeck, Stochastic Models, Estimation and Controls, vols. 11, fusion of possibly heterogeneous multisensor data for tracking.
111. New York: Academic, 1982. Dr. Bar-Shalom served as Associate Editor of the IEEE TRANSACTIONS
[29] P. S. Maybeck and K. P. Hentz, “Investigation of moving-bank multiple ON AUTOMATIC CONTROL, and from 1978 to 1981 as Associate Editor of
model adaptive algorithms,” AIAA J. Guidance, Contr., Dynamics, vol. Automatica He was Program Chairman of the 1982 American Control
10, pp. 90-96, Jan.-Feb. 1987. Conference, General Chairman of the 1985 ACC, and Co-chairman of the
[30] L. R. Rabiner, “A tutorial on hidden Markov models and selected 1989 IEEE International Conference on Control and Applications. During
applications in speech recognition,” Proc. IEEE, vol. 77, pp. 257-286, 1983-1987, he served as Chairman of the Conference Activities Board
Feb. 1989. of the IEEE Control Systems Society, and during 1987-1989, he was a
[31] S. N. Sheldon and P. S. Maybeck, “An optimizing design strategy for member of the Board of Governors of the IEEE Control Systems Society In
multiple model adaptive estimation and control,” IEEE Trans. Automat. 1987, he received the IEEE Control Systems Society Distinguished Member
Contr., vol. 38, pp. 651-654, Apr. 1993. Award. Currently, he is a Distinguished Lecturer of the IEEE Aerospace
[32] K. Thulasiraman and M. N. S. Swamy, Graph Theory: Theory and and Electronic Systems Society He was elected a Fellow of the IEEE for
Algorithms. New York: Wiley, 1992. “contributions to the theory of stochastic systems and of multitarget tracking.”

You might also like