0% found this document useful (0 votes)
67 views7 pages

1 Deriving Kalman Filter

Uploaded by

Muhammad Haroon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views7 pages

1 Deriving Kalman Filter

Uploaded by

Muhammad Haroon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Global Journal of Science Frontier Research: F

Mathematics and Decision Sciences


Volume 17 Issue 3 Version 1.0 Year 2017
Type : Double Blind Peer Reviewed International Research Journal
Publisher: Global Journals Inc. (USA)
Online ISSN: 2249-4626 & Print ISSN: 0975-5896

Deriving Kalman Filter - An Easy Algorithm


By Amaresh Das & Faisal Alkhateeb
Southern University At New Orleans, United States
Abstract- The Kalman filter may be easily understood by the econometricians, and forecasters if it
is cast as a problem in Bayesian inference and if along the way some well-known results in
multivariate statistics are employed. The aim is to motivate the readers by providing an
exposition of the key notions of the predictive tool and by laying its derivation in a few easy steps.
The paper does not deal with many other ad hoc techniques used in adaptive Kalman filtering.
Keywords: bayes’s theorem, state-space forecasting.
GJSFR-F Classification: MSC 2010: 11Y16

DerivingKalmanFilterAnEasyAlgorithm

Strictly as per the compliance and regulations of :

© 2017. Amaresh Das & Faisal Alkhateeb. This is a research/review paper, distributed under the terms of the Creative Commons
Attribution-Noncommercial 3.0 Unported License http://creativecommons.org/licenses/by-nc/3.0/), permitting all non commercial
use, distribution, and reproduction in any medium, provided the original work is properly cited.
Deriving Kalman Filter - An Easy Algorithm
Notes
Amaresh Das α & Faisal Alkhateeb σ

2017
Year
Abstract- The Kalman filter may be easily understood by the econometricians, and forecasters if it is cast as a problem
in Bayesian inference and if along the way some well-known results in multivariate statistics are employed. The aim is to
motivate the readers by providing an exposition of the key notions of the predictive tool and by laying its derivation in a
11
few easy steps. The paper does not deal with many other ad hoc techniques used in adaptive Kalman filtering.

Global Journal of Science Frontier Research ( F ) Volume XVII Issue III Version I
Keywords: bayes’s theorem, state-space forecasting.

I. Introduction
The Kalman filter wants to find at each iteration, the most likely cause of the
measurement of Yt given the approximation made by a flawed estimation the linear
dynamics 1. What is important here is not only that we have the measurement and the
prediction, but knowledge of how each is flawed. 2 In the Kalman case, this knowledge is
given by the covariance matrixes (essentially fully describing the distribution of the
measurement and prediction for the Gaussian case). The main principle of forecasting is
to find the model that will produce the best forecasts, not the best fit to the historical
data. The model that explains the historical data best may not be best predictive
model 3.The power of the Kalmancomes from its ability not only to perform this
estimation once (a simple Bayesian task) but to use both estimates and knowledge of
their distributions to a distribution for the updated estimate, thus iteratively
calculating the best solution for state at each iteration 4.
Let Yt , Yt +1 , . . . . Y1 , the data (which may be either scalar or vertical) denote the
observed values of a variable of interest at times t, t-1, . . . . 1. We assume that Yt

1
The famous work by [1] was extension of Weiner’s classical work. They focused attention upon aclass of linear minimum-error
variance sequential error estimation algorithm. While the problem of linear minimum variance sequential filtering
2
In the Kalman case, this knowledge is given by the covariance matrixes (essentially fully describing the distribution of the
measurement and prediction for the Gaussian case.While many derivations of the Kalmanfilter are available, utilizing the
orthogonalityprinciple orfnding iterative updates to the Best Linear Unbiased Estimator (BLUE), we will derive the Kalmanfilter here
using a Bayesian approach, where 'best' is interpretedin the Maximum A-Posteriori (MAP) sense 2instead of Gaussian innovations. This
forecasting algorithm [5] is very flexible method that is particularly suitable in nonstationary time series. The Eq [7] used the method to
forecast demand in the alcoholic drink industry over a period that included record demand followed by a drought and the imposition of
a new excise duty.
3
The future may not be described by the same probability as the past. Perhaps neither the past nor the future is a sample from any
probability distribution. The time series could be nothing more than a non-recurrent historical record. •The model may involve too many
parameters. Over fitted models could account for noise or other features in the data that are unlikely to extend into the future.The error
involved in fitting a large number of parameters may be damaging to forecast accuracy, even when the model is correctly specified.
4
It will be very convenient for the readers to remember the keywords used in the text:Filtering- When we estimate the current value
given past and current observations, Smoothing: - When estimating past values given present and past measures, and Prediction - :
When estimating a probable future value given the present and the past measures.

Author α: Professor, College of Business, Southern University at New Orleans & Department of Mathematics, University of New Orleans.
e-mail: [email protected]
Author σ: Assistant Professor, College of Business, Southern University at New Orleans.

© 2017 Global Journals Inc. (US)


Deriving Kalman Filter - An Easy Algorithm

depends on an unobservable be either a scalar or a vector whose dimension is


independent of the dimensions of Yt the relationship between Yt and φ t is linear and is
specified by the observation equation
Yt = ϖ t φ t + υ t (1)
where ϖ t is a known quantity. The observation error υ t is assumed to be normally
distributed with mean zero and a known variance υ t denoted as υ t → N ( 0, υ t )
The essential difference between the Kalman filter and the conventional linear Notes
model representation is that in the former, the state of nature - analogous to the
regression coefficients of the latter – is not assumed to be a constant but may change
2017

with time. This dynamic feature is incorporated via the system equation wherein
Year

φt = Ψ t φt + ζ t (2)
21
Ψt being a known quantity and the system equation error ζ t → N (0, ζ t ) with ζ t known.
Global Journal of Science Frontier Research ( F ) Volume XVII Issue III Version I

Since there are physical systems for which the state of nature φ t changes over time
according to a relationship prescribed by engineering or scientific principles, the ability
to include a knowledge of the system behavior in the statistical model is an apparent
source of attractiveness of the Kalman filter. Note that the relationships (1) and (2)
specified through ϖ t and ψ t may or may not change with time, as is also true of the
variance υ t and ζ t we have subscripted these here for the sake of generality. In addition
to the usual linear model assumptions regarding the error terms, we also postulate that
υ t is independent of time. The extension of the case of dependency is straightforward.

II. Extension of the Concept


To look at how the Kalman filter model might be employed in practice, we
consider a situation in the context of statistical quality control. Here the observation Yt
is a simple (approximately normal) transform of the number of defectives observed in a
sample obtained at time t, while φ1t and φ 2t , represents, respectively, the true defective
index of the process and the drift of the index. We have here as the observation
equation
Y1 = φ1t + υ1t (3)
and as the system equations
φ1, t = φ 2, t + ζ 2, t

φ 2, t = φ 2, t -1 + ζ 2, t

In vector notation, the system of equation becomes φ t = ψφ t -1 + π t


where
φ1t  1 1  ζ1t 
     
φt =   and π t =    
φ 2 t  0 1 ζ 2 t 

0 1
ζ =  

0 1

© 2017 Global Journals Inc. (US)


Deriving Kalman Filter - An Easy Algorithm

does not change with time.


If we examine Yt - . Yt - 1. for this model, we observe that, under the assumptions of
constant variance, υ t = υ and ζ t = ζ the autocorrelation structure of the difference is
identical to that of ARIMA (0, 1, 1) process in the sense of [1]. Although with a
correspondence is sometimes easily discernible, we should in general, not because of the
discrepancies in the philosophies and methodologies involved, consider the two
approaches to be equivalent.
Ref III. Recursive Procedure of the Filter

2017
The Kalman filter refers to a recursive procedure for inference about the state of
nature φ t . The key notion here is that given the data Yt = ( Yt ,.. . . Y1 ) the inference

Year
1. Box and G E P and Jenkins G M (1970) Time Series Analysis, Forecasting and

about φ t can be carried out through a direct application of a Bayes’s theorem.


Prob {State of Nature│Data= Prob (Data│ State of Nature} 31
which can be written as

Global Journal of Science Frontier Research ( F ) Volume XVII Issue III Version I
P( φ t Yt ) = P ( Yt φ t , Y t -1) x P ( φ t Yt -1 ( (4)

Where the notion of P (A│B) denotes the probability of occurrence of even A


given that (or conditional on) event B has occurred. Note that the expression on the left
side of (4) denotes the posterior distribution for φ at time t, whereas the first and
second expression on the right side denotes the likelihood and the prior distribution for φ,
respectively.
The recursive procedure can best be explained if we focus attention on time point
t-1, t = 1, 2, and the observed data until then, ( Yt−1, Y t -2. .. . Y1 ) ). In what follows, we
use matrix manipulation in allowing for Y and / or, φ to be vectors without explicitly
noting them as such.
At t-1, our state of knowledge without φ t−1 is embodied in the following
Control, San Francisco, Holden- Day

probability statement for φ t−1



( φ t −1 Yt -1 → N ( φ t −1 , Σ t -1 ) (5)

where φ and Σ t−1 are the expectation and variance of φ t −1 Yt -1 In effect (5) represents
posterior distribution of φ t−1 ; its evaluation will become clear in the subsequent text.
It is helpful to remark here that the recursive procedure is stated off at a time 0
by choosing φ̂ 0 and Σ 0 to be our best guess about the mean and the variance of φ 0 ,
respectively.
We now look forward to time t but in two stages 5
1. Prior to observing Yt and
2. After observing Yt

5
Kalman filters are ideal for systems which are continuously changing. They have the advantage that they are light on memory (they
don’t need to keep any history other than the previous state), and they are very fast, making them well suited for real time problems and
embedded systems. For a Monte Carlo Sampling Method for Bayesian Filter see [3] Sequential Bayesian filtering is the extension of the
Bayesian estimation for the case when the observed value changes in time. It is a method to estimate the real value of an observed
variable that evolves in time. See [11]

© 2017 Global Journals Inc. (US)


Deriving Kalman Filter - An Easy Algorithm

Stage 1
Prior to choosing Yt our best choice for φ t is governed by the system equation (2)
and is given by Ψ φ t -1 + ζ t . Since φ t−1 is described by (5) and state of knowledge above
φ t is embodied in the probability statement

( φ t −1 Yt -1 → N ( ψ t , φ t −1 , Θ = Ψt Σ t −1 ψ ′ + ζ t ) (6)

This is our prior distribution.


In observing (6) which represents our prior for φ t in the next cycle of (4), we use
Notes
the well -known result that for any constant c
2017

X → N ( µ, Σ ) = C X → N ( Cµ , C Σ C ′ )
Year

Where C denotes the transpose of C


41
Stage 2
Global Journal of Science Frontier Research ( F ) Volume XVII Issue III Version I

On observing Yt our goal is to complete the posterior of φ t using(4). However, to


do this, we need to know the likelihood ℜ ( φ t Yt ) or equivalently P ( Yt ) the
determination of which is undertaken via the following arguments.
Let e t denote the error in predicting Yt from the point t-1; thus
 
e t = Yt - Yt = Yt − ω t ψ t φ t −1 (7)

Since ωt , ψ t and φt −1 are all known, observing Yt is equivalent to observing e t .
Thus (4) can be written as

P ( φ t Yt , Yt -1 ) = P ( φ t e t -1 ) = P ( e t φ t , Yt -1 ) X

P ( φ t Yt -1 ) (8)

with P ( e t φ t , Yt −1 ) being the likelihood6.


Using the fact that Yt = ω t ϕ t + υ t (7) can be written as e t = ω t ( φ t - ψ t φ t -1 ) + υ t so

that Σ (e t φ t , Yt −1 ) = ω t ( φ t - ψ t φ t −1 )
Since υ t → N ( 0 , υ t ) it follows that the likelihood function is described by

( e t φ t , Yt −1 ) → N ( υ t ( φ t - ψ t φ t -1 ) , υ t (9)

We can now use Bayes’s theorem (eel 8) to obtain

P ( e t φ t , Yt −1 ) x P ( φ t Yt -1 )
P ( φ t Yt , Yt −1 ) = (10)
∫all et P ( e t , φ t Yt -1 ) d φ1

and this best describes our state of knowledge about φ t at time t. Once P( φ t Yt , Yt -1 ) is
continued, we can go back to (5) for the next cycle of the recursive procedure.
Therefore, Kalman filter can be a very effective forecasting tool. It should be useful in a

6
The opportunity exists to proclaim an inherent equivalence of the least square estimation and Kalman filter theory, See [3] See also [2]

© 2017 Global Journals Inc. (US)


Deriving Kalman Filter - An Easy Algorithm

wide variety of situations. [9]developeda complete numerical procedure called ‘state-


space forecasting’ for predicting future values of a multivariate stationary process Yt
given past values. The procedure involves basically two main stages.
a. First fit a canonical state-space model to the given observation using Akaike’s
canonical correlations technique to determine the dimensions of the state vector and
to provide estimates of the non-zero elements of the matrix ω . A multivariate AR
model is also fitted to the observations(using AIC to determine the order) to provide
Ref estimates of Σ t and the impulse response matrices.
b. Having filtered and estimated ω , ψ, Σ , the procedures are computed recursively using

2017
Kalman’s algorithm 7. Practical applications are given in the paper by Mehra.

Year
9. Mehra, R K (1979) “Kalman filters and their Applications in Forecasting’ TIMS
Studies in Management Sciences Ed M K Starr, Amsterdam, North-Holland, 37, 207-

IV. Conclusion
51
The note presents a mathematical theory of Kalman filtering. The filtering

Global Journal of Science Frontier Research ( F ) Volume XVII Issue III Version I
techniques is discussed as a problem in Bayesian inference in a series of elementary
steps, enabling the optimality of the process to be understood. The style of the note is
informal and the mathematics elementary but rigorous, making it accessible to all those
with a minimal knowledge of linear algebra and systems theory Many other topics
related to Kalman filtering are ignored (for example, Wavelet) although occasionally we
referred to them inside the text.

References Références Referencias


1. Box and G E P and Jenkins G M (1970) Time Series Analysis, Forecasting and
Control, San Francisco, Holden- Day.
2. Chui, C K and Chen G (1990) Kalman Filtering with Real-Time Applications,
Second Edition, Springer- Verlag.
3. Doucet, A, Godsill S and Andrieu C (2000) “On Sequential Monte Carlo Sampling
Method for Bayesian Filters, Statistics and Computing, 10, 197-208.
4. Duncan D B and Harvard S D (1972) “Linear Dynamic Recursive Estimation from
the View Point of Regression Analysis” Journal of the American Statistical
Association, 67, 815-821.
5. Harrisin F J and Stevens, C F (1971) ‘A Bayesian Approach to Short-term
Forecasting’, Operations Research Quarterly, 22, 341-362.
6. Grossman, A and Morlet J (1984) “Decompositions of Hardy Functions into Square
Intrgrable Wavelets of Constant Shape, SIAM J of Mathematics Annals, 15, 720-736.
7. Johnson, F R and Harrison, P J (1980) ‘An Application of Forecasting in the
Alcoholic Drinks Industry” Journal of the Operations Research Society, 31, 699-709.
8. Kalmam R E and BucyR S (1961) New Results in Linear Filtering and Predictions,
Trans ASME Journal Basic Engineering, 83. 95-108.
213.

7
In addition to the Kalman filtering algorithms there are other time domain algorithms available in literature. Perhaps the most exciting
ones are the so-called wavelet algorithms. Wavelets were first introduced by [6].Wavelets are based on translation W(x) → W ( x + 1)
and above all on dilation (w (x) → (2 x ) The basic dilation is a two-scale difference equation Φ ( x) = Σ c k Φ ( 2x - k ) ... We look for a
solution normalized by Φ dx = 1 The first requirement on the coefficients c k comes from multiplying by 2 and integrating

2 ∫ Φ dx = Σ c k ∫ Φ ( 2x - k ) d (2x - k ) yields Σ c k = 2. Uniqueness of Φ is ensured by Σ c k = 2

© 2017 Global Journals Inc. (US)


Deriving Kalman Filter - An Easy Algorithm

9. Mehra, R K (1979) “Kalman filters and their Applications in Forecasting’ TIMS


Studies in Management Sciences Ed M K Starr, Amsterdam, North-Holland, 37, 207-
213
10. Mein hold, R and Singpurwalla, N D (1983) “Understanding the Kalman Filter’ The
American Statistician, Vol 37, Issue 2.
11. Sarakka, Simo (2013) Bayesian Filtering and Smoothing, Cambridge University
Press (PDF).
Notes
2017
Year

61
Global Journal of Science Frontier Research ( F ) Volume XVII Issue III Version I

© 2017 Global Journals Inc. (US)

You might also like