0% found this document useful (0 votes)
36 views104 pages

SE 513: System Identification: ARX-ARMAX-Other Models

The document discusses system identification with a focus on random processes and their characteristics, including definitions and classifications of stochastic processes. It covers key concepts such as moments, auto-correlation, and stationarity, specifically strict sense stationary processes. Additionally, it introduces models like ARX and ARMAX, which are used in system identification methodologies.

Uploaded by

hybridabduh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views104 pages

SE 513: System Identification: ARX-ARMAX-Other Models

The document discusses system identification with a focus on random processes and their characteristics, including definitions and classifications of stochastic processes. It covers key concepts such as moments, auto-correlation, and stationarity, specifically strict sense stationary processes. Additionally, it introduces models like ARX and ARMAX, which are used in system identification methodologies.

Uploaded by

hybridabduh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SE 513: System Identification

Topic 07
ARX-ARMAX-Other Models

Md Shafiullah, Ph.D.
Lecture Outline
 Random Processes
 Shift Operators
 AutoRegressive with eXtra input (ARX) model
 AutoRegressive Moving Average with eXtra input
(ARMAX) model
 Other models

2 Md Shafiullah, Ph.D.
Random Processes
 A stochastic process is a mathematical description of random events that
occur one after another. It is possible to order these events according to the
time at which they occur.
 A stochastic process, also known as a random process, is a collection of
random variables indexed by time. A random process is conceptually an
extension of a random variable.

Daily weather data,


exchange rate changes,
systems information, and
medical information such
as ECG, EEG, pressure,
and temperature.
3 Md Shafiullah, Ph.D.
Random Processes
 In mathematics, the moments of a function are certain quantitative measures related
to the shape of the function's graph (e.g., random process).
 If the function represents mass density, then the zeroth moment is the total mass,
the first moment (normalized by total mass) is the center of mass, and the second
moment is the moment of inertia.
 If the function is a probability distribution or a random process, then the first
moment is the expected value (mean), the second central moment is the variance
(positive square root of the variance is the standard deviation), the third
standardized moment is the skewness, and the fourth standardized moment is the
kurtosis.
 Higher order and mixed moments are also evaluated for systems with higher
degrees of freedom.

In Physics, it is different. For instance, the moment of force is the torque!

4 Link Md Shafiullah, Ph.D.


Random Processes

5 Link Md Shafiullah, Ph.D.


Random Processes

6 Link Md Shafiullah, Ph.D.


Random Processes
 According to the characteristics of t and the random variable X(t) at time t, the
random processes are classified as:
 Continuous random process: X(t) and t both are continuous. For instance, X(t) represents
the minimum temperature at a place in the interval (0, t).
 Continuous random sequence: X(t) is continuous and t is discrete. For instance, Xn
represents the temperature at the end of nth hour of a day in the interval (1, 24).
 Discrete random process: X(t) is discrete and t is continuous. For instance, X(t) represents
the number of telephone calls received in the interval (0, t).
 Discrete random sequence: X(t) and t both are discrete. For instance, Xn represents the
outcome of the nth toss of a fair die, then, {Xn: n≥1} is a discrete random sequence.
 They can also be classified as:
 Deterministic random process: if all future values can be predicted from the past
observations, X(t)=Acos(wt+ θ) which can be described in terms of A, w, and θ.
 Non-deterministic random process: if the future values cannot be predicted from any
function and cannot be described in terms of a finite number of parameters.

7 Link Md Shafiullah, Ph.D.


Random Processes
 Let’s assume the following function is a random process:
𝑋(𝑡) = 𝑋𝑡 : 𝑡 ∈ 𝐽 where 𝐽 → ℝ

 The first moment, the expected value (mean), is defined as:



𝑚 𝑡 = 𝐸 𝑋(𝑡) = න 𝑋(𝑡)𝑓 𝑥, 𝑡 𝑑𝑥
−∞

 The second moment, the variance, is defined as:


𝑣 𝑡 = 𝑉𝑎𝑟 𝑋(𝑡) = 𝑉 𝑋(𝑡) = 𝐸 𝑋 2 𝑡 − 𝐸𝑋 𝑡 2

 The auto-correlation function is defined as:


∞ ∞
𝑅 𝑡1 , 𝑡2 = 𝐸 𝑋(𝑡1 )𝑋(𝑡2 ) = න න 𝑋(𝑡1 )𝑋(𝑡2 )𝑓 𝑥1 , 𝑥2 , 𝑡1 , 𝑡2 𝑑𝑥1 𝑑𝑥2
−∞ −∞

8 Link Md Shafiullah, Ph.D.


Random Processes
 The auto-correlation function is defined as:
∞ ∞
𝑅 𝑡1 , 𝑡2 = න න 𝑋(𝑡1 )𝑋(𝑡2 )𝑓 𝑥1 , 𝑥2 , 𝑡1 , 𝑡2 𝑑𝑥1 𝑑𝑥2
−∞ −∞

Or, 𝑅 𝑡, 𝑡 + 𝜏 = 𝐸 𝑋 𝑡 𝑋(𝑡 + 𝜏) = 𝑅𝑋𝑋 𝑡1 , 𝑡2 = 𝑅𝑋𝑋 𝑡, 𝑡 + 𝜏

Where, 𝜏 is the time difference:𝜏 = 𝑡2 − 𝑡1

 The auto-covariance function is defined as:


𝐶𝑋𝑋 𝑡1 , 𝑡2 = 𝐸 {𝑋(𝑡1 ) − 𝑚 𝑡1 }{𝑋(𝑡2 ) − 𝑚 𝑡2 }

𝐶𝑋𝑋 𝑡1 , 𝑡2 = 𝑅𝑋𝑋 𝑡1 , 𝑡2 − 𝐸 𝑋(𝑡1 ) 𝐸 𝑋(𝑡2 ) = 𝑅𝑋𝑋 𝑡1 , 𝑡2 − 𝑚 𝑡1 𝑚 𝑡2

9 Link Md Shafiullah, Ph.D.


Random Processes
 The auto-covariance function is defined as:
𝐶𝑋𝑋 𝑡1 , 𝑡2 = 𝑅𝑋𝑋 𝑡1 , 𝑡2 − 𝐸 𝑋𝑡1 𝐸 𝑋𝑡2

If 𝑡1 = 𝑡2 = 𝜏:
𝐶𝑋𝑋 𝑡, 𝑡 = 𝐸 𝑋 2 𝑡 − 𝐸𝑋 𝑡 2
= 𝑉𝑎𝑟 [𝑋 𝑡 ]

 The correlation coefficient is defined as:


𝐶𝑋𝑋 𝑡1 , 𝑡2
𝜌𝑋𝑋 𝑡1 , 𝑡2 =
𝐶𝑋𝑋 𝑡1 , 𝑡2 , 𝐶𝑋𝑋 𝑡2 , 𝑡1

Note:
𝜌𝑋𝑋 𝑡, 𝑡 = 1

10 Link Md Shafiullah, Ph.D.


Random Processes
 The cross-covariance of two random processes are defined as:
𝐶𝑋𝑌 𝑡1 , 𝑡2 = 𝑅𝑋𝑌 𝑡1 , 𝑡2 − 𝐸 𝑋𝑡1 𝐸 𝑌𝑡2

If 𝑡2 − 𝑡1 = 𝜏: 𝐶𝑋𝑌 𝑡, 𝑡 + 𝜏 = 𝑅𝑋𝑌 𝑡, 𝑡 + 𝜏 − 𝐸 𝑋(𝑡) 𝐸 𝑌 𝑡 + 𝜏

 Their cross-correlation is defined as:


𝑅𝑋𝑌 𝑡1 , 𝑡2 = 𝐸 𝑋𝑡1 𝐸 𝑌𝑡2

𝑅𝑋𝑌 𝑡, 𝑡 + 𝜏 = 𝐸 𝑋(𝑡) 𝐸 𝑌 𝑡 + 𝜏

 The cross-correlation coefficient is defined as:


𝐶𝑋𝑌 𝑡1 , 𝑡2
𝜌𝑋𝑌 𝑡1 , 𝑡2 =
𝐶𝑋𝑌 𝑡1 , 𝑡1 , 𝐶𝑋𝑌 𝑡2 , 𝑡2

11 Link Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 Stationarity refers to time invariance of some, or all, of the statistics of a random
process, such as mean, autocorrelation, nth-order distribution
 We define two types of stationarity: strict sense (SSS) and wide sense (WSS).
 A random process X(t) (or Xn) is said to be SSS if all its finite order distributions
are time invariant.
 So for an SSS process, the first-order distribution is independent of time, and the
second-order distribution—the distribution of any two samples X(t1) and X(t2)
—depends only on τ = t2 −t1

12 Link1, Link2 Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 If a random process is stationary to all order then the random process is said to
be a strict sense stationary process.
 The mean and variance of a first-order stationary process are constants.

𝐸 𝑋(𝑡) = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

𝑉 𝑋(𝑡) = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

 A second-order stationary process is also a first-order stationary process and the


reverse is not always true.

13 Link Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 Evaluate the mean of the following random process, 𝑋(𝑡), where A and ω are
constants, φ is a random variable uniformly distributed in (0, 2π):
𝑋 𝑡 = A sin (ωt + φ)
 The mean of the process:
∞ 2𝜋
𝐸[𝑋 𝑡 ] = න 𝑋 𝑡 𝑓 φ 𝑑φ = න A sin (ωt + φ) × 𝑓 φ × 𝑑φ
−∞ 0

2𝜋
1 𝐴 2𝜋
𝐸𝑋 𝑡 =න A sin (ωt + φ) × × 𝑑φ = න sin (ωt + φ) 𝑑φ
0 2𝜋 2𝜋 0

𝐸𝑋 𝑡 = 0 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

1 1
From the definition of the uniform distribution: 𝑓 φ = =
2𝜋−0 2𝜋

14 Link Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 Find first and second moment of the function, 𝑋 𝑡 = cos(ω0 𝑡 + 𝜃), where
𝜃 is a random variable uniformly distributed in (- π, π):
 Now, the mean (first moment) of the process:
∞ 𝜋
𝐸𝑋 𝑡 = න 𝑋 𝑡 𝑓 𝜃 𝑑𝜃 = න cos(ω0 𝑡 + 𝜃) × 𝑓 𝜃 × 𝑑𝜃
−∞ −𝜋

1 𝜋
𝐸𝑋 𝑡 = න cos(ω0 𝑡 + 𝜃) 𝑑𝜃 = 0 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
2𝜋 −𝜋

1 1
From the definition of the uniform distribution: 𝑓 𝜃 = =
𝜋−(−𝜋) 2𝜋
 The second moment of the process:
𝑣 𝑡 = 𝑉𝑎𝑟 𝑋𝑡 = 𝑉 𝑋𝑡 = 𝐸 𝑋 2 𝑡 − 𝐸𝑋 𝑡 2

15 Link Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 Find first and second moment of the function, 𝑋 𝑡 = cos(ω0 𝑡 + 𝜃), where
𝜃 is a random variable uniformly distributed in (- π, π):
 The second moment of the process:

𝑣 𝑡 = 𝑉𝑎𝑟 𝑋𝑡 = 𝑉 𝑋𝑡 = 𝐸 𝑋 2 𝑡 − 𝐸𝑋 𝑡 2

𝑉 𝑋𝑡 = 𝐸 𝑋 2 𝑡 − 0 2 = 𝐸 𝑋2 𝑡 = 𝐸 𝑐𝑜𝑠 2 ω0 𝑡 + 𝜃

1 + cos[2 ω0 𝑡 + 𝜃 ] 1 cos[2 ω0 𝑡 + 𝜃 ]
𝑉 𝑋𝑡 = 𝐸 =𝐸 +
2 2 2

1 1
𝑉 𝑋𝑡 = + 𝐸[cos 2ω0 𝑡 + 2𝜃 ]
2 2

As, 𝐸 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡


16 Link Md Shafiullah, Ph.D.
Random Processes
 Strict sense stationary (SSS) process
 Calculation of the mean value for tossing a die:
6 1 1 1 1 1 1
𝐸 𝑋(𝑡) = ෍ 𝑛𝑃𝑛 = 1 × + 2 × + 3 × + 4 × + 5 × + 6 ×
𝑛=1 6 6 6 6 6 6

𝐸 𝑋(𝑡) = 3.5
 If the random process 𝑋(𝑡) takes the value -1 with probability 1/3 and takes the
value 1 with probability 2/3, find whether 𝑋(𝑡) is a stationary process or not.
Mean:
1 1 2 1
𝐸 𝑋(𝑡) = ෍ 𝑛𝑃𝑛 = −1 × + +1 × = = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
−1 3 3 3
Variance:
1 2
2 2 2
1 8
𝑉 𝑋(𝑡) = 𝐸 𝑋 𝑡 − 𝐸𝑋 𝑡 = ෍ 𝑛 𝑃𝑛 − = = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
−1 3 9

17 Link Therefore, 𝑋(𝑡) is a SSS process. Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 Consider the random process 𝑋 𝑡 = cos(𝑡 + φ), where is a φ random
variable with density function f(φ) = 1/π, -π/2 < φ < π/2, check whether the
process is stationary or not.

 The first moment of the process:


∞ 𝜋/2
𝐸𝑋 𝑡 = න 𝑋 𝑡 𝑓 φ 𝑑φ = න cos(𝑡 + φ) × 𝑓 φ × 𝑑φ
−∞ −𝜋/2

2
𝐸𝑋 𝑡 = cos 𝑡 ≠ 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
𝜋

 Therefore, 𝑋 𝑡 is not a SSS process.

18 Link Md Shafiullah, Ph.D.


Random Processes
 Strict sense stationary (SSS) process
 Show that the random process, 𝑋 𝑡 = 𝐴cos(ω0 𝑡 + 𝜃), where A and ω0 are
constant and 𝜃 is a random variable uniformly distributed in (0, π).
 The first moment of the process:
∞ 𝜋/2
𝐸𝑋 𝑡 = න 𝑋 𝑡 𝑓 𝜃 𝑑𝜃 = න 𝐴cos(ω0 𝑡 + 𝜃) × 𝑓 𝜃 × 𝑑𝜃
−∞ −𝜋/2

2𝐴
𝐸𝑋 𝑡 =− sin ω0 𝑡 ≠ 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
𝜋

 Therefore, 𝑋 𝑡 is not a SSS process.

19 Link Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 A random process, 𝑋 𝑡 , is said to beWSS, it satisfies the following conditions:
 𝐸𝑋 𝑡 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

 𝑅 𝑡1 , 𝑡2 = 𝐸 𝑋(𝑡1 )𝑋(𝑡2 ) = 𝑓 𝑡𝑖𝑚𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 = 𝑅 𝑡2 − 𝑡1 = 𝑅 𝜏


Time difference: 𝜏 = 𝑡2 − 𝑡1

 The auto-covariance function should be stationary (technical condition)


𝐶𝑋𝑋 𝑡1 , 𝑡2 < ∞

 In general, a process is stationary to order N, if for N random variables of the


process considered at times 𝑡1 , 𝑡2 , … … 𝑡𝑁 , their Nth order joint density
function is invariant with time origin shift as:

𝑓𝑥 𝑥1 , 𝑥2 , … , 𝑥N : 𝑡1 , 𝑡2 , … … 𝑡𝑁 = 𝑓𝑥 𝑥1 , 𝑥2 , … , 𝑥N : 𝑡1 + 𝜏, 𝑡2 + 𝜏, … … 𝑡𝑁 + 𝜏

20 Link ∀ t and 𝜏 Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 Any SSS process of order two is a WSS. However, every WSS process needs not
to be an SSS of order 2.
 Usually in system identification, we work with WSS processes.
 Example: a random telegraph signal
 They are also known as weak-sense or covariance stationary process.

 Two random processes, 𝑋 𝑡 and 𝑌 𝑡 , are called jointly WSS, if the following
conditions are satisfied:
o 𝑋 𝑡 is a WSS
o 𝑌 𝑡 is aWSS
o 𝑅𝑋𝑌 𝑡1 , 𝑡2 = 𝐸 𝑋𝑡1 𝐸 𝑌𝑡2 = 𝑅𝑋𝑌 𝑡, 𝑡 + 𝜏 = 𝑅𝑋𝑌 𝜏 = 𝑅𝑋𝑌 𝑡2 − 𝑡1

21 Link Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 A random process is described by 𝑋 𝑡 = 𝐴 𝑠𝑖𝑛(t) + B cos (t) , where A and B
are independent random variables with zero mean and equal variances (or equal
SD). Show that the process is stationary of second order.
 Now, the mean of the process:
𝐸𝑋 𝑡 = 𝐸 𝐴 𝑠𝑖𝑛(t) + B cos (t) = 𝐸 𝐴] 𝐸[𝑠𝑖𝑛(t)]+ E[B] E[cos (t)

𝐸𝑋 𝑡 = 𝑠𝑖𝑛𝑡 𝐸 𝐴] + cost E[B

As cos (t) and sin (t) are deterministic signals, therefore, their mean will be the same as the
main signal.

𝐸𝑋 𝑡 = 𝑠𝑖𝑛𝑡 × 0 + 𝑐𝑜𝑠𝑡 × 0 = 0 = constant

As, 𝐸 𝐴 = 𝐸 𝐵 = 0

22 Link Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 A random process is described by 𝑋 𝑡 = 𝐴 𝑠𝑖𝑛(t) + B cos (t) , where A and B
are independent random variables with zero mean and equal variances (or equal
SD). Show that the process is stationary of second order.
 Now, the variance of the process:

𝐸 𝑋2 𝑡 = 𝐸 {𝐴 𝑠𝑖𝑛(t) + B cos (t)}2 = 𝐸 𝐴2 𝑠𝑖𝑛2 𝑡 + 𝐵2 𝑐𝑜𝑠 2 𝑡 + 2𝐴𝐵 𝑠𝑖𝑛𝑡 𝑐𝑜𝑠𝑡

𝐸 𝑋2 𝑡 = 𝐸 𝐴2 ]𝐸[𝑠𝑖𝑛2 𝑡] + 𝐸 𝐵2 𝐸[𝑐𝑜𝑠 2 𝑡] + 2𝐸 𝐴 𝐸[𝐵] 𝐸[𝑠𝑖𝑛𝑡𝑐𝑜𝑠𝑡

𝐸 𝑋2 𝑡 = 𝜎 2 𝑠𝑖𝑛2 𝑡 + 𝑐𝑜𝑠 2 𝑡 + 2 × 0 = 𝜎 2 × 1 = 𝜎 2 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

As, 𝐸[𝐴2 ] = 𝐸 𝐵2 = 𝜎 2 and 𝐸 𝐴 = 𝐸 𝐵 = 0

 Therefore, 𝑋 𝑡 is stationary of order two.

23 Link Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 A random process is described by 𝑋 𝑡 = 𝐴 𝑠𝑖𝑛(t) + B cos (t) , where A and B
are independent random variables with zero mean and equal variances (or equal
SD). Show that the process is stationary of second order.
 Now, the variance of the process:

𝐸 𝑋2 𝑡 = 𝐸 {𝐴 𝑠𝑖𝑛(t) + B cos (t)}2 = 𝐸 𝐴2 𝑠𝑖𝑛2 𝑡 + 𝐵2 𝑐𝑜𝑠 2 𝑡 + 2𝐴𝐵 𝑠𝑖𝑛𝑡 𝑐𝑜𝑠𝑡

𝐸 𝑋2 𝑡 = 𝐸 𝐴2 ]𝐸[𝑠𝑖𝑛2 𝑡] + 𝐸 𝐵2 𝐸[𝑐𝑜𝑠 2 𝑡] + 2𝐸 𝐴 𝐸[𝐵] 𝐸[𝑠𝑖𝑛𝑡𝑐𝑜𝑠𝑡

𝐸 𝑋2 𝑡 = 𝜎 2 𝑠𝑖𝑛2 𝑡 + 𝑐𝑜𝑠 2 𝑡 + 2 × 0 = 𝜎 2 × 1 = 𝜎 2 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

As, 𝐸[𝐴2 ] = 𝐸 𝐵2 = 𝜎 2 and 𝐸 𝐴 = 𝐸 𝐵 = 0

 Therefore, 𝑋 𝑡 is stationary of order two.

24 Link Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 Prove that the following random process, 𝑋(𝑡), is a WSS where A and ω are
constants, φ is a random variable uniformly distributed in (0, 2π):
𝑋 𝑡 = A cos (ωt + φ)
 Now, the mean of the process:
∞ 2𝜋
𝐸[𝑋 𝑡 ] = න 𝑋 𝑡 𝑓 φ 𝑑φ = න A cos (ωt + φ) × 𝑓 φ × 𝑑φ
−∞ 0

2𝜋
1 𝐴 2𝜋
𝐸𝑋 𝑡 =න A cos (ωt + φ) × × 𝑑φ = න cos (ωt + φ) 𝑑φ
0 2𝜋 2𝜋 0

𝐸𝑋 𝑡 = 0 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

1 1
From the definition of the uniform distribution: 𝑓 φ = =
2𝜋−0 2𝜋

25 Link Md Shafiullah, Ph.D.


Random Processes
Hence, X (t) is a WSS process.

 Wide-Sense Stationary (WSS) Process


 Prove that the following random process, 𝑋(𝑡), is a WSS where A and ω are constants, φ is a
random variable uniformly distributed in (0, 2π):
𝑋 𝑡 = A cos (ωt + φ)
 Now, the auto-correlation function:
𝑅 𝑡1 , 𝑡2 = 𝐸 𝑋(𝑡1 )𝑋(𝑡2 ) = 𝐸[A cos (ω𝑡1 + φ) × A cos (ω𝑡2 + φ)]

𝑅 𝑡1 , 𝑡2 = 𝐸[𝐴2 cos (ω𝑡1 + φ) × cos (ω𝑡2 + φ)]

𝐴2
𝑅 𝑡1 , 𝑡2 =𝐸 cos{ω(𝑡1 + 𝑡2 )+2 φ}+cos{ω(𝑡1 − 𝑡2 }
2

𝐴2 𝐴2
𝑅 𝑡1 , 𝑡2 = × 𝐸 cos{ω(𝑡1 + 𝑡2 )+2 φ} + × 𝐸 cos{ω(𝑡1 − 𝑡2 )}
2 2

𝐴2 𝐴2
𝑅 𝑡1 , 𝑡2 = ×0+ × cos{ω(𝑡1 − 𝑡2 )} = 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡1 − 𝑡2
26 Link 2 2
Md Shafiullah, Ph.D.
Random Processes
 Wide-Sense Stationary (WSS) Process
 For a random process , 𝑋(𝑡), Y is an uniform random variable in the interval -1
to +1. Check whether the process is aWSS or not.
𝑋 𝑡 = Y sin (ωt)
 Now, the mean of the process:
∞ 1
𝐸[𝑋 𝑡 ] = න 𝑋 𝑡 𝑓 Y 𝑑Y = න Y sin (ωt) × 𝑓 𝑌 × 𝑑Y
−∞ −1

1
1
𝐸𝑋 𝑡 == sin ωt න Y × × 𝑑Y
−1 2

𝐸𝑋 𝑡 = 0 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡

1 1
From the definition of the uniform distribution: 𝑓 Y = =
1−(−1) 2

27 Link Md Shafiullah, Ph.D.


Random Processes
 Wide-Sense Stationary (WSS) Process
 For a random process , 𝑋(𝑡), Y is an uniform random variable in the interval -1 to
+1. Check whether the process is a WSS or not.
𝑋 𝑡 = Y sin (ωt)
 Now, the auto-correlation function:
𝑅 𝑡1 , 𝑡2 = 𝐸 𝑋(𝑡1 )𝑋(𝑡2 ) = E[Y 𝑠𝑖𝑛 ω𝑡1 × Y 𝑠𝑖𝑛(ω𝑡2 )]

1
𝑅 𝑡1 , 𝑡2 = න Y 𝑠𝑖𝑛 ω𝑡1 × Y 𝑠𝑖𝑛(ω𝑡2 ) × 𝑓 𝑌 × 𝑑Y
−1

1
1
𝑅 𝑡1 , 𝑡2 = 𝑠𝑖𝑛 ω𝑡1 × 𝑠𝑖𝑛 ω𝑡2 2
න 𝑌 × × 𝑑Y
−1 2
1 1
From the definition of the uniform distribution: 𝑓 Y = 1−(−1) = 2
1
𝑅 𝑡1 , 𝑡2 = 𝑠𝑖𝑛 ω𝑡1 × 𝑠𝑖𝑛 ω𝑡2
28 Link 3
Md Shafiullah, Ph.D.
Random Processes
Hence, X (t) is not a WSS process.

 Wide-Sense Stationary (WSS) Process


 For a random process , 𝑋(𝑡), Y is an uniform random variable in the interval -1
to +1. Check whether the process is aWSS or not.
𝑋 𝑡 = Y sin (ωt)
 Now, the auto-correlation function:
1
𝑅 𝑡1 , 𝑡2 = 𝑠𝑖𝑛 ω𝑡1 × 𝑠𝑖𝑛 ω𝑡2
3

1
𝑅 𝑡1 , 𝑡2 = cos ω(𝑡1 −𝑡2 ) − cos ω(𝑡1 +𝑡2 )
6

𝑅 𝑡1 , 𝑡2 ≠ 𝑎 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡1 − 𝑡2

29 Link Md Shafiullah, Ph.D.


Random Processes
 Other processes:
 Ergodic Processes
 Markov Process
 Poisson Process – Continuous-time Markov Chain
 Discrete Parameter Markov Process [Markov Chain]

30 Link Md Shafiullah, Ph.D.


Shift Operators
 Input-output relationship for the following system (shift-
operator) in Z-domain can be written as:
𝑌 𝑧 = 𝑧 −1 𝑈(𝑧)

𝑈(𝑧) 𝑧 −1 𝑌(𝑧)

 In the same spirit, the time domain shift operation can be


written as:
𝑦 𝑘 = 𝑞 −1 𝑢(𝑘)
−1
𝑢(𝑘) 𝑞 𝑦(𝑘)
31 Md Shafiullah, Ph.D.
Shift Operators
 The forward shift operator, q:
𝑞𝑢 𝑘 = 𝑢(𝑘 + 1)

 The backward shift operator, q-1:


𝑞 −1 𝑢 𝑘 = 𝑢(𝑘 − 1)

 Digital signal filtering becomes easier with the shift operator:


▪ 2-tap moving average, Finite Impulse Response (FIR) filtering:

𝑦 𝑘 = 𝑤1 𝑢 𝑘 + 𝑤2 𝑢(𝑘 − 1)

32 Md Shafiullah, Ph.D.
Shift Operators
 Two-tap moving average, Finite Impulse Response (FIR) filtering:

𝑦 𝑘 = 𝑤1 𝑢 𝑘 + 𝑤2 𝑢(𝑘 − 1)

1 1 1
𝑦 𝑘 = 𝑢 𝑘 + 𝑢 𝑘 − 1 ; 𝑤ℎ𝑒𝑛, 𝑤1 = 𝑤2 =
2 2 2

1
𝑦 𝑘 = 1 + 𝑞 −1 𝑢 𝑘
2

𝑢(𝑘)
1 + 𝑞 −1 𝑦(𝑘)
2
33 Md Shafiullah, Ph.D.
Shift Operators
 First-order autoregressive process:

𝑦 𝑘 =𝑦 𝑘−1 +𝑒 𝑘

𝑦 𝑘 [1 − 𝑞 −1 ] = 𝑒 𝑘

1
𝑦 𝑘 = −1
𝑒 𝑘
1−𝑞

1
𝑒(𝑘) 𝑦(𝑘)
1 − 𝑞 −1
34 Md Shafiullah, Ph.D.
Shift Operators
 First-order Infinite Impulse Response (IIR) filtering:

𝑦 𝑘 =𝑦 𝑘−1 +𝑢 𝑘 +𝑢 𝑘−1

𝑦 𝑘 1 − 𝑞 −1 = 1 + 𝑞 −1 𝑢 𝑘

1 + 𝑞 −1
𝑦 𝑘 = −1
𝑢 𝑘
1−𝑞

1 + 𝑞 −1
𝑢(𝑘) 𝑦(𝑘)
35
1 − 𝑞 −1 Md Shafiullah, Ph.D.
Shift Operators
 Using shift operators: general input-output relation can be expressed
as:
𝑦 𝑘 = 𝐺 𝑞 −1 𝑢 𝑘 + 𝐻 𝑞 −1 𝑒 𝑘

𝑦 𝑘 : 𝑜𝑢𝑡𝑝𝑢𝑡
𝑢 𝑘 = 𝑖𝑛𝑝𝑢𝑡
𝑒 𝑘 = 𝑛𝑜𝑖𝑠𝑒
𝐺 𝑞 −1 𝑖𝑠 𝑡ℎ𝑒 𝑡𝑟𝑎𝑛𝑠𝑓𝑒𝑟 𝑜𝑝𝑒𝑟𝑎𝑡𝑜𝑟
[Transfer function in Laplace and Z-domain]

▪ 𝐺 𝑞 −1 is parametrically the same as the transfer function after


replacing z with q.
▪ Literature uses both 𝐺 𝑞 −1 and 𝐺 𝑞

36 Md Shafiullah, Ph.D.
Shift Operators

𝑦 𝑘 = 𝐺 𝑞 −1 𝑢 𝑘 + 𝐻 𝑞 −1 𝑒 𝑘

𝑒(𝑘)

𝐻(𝑞 −1 )

𝑢(𝑘) G(𝑞 −1 ) Σ 𝑦(𝑘)

37 Md Shafiullah, Ph.D.
Shift Operators
 Most of the common plants can be modelled as:

𝑞 −1 𝐵 ∗ (𝑞 −1 )
𝐺 𝑞 −1 =
𝐴(𝑞 −1 )
𝐴 𝑞−1 = 1 + 𝑎1 𝑞−1 + 𝑎2 𝑞−2 + ⋯ + 𝑎𝑛𝑎 𝑞−𝑛𝑎

𝐵∗ 𝑞−1 = 𝑏0 + 𝑏1 𝑞−1 + 𝑏2 𝑞−2 + ⋯ + 𝑏𝑛𝑏 𝑞−𝑛𝑏

▪ Monic denominator: zeroth coefficient of the denominator is normalized to 1.


▪ Intrinsic delay: at least one step delay exists in the plant after zero order hold
discretization.
▪ Model irreducibility and co-primeness of polynomials: we will assume that
𝐴 𝑞 −1 and 𝐵∗ 𝑞−1 do not have any common factors, e.g., model is
38
irreducible.
Md Shafiullah, Ph.D.
AutoRegressive with eXtra input (ARX) model
 ARX model:

𝑦 𝑘 + 𝑎1 𝑦 𝑘 − 1 + 𝑎2 𝑦 𝑘 − 2 + 𝑎3 𝑦 𝑘 − 3 + ⋯ + 𝑎𝑛𝑎 𝑦 𝑘 − 𝑛𝑎
= 𝑏1 𝑢 𝑘 − 1 + 𝑏2 𝑢 𝑘 − 2 + 𝑏3 𝑢 𝑘 − 3 + ⋯ + 𝑏𝑛𝑏 𝑢 𝑘 − 𝑛𝑏 + 𝑒(𝑘)

𝐴 𝑞−1 𝑦 𝑘 = 𝐵 𝑞−1 𝑢(𝑘) + 𝑒(𝑘)


Here,𝑒 𝑘 ~(0, λ2 ),
zero mean and
Autoregressive (AR)
Extra (X) input variance of λ2 , it is a
WSS signal
𝐵 𝑞 −1 1
𝑦 𝑘 = −1
𝑢 𝑘 + −1
𝑒 𝑘
𝐴 𝑞 𝐴 𝑞

𝐵 𝑞−1 −1 1
𝐺 𝑞 −1 =
𝐴 𝑞−1
and 𝐻 𝑞 =
𝐴 𝑞 −1

39 Md Shafiullah, Ph.D.
AutoRegressive with eXtra input (ARX) model
 Error, e(k), directly enters into the difference equation. Thus, the model is
also known as the equation error model:

𝐴 𝑞−1 𝑦 𝑘 = 𝐵 𝑞−1 𝑢(𝑘) + 𝑒(𝑘)

Autoregressive (AR) Extra (X) input

 Adding 𝑦(𝑘) and subtracting 𝐴 𝑞 −1 𝑦(𝑘) from both sides:

𝑦 𝑘 = 1 − 𝐴 𝑞 −1 𝑦 𝑘 + 𝐵 𝑞 −1 𝑢(𝑘) + 𝑒(𝑘)

1 − 𝐴 𝑞 −1 𝑦 𝑘 depends on 𝑦 𝑘 − 1 , 𝑦 𝑘 − 2 , … , 𝑒𝑡𝑐, but not 𝑦(𝑘)


as 1 − 𝐴 𝑞 −1 = −(𝑎1 𝑞−1 + 𝑎2 𝑞 −2 + ⋯ + 𝑎𝑛𝑎 𝑞 −𝑛𝑎 )
40 Md Shafiullah, Ph.D.
AutoRegressive with eXtra input (ARX) model
 Therefore, a predictor can be expressed as:

𝑦ො 𝑘 𝑘 − 1 = 1 − 𝐴 𝑞 −1 𝑦 𝑘 + 𝐵 𝑞 −1 𝑢(𝑘) +Error?

Autoregressive (AR)
Extra (X) input

 By equating with the least square regression:


𝑦ො 𝑘 𝑘 − 1 = 𝜑(𝑘)𝑇 𝜃 pseudo-inverse

𝜑 𝑘 = [𝑦 𝑘 − 1 , 𝑦 𝑘 − 2 , … , 𝑦 𝑘 − 𝑛𝑎 ; 𝑢 𝑘 − 1 , 𝑢 𝑘 − 2 , … , 𝑢 𝑘 − 𝑛𝑏 ]

𝜃 = [𝑎1 , 𝑎2 , 𝑎3 ,… , 𝑎𝑛𝑎 , 𝑏1 , 𝑏2 , 𝑏3 ,… , 𝑏𝑛𝑏 ]

 Finally, the error:


𝑒 𝑘 = 𝑦 𝑘 − 𝑦ො 𝑘 𝑘 − 1 = 𝑦 𝑘 − 𝜑(𝑘)𝑇 𝜃
41 Md Shafiullah, Ph.D.
AutoRegressive with eXtra input (ARX) model
 Example: suppose we have 10 data points from an I/O system

Inputs: 𝑢 1 , 𝑢 2 , 𝑢 3 , … , 𝑢 10
Outputs: [𝑦 1 , 𝑦 2 , 𝑦(3) … , 𝑦 10 ]

 Estimate an ARX(1,1) model:


𝑦ො 𝑘 𝑘 − 1 = −𝑎𝑦 𝑘 − 1 + 𝑏 𝑢(𝑘 − 1)
pseudo-inverse
෡ = (𝜑𝑇 𝜑)−1 𝜑𝑇 𝑌 ➔ for minimum error according to LSE
θ

𝑦(1) 𝑢(1) 𝑦(2)


𝑦(2) 𝑢(2) 𝑦(3)
𝜑= ⋮ ⋮
𝑌=

𝑦(9) 𝑢(9) 𝑦(10)
42 Md Shafiullah, Ph.D.
AutoRegressive with eXtra input (ARX) model
Experiment Input, Output,
sample u(k) y(k)
1 0.5 0.17
2 1.0 0.23
෡ = (𝜑𝑇 𝜑)−1 𝜑𝑇 𝑌 = [−0.6839, 0.2920]𝑇
θ
3 1.3 0.76
4 1.7 0.91 𝑦ො 11 10 = −𝑎𝑦 10 + 𝑏𝑢 10
5 2.1 1.53
6 1.6 1.14 𝑦ො 11 10 = 0.6839 ∗ 1.5 + 0.2920 ∗ 2.0

7 0.50 0.83 𝑦ො 11 10 = 1.61


8 0.80 0.95
9 1.4 1.2
10 2.0 1.50

y(11) does not depends on u(11) for this specific model. But, in other models, it can be included.
43 Md Shafiullah, Ph.D.
AutoRegressive Moving Average with eXtra
input (ARMAX) model
 ARMAX Model:

𝑦 𝑘 + 𝑎1 𝑦 𝑘 − 1 + 𝑎2 𝑦 𝑘 − 2 + 𝑎3 𝑦 𝑘 − 3 + ⋯ + 𝑎𝑛𝑎 𝑦 𝑘 − 𝑛𝑎 =
𝑏1 𝑢 𝑘 − 1 + 𝑏2 𝑢 𝑘 − 2 + 𝑏3 𝑢 𝑘 − 3 + ⋯ + 𝑏𝑛𝑏 𝑢 𝑘 − 𝑛𝑏 + 𝑒(𝑘)
+ 𝑐1 𝑒 𝑘 − 1 +𝑐2 𝑒 𝑘 − 2 + ⋯ + 𝑐𝑛𝑐 𝑒 𝑘 − 𝑛𝑐

𝐴 𝑞−1 𝑦 𝑘 = 𝐵 𝑞−1 𝑢(𝑘) + 𝐶 𝑞−1 𝑒(𝑘)

Autoregressive (AR) Extra (X) input Moving average (MA)

𝐵 𝑞 −1 = 𝑏1 𝑞−1 + 𝑏2 𝑞−2 + ⋯ + 𝑏𝑛𝑏 𝑞 −𝑛𝑏


𝐴 𝑞 −1 = 1 + 𝑎1 𝑞−1 + 𝑎2 𝑞 −2 + ⋯ + 𝑎𝑛𝑎 𝑞 −𝑛𝑎
𝐶 𝑞 −1 = 1 + 𝑐1 𝑞 −1 + 𝑐2 𝑞−2 + ⋯ + 𝑐𝑛𝑐 𝑞−𝑛𝑐
44 Md Shafiullah, Ph.D.
AutoRegressive Moving Average with eXtra
input (ARMAX) model
 ARMAX Model:

𝐴 𝑞−1 𝑦 𝑘 = 𝐵 𝑞−1 𝑢(𝑘) + 𝐶 𝑞−1 𝑒(𝑘)

Autoregressive (AR) Moving average (MA)


Extra (X) input

𝐵 𝑞 −1 𝐶 𝑞 −1
𝑦 𝑘 = −1
𝑢 𝑘 + −1
𝑒(𝑘)
𝐴 𝑞 𝐴 𝑞

45 Md Shafiullah, Ph.D.
Other models
 ARARX Model:
−1 −1
1
𝐴 𝑞 𝑦 𝑘 =𝐵 𝑞 𝑢(𝑘) + 𝑒(𝑘)
𝐷 𝑞−1
Autoregressive (AR) Autoregressive (AR)
Extra (X) input

 ARARMAX Model:

𝐶 𝑞−1
𝐴 𝑞−1 𝑦 𝑘 = 𝐵 𝑞−1 𝑢(𝑘) + 𝑒(𝑘)
𝐷 𝑞−1
Autoregressive (AR) ARMA
Extra (X) input

46
Investigate the last terms: how they are AR and ARMA! Md Shafiullah, Ph.D.
Other models
 Box-Jenkins Model:

𝐵 𝑞−1 𝐶 𝑞−1
𝑦 𝑘 = −1
𝑢(𝑘) + −1
𝑒(𝑘)
𝐴 𝑞 𝐷 𝑞

 Output Error Model:

𝐵 𝑞 −1
𝑦 𝑘 = 𝑢(𝑘) + 𝑒(𝑘)
𝐹 𝑞 −1

47 Md Shafiullah, Ph.D.
ARX Model
𝑒(𝑘)
1
𝐴(𝑞 −1 )

𝑢(𝑘) 𝐵(𝑞 −1 ) 𝑦(𝑘)


Σ
𝐴(𝑞 −1 )

48 Md Shafiullah, Ph.D.
ARMAX Model
𝑒(𝑘)

𝐶(𝑞 −1 )
𝐴(𝑞 −1 )

𝑢(𝑘) 𝐵(𝑞 −1 ) 𝑦(𝑘)


Σ
𝐴(𝑞 −1 )

49 Md Shafiullah, Ph.D.
Box-Jenkin Model
𝑒(𝑘)

𝐶(𝑞 −1 )
𝐷(𝑞−1 )

𝑢(𝑘) 𝐵(𝑞 −1 ) 𝑦(𝑘)


Σ
𝐴(𝑞 −1 )

50 Md Shafiullah, Ph.D.
Output Error Model

𝑒(𝑘)

𝑢(𝑘) 𝐵(𝑞 −1 ) 𝑥(𝑘) 𝑦(𝑘)


Σ
𝐹(𝑞 −1 )

51 Md Shafiullah, Ph.D.
Model Order Selection
 ARMAX Model:
𝑦 𝑘 + 𝑎1 𝑦 𝑘 − 1 + 𝑎2 𝑦 𝑘 − 2 + 𝑎3 𝑦 𝑘 − 3 + ⋯ + 𝑎𝑛𝑎 𝑦 𝑘 − 𝑛𝑎 = 𝑏1 𝑢 𝑘 − 1 +
𝑏2 𝑢 𝑘 − 2 + 𝑏3 𝑢 𝑘 − 3 + ⋯ + 𝑏𝑛𝑏 𝑢 𝑘 − 𝑛𝑏 + 𝑒(𝑘) +𝑐1 𝑒 𝑘 − 1 +𝑐2 𝑒 𝑘 − 2
+ ⋯ + 𝑐𝑛𝑐 𝑒 𝑘 − 𝑛𝑐
𝐵 𝑞 −1 = 𝑏1 𝑞 −1 + 𝑏2 𝑞 −2 + ⋯ + 𝑏𝑛𝑏 𝑞 −𝑛𝑏
𝐴 𝑞 −1 = 1 + 𝑎1 𝑞 −1 + 𝑎2 𝑞 −2 + ⋯ + 𝑎𝑛𝑎 𝑞 −𝑛𝑎
𝐶 𝑞 −1 = 1 + 𝑐1 𝑞 −1 + 𝑐2 𝑞 −2 + ⋯ + 𝑐𝑛𝑐 𝑞 −𝑛𝑐
𝜃 = [𝑎1 , 𝑎2 , … , 𝑎𝑛𝑎 , 𝑏1 , 𝑏2 , … , 𝑏𝑛𝑏 , 𝑐1 , 𝑐2 , … , 𝑐𝑛𝑐 ]
𝑑 = 𝑛𝑎 + 𝑛𝑏 + 𝑛𝑐 = 𝑓𝑖𝑥𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟
LSE can be used for
𝑑 = 𝑛𝑎 𝑎𝑛𝑑 𝑛𝑏 = 𝑛𝑐 = 0 for AR model
model parameters
𝑑 = 𝑛𝑎 +𝑛𝑏 𝑎𝑛𝑑 𝑛𝑐 = 0 for ARX model estimation
𝑑 = 𝑛𝑎 + 𝑛𝑐 𝑎𝑛𝑑 𝑛𝑏 = 0 for ARMA model
52 Md Shafiullah, Ph.D.
𝑑 = 𝑛𝑎 + 𝑛𝑏 + 𝑛𝑐 for ARMAX model
Model Order Selection
 For simplicity, we consider equally balanced models:
𝑑= 𝑛𝑎 𝑎𝑛𝑑 𝑛𝑏 = 𝑛𝑐 = 0 for AR model
𝑑 = 𝑛𝑎 +𝑛𝑏 𝑎𝑛𝑑 𝑛𝑐 = 0 for ARX model where 𝑛𝑎 = 𝑛𝑏
𝑑 = 𝑛𝑎 + 𝑛𝑐 𝑎𝑛𝑑 𝑛𝑏 = 0 for ARMA model where 𝑛𝑎 = 𝑛𝑐
𝑑 = 𝑛𝑎 + 𝑛𝑏 + 𝑛𝑐 for ARMAX model where 𝑛𝑎 = 𝑛𝑏 = 𝑛𝑐
 Naïve approach
▪ Calculate the error for d=1 to dmax
▪ Choose the value of d for which the error is minimum

53 Md Shafiullah, Ph.D.
Model Order Selection
d=m= model order

 d=1; Not enough  d=2; Data are well  d>11; Too many degrees of
degrees of freedom to described by this freedom: The model is
represent the data model choice perfect for this set of data.
 Underfitting: the  Overfitting: the model is
 Correct model order
model is too simple. too complex!

54 Md Shafiullah, Ph.D.
Model Order Selection
 Model fitting vs. model complexity:

• Estimate the model


• Plot loss function (MSE) at different model
orders
• The minimized loss is a decreasing function
of the model order and it begins to
decrease as the model picks up the relevant
features
• As p (model order) increases, the model
tends to overfit the data
• In practice, we look for the “knee” in the
curve

What should be the order for this figure if it is an AR model? AR(5)


55 Link Md Shafiullah, Ph.D.
White Noise, a Novel, by
Don DeLillo, in 1985!

Model Order Selection


 Whiteness test:
▪ By nature, the calculated error should be a white noise! Check whether the
error is a white noise or not.
▪ Choose the first value of d for which the whiteness test is ok.
▪ The error or residual is defined as:
ε 𝑡 = 𝑦 𝑡 − 𝑦(𝑡) ො
▪ The correlation of ε 𝑡 can be defined as:
𝑟ε 𝜏 = 𝐸[ε 𝑡 + 𝜏 ε 𝜏 ]

▪ According to the test, ε 𝑡 be a zero mean process.


▪ If ε 𝑡 is a zero mean process:
▪ The correlation function 𝑟ε 𝜏 = 0 for any non-zero 𝜏
▪ The correlation function 𝑟ε 0 = 𝜎 2 = 𝑣𝑎𝑟(ε 𝑡 ), at 𝜏 = 0

56 Link Md Shafiullah, Ph.D.


Model Order Selection
 Whiteness test:
▪ Generation of a noise and calculation of its properties in MATLAB.
MATLAB Command: N = 200000; t = 0:N-1; e = randn(N,1);

White Noise Histogram

• First moment, mean, 𝐸 ε 𝑡 = 0


57 Link • Autocorrelation, 𝑅εε 𝜏 = 1.0 at 𝜏 = 0 Md Shafiullah, Ph.D.
Model Order Selection
 Whiteness test:
▪ Generation of a noise and calculation of its properties in MATLAB.

Autocorrelation of a continuous
white noise signal has a strong
peak (Dirac delta function) at t=0,
and is 0 for all t unequal 0.

Correlation Function
58 Link Md Shafiullah, Ph.D.
Model Order Selection
 Cross-correlation test:
▪ The cross-correlation between the input(s) and the error should be zero meaning
that there is nothing to be extracted by playing with model order!
▪ The cross-correlation between, the error ε 𝑡 , the input 𝑢 𝑡 , can be defined
as:
𝑟ε𝑢 𝜏 = 𝐸[ε 𝑡 + 𝜏 𝑢 𝑡 ]
▪ The prediction errors ε(t) are independent of the input u(t) for 𝜏 ≥ 0 (current
and future errors are independent of current inputs).
▪ The prediction errors ε(t) are independent of the input u(t) for any 𝜏 (all the
errors are independent of all inputs).
 A few references also used to calculate the cross correlations of (a) input square
and residuals, (b) input square and residual square, (c) residuals and
(input×residuals).
59 Link1, Link2, Link3 Md Shafiullah, Ph.D.
Model Order Selection
 Adding zero-mean white noise with a variance of 4
MATLAB Command: a=5; b=4; t=1:0.05:20; x=a*cos(2*pi*t/10);
xn=x+sqrt(b)*randn(size(x));

60 Md Shafiullah, Ph.D.
Model Order Selection
 Adding 20 SNR white noise to a saw tooth signal

MATLAB Command: t = (0:0.1:60)’; x = sawtooth(t); SNR=20;


y = awgn(x,SNR);

61 Md Shafiullah, Ph.D.
Model Order Selection
 Validation: model is validated on a fresh set of data

 Model complexity criteria: calculate the values of following indices (from


the model quality and complexity) and decide on the model order
▪ Akaike information criterion (AIC)
▪ Corrected Akaike information (AICc)
▪ Bayesian information criterion (BIC)
▪ Final prediction error (FPE)
▪ Minimum description length (MDL)
62 Link Md Shafiullah, Ph.D.
Model Order Selection

63 Link Md Shafiullah, Ph.D.


L=512

MATLAB Example ARX


 Consider an ARX system with
𝐵(𝑞) = 𝑞 −1 + 0.3𝑞 −2
𝐴(𝑞) = 1 + 0.3𝑞 −1 − 0.2𝑞 −2 − 0.35𝑞 −3
Simulate the system with a random input 𝑢(𝑡) of length 𝐿 = 512 that is
uncorrelated with the added white noise, 𝑒(𝑡), of variance 𝜎 2 = 3.
Obtain the estimates of the coefficients of B(q) and A(q) using the function
“arx”. Use the dimensions that are compatible with the given B(q) and A(q),
e.g., 𝑛𝑎 = 3, 𝑛𝑏 = 2; and 𝑛𝑘 = 1.

64 Link Md Shafiullah, Ph.D.


L=512

MATLAB Example ARX


 ARX Structure in MATLAB:

𝑦 𝑡 + 𝑎1 𝑦 𝑡 − 1 + ⋯ + 𝑎𝑛𝑎 𝑦 𝑡 − 𝑛𝑎
= 𝑏1 𝑢 𝑡 − 𝑛𝑘 + ⋯ + 𝑏𝑛𝑏 𝑢 𝑡 − 𝑛𝑏 − 𝑛𝑘 + 1 + 𝑒(𝑡)

𝑦 𝑡 : Output at time t

𝑛𝑎 : number of poles [AR points]

𝑛𝑏 : number of zeros [Input points]

𝑛𝑘 : Number of input samples that occur before the input affects the output, also
called the dead time in the system

𝑒(𝑡): White-noise disturbance value

65 Link Md Shafiullah, Ph.D.


L=512

MATLAB Example ARX System vs. model output

MATLAB Command:
% Computing output with filtering u and
e by G and H: model output
y=filter(b,a,u)+filter(1,a,e);

% Evaluating the system (ARX Model)


parameters using arx toolbox
data=iddata(y',u');
theta=arx(data,[3 2 1]);

% Predicting the outputs from the


evaluated system
y_hat=filter(theta.B,theta.A,u);
pred_error=y-y_hat;
66 Link Md Shafiullah, Ph.D.
L=512

MATLAB Example ARX Predicted error

MATLAB Command:
% Computing output with filtering u and
e by G and H: model output
y=filter(b,a,u)+filter(1,a,e);

% Evaluating the system (ARX Model)


parameters using arx toolbox
data=iddata(y',u');
theta=arx(data,[3 2 1]);

% Predicting the outputs from the


evaluated system
y_hat=filter(theta.B,theta.A,u);
pred_error=y-y_hat;
67 Link Md Shafiullah, Ph.D.
L=512

MATLAB Example ARX

Autocorrelation of the prediction error process


68 Link Md Shafiullah, Ph.D.
L=512

MATLAB Example ARX

Signal Mean Variance


Input 0.0139 1.0446
Noise 0.0132 2.9775
Output 0.0243 8.1661
Predicted output 0.0111 1.9549
Predicted error 0.0132 5.8100

a 1.0000 0.3000 -0.2000 0.3500 b 0.0000 1.0000 0.3000


a_pred 1.0000 0.2239 -0.2214 0.3921 b_pred 0.0000 1.0428 0.1502

69 Link Md Shafiullah, Ph.D.


L=51200

MATLAB Example ARX

Why the prediction


is not correct even
with huge amount
of data?

System vs. model output


70 Link Md Shafiullah, Ph.D.
L=51200

MATLAB Example ARX

Autocorrelation of the prediction error process


71 Link Md Shafiullah, Ph.D.
L=51200

MATLAB Example ARX

Signal Mean Variance


Input 0.0006 1.0088
Noise 0.0013 2.9953
Output 0.0015 7.9373
Predicted output 0.0006 1.5364
Predicted error 0.0009 6.3761

a 1.0000 0.3000 -0.2000 0.3500 b 0.0000 1.0000 0.3000


a_pred 1.0000 0.2959 -0.1957 0.3486 b_pred 0.0000 1.0040 0.3152

72 Link Md Shafiullah, Ph.D.


L=51200

MATLAB Example ARX


Increasing input compared to
white noise

MATLAB Command:
%Generating a random input
L=51200; % length of input
u=2*randn(1,L); % input

% Generating a white noise with


variance 2
b=0.5; % noise variance
e=sqrt(b)*randn(1,L); % targeted
white noise
System vs. model output
73 Link Md Shafiullah, Ph.D.
L=51200

MATLAB Example ARX

Signal Mean Variance


Input -0.0089 3.9904
Noise 0.0007 0.4975
Output -0.0076 7.4710
Predicted output -0.0080 6.3204
Predicted error 0.0005 1.1098

a 1.0000 0.3000 -0.2000 0.3500 b 0.0000 1.0000 0.3000


a_pred 1.0000 0.3041 -0.1999 0.3484 b_pred 0.0000 1.0009 0.3029

74 Link Md Shafiullah, Ph.D.


Canonical form of a stochastic process
 The following block diagram will only exits if:
 𝐶 𝑧 , 𝐴(𝑧) are polynomials with the same degree (or equivalently, the
relative degree of the transfer function is zero): number of poles and zeros
are equal: proper transfer function
 𝐶 𝑧 , 𝐴(𝑧) are co-prime (they-share no root)
 𝐶 𝑧 , 𝐴(𝑧) have roots inside the unit circle (in case of z-domain):
asymptotically stable filter
 If the filter 𝑊 𝑧 satisfies the above-mentioned conditions, the process 𝑦(𝑡) is
said to be in canonical form where 𝑒 𝑡 ~𝜔𝑛 (0, λ2 )

𝑒(𝑡) 𝐶(𝑧) 𝑦(𝑡) 𝑒(𝑡) 𝑦(𝑡)


𝐶(𝑞−1 )
𝑊 𝑧 = 𝑊 𝑞−1 =
𝐴(𝑧) 𝐴(𝑞−1 )

75 Link Md Shafiullah, Ph.D.


Canonical form of a stochastic process
 If we want to estimate 𝑦(𝑡) at 𝑟 ∈ ℕ steps ahead, for instance, 𝑦(𝑡 + 𝑟) using the
known data at time 𝑡. Then, we have the following items:
 A set of data up to time 𝑡
 Previous predictions: 𝑦(𝑡
ො + 𝑟 − 1|𝑡 − 1), 𝑦(𝑡 ො + 𝑟 − 2|𝑡 − 2), 𝑦(𝑡
ො +𝑟−
3|𝑡 − 3),etc.
𝐶(𝑧)
 The process or transfer function: 𝑊 𝑧 =
𝐴(𝑧)
 The 𝑟-step predictor is said to be optimal if:
 The predictor error has zero mean (passed the whiteness test), 𝐸 ε 𝑡 =0
ො + 𝑟|𝑡) and ε 𝑡 are uncorrelated, 𝐸 𝑦(𝑡
 𝑦(𝑡 ො + 𝑟|𝑡), ε 𝑡 = 0, if that is not
the case, it means that there is still some information that we could use to define
a better predictor.
 The variance of the error is the minimum achievable, 𝑉𝑎𝑟 ε 𝑡 = 𝐸 ε 𝑡 2

76 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 If the process 𝑦(𝑡) is in canonical form, we can now define the inverse of
෩ 𝑧 = 𝑊(𝑧)−1
𝑊 𝑧 as: 𝑊
𝐶 𝑧
𝑌 𝑧 =𝑊 𝑧 𝐸 𝑧 = 𝐸(𝑧)
𝐴 𝑧

𝐴 𝑧
෩ 𝑧 𝑌 𝑧 =
𝐸 𝑧 =𝑊 𝑌(𝑧)
𝐶 𝑧

𝑒(𝑡) 𝐶(𝑧) 𝑦(𝑡) 𝑦(𝑡) 𝐴 𝑧 𝑒(𝑡)


𝑊 𝑧 = ෩ 𝑧 =
𝑊
𝐴(𝑧) 𝐶 𝑧
77 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 Recall: 𝑦(𝑡) can be written in time domain using the convolutional formula as:
𝑡

𝑦 𝑡 = ෍ 𝑤 𝑡 − 𝑗 𝑒(𝑗)
𝑗=−∞

𝑦 𝑡 = ෍ 𝑤 𝑖 𝑒(𝑡 − 𝑖)
𝑖=0
𝑤 𝑡 is the system impulse response with input 𝑒(𝑡)
 We can now do the same for 𝑒(𝑡):

𝑒 𝑡 = ෍𝑤
෥ 𝑖 𝑦(𝑡 − 𝑖)
𝑖=0
𝑒(𝑡) 𝐶(𝑧) 𝑦(𝑡) 𝑦(𝑡) 𝐴 𝑧 𝑒(𝑡)
𝑊 𝑧 = ෩ 𝑧 =
𝑊
𝐴(𝑧) 𝐶 𝑧
78 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 To predict 𝑦(𝑡) at 𝑟-step ahead:

𝑦 𝑡 + 𝑟 = ෍ 𝑤 𝑖 𝑒(𝑡 + 𝑟 − 𝑖)
𝑖=0
𝑦 𝑡+𝑟
= 𝑤 0 𝑒 𝑡 + 𝑟 + 𝑤 1 𝑒 𝑡 + 𝑟 − 1 + 𝑤 2 𝑒 𝑡 + 𝑟 − 2 + ……
+ 𝑤 𝑟 𝑒 𝑡 + 𝑤 𝑟 + 1 𝑒 𝑡 − 1 + ……
 The red part of the above equation is not computable as we cannot obtain
𝑒 𝑡 + 1 , 𝑒 𝑡 + 2 , … , 𝑒 𝑡 + 𝑟 due to no knowledge about 𝑦 at those time
instants.
 However, the blue portion can be obtained.

𝑒(𝑡) 𝐶(𝑧) 𝑦(𝑡) 𝑦(𝑡) 𝐴 𝑧 𝑒(𝑡)


𝑊 𝑧 = ෩ 𝑧 =
𝑊
𝐴(𝑧) 𝐶 𝑧
79 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 Therefore, the 𝑟-step ahead prediction can be defined as:
𝑦 𝑡 + 𝑟 = 𝑤 0 𝑒 𝑡 + 𝑟 + 𝑤 1 𝑒 𝑡 + 𝑟 − 1 + ……
+ 𝑤 𝑟 𝑒 𝑡 + 𝑤 𝑟 + 1 𝑒 𝑡 − 1 + ……

𝑦 𝑡 + 𝑟 = 𝜀 𝑡 + 𝑟 + 𝑦(𝑡
ො + 𝑟|𝑡)

𝑦ො 𝑡 + 𝑟 𝑡 = 𝑦 𝑡 + 𝑟 − 𝜀 𝑡 + 𝑟

𝜀 𝑡 + 𝑟 = 𝑦 𝑡 + 𝑟 − 𝑦ො 𝑡 + 𝑟 𝑡

𝑒(𝑡) 𝐶(𝑧) 𝑦(𝑡) 𝑦(𝑡) 𝐴 𝑧 𝑒(𝑡)


𝑊 𝑧 = ෩ 𝑧 =
𝑊
𝐴(𝑧) 𝐶 𝑧
80 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 Recall: z-transform of the impulse response, 𝑤 𝑡 , is the transfer function, 𝑊 𝑧 :


𝐶(𝑧)
𝑊 𝑧 = = ෍ 𝑤(𝑡)𝑧 −𝑡
𝐴(𝑧)
𝑡=0

= 𝑤 0 + 𝑤 1 𝑧 −1 + 𝑤 2 𝑧 −2 + 𝑤 𝑟 − 1 𝑧 −𝑟+1 + … …
+ 𝑤 𝑟 𝑧 −𝑟 + 𝑤 𝑟 + 1 𝑧 −𝑟−1 + … …

 We can see the blue portion contain the expressions used in the predictor equation
while the red portion contain the expressions used in the error equation.
 Therefore, taking the long division of the transfer function, 𝑊 𝑧 , expression
might be useful.

81 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 Computing 𝑟-step long division of the transfer function, 𝑊 𝑧 :
𝐶(𝑧) 𝑅𝑟 (𝑧)
𝑊 𝑧 = = 𝑄𝑟 𝑧 +
𝐴(𝑧) 𝐴(𝑧)
−𝑟
𝑅(𝑧)
𝑊 𝑧 = 𝑄𝑟 𝑧 + 𝑧
𝐴(𝑧)

 From the previous slide:


𝑊 𝑧 = 𝑤 0 + 𝑤 1 𝑧 −1 + 𝑤 2 𝑧 −2 + ⋯ + 𝑤 𝑟 − 1 𝑧 −𝑟+1
+ 𝑤 𝑟 𝑧 −𝑟 + 𝑤 𝑟 + 1 𝑧 −𝑟−1 + ⋯

𝑊 𝑧 = 𝑤 0 + 𝑤 1 𝑧 −1 + 𝑤 2 𝑧 −2 + ⋯ + 𝑤 𝑟 − 1 𝑧 −𝑟+1
+ 𝑧 −𝑟 {𝑤 𝑟 + 𝑤 𝑟 + 1 𝑧 −1 + ⋯ }

82 Md Shafiullah, Ph.D.
𝑒(𝑡) 𝐶(𝑧) 𝑦(𝑡)
𝑊 𝑧 =
𝐴(𝑧)

Canonical form of a stochastic process


 From long division:
−𝑟
𝑅(𝑧)
𝑊 𝑧 = 𝑄𝑟 𝑧 + 𝑧
𝐴(𝑧)
 From z-transform equation:
𝑊 𝑧 = 𝑤 0 + 𝑤 1 𝑧 −1 + 𝑤 2 𝑧 −2 + ⋯ + 𝑤 𝑟 − 1 𝑧 −𝑟+1
+ 𝑧 −𝑟 {𝑤 𝑟 + 𝑤 𝑟 + 1 𝑧 −1 + ⋯ }
−𝑟
𝑅(𝑧)
= 𝑄𝑟 𝑧 + 𝑧
𝐴(𝑧)
 From predictor equation:
𝑦 𝑡 + 𝑟 = 𝜀 𝑡 + 𝑟 + 𝑦(𝑡
ො + 𝑟|𝑡)
 Therefore,
−𝑟
𝑅 𝑧 𝑅 𝑧
𝑦ො 𝑡 + 𝑟 𝑡 = 𝑧 𝑒 𝑡+𝑟 = 𝑒 𝑡
𝐴 𝑧 𝐴 𝑧

83 𝜀 𝑡 + 𝑟 = 𝑄𝑟 𝑧 𝑒(𝑡 + 𝑟) Md Shafiullah, Ph.D.


Canonical form of a stochastic process
 The block diagram of the predictor is:

𝑦(𝑡) 𝐴 𝑧 𝑒 𝑡 ~𝜔𝑛 (0, λ2 ) 𝑅(𝑧) 𝑦(𝑡


ො + 𝑟|𝑡)

𝑊 𝑧 =
𝐶 𝑧 𝐴(𝑧)

 Therefore,
𝑅 𝑧 𝑅 𝑧 𝐴 𝑧 𝑅 𝑧 1
𝑦ො 𝑡 + 𝑟 𝑡 = 𝑒 𝑡 = 𝑦(𝑡) = 𝑦(𝑡)
𝐴 𝑧 𝐴 𝑧 𝐶 𝑧 1 𝐶 𝑧

𝑅 𝑧
𝑦ො 𝑡 + 𝑟 𝑡 = 𝑦(𝑡)
𝐶 𝑧

84 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 Calculation of the variance of the prediction error:
𝑉𝑎𝑟[𝜀 𝑡 + 𝑟 ]
= 𝑉𝑎𝑟[𝑤 0 𝑒 𝑡 + 𝑟 + 𝑤 1 𝑒 𝑡 + 𝑟 − 1 + ⋯ + 𝑤 𝑟 − 1 𝑒(𝑡 + 1)]

= 𝐸 𝑤 0 𝑒 𝑡 + 𝑟 + 𝑤 1 𝑒 𝑡 + 𝑟 − 1 + ⋯+ 𝑤 𝑟 − 1 𝑒 𝑡 + 1 2

= 𝐸[ 𝑤 0 𝑒 𝑡 + 𝑟 2 + 𝑤 1 𝑒 𝑡 + 𝑟 − 1 2 + ⋯+ 𝑤 𝑟 − 1 𝑒 𝑡 + 1 2

+ 2𝑤 0 𝑤 1 𝑒 𝑡 + 𝑟 𝑒 𝑡 + 𝑟 − 1 + ⋯ ]

= 𝐸[ 𝑤 0 𝑒 𝑡 + 𝑟 2 + 𝑤 1 𝑒 𝑡+𝑟−1 2 + ⋯+ 𝑤 𝑟 − 1 𝑒 𝑡 + 1 2]

As 𝐸 𝑒 𝑡 + 𝑟 𝑒 𝑡 + 𝑟 − 1 =0

85 Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 Calculation of the variance of the prediction error:
𝑉𝑎𝑟 𝜀 𝑡 + 𝑟
2 2 2
=𝐸 𝑤 0 𝑒 𝑡+𝑟 + 𝑤 1 𝑒 𝑡+𝑟−1 + ⋯+ 𝑤 𝑟 − 1 𝑒 𝑡 + 1

= 𝑤 0 2𝐸 𝑒 𝑡 + 𝑟 2 + 𝑤 1 2𝐸 𝑒 𝑡 + 𝑟 − 1 2 + ⋯ + 𝑤 𝑟 − 1 2 𝐸 [𝑒(𝑡 +
1)2 ]

= 1 × λ2 + 𝑤 1 2
× λ2 + ⋯ + 𝑤 𝑟 − 1 2
× λ2

= λ2 {1 + 𝑤 1 2 + ⋯ + 𝑤 𝑟 − 1 2}

As 𝑉𝑎𝑟 𝑒 𝑡 = λ2 and 𝑤 0 2 = 1 since 𝐴 𝑧 𝐶(𝑧) are monic.


 If 𝑟 = 1, the 𝑉𝑎𝑟 𝜀 𝑡 + 𝑟 = λ2 ➔Variance of the error or white noise
 The variance increases with the increase of 𝑟
86  If 𝑟 → ∞, 𝑉𝑎𝑟 𝜀 𝑡 + 𝑟 = 𝑉𝑎𝑟 𝑦 𝑡 ➔ prove it Md Shafiullah, Ph.D.
Canonical form of a stochastic process
 The 𝑟-step long division of the transfer function, 𝑊 𝑧 :
𝐶(𝑧) 𝑅𝑟 (𝑧)
𝑊 𝑧 = = 𝑄𝑟 𝑧 +
𝐴(𝑧) 𝐴(𝑧)
−𝑟
𝑅(𝑧)
𝑊 𝑧 = 𝑄𝑟 𝑧 + 𝑧
𝐴(𝑧)

 Example: computing 2-step long division of the following transfer function:


1 −1
𝐶(𝑧) 1 + 𝑧 𝑅(𝑧) 𝑅2 (𝑧)
𝑊 𝑧 = = 3 = 𝑄2 𝑧 + 𝑧 −2 = 𝑄2 𝑧 +
𝐴(𝑧) 1 + 1 𝑧 −1 𝐴(𝑧) 𝐴(𝑧)
2

 Calculate, 𝑄2 𝑧 and 𝑅2 (𝑧)

1 1 −2
𝑄2 𝑧 = 1 − 𝑧 −1 𝑅2 𝑧 = 𝑧 = 𝑧 −2 𝑅(𝑧)
87 6 12
Md Shafiullah, Ph.D.
Optimal Predictor for MA processes
 The following MA process is a WSS:
𝑦 𝑡 = 𝑐0 𝑒 𝑡 + 𝑐1 𝑒 𝑡 − 1 + ⋯ + 𝑐𝑛 𝑒 𝑡 − 𝑛
𝑛

= ෍ 𝑐𝑖 𝑒 𝑡 − 𝑖 ; 𝑒 𝑡 ~𝜔𝑛 (𝜇, λ2 )
𝑖=0
 To predict 1-step ahead, we need to make the process canonical by imposing the
following conditions:
 𝑐0 = 1, therefore: 𝐶 𝑧 = 1 + 𝑐1 𝑧 −1 + ⋯ + 𝑐𝑛 𝑧 −𝑛
 Making the mean of the white noise zero: 𝑒 𝑡 ~𝜔𝑛 (0, λ2 )
 The MA process now become:
𝑦 𝑡 = 𝑒 𝑡 + 𝑐1 𝑒 𝑡 − 1 + ⋯ + 𝑐𝑛 𝑒 𝑡 − 𝑛 ; 𝑒 𝑡 ~𝜔𝑛 (0, λ2 )

𝑦 𝑡 = 𝜀 𝑡 + 𝑦(𝑡|𝑡
ො − 1)
 It can also be expressed as:
88 𝑦 𝑡 + 1 = 𝜀 𝑡 + 1 + 𝑦(𝑡
ො + 1|𝑡) Md Shafiullah, Ph.D.
MA Process: Example
 Evaluate the 2-step ahead predictor of the following MA(2) process:
1 1
𝑦 𝑡 =𝑒 𝑡 − 𝑒 𝑡 − 1 − 𝑒 𝑡 − 2 ; 𝑒 𝑡 ~𝜔𝑛 (0,1)
12 12
 To predict 2-step ahead, we need to check the following aspects first:
 Mean of the process, 𝐸 𝑦 𝑡 = 0 as 𝑒 𝑡 is of zero mean.
𝐶(𝑧) 1 −1 1 −2
 Transfer function of the process, 𝑊 𝑧 = =1− 𝑧 − 𝑧
𝐴(𝑧) 12 12

1 −1 1 −2
𝐶(𝑧) 1− 𝑧 − 𝑧
𝑊 𝑧 = = 12 12
𝐴(𝑧) 1
As the two polynomials, 𝐶(𝑧) and 𝐴(𝑧), are co-prime, monic, and of same order.
Therefore, the process is in canonical form
1 1
 𝐶 𝑧 , 𝐴(𝑧) have roots [ , − ] inside the unit circle: asymptotically stable
3 4
filter.
89 Link Md Shafiullah, Ph.D.
MA Process: Example
 Recall: 𝑟-step ahead prediction definition:
𝑦 𝑡 + 𝑟 = 𝑤 0 𝑒 𝑡 + 𝑟 + 𝑤 1 𝑒 𝑡 + 𝑟 − 1 + ……
+ 𝑤 𝑟 𝑒 𝑡 + 𝑤 𝑟 + 1 𝑒 𝑡 − 1 + ……

𝑦 𝑡+𝑟
= 𝑒 𝑡 + 𝑟 + 𝑐1 𝑒 𝑡 + 𝑟 − 1 + 𝑐2 𝑒 𝑡 + 𝑟 − 2 + ⋯ + 𝑐𝑟−1 𝑒 𝑡 − 1 + ⋯
+𝑐𝑟 𝑒 𝑡 + 𝑐𝑟+1 𝑒 𝑡 + 1 + ⋯ + +𝑐𝑛 𝑒 𝑡 + 𝑟 − 𝑛
𝑐0 = 𝑤 0 = 1 for canonical form

𝑦 𝑡 = 𝑒 𝑡 + 𝑐1 𝑒 𝑡 − 1 + 𝑐2 𝑒 𝑡 − 2 + ⋯ + 𝑐𝑟−1 𝑒 𝑡 − 𝑟 − 1 + ⋯
+𝑐𝑟 𝑒 𝑡 − 𝑟 + 𝑐𝑟+1 𝑒 𝑡 − 𝑟 + 1 + ⋯ + +𝑐𝑛 𝑒 𝑡 − 𝑛

 Now: 2-step ahead predictor of the MA process:


𝑦 𝑡 = 𝑒 𝑡 + 𝑐1 𝑒 𝑡 − 1 + 𝑐2 𝑒 𝑡 − 2
90 Md Shafiullah, Ph.D.
MA Process: Example
 Now: 2-step ahead predictor of the MA process:
𝑦 𝑡 = 𝑒 𝑡 + 𝑐1 𝑒 𝑡 − 1 + 𝑐2 𝑒 𝑡 − 2

𝑦 𝑡 = 𝜀 𝑡 + 𝑦(𝑡|𝑡
ො − 2)
 Given process:
1 1
𝑦 𝑡 =𝑒 𝑡 − 𝑒 𝑡−1 − 𝑒 𝑡−2
12 12
 Therefore,
1
𝑦ො 𝑡 𝑡 − 2 = − 𝑒 𝑡 − 2
12

1
𝜀 𝑡 =𝑒 𝑡 − 𝑒 𝑡−1
12

91 Md Shafiullah, Ph.D.
MA Process: Example
1
𝑦ො 𝑡 𝑡 − 2 = − 𝑒 𝑡−2
12
 Therefore, we need a predictor that only uses output data. To do so, we need to
compute the whitening filter:
𝑦 𝑡 =𝑊 𝑧 𝑒 𝑡

෩ 𝑧 𝑦 𝑡
𝑒(𝑡) = 𝑊

෩ 𝑧 𝑦 𝑡−2
𝑒(𝑡 − 2) = 𝑊

 Therefore,
1 1
𝑦ො 𝑡 𝑡 − 2 = − 𝑒 𝑡−2 =− 𝑊 ෩ 𝑧 𝑦 𝑡−2
12 12
92 Md Shafiullah, Ph.D.
MA Process: Example
 Therefore,
1 1
𝑦ො 𝑡 𝑡 − 2 = − 𝑒 𝑡−2 =− 𝑊 ෩ 𝑧 𝑦 𝑡−2
12 12
1 1
𝑦ො 𝑡 𝑡 − 2 = − 𝑦 𝑡−2
12 1 − 1 𝑧 −1 − 1 𝑧 −2
12 12
1 −1 1 1
1− 𝑧 − 𝑧 −2 𝑦ො 𝑡 𝑡 − 2 = − 𝑦 𝑡 − 2
12 12 12

1 1 1
𝑦ො 𝑡 𝑡 − 2 − 𝑦ො 𝑡 − 1 𝑡 − 3 − 𝑦ො 𝑡 − 2 𝑡 − 4 = − 𝑦 𝑡 − 2
12 12 12
1 1 1
𝑦ො 𝑡 𝑡 − 2 = 𝑦ො 𝑡 − 1 𝑡 − 3 + 𝑦ො 𝑡 − 2 𝑡 − 4 − 𝑦 𝑡−2
12 12 12
93 Md Shafiullah, Ph.D.
MA Process: Example
 Variance of the predictor error:
2
1 1
𝑉𝑎𝑟 𝜀 𝑡 = 𝑉𝑎𝑟 𝑒 𝑡 − 𝑒 𝑡 − 1 =𝐸 𝑒 𝑡 − 𝑒 𝑡−1
12 12

2
1 2
1
=𝐸 𝑒 𝑡 + 𝑒 𝑡−1 −2×𝑒 𝑡 × 𝑒 𝑡−1
144 12

2
1 2
1
=𝐸 𝑒 𝑡 +𝐸 𝑒 𝑡−1 +𝐸 − 𝑒 𝑡 ×𝑒 𝑡−1
144 6

1 1 145
=1+ ×1− ×0=
144 6 144

94 Md Shafiullah, Ph.D.
AR Process: Example
 Evaluate the 2-step ahead predictor of the following AR(1) process:

1
𝑦 𝑡 = − 𝑦 𝑡 − 1 + 𝑒 𝑡 ; 𝑒 𝑡 ~𝜔𝑛 (14,108)
2
𝑦ො 𝑡 𝑡 − 2 =? ? ?

95 Link Md Shafiullah, Ph.D.


ARMA Process: Example
 Evaluate the 1-step ahead predictor of the following ARX process:

1
1 + 𝑧 −1 1
𝑦 𝑡 = 5 𝑢 𝑡−1 + 𝑒 𝑡 ; 𝑒 𝑡 ~𝜔𝑛 (0,2)
1 −1 1 −1
1− 𝑧 1− 𝑧
2 2

𝑦ො 𝑡 𝑡 − 1 =? ? ?

96 Link Md Shafiullah, Ph.D.


ARMA Process: Example
 Evaluate the 1-step ahead predictor of the following ARMA(1,1) process:

1 1
𝑦 𝑡 = − 𝑦 𝑡 − 1 + 𝑒 𝑡 + 𝑒(𝑡 − 1); 𝑒 𝑡 ~𝜔𝑛 (0,2)
3 2

𝑦ො 𝑡 𝑡 − 1 =? ? ?

97 Link Md Shafiullah, Ph.D.


ARMA Process: Example
 Evaluate the 1-step and 2-step ahead predictor of the following ARMAX process:

1 1 −2
1 + 𝑧 −1 1 + 𝑧
𝑦 𝑡 = 5 𝑢 𝑡−1 + 4 𝑒 𝑡 ; 𝑒 𝑡 ~𝜔𝑛 (0,2)
1 −1 1 −1
1− 𝑧 1− 𝑧
2 2

𝑦ො 𝑡 𝑡 − 1 =? ? ?

𝑦ො 𝑡 𝑡 − 2 =? ? ?

98 Link Md Shafiullah, Ph.D.


End of topic 6

Content Courtesy
 Dr. Mujahed Mohammad Al-Dhaifallah
 Dr. Fouad M. Al-Sunni
 Mr. Mohamed Mohamed Ahmed
 Text and Reference Books
 Online Materials
Appendix

100 Md Shafiullah, Ph.D.


n-Step ahead prediction
 One Step ahead prediction is given by:

𝑦ො 𝑘 𝑘 − 1 = 𝐻−1 𝑞 −1 𝐺 𝑞 −1 𝑢 𝑘 + 1 − 𝐻−1 𝑞 −1 𝑦(𝑘)

 n-Step ahead prediction is given by:

𝑦ො 𝑘 𝑘 − 𝑛 = 𝑊𝑛 𝑞 −1 𝐺 𝑞 −1 𝑢 𝑘 + 1 − 𝑊𝑛 𝑞 −1 𝑦(𝑘)

𝑊𝑛 𝑞−1 = 𝐻 ෡ 𝑞 −1 𝐻−1 𝑞 −1 .
෡ 𝑞 −1 is the first n terms of 𝐻 𝑞 −1 .
Where 𝐻

101 Md Shafiullah, Ph.D.


n-Step ahead prediction

Example: Consider the following system where 𝑤(𝑡) is white noise with a
variance of 1
𝑦 𝑘 − 𝑦 k − 1 + 0.09𝑦 k − 2 = 𝑢 k − 1 + 2𝑤 𝑘

We first express the output as


𝑞 −1 1
𝑦 𝑘 = 𝑢 𝑘 + 𝑒 𝑘
1 − 0.1𝑞 −1 1 − 0.9𝑞 −1 1 − 0.1𝑞 −1 1 − 0.9𝑞 −1
𝐺 𝑞 𝐻 𝑞

where e(k) = 2w(k) is a white noise process of variance 22 = 4.


We then form the one-step ahead predictor as
𝑦(k
ƶ ∣ 𝑘 − 1) = 𝐻 −1 (𝑞)𝐺(𝑞)𝑢(k) + 1 − 𝐻−1 (𝑞) 𝑦(k)
= 𝑞 −1 𝑢(k) + 𝑞 −1 − 0.09𝑞 −2 𝑦(k)
= 𝑢(𝑘 − 1) + 𝑦(𝑘 − 1) − 0.09𝑦(𝑘 − 2)
102 Md Shafiullah, Ph.D.
n-Step ahead prediction

For two step ahead prediction we decompose H as

1
𝐻 𝑞 =
1 − 0.1𝑞−1 1 − 0.9𝑞−1
= 1 + 0.1𝑞−1 + 0.01𝑞−2 + ⋯ 1 + 0.9𝑞−1 + 0.81𝑞−2 + ⋯
= 1 + 𝑞−1 + 𝑞−2 0.91 + ⋯ .
෡ 𝑞
𝐻 ෩2 𝑞
𝐻

෡ 𝑞 −1 𝐻−1 𝑞 −1 = 1 + 𝑞 −1 (1 − 𝑞 −1 + 0.09𝑞 −2 )
𝑊2 𝑞 = 𝐻

𝑦(k
ƶ ∣ 𝑘 − 2) = 𝑊2 𝑞 𝐺(𝑞)𝑢(k) + 1 − 𝑊2 𝑞 𝑦(k)
= (1 + 𝑞 −1 )𝑞 −1 𝑢(k) + 0.81𝑞 −2 − 0.09𝑞 −3 𝑦(k)
= 𝑢 𝑘 − 1 + u(k − 2) + 0.81𝑦(𝑘 − 2) − 0.09𝑦(𝑘 − 3)

103 27/11/2024 Md Shafiullah, Ph.D.


https://cal.unibg.it/wp-content/uploads/DSI/slide/Lecture-16-Predictors.pdf

https://arch.readthedocs.io/en/latest/univariate/univariate_forecasting_with_exoge
nous_variables.html

https://www.mathworks.com/help/ident/ref/arx.html

https://www.mathworks.com/help/ident/ref/predict.html

https://busoniu.net/teaching/sysid2017/

104 27/11/2024 Md Shafiullah, Ph.D.

You might also like