Modeling and Analysis of Time Series Data: Chapter 1: Introduction
Modeling and Analysis of Time Series Data: Chapter 1: Introduction
Chapter 1: Introduction
Edward L. Ionides
1 / 26
Outline
1 Overview
2 Example: Winter in Michigan
Course files on Github
Rmarkdown and knitr
Some basic investigation using R
3 A first look at an autoregressive-moving average (ARMA) model
4 Fitting an ARMA model in R
5 Model diagnostics
6 Model misspecification and non-reproducibility
7 A first look at a state-space model
2 / 26
Overview
Discuss some basic motivations for the topic of time series analysis.
Introduce some fundamental concepts for time series analysis:
stationarity, autocorrelation, autoregressive models, moving average
models, autoregressive-moving average (ARMA) models, state-space
models. These will be covered in more detail later.
Introduce some of the computational tools we will be using.
3 / 26
Overview
Overview
Time series data are, simply, data collected at many different times.
This is a common type of data! Observations at similar time points
are often more similar than more distant observations.
This immediately forces us to think beyond the independent,
identically distributed assumptions fundamental to much basic
statistical theory and practice.
Time series dependence is an introduction to more complicated
dependence structures: space, space/time, networks
(social/economic/communication), ...
4 / 26
Overview
5 / 26
Overview
6 / 26
Example: Winter in Michigan Course files on Github
y <- read.table(file="ann_arbor_weather.csv",header=1)
7 / 26
Example: Winter in Michigan Rmarkdown and knitr
The notes combine source code with text, to generate statistical analysis
that is
Reproducible
Easily modified or extended
These two properties are useful for developing your own statistical research
projects. Also, they are useful for teaching and learning statistical
methodology, since they make it easy for you to replicate and adapt
analysis presented in class.
Many of you will already know Rmarkdown (Rmd format) and/or
Jupyter notebooks.
knitr (Rnw format) is similar, and is also supported by Rstudio. The
notes are in Rnw, since it is superior for combining with Latex to
produce pdf articles.
Rmd naturally produces html.
8 / 26
Example: Winter in Michigan Some basic investigation using R
str(y)
i.i.d data
sufficiently close to normal for a central limit theorem to hold
10 / 26
Example: Winter in Michigan Some basic investigation using R
11 / 26
Example: Winter in Michigan Some basic investigation using R
The first rule of data analysis is to plot the data in as many ways as
you can think of. For time series, we usually start with a time plot.
plot(Year,Low,data=y,ty="l")
12 / 26
A first look at an autoregressive-moving average (ARMA) model
ARMA models
A note on notation
14 / 26
Fitting an ARMA model in R
Maximum likelihood
Coefficients:
ar1 ma1 intercept
-0.596 0.630 -2.858
0.683: observed Fisher information
s.e. 0.594 0.573 0.683
alpha_hat beta_hat mu_hat
sigma^2 estimated as 55.5: log likelihood = -424.92,
aic = 857.85 sigma_hat squared
We will write the ARMA(1,1) estimate of µ as µ̂2 , and its standard error
as SE2 .
15 / 26
Fitting an ARMA model in R
Investigating R objects
names(arma11)
16 / 26
Model diagnostics
In this case, the two estimates, µ̂1 = −2.86 and µ̂2 = −2.86, and
their standard errors, SE1 = 0.67 and SE2 = 0.68, are close.
For data up to 2015, µ̂2015
1 = −2.83 and µ̂2015
2 = −2.85, with
standard errors, SE1 = 0.68 and SE2015
2015
2 = 0.83.
In this case, the standard error for the simpler model is
100(1 − SE2015
1 /SE2015
2 ) = 17.5% smaller.
Exactly how the ARMA(1,1) model is fitted and the standard errors
computed will be covered later.
Question 1.3. When standard errors for two methods differ, which is more
trustworthy? Or are they both equally valid for their distinct estimators?
(i) check assumptions
(ii) do a simulation study
17 / 26
Model diagnostics
plot(arma11$resid)
We see slow variation in the residuals, over a decadal time scale. However,
the residuals r1:N are close to uncorrelated. We see this by plotting their
pairwise sample correlations at a range of lags. This is the sample
autocorrelation function, or sample ACF, written for each lag h as
1 PN −h
N n=1 rn rn+h
ρ̂h = 1 PN
. (6)
2
N n=1 rn
19 / 26
Model diagnostics
acf(arma11$resid,na.action=na.pass)
21 / 26
A first look at a state-space model
22 / 26
A first look at a state-space model
23 / 26
A first look at a state-space model
{ϵn } is iid N (0, 1), {νn } is iid N (0, σν2 ), {ωn } is iid N (0, σω2 ).
Hn is unobserved volatility at time tn . We only observe the return,
modeled by Yn .
Hn has auto-regressive behavior and dependence on Yn−1 and a
slowly varying process Gn .
24 / 26
A first look at a state-space model
26 / 26