Probstats Review1
Probstats Review1
Arun K. Tangirala
Trend-plus-random processes
Random process
Framework
1. Univariate / bivariate
2. Linear random process
3. Stationary and non-stationarities (of certain types)
4. Discrete-time
5. Time- and frequency-domain analysis
Notation
- Random variable: UPPERCASE e.g., X; Outcomes: lowercase e.g., x.
- Probability distribution and density functions: F (x) and f (x), respectively.
- Scalars: lowercase x, ✓, etc.
- Vectors: lowercase bold faced e.g., x, v, ✓, etc.
- Matrices: Uppercase bold faced A, X.
- Expectation operator: E(.)
- Discrete-time random signal and process: v[k] (or {v[k]}) (scalar-valued)
- White-noise: e[k]
- Backward / forward shift-operator: q 1 and q s.t. q 1 v[k] = v[k 1].
- Angular and cyclic frequencies: ! and f , respectively.
- ...
Arun K. Tangirala Applied Time-Series Analysis 8
Probability & Statistics - Review 1
Random Variable
Definition
A random variable (RV) is one whose value set contains at least two elements, i.e., it
draws one value from at least two possibilities. The space of possible values is known as
the outcome space or sample space.
Formal definition
Outcomes of random phenomena can be either qualitative and/or quantitative. In order
to have a unified mathematical treatment, RVs are defined to be quantitative.
I In the study of RVs, the time (or space) dimension does not come into picture.
Instead they are analysed only in the outcome space.
I When the set of possibilities contains a single element, the randomness vanishes to
give rise to a deterministic variable.
I Two classes of random variables exist:
Discrete-valued RV: discrete set of possibilities (e.g., roll of a dice)
Continuous-valued RV: continuous-valued RV (e.g., ambient temperature)
The tag of randomness is given to any variable or a signal which is not accurately pre-
dictable, i.e., the outcome of the associated event is not predictable with zero error.
In reality, there is no reason to believe that the true process behaves in a “random”
manner. It is merely that since we are unable to predict its course, i.e., due to lack of
sufficient understanding or knowledge that any process becomes random.
Probability Distribution
The natural recourse to dealing with uncertainties is to list all possible outcomes and
assign a chance to each of those outcomes
Examples:
The specification of the outcomes and the associated probabilities through what is known
as probability distribution completely characterizes the random variable.
F (x) = Pr(X x)
dF (x)
f (x) = (2)
dx
1.0
1.0
0.8
0.8
0.8
0.6
0.6
0.6
F(x)
F(x)
F(x)
0.4
0.4
0.4
The type of distribution for a
0.2
0.2
0.2
0.0
0.0
0.0
-4 -2 0 2 4 0 10 20 30 40 0 2 4 6 8 10 random phenomenon
x x x
0.25
0.4
0.08
0.20
0.3
0.06
0.15
p(X=x)
f(x)
f(x)
0.2
0.04
0.10
0.1
0.02
0.05
0.00
0.00
0.0
-4 -2 0 2 4 0 10 20 30 40 0 2 4 6 8 10
x x x
Density Functions
✓ ◆
1 1 (x µ)2
1. Gaussian density function: f (x) = p exp 2
2⇡ 2
1
2. Uniform density function: f (x) = , axb
b a
1
3. Chi-square density: fn (x) = xn/2 1 e x/2
2n/2 (n/2)
Commands in R
Every distribution that R handles has four functions for probability, quantile, density and
random variable (value), and has the same root name, but prefixed by p, q, d and r
respectively
Few relevant functions:
Commands Distribution
rnorm, pnorm, qnorm, dnorm Gaussian
rt, pt, qt, dt Student’s-t
rchisq, pchisq, qchisq, dchisq Chi-square
runif, punif, qunif, dunif Uniform distribution
rbinom, pbinom, qbinom, dbinom Binomial
Arun K. Tangirala Applied Time-Series Analysis 18
Probability & Statistics - Review 1
Sample usage
Histogram of x
0.08
0.06
Density
0.04
0.02
0.00
5 10 15 20 25 30 35
Practical Aspects
The p.d.f. of a RV allows us to compute the probability of X taking on values in an
infinitesimal interval, i.e., Pr(x X x + dx) ⇡ f (x)dx
Note: Just as the way the density encountered in mechanics cannot be interpreted as mass of
the body at a point, the probability density should never be interpreted as the probability at a
point. In fact, for continuous-valued RVs, Pr(X = x) = 0
In practice, knowing the p.d.f. theoretically is seldom possible. One has to conduct
experiments and then try to fit a known p.d.f. that best explains the behaviour of the
RV.
Arun K. Tangirala Applied Time-Series Analysis 20
Probability & Statistics - Review 1
The useful statistical properties, namely, mean and variance are, in fact, the first and
second-order (central) moments of the p.d.f. f (x) (similar to the moments of inertia).
The nth moment of a p.d.f. is defined as
Z 1
Mn (X) = xn f (x) dx (3)
1
It turns out that for linear processes, predictions of random signals and estimation of
model parameters it is sufficient to have the knowledge of mean, variance and
covariance (to be introduced shortly), i.e., it is sufficient to know the first and
second-order moments of p.d.f.
The mean is defined as the first moment of the p.d.f. (analogous to the center of mass).
It is also the expected value (outcome) of the RV.
Mean
The mean of a RV, also the expectation of the RV, is defined as
Z 1
E(X) = µX = xf (x) dx (4)
1
Remarks
I The integration in (4) is across the outcome space and NOT across any time
space.
I Applying the expectation operator E to a random variable produces its
“average” or expected value.
I Prediction perspective:
The mean is the best prediction of the random variable in the min-
imum mean square error sense, i.e.,
Expectation Operator
I For any constant, E(c) = c.
I The expectation of a function of X is given by
Z 1
E(g(X)) = g(x)f (x) dx (5)
1
I It is a linear operator:
k
! k
X X
E ci gi (X) = ci E(gi (X)) (6)
i=1 i=1
Example
Problem: Find the expectation of a random variable y[k] = sin(!k + ) where is
uniformly distributed in [ ⇡, ⇡].
Z ⇡
1
Solution: E(y[k]) = E(sin(!k + )) = sin(!k + ) d
2⇡ ⇡
1
= ( cos(!k + )|⇡ ⇡ )
2⇡
1
= (cos(!k ⇡) cos(!k + ⇡)) = 0
2⇡
Variance / Variability
Variance
2
The variance of a random variable, denoted by X is the average spread of outcomes
around its mean,
Z 1
2 2
X = E((X µX ) ) = (x µX )2 f (x) dx (7)
1
Points to note
2
I As (7) suggests, X is the second central moment of f (x). Further,
2
X = E(X 2 ) µ2X (8)
2
E(X + c) = µX + c, var(X + c) = var(X) = X (9)
I Affine transformation:
µ y = a1 µ 1 + a2 µ 2 + · · · + an µ n
2
y = a21 2
1 + a22 2
2 + · · · + a2n 2
n
N
X
YN = Xi , N = 1, 2, · · ·
i=1
YN Nµ
p ! N (0, 1)
N
One of the popular applications of the CLT is in deriving the distribution of sample mean.
Arun K. Tangirala Applied Time-Series Analysis 31