0% found this document useful (0 votes)

168 views25 pages

Multivariate Analysis

This document summarizes key concepts from Chapter 2 of STAT7005 Multivariate Methods. It discusses: 1) The multivariate normal distribution and its properties including linearity, additivity, and partitioning. 2) Maximum likelihood estimation of the mean and covariance matrix of a multivariate normal, where the sample mean and covariance matrix are the MLEs. 3) The Wishart distribution, a generalization of the chi-squared distribution to matrices that describes the distribution of sample covariance matrices.

Uploaded by

Paul Etham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

168 views25 pages

Multivariate Analysis

Uploaded by

Paul Etham

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

STAT7005 Multivariate Methods

Chapter 2: Multivariate Normal and Related Distributions

Tony Fung

University of Hong Kong

Department of Statistics and Actuarial Science

HKU

Tony Fung STAT7005 HKU 1 / 25

Contents

1 Multivariate normal distribution

2 Estimating µ and Σ

3 Wishart distribution

4 Assessing normality assumption

5 Transformations to near normality

Tony Fung STAT7005 HKU 2 / 25

1. Multivariate normal distribution

Multivariate normal distribution

A random vector x is said to have a multivariate normal

distribution (multinormal distribution) if every linear combination
of its components has a univariate normal distribution.
Suppose a = (a1 a2 )0 and x = (x1 x2 )0 . The multinormality of x
requires that a0 x = a1 x1 + a2 x2 is univariate normal for all a1 and a2 .

Tony Fung STAT7005 HKU 3 / 25

1. Multivariate normal distribution

Multivariate normal distribution (cont.)

Tony Fung STAT7005 HKU 4 / 25

1. Multivariate normal distribution

Multivariate normal distribution (cont.)

Tony Fung STAT7005 HKU 5 / 25

1. Multivariate normal distribution

Multivariate normal distribution (cont.)

There are a lot of convenient properties for the multivariate normal
distribution, as follows.
1 (Linearity) If x is multinormal, for any constant vector a,

a0 x ∼ N(a0 µ, a0 Σa).

2 The mgf of a multinormal random vector x with mean vector µ and

covariance matrix Σ is given by

0 1 0
Mx (t) = exp t µ + t Σt .
2

Thus, a multinormal distribution is identified by its mean µ and

covariance Σ. We use the notation x ∼ Np (µ, Σ).

Tony Fung STAT7005 HKU 6 / 25

1. Multivariate normal distribution

Multivariate normal distribution (cont.)

3 (Additivity) Given x ∼ Np (µ1 , Σ1 ) and y ∼ Np (µ2 , Σ2 ). If x and y
are independent,

x + y ∼ Np (µ1 + µ2 , Σ1 + Σ2 ).

4 (Multiple linear combinations) If x ∼ Np (µ, Σ), for any constant

m × p matrix A and constant m × 1 vector d,

Ax + d ∼ Nm (Aµ + d, AΣA0 ).
5 (Standardization) Given a positive definite (i.e. non-singular square,
or invertible) matrix Σ. Then, x ∼ Np (µ, Σ) iff there exists a
non-singular matrix B and z ∼ Np (0, I) such that

x = µ + Bz.

In this case, Σ = BB0 .

Tony Fung STAT7005 HKU 7 / 25

1. Multivariate normal distribution

Multivariate normal distribution (cont.)

6 The pdf of x ∼ Np (µ, Σ) is given by

1 1 0 −1
f (x) = p 1 exp − (x − µ) Σ (x − µ) , (Σ is p.d.).
(2π) 2 |Σ| 2 2

x1 µ1 Σ11 Σ12
7 (Partition) Let x = ,µ= and Σ = ,
x2 µ2 Σ21 Σ22
where x1 consists of the first q components and x2 consists of the last
(p − q) components.
1 x1 and x2 are independent iff Cov(x1 , x2 ) = Σ12 = 0.
2 x1 ∼ Nq (µ1 , Σ11 ) and x2 ∼ Np−q (µ2 , Σ22 ).
3 (x1 − Σ12 Σ−1
22 x2 ) is independent of x2 and is distributed as
Nq (µ1 − Σ12 Σ−1 −1
22 µ2 , Σ11 − Σ12 Σ22 Σ21 ).

4 Given x2 , x1 ∼ Nq (µ1 + Σ12 Σ−1 −1

22 (x2 − µ2 ),Σ11 − Σ12 Σ22 Σ21 ).

Tony Fung STAT7005 HKU 8 / 25

1. Multivariate normal distribution

Multivariate normal distribution (cont.)

8 (Quadratic form) Given E(x) = µ and Var(x) = Σ and p × p
symmetric matrix A. Then,

E(x0 Ax) = µ0 Aµ + tr(AΣ).

9 (“Squaring” a standard normal) Given x ∼ Np (µ, Σ) and Σ is p.d..

Then,
(x − µ)0 Σ−1 (x − µ) ∼ χ2 (p).
10 Let x ∼ Np (µ, Σ) and Σ is p.d.. Then, for any m × p matrix A and
n × p matrix B,
1 Ax is independent of Bx iff AΣB0 = 0.
2 x0 Ax (A is symmetric p × p) is independent of Bx iff BΣA = 0.
3 x0 Ax and x0 Bx (A and B are both symmetric p × p) are independent iff
AΣB = 0.

Tony Fung STAT7005 HKU 9 / 25

2. Estimating µ and Σ

Maximum likelihood estimation

Suppose x1 , . . . , xn are i.i.d. Np (µ, Σ) where Σ is positive definite,

the sample mean vector x̄ and the sample covariance matrix S defined
are unbiased estimators. By the Law of Large Numbers, these sample
quantities approach to µ and Σ, i.e.,
p p
x̄ → µ; S → Σ.

The likelihood function for the random sample is

n
( )
1 1X
L(µ, Σ) = np n exp − (xi − µ)0 Σ−1 (xi − µ) .
(2π) 2 |Σ| 2 2
i=1

It can be shown that the MLE of µ and Σ are

W (n − 1)S
µ̂ = x̄; Σ̂ = = .
n n
Tony Fung STAT7005 HKU 10 / 25
2. Estimating µ and Σ

Maximum likelihood estimation (cont.)

Properties of the MLEs:

1 x̄ and S are sufficient statistics for Np (µ, Σ). That means x̄ and S
contain all information needed to make inference on µ and Σ.
2 x̄ ∼ Np (µ, n1 Σ) and (n − 1)S ∼ Wp (n − 1, Σ), a central Wishart
distribution (multivariate analogue of chi-squared) which is defined
in the next section.
3 µ̂ is unbiased but Σ̂ is biased (like the univariate case). However, S is
unbiased for Σ.
4 The MLEs possess an invariance property: if MLE of θ is θ̂, then MLE
of φ = h(θ) is φ̂ = h(θ̂), provided that h(·) is a one-to-one function.

Tony Fung STAT7005 HKU 11 / 25

3. Wishart distribution

Definition

The Wishart distribution is one defined on matrices, and can be

thought of as a multivariate generalization of the chi-squared
distribution.
Definition: Suppose xi (i = 1, . . . , k) are independent Np (µi , Σ).
Define a symmetric p × p matrix V as
k
X
V= xi x0i = X0 X.
i=1

Then, V is said to follow a p-dimensional Wishart distribution,

denoted by Wp (k, Σ, Ψ), where:
I k is the degree of freedom;
I Σ is the scaling matrix;
k
µi µ0i is the (p × p symmetric) noncentrality matrix.
P
I Ψ=
i=1

Tony Fung STAT7005 HKU 12 / 25

3. Wishart distribution

Definition (cont.)

When Ψ = 0, we call it a central Wishart distribution, denoted simply

by Wp (k, Σ).
Note that, when p = 1, the pdf of the central Wishart distribution is
reduced to the central chi-squared distribution.
The pdf of the general Wishart distribution is given by

−k/2 (k−p−1)/2 1 −1
f (V) = c(p, k) |Σ| |V| exp tr − Σ V
2

k k−1 k−p+1 −1

where c(p, k) = 2kp/2 π p(p−1)/4 Γ

2 Γ 2 ···Γ 2 .

Tony Fung STAT7005 HKU 13 / 25

3. Wishart distribution

Properties

Some key properties of the Wishart distribution are listed below:

1 (Additivity) If V1 ∼ Wp (k1 , Σ, Ψ1 ) and V2 ∼ Wp (k2 , Σ, Ψ2 ) are
independent, V1 + V2 ∼ Wp (k1 + k2 , Σ, Ψ1 + Ψ2 ).
2 If V ∼ Wp (k, Σ, Ψ), AVA0 ∼ Wq (k, AΣA0 , AΨA0 ) for any given
q × p matrix A.
3 (Getting chi-squared)
1 If V ∼ Wp (k, Σ, Ψ) and a is a constant vector,
a0 Va/a0 Σa ∼ χ2 (k, a0 Ψa), a non-central chi-squared r.v. In particular,
for the ith diagonal element of V, vii /σii ∼ χ2 (k, Ψii ).
2 If y is any random vector independent of V ∼ Wp (k, Σ),
0
y0 Vy/y Σy ∼ χ2 (k) and is independent of y.
3 If y is any random vector independent of V ∼ Wp (k, Σ), k > p, the
ratio y0 Σ−1 y/y0 V−1 y ∼ χ2 (k − p + 1) and is independent of y.

Tony Fung STAT7005 HKU 14 / 25

3. Wishart distribution

Properties (cont.)

4 (Normal random sample) Let x1 , . . . , xn be a random sample from

Np (µ, Σ).
1 x̄ ∼ Np (µ, n1 Σ).
2 (n − 1)S (= W, the CSSP matrix) ∼ Wp (n − 1, Σ).
3 x̄ and S are independent.

Several other properties are given in the lecture notes. The Wishart
distribution will be used for the construction of other random variables in
the next few chapters.

Tony Fung STAT7005 HKU 15 / 25

4. Assessing normality assumption

Univariate checks

A number of statistical analysis we will introduce in the following

chapters depend on the assumption that the observations are
(multivariate) normally distributed.
How to check normality when there are multiple variables?
Simple method: check each variable for univariate normality
(necessary for multinormality but not sufficient):

1 Q-Q plot (quantile against quantile plot) for normal distribution:

I Sample quantiles are plotted against the theoretical quantiles of a
standard normal distribution.
I A straight line indicates univariate normality.
I Non-linearity may indicate a need to transform the variable.
I One may use a P-P plot as well (sample cdf vs theoretical cdf).

Tony Fung STAT7005 HKU 16 / 25

4. Assessing normality assumption

Univariate checks (cont.)

OK May need transformation

2 Shapiro-Wilk W test — a statistic for checking normality (together
with p-value) conveniently obtained from statistical software.
3 Kolmogorov-Smirnov-Lilliefors (KSL) test for large samples —
comparing empirical and fitted normal cdf.
4 Check for (almost) zero skewness and excess kurtosis.

Tony Fung STAT7005 HKU 17 / 25

4. Assessing normality assumption

Multivariate checks

When n − p is large enough, we make use of property (9) of Section

2.1: if x ∼ Np (µ, Σ), then

(x − µ)0 Σ−1 (x − µ) ∼ χ2 (p).

Check whether the squared generalized distance (Mahalanobis

distance) as defined below follows a chi-squared distribution by a Q-Q
plot (necessary and sufficient condition for very large sample size).
Mahalanobis distance is similar to Euclidean distance, but takes into
account the ellipsoidal contour of the covariance matrix.

Tony Fung STAT7005 HKU 18 / 25

4. Assessing normality assumption

Multivariate checks (cont.)

Steps to calculate distance and produce Q-Q plot:

1 Define the squared generalized distance as di2 = (xi − x̄)0 S−1 (xi − x̄),
i = 1, . . . , n.
2 Order d12 , d22 , . . . , dn2 as d(1)
2 ≤ d2 ≤ · · · ≤ d2 .
(2) (n)
3 2 , where χ2 is the 100(i − 1 )/n percentile of χ2 (p)
Plot χ2(i) vs d(i) (i) 2
distribution.
4 A straight line indicates multivariate normality.

Tony Fung STAT7005 HKU 19 / 25

4. Assessing normality assumption

Multivariate checks (cont.)

OK May need transformation

We may check the principal components (PCs) of the data — each

PC is a linear combination of the variables; if the full data set is
multivariate normal, the PCs must be univariate normal. Hence this is
only a necessary condition (unless n is large enough).
More about principal component analysis (PCA) in Chapter 8.

Tony Fung STAT7005 HKU 20 / 25

5. Transformations to near normality

Common transformations

The standard procedure is to transform each variable to near

normality.
This does not guarantee the transformed data follow multivariate
normal; one may need to check the multinormality again.
Common univariate transformations:
I For right-skewed data (e.g., log-normal), log(x) can work well.
√
I For count data (e.g., Poisson), we may use x.
I For binomial
√ proportion, we may use the arcsine transformation:
x
arcsin( x), or the logit transformation: log( 1−x ).
More systematic approach?

Tony Fung STAT7005 HKU 21 / 25

5. Transformations to near normality

Box-Cox transformation

The Box-Cox transformation (applied in a univariate fashion) has a

[λ]
parameter λ. The transformed xi is denoted as xi where

 xiλ − 1
xi
[λ]
= for λ 6= 0 , i = 1, . . . , n;
 logλx for λ = 0 , i = 1, . . . , n.
i

Essentially, this performs a power transformation.

Tony Fung STAT7005 HKU 22 / 25

5. Transformations to near normality

Box-Cox transformation (cont.)

“Convenient” choice of λ:
Power, λ Transformation
3 cubic
2 square
1 no transform
0.5 square-root
1/3 cubic-root
0 log
−1/3 inverse of cubic-root
−0.5 inverse of square-root
−1 inverse
−2 inverse of square
−3 inverse of cubic

Tony Fung STAT7005 HKU 23 / 25

5. Transformations to near normality

Box-Cox transformation (cont.)

Since x [λ] ’s are of different scales for different λ, we instead consider

the following “standardized” Box-Cox transformation, so that we can
compare the quality of transformation for different λ’s:

xiλ − 1

, if λ 6= 0 , i = 1, . . . , n;

[λ]
xi = λ[GM(x)]λ−1
GM(x) log xi , if λ = 0 , i = 1, . . . , n,


where GM(x) = (x1 x2 · · · xn )1/n is the geometric mean of the

observations x1 , x2 , . . . , xn . We choose the value of λ that minimizes
the sum of squares of residuals (equivalent to sample variance) after
transformation.

Tony Fung STAT7005 HKU 24 / 25

5. Transformations to near normality

Box-Cox transformation (cont.)

Before transformation After transformation with λ = 2

This procedure is repeated for other variables (with potentially
different λ’s). After all variables have been transformed, we can
calculate the squared Mahalanobis distances and construct the
chi-squared Q-Q plot to check if the transformed data set is closer to
multivariate normal.

Tony Fung STAT7005 HKU 25 / 25

Chap 2
No ratings yet
Chap 2
9 pages
The Multivariate Normal Distribution: f (x) = √ e −∞ 0. /σ
No ratings yet
The Multivariate Normal Distribution: f (x) = √ e −∞ 0. /σ
5 pages
HASTS215 - HSTS215 NOTES Chapter4
No ratings yet
HASTS215 - HSTS215 NOTES Chapter4
7 pages
Advanced Multivariate Statistics
No ratings yet
Advanced Multivariate Statistics
18 pages
Multi Normal
No ratings yet
Multi Normal
6 pages
Multivariate Normal Distribution Guide
100% (1)
Multivariate Normal Distribution Guide
8 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Multivariate Statistical Analysis: The Multivariate Normal Distribution
No ratings yet
Multivariate Statistical Analysis: The Multivariate Normal Distribution
13 pages
Chapter1 MV
No ratings yet
Chapter1 MV
72 pages
3.multivariate Normal Distribution I
No ratings yet
3.multivariate Normal Distribution I
33 pages
Chapter06 Normal 2023su
No ratings yet
Chapter06 Normal 2023su
44 pages
Statistics Review
No ratings yet
Statistics Review
9 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
9 pages
Chisq QQPlot
No ratings yet
Chisq QQPlot
8 pages
Multivariate Statistics Assignment Help
No ratings yet
Multivariate Statistics Assignment Help
17 pages
Handout-3-Multivariate Normal
No ratings yet
Handout-3-Multivariate Normal
9 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
19 pages
Solusi Soal Bab 4
No ratings yet
Solusi Soal Bab 4
9 pages
Research Methodology Part 1
No ratings yet
Research Methodology Part 1
25 pages
Stat 147 - Chapter 2 - Multivariate Distributions PDF
No ratings yet
Stat 147 - Chapter 2 - Multivariate Distributions PDF
6 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
20 pages
Slides 4
No ratings yet
Slides 4
51 pages
Multi Varia Da 1
No ratings yet
Multi Varia Da 1
59 pages
Multivariate Normal Distribution Guide
No ratings yet
Multivariate Normal Distribution Guide
59 pages
Unit 19
No ratings yet
Unit 19
16 pages
Covariance Matrix (W Krzanowski)
No ratings yet
Covariance Matrix (W Krzanowski)
5 pages
Exactly Central Limit: Multivariate Statistical Methods
No ratings yet
Exactly Central Limit: Multivariate Statistical Methods
18 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
13 pages
2 The Multinomial Distribution
No ratings yet
2 The Multinomial Distribution
15 pages
Multivariate Normal Distributions
No ratings yet
Multivariate Normal Distributions
6 pages
Multivariate Normal
No ratings yet
Multivariate Normal
24 pages
Notes 5 Multivariate Distributions
No ratings yet
Notes 5 Multivariate Distributions
13 pages
Stat1 Formulas and Tables For Statistics 2022
No ratings yet
Stat1 Formulas and Tables For Statistics 2022
34 pages
Slides
No ratings yet
Slides
38 pages
Lecture 4 Sep 16
No ratings yet
Lecture 4 Sep 16
26 pages
1) Common Univariate Summaries: I) I) Iii) I) Ii)
No ratings yet
1) Common Univariate Summaries: I) I) Iii) I) Ii)
5 pages
Multivariate Normal Distribution - Wikipedia, The Free Encyclopedia
No ratings yet
Multivariate Normal Distribution - Wikipedia, The Free Encyclopedia
12 pages
Unit 1 Multivariate Analysis Lecture Notes
No ratings yet
Unit 1 Multivariate Analysis Lecture Notes
12 pages
1-Multivariate Normal Distributions-18-07-2024
No ratings yet
1-Multivariate Normal Distributions-18-07-2024
36 pages
Topic 3 Multivariate Models I (Week 2)
No ratings yet
Topic 3 Multivariate Models I (Week 2)
27 pages
Cap 2 Applied Multivariate Statistical JOHNSON PP 149-163
No ratings yet
Cap 2 Applied Multivariate Statistical JOHNSON PP 149-163
15 pages
Lec 2
No ratings yet
Lec 2
4 pages
Multivariate Distributions
No ratings yet
Multivariate Distributions
8 pages
Chapter6 (Multivariate Normal Distribution)
No ratings yet
Chapter6 (Multivariate Normal Distribution)
25 pages
Notes For Lectures 1 To 10 - 2024
No ratings yet
Notes For Lectures 1 To 10 - 2024
39 pages
Multivariate Normality: Hurold W. Fulls Marshall Marshall Ala
No ratings yet
Multivariate Normality: Hurold W. Fulls Marshall Marshall Ala
288 pages
22-23 323 Week5Notes
No ratings yet
22-23 323 Week5Notes
8 pages
Normal Distribution - Wikipedia, The Free Encyclopedia
No ratings yet
Normal Distribution - Wikipedia, The Free Encyclopedia
22 pages
STAT456 Study Guide
No ratings yet
STAT456 Study Guide
31 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
Normal Distribution
No ratings yet
Normal Distribution
48 pages
Probability - Statistics - Class Notes
No ratings yet
Probability - Statistics - Class Notes
15 pages
Symbiosis International (Deemed University) : Symbiosis School For Online and Digital Learning
No ratings yet
Symbiosis International (Deemed University) : Symbiosis School For Online and Digital Learning
84 pages
Statistics Study Guide: Measures of Central Tendancy
No ratings yet
Statistics Study Guide: Measures of Central Tendancy
2 pages
A Generalization of Shapiro Wilk S Test For Multivariate Normality
No ratings yet
A Generalization of Shapiro Wilk S Test For Multivariate Normality
15 pages
Robust Multivariate Estimators Guide
No ratings yet
Robust Multivariate Estimators Guide
19 pages
SSP4SE Appa
No ratings yet
SSP4SE Appa
10 pages
Normal Distribution Overview
No ratings yet
Normal Distribution Overview
17 pages
Introduction To Machine Learning: K-Nearest Neighbors: Zhongheng Zhang
No ratings yet
Introduction To Machine Learning: K-Nearest Neighbors: Zhongheng Zhang
7 pages
STAT 512 Homework Solutions
No ratings yet
STAT 512 Homework Solutions
6 pages
1011MPFM Econometrics
No ratings yet
1011MPFM Econometrics
3 pages
Overview of Hypothesis Testing Analysis
No ratings yet
Overview of Hypothesis Testing Analysis
3 pages
Analysing Panel Data Using STATA
No ratings yet
Analysing Panel Data Using STATA
13 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Econometrics 2
No ratings yet
Econometrics 2
135 pages
Jamboree
No ratings yet
Jamboree
10 pages
Patrick Dattalo - Analysis of Multiple Dependent Variables
No ratings yet
Patrick Dattalo - Analysis of Multiple Dependent Variables
191 pages
تكنولوجيا المعلومات والاتصالات والنمو الاقتصادي في البلاد ال... ة - دراسة قياسية باستخدام نماذج البانل (panel data models) للفترة 2005 - 2018
No ratings yet
تكنولوجيا المعلومات والاتصالات والنمو الاقتصادي في البلاد ال... ة - دراسة قياسية باستخدام نماذج البانل (panel data models) للفترة 2005 - 2018
23 pages
Quantile Regression
No ratings yet
Quantile Regression
3 pages
P4 Project Report
No ratings yet
P4 Project Report
28 pages
ARIMA Models: Instructions
60% (5)
ARIMA Models: Instructions
3 pages
Levine Bsfc7ge Ch12 1
No ratings yet
Levine Bsfc7ge Ch12 1
93 pages
Diversity and Inclusion Survey Analysis 19th Sep
No ratings yet
Diversity and Inclusion Survey Analysis 19th Sep
8 pages
CASO Crecimiento Plantas
No ratings yet
CASO Crecimiento Plantas
12 pages
Hypothesis Test: Independent Groups (T-Test, Pooled Variance)
No ratings yet
Hypothesis Test: Independent Groups (T-Test, Pooled Variance)
11 pages
Econometrics Word File
No ratings yet
Econometrics Word File
13 pages
REGRESSION
No ratings yet
REGRESSION
7 pages
Files-2-Presentations Malhotra Mr05 PPT 16
No ratings yet
Files-2-Presentations Malhotra Mr05 PPT 16
53 pages
FML LabFile 7exps
No ratings yet
FML LabFile 7exps
37 pages
Mediation Testing with Regression
No ratings yet
Mediation Testing with Regression
3 pages
Cart: Classification and Regression Tree
No ratings yet
Cart: Classification and Regression Tree
45 pages
Financial Econometrics, Mathematics and Statistics
No ratings yet
Financial Econometrics, Mathematics and Statistics
19 pages
Panel Data Analysis Course Overview
0% (1)
Panel Data Analysis Course Overview
25 pages
Viva Questions and Possible Answers - Ver 1.0
No ratings yet
Viva Questions and Possible Answers - Ver 1.0
3 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
Decision Trees and Random Forest
No ratings yet
Decision Trees and Random Forest
79 pages
Running Head: Relationship Between Circumference and Diameter 1
No ratings yet
Running Head: Relationship Between Circumference and Diameter 1
8 pages
Regression Analysis
No ratings yet
Regression Analysis
7 pages