0% found this document useful (0 votes)
74 views

Multivariate Statistics PCA

1. Principal component analysis (PCA) is a technique used to reduce the dimensionality of multivariate data while retaining as much information as possible. 2. PCA works by transforming the original variables into a new set of uncorrelated variables called principal components. 3. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.

Uploaded by

debantam majilla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

Multivariate Statistics PCA

1. Principal component analysis (PCA) is a technique used to reduce the dimensionality of multivariate data while retaining as much information as possible. 2. PCA works by transforming the original variables into a new set of uncorrelated variables called principal components. 3. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.

Uploaded by

debantam majilla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Multivariate Statistics

Sudipta Das

Assistant Professor,
Department of Data Science,
Ramakrishna Mission Vivekananda University, Kolkata
Outline I

1 Principal Component Analysis

Sudipta Das (RKMVU) Multivariate Statistics 2 / 19


Introduction I

A principal component analysis (PCA) is concerned with


explaining the variance-covariance structure of a set of variables
through a few linear combinations of these variables.
Objectives
Data reduction
Data interpretation

Sudipta Das (RKMVU) Multivariate Statistics 3 / 19


Introduction II

By PCA we select k principal components from a set for p(≥ k )


initial variables such that the total system variability is retained as
much as possible.
PCA
Data-set of size (n × p) ==⇒ Data-set of size (n × k )
Note
To retain the total system variability, we need to retain all the p
principal components

Sudipta Das (RKMVU) Multivariate Statistics 4 / 19


Population Principal Components I

Principal components are particular linear combinations of the p


random features/variables X1 , X2 , . . . , Xp .
These linear combinations represents selection of new coordinate
system obtained by rotating the original system with X1 , X2 , . . . , Xp
as the coordinate axes
the new axes represent the direction with maximum variability and
provide a simpler and more parsimonious description of the
covariance structure
Note:
Principal components depends solely on the covariance matrix Σ of
X1 , X2 , . . . , Xp .
Their development does not require a multivariate normal
assumption.
However, standard results on inference can be used if the samples
are assumed to be coming from normal population

Sudipta Das (RKMVU) Multivariate Statistics 5 / 19


Population Principal Components II

Formal definition
First principal component
is the linear combination of a1 0 X that maximizes Var (a1 0 X) subject to
a1 0 a1 = 1
Second principal component
is the linear combination of a2 0 X that maximizes Var (a2 0 X) subject to
a2 0 a2 = 1 and Cov (a2 0 X, a1 0 X) = 0
······
ith principal component
is the linear combination of ai 0 X that maximizes Var (ai 0 X) subject to
ai 0 ai = 1 and Cov (ai 0 X, ak 0 X) = 0 for all k < i

Sudipta Das (RKMVU) Multivariate Statistics 6 / 19


Population Principal Components III

Result: Let Σ be the covariance matrix associated with random


vector X = [X1 , X2 , . . . , Xp ]0 . Let Σ have the eigenvalue-
eigenvector pairs (λ1 , e1 ), (λ2 , e2 ), . . . , (λp , ep ), where
λ1 ≥ λ2 ≥ · · · λp ≥ 0. Then the ith principal component is given by

Yi = ei 0 X = ei1 X1 + ei2 X2 + · · · + eip Xp , for i = 1, . . . , p

With these choices,

Var (Yi ) = ei 0 Σei = λi , for i = 1, . . . , p

Cov (Yi , Yk ) = ei 0 Σek = 0, for i 6= k


Note: If some λi are equal then the choices of the corresponding
coefficient vectors ei , and hence Yi , are not unique.

Sudipta Das (RKMVU) Multivariate Statistics 7 / 19


Population Principal Components IV

Sketch of proof:
To get the first principal component, we need

a0 Σa
 
0 0
max(Var (a X)) s. t. a a = 1 ⇒ max
a a a0 a

Thus (Lemma: Maximization of Quadratic Forms for Points on the Unit Sphere),
 0 
a Σa
max = λ1
a a0 a

and maximum is attained at a = e1 .


Hence, Y1 = e01 X and Var (Y1 ) = e01 Σe1 = λ1

Sudipta Das (RKMVU) Multivariate Statistics 8 / 19


Population Principal Components V
Sketch of proof (contd.):
To get the ith principal component, we need
max(Var (a0 X)) s. t. a0 a = 1 and Cov (a0 X, ak 0 X) = 0 for all k < i
a
 0 
a Σa
⇒ max s. t. Cov (a0 X, ek 0 X) = 0 for all k < i
a a0 a
 0 
a Σa
⇒ max , [ since a0 Σek = a0 λk ek = 0 ⇒ a⊥ek ]
a⊥e1 ,...,ei−1 a0 a

Thus (Lemma: Maximization of Quadratic Forms for Points on the Unit Sphere),
 0 
a Σa
max = λi
a⊥e1 ,...,ei−1 a0 a
and maximum is attained at a = ei .
Hence, Yi = e0i X and Var (Yi ) = e0i Σei = λi
Also,
Cov (Yi , Yk ) = Cov (e0i X, e0k X) = e0i Σek = 0

Sudipta Das (RKMVU) Multivariate Statistics 9 / 19


Population Principal Components VI

Result: Let the random vector X = [X1 , X2 , . . . , Xp ]0 have


covariance matrix Σ, with the eigenvalue- eigenvector pairs
(λ1 , e1 ), (λ2 , e2 ), . . . , (λp , ep ), where λ1 ≥ λ2 ≥ · · · λp ≥ 0. Let
Y1 = e1 0 X, Y2 = e2 0 X, . . . , Yp = ep 0 X be the principal components.
Then
p
X
Var (Xi ) = σ11 + σ22 + · · · + σpp
i=1
= tr (Σ)
= λ1 + λ2 + · · · + λp
p
X
= Var (Yi ).
i=1

Sudipta Das (RKMVU) Multivariate Statistics 10 / 19


Population Principal Components VII

Proportion of total population variance explained by k th principal


component:
λk
λ1 + . . . + λk + . . . + λp
Proportion of total population variance explained by first k
principal components:

λ1 + . . . + λk
λ1 + . . . + λk + . . . + λp

Sudipta Das (RKMVU) Multivariate Statistics 11 / 19


Population Principal Components VIII

Result: If Y1 = e1 0 X, Y2 = e2 0 X, . . . , Yp = ep 0 X are the principal


components obtained from the covariance matrix Σ, then

eik λi
ρYi ,Xk = √ , for i, k = 1, 2, . . . , p
σkk

are the correlation coefficients between the components Yi and


the variables Xk . Here (λ1 , e1 ), (λ2 , e2 ), . . . , (λp , ep ) are the
eigenvalue- eigenvector pairs for Σ.
The magnitude of eik measures the importance of k th variable (Xk )
to the ith principal component (Yi )

Sudipta Das (RKMVU) Multivariate Statistics 12 / 19


Population Principal Components IX

Sketch of proof:

ρYi ,Xk = Cor (Yi , Xk )


Cov (ei0 X, [0 0 . . . 1 . . . 0]X)
k th
= p
Var (Yi )Var (Xk )
[0 0 . . . 1 . . . 0]Σei
th
= √k
λi σkk

λi eik λi eik
= √ = √
λi σkk σkk

Sudipta Das (RKMVU) Multivariate Statistics 13 / 19


Principal Components on Standardized Variable I

Given the vector X , the standardized vector can be obtained as


1
Z = V − 2 (X − µ),

σ11 ... 0
 
 . .. .. 
recall V =  .. . . .
0 ... σpp
Note:
E(Z) = 0 = [0 .. . 0]0 
1 ρ12 ... ρ1p
 .. .. ..
Cov (Z) = ρ =  . .

. .
ρ1p ρ2p ... 1

Sudipta Das (RKMVU) Multivariate Statistics 14 / 19


Principal Components on Standardized Variable II

Result: The ith principal component of the standardized variables


Z = [Z1 Z2 . . . Zp ]0 with Cov (Z) = ρ, is given by

Yi = ei0 Z, for i = 1, 2, . . . , p.

Moreover,
p
X p
X
Var (Yi ) = Var (Zi ) = p.
i=1 i=1

In this case, (λ1 , e1 ), (λ2 , e2 ), . . . , (λp , ep ) are the eigenvalue-


eigenvector pairs for ρ, with λ1 ≥ λ2 ≥ · · · ≥ λp ≥ 0.

Sudipta Das (RKMVU) Multivariate Statistics 15 / 19


Principal Components on Standardized Variable III

Proportion of total population variance explained by k th principal


component:
λk
p
Proportion of total population variance explained by first k
principal components:

λ1 + . . . + λk
p

Sudipta Das (RKMVU) Multivariate Statistics 16 / 19


Summarizing Sample Variations by Principal
Components I
Result: Let X be the observation on the variables X1 , X2 , . . . , Xp
with the corresponding sample covariance matrix Sp×p . Then the
ith sample principal component is given by

Ŷi = ê0i X = êi1 X1 + · · · + êip Xp for i = 1, 2 . . . , p,

where (λ̂1 , ê1 ), (λ̂2 , ê2 ), . . . , (λ̂p , êp ) are the eigenvalue-
eigenvector pairs for S, with λ̂1 ≥ λ̂2 ≥ · · · ≥ λ̂p ≥ 0. Also,
Var (Ŷi ) = λ̂i , for i = 1, 2, . . . , p

and
Cov (Ŷi , Ŷk ) = 0, for i 6= k .

In addition,
p
X p
X
Total Sample Variance = sii = λ̂i
i=1 i=1

Sudipta Das (RKMVU) Multivariate Statistics 17 / 19


Summarizing Sample Variations by Principal
Components II
 
Result: Let Z be the observation on the variables Zi = X√i −sX̄i s,
ii
i = 1, . . . , p, with the corresponding sample covariance matrix
Rp×p . Then the ith sample principal component is given by

Ŷi = ê0i Z = êi1 Z1 + · · · + êip Zp for i = 1, 2 . . . , p,

where (λ̂1 , ê1 ), (λ̂2 , ê2 ), . . . , (λ̂p , êp ) are the eigenvalue-
eigenvector pairs for R, with λ̂1 ≥ λ̂2 ≥ · · · ≥ λ̂p ≥ 0. Also,

Var (Ŷi ) = λ̂i , for i = 1, 2, . . . , p

and
Cov (Ŷi , Ŷk ) = 0, for i 6= k .

In addition,
p
X
Total Sample Variance = λ̂i = p
i=1

Sudipta Das (RKMVU) Multivariate Statistics 18 / 19


Summarizing Sample Variations by Principal
Components III

How many principal components to be retained?


No definite answer.
Subjectively, we deice on
the relative size of the eigenvalues and the amount of sample
variation explained
subject-matter interpretations of the components is also important
Visual aid: Scree Plot
Plot of λ̂i vs i
To determine the appropriate number of components we look for an
elbow (bend) in the scree plot
The number of components is taken to be the point at which the
remaining eigenvalues are relatively small and all about the same
size.

Sudipta Das (RKMVU) Multivariate Statistics 19 / 19

You might also like