0% found this document useful (0 votes)
88 views13 pages

Dimensionality Reduction Using PCA (Principal Component Analysis)

The document discusses a Data Mining course taught by K K Singh that covers dimensionality reduction using principal component analysis (PCA). The course covers the mathematical background of PCA including concepts like covariance, eigenvectors, and eigenvalues. It also discusses applications of PCA such as data compression, feature selection, latent semantic indexing, and more.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views13 pages

Dimensionality Reduction Using PCA (Principal Component Analysis)

The document discusses a Data Mining course taught by K K Singh that covers dimensionality reduction using principal component analysis (PCA). The course covers the mathematical background of PCA including concepts like covariance, eigenvectors, and eigenvalues. It also discusses applications of PCA such as data compression, feature selection, latent semantic indexing, and more.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Data Mining course by K K Singh is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Dimensionality Reduction using


PCA ( Principal component analysis)

K K SINGH,
DEPT OF CSE, RGUKT NUZVID
Outline 2
 Introduction
 Mathematical background
 PCA
 Applications
Let there are 1000 students in 20 class rooms of the department .

We want to call a meeting to discuss about the performance of student and build
some inference/model.
Whether to call all the 1000 students, OR only a set of students representative
(CRs)??
Same way if there are 1000 variables/dimensions in a data set, only a set of ‘P’
representative variables (called principle components in PCA) may be enough to
build the model.
Introduction
 3
Principal component analysis (PCA) is a statistical procedure that uses
an orthogonal transformation to convert a set of observations (of possibly
correlated variables) into a set of values of linearly uncorrelated variables
called principal components
 PCA can supply the user with a lower-dimensional picture, a projection of
the object when viewed from its most informative viewpoint
 Look at the different 2-D view of a 3-D cuboids.
Check
 Which one is the most informative??
 (Forth one?)

 The transformation is defined in such a way that the first principal


component has the largest possible variance
 Source: https://en.wikipedia.org/wiki/Principal_component_analysis
Mathematical Background 4
 If we have one dimensional data (x)
 Variance : The average square of the distance from the mean of the data
set to its points
 Var(X)=∑(xi - x̅ )2/(n-1)

 Covariance Always measured between two dimensions.


 Cov(X,Y)=∑(xi - x̅ ) (yi - ȳ ) /(n-1)
 Cov(X, X)=Var(X)
 If X and Y are independent (uncorrelated) cov(X,Y)=0
Mathematical Background 5
 Covariance Matrix

Cov(x,x) cov(x,y) var(x) cov(x,y)


 C= =
Cov(x, y) cov(y,y) Cov(x, y) var(y)

C is square and symmetric matrix.


• The diagonal values are the variance for each dimension and the off-diagonal are the
covariance between measurement types.
• Large term in the diagonal correspond to interesting dimensions
, whereas large values in the off-diagonal correspond to high correlations (redundancy).
Mathematical Background 6
0 1

A V = λ V => (A – λI ) V =0
A=
-2 -3
0 1 λ 0
 |A-λI | = -
-2 -3 0 λ +1
+1
V1 = k1 , V2 = k2
 -λ 1 = λ + 3λ +2 = 0
2
-2
-2 -3 -λ -1

 => λ1= -1, λ2 = -2


Mathematical Background 7

Eigenvectors Properties
 eigenvectors can only be found for square matrices.
 Not every square matrix has eigenvectors.
 Symmetric matrices S(n x n) satisfies two properties:
 I. has exactly n eigenvectors.
 II. All the eigenvectors are orthogonal (perpendicular).
 Any vector is an eigenvector to the identity matrix.
Example 8
Sl
No
size,
(KLOC)
Cyclomatic
Complexity
No of
Defects(
Step 1: get some data
(CC) D) from scipy import linalg as LA
1 5 55 4 import numpy as np
x1=[5,8,6,9,1,2,4,6,8]
2 8 75 8
x2=[55,75,50,85,12,24,30,70,85]
3 6 50 5 X=np.stack((x1, x2), axis=-1)
4 9 85 10 Step 2: Calculate the covariance matrix C.
5 1 12 3 n,m=X.shape
6 2 24 4
M=np.mean(X, axis = 0)
7 4 30 3
X =X- M
8 6 70 7
cov = np.dot(X.T,X)/(n-1)=
9 8 85 6

7.25 71.75

71.75 734.5
plt.scatter(x1,x2, color="r") Step 5:
Project on the p eigenvectors that corresponds to the
highest p eigen values 9
PC = np.dot(X, evecs)

plt.scatter(PC[:,0],PC[:,1], color="b")
plt.ylim(-2, 10)

Step 4: Calculate the eigenvectors and


eigenvalues of the covariance matrixe Back
vals , evecs = LA.eigh(cov)
idx = np.argsort(evals)[::-1] Step 5:
Get the data back
evecs = evecs[:,idx]
evals = evals[idx] XX= np.dot(PC, evecs.T)
 Why we use PCA or SVD ?
 A powerful tool for analyzing data and finding patterns. 10
 Used for data compression and feature selection .
 So one can reduce the number of dimensions without much loss of information.
 PCA can be done by either eigenvalue decomposition of a data covariance matrix
or singular value decomposition of a data matrix, usually after a normalization step of the
initial data (mean centering)
 Some other terminologies about PCA?
 component scores(or factor scores): New variables (PCs) are constructed as weighted
averages of the original variables(X). These new variables are called the principal
components, latent variables, or factors. Their specific values on a specific row are
referred to as the factor scores, the component scores, or simply the scores
 Loadings: the weight by which each normalized original variable should be multiplied to
get the transformed variable .
 Loadings=Eigenvectors⋅ √ (Eigenvalues)
PCA applications 11

 Dimensionality Reduction
 Image compression
 Feature selection.
 Page Rank and Hits algorithm
 LSI:Latent Semantic Indexing (Document term-to-
concept similarity matrix )
Assignments: 12

 Can we compute eigenvectors of as square matrix A where


Det(A)=0?
 Explain XTX/ (n-1) = cov(X), where X is mean centered data
matrix
 Let The principle components P n*m and Eigen Vectors Vm*m are
given and the first k eigen vectors capture 99% of variance in the
data set, then how to reconstruct the original data set using the
first k , V & PC?
13

Thank you for listening

Questions or Thoughts ??

You might also like