0% found this document useful (0 votes)

3 views25 pages

Principal Computer Analysis(PCA)

Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction, simplifying complex datasets while retaining essential information. It involves standardizing data, calculating a covariance matrix, and identifying principal components through eigenvalues and eigenvectors. PCA has various applications, advantages such as preventing overfitting, and limitations including linearity assumptions and potential loss of information.

Uploaded by

kaah22ise

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views25 pages

Principal Computer Analysis(PCA)

Uploaded by

kaah22ise

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

PRINCIPAL

COMPONENT
ANALYSIS(PCA)
What is PCA?

Dimensionality Reduction

Why PCA?

OUTLINE Important Terminologies

How Does PCA Work

Application of USA

Advantages and Disadvantages

INTRODUCTION
Principal Component Analysis, commonly referred to as
PCA, is a powerful mathematical technique used in data
analysis and statistics. At its core, PCA is designed to
simplify complex datasets by transforming them into a more
manageable form while retaining the most critical
information.

• reducing the dimensionality of dataset

• Increasing interpretability without losing information

Dimensionality Reduction

1 2 3 4 5 6 7

Dimensionality Why DR? Less Redundancy is Data It helps to find Leads to better
reduction dimensions for removed after Compression out the most human
refers to the a given dataset removing (Reduce significant interpretations
techniques that means less similar entries storage space) features and
reduce the computation or from the skip the rest
number of training time dataset
input variables
in a dataset.
WHY PCA?

DIMENSIONALIT NOISE VISUALIZATION FEATURE OVERFITTING DATA MACHINE

Y REDUCTION REDUCTION ENGINEERING PROBLEM COMPRESSION LEARNING
PROCESSING
Variance

Covariance
IMPORTANT
TERMINOLOGIES Eigenvalues

Eigenvectors

Principle Component
IMPORTANT TERMINOLOGIES (VARIANCE)

• Variance is the sum of squares of differences between all numbers and

means
• Variance (σ²) = (Sum of the squared differences from the mean) / (Total
number of values)
• In mathematical notation: σ² = Σ(x - μ)² / (n)
Here:
• μ is the mean of independent features
• Mean (μ) = (Sum of all values) / (Total number of values)
IMPORTANT TERMINOLOGIES (VARIANCE)

• The variance is a measure that indicates how much data scatter around the
mean
IMPORTANT TERMINOLOGIES (VARIANCE)

• In mathematical notation: σ² = Σ(x - μ)² / (n)

Compute Eigenvalues/EigenVectors

Let A be a square N*N matrix & x be a non-zero vector for which:

Ax = λx

For some scalar values λ

λ = Eigenvalue of matrix A.
X = Eigenvector of matrix A.
Eigenvalues:
A-λI = 0 [return n numbers of eigenvalues
HOW DOES PCA WORKS

• Step 1: Standardize the data.

• Step 2: Calculate the
covariance matrix.
• Step 3: Compute the
eigenvectors and eigenvalues.
• Step 4: Select the principal
components.
• Step 5: Project data onto the
new basis.
Step-By-Step Explanation of PCA (Principal Component
Analysis)

Step 1: Standardization
The main aim of this step is to standardize the range of the attributes so that each one of them lie within
similar boundaries

• Z = (x - μ) / σ

• μ is the mean of independent features

• σ is the standard deviation of independent features

• σ = √[ Σ(x - x̄)² / N ]
STANDARDIZATION
Dataset:
Consider a small dataset with two variables, X and Y, represented by the following data points:
• X: [2, 3, 5, 7, 10]
• Y: [4, 5, 7, 8, 11]

For variable X:
• Mean (μX) = (2 + 3 + 5 + 7 + 10) / 5 = 5.4
• Standard Deviation (σX) = √[Σ(Xi - μX)² / (n - 1)] = √[(0.64 + 0.04 + 0.16 + 1.44 + 20.25) / 4] = 2.40

For variable Y:
• Mean (μY) = (4 + 5 + 7 + 8 + 11) / 5 = 7
• Standard Deviation (σY) = √[Σ(Yi - μY)² / (n - 1)] = √[(9 + 4 + 0 + 1 + 16) / 4] = 2.38
• Standardized X: [-1.25, -0.71, 0.36, 1.43, 0.17]
• Standardized Y: [-1.34, -0.87, 0.11, 0.61, 1.50]
Covariance Matrix Computation
Covariance matrix is used to express the correlation between any two or more attributes in
a multidimensional dataset

•Variance is denoted by Var

•Covariance is denoted by Cov
COVARIANCE MATRIX COMPUTATION

Cov(X, X) Cov(X, Y)
Cov(Y, X) Cov(Y, Y)
• Using the formula for covariance:

Cov(X, X) = Σ(Standardized X * Standardized X) / (n - 1) = (1.56 + 0.50 + 0.13 + 2.05 + 0.03) / 4 = 1.305

Cov(X, Y) = Σ(Standardized X * Standardized Y) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) / 4 = 0.133
Cov(Y, X) = Σ(Standardized Y * Standardized X) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) / 4 = 0.133
Cov(Y, Y) = Σ(Standardized Y * Standardized Y) / (n - 1) = (1.79 + 0.76 + 0.01 + 0.15 + 2.25) / 4 = 1.24
• Covariance Matrix:

1.305 0.133
0.133 1.24
Important Terminologies (Covariance)

It is the relationship It can take any value negative relationship It is used for the linear It gives the direction of
between a pair of between -infinity to whereas a positive value relationship between relationship between
random variables where +infinity, where the represents the positive variables. variables.
change in one variable negative value relationship.
causes change in represents the
another variable.
IMPORTANT TERMINOLOGIES
(COVARIANCE)

The formula for the covariance (Cov) between two random variables X and Y,
each with N data points, is as follows:
Cov(X, Y) = (1/N) * Σ (from i=1 to N) [(Xi - X̄ ) * (Yi - Ȳ)]
Where:
• Cov(X, Y) is the covariance between X and Y.
• N is the number of data points.
• Xi and Yi represent individual data points for X and Y, respectively.
COMPUTE EIGENVALUES AND EIGENVECTORS OF
COVARIANCE MATRIX TO IDENTIFY PRINCIPAL COMPONENTS

Let's assume we find two eigenvalues and corresponding eigenvectors:

• Eigenvalue 1 (λ1) = 1.50
• Eigenvector 1 (v1) = [0.707, 0.707]
• Eigenvalue 2 (λ2) = 1.05
• Eigenvector 2 (v2) = [-0.707, 0.707]
SELECT THE PRINCIPAL COMPONENT

1.First Principle component is the

direction of greatest
variability(covariance) in the data
2.Second is the next
orthogonal(uncorrelated) direction of
greatest variability
PROJECT DATA ON TWO PRINCIPAL
COMPONENTS

• To transform the data into the new principal component space, we

dot-multiply the standardized data by the eigenvectors:
• New PC1 = (Standardized X * v1, Standardized Y * v1)
• New PC2 = (Standardized X * v2, Standardized Y * v2)
APPLICATIONS OF PCA

Netflix Movie
Grocery
Recommendation Fitness Trackers Car Shopping Real Estate
Shopping
s

Manufacturing
Renewable
and Quality Sports Analytics Smart Cities
Energy
Control
Advantages of PCA

Prevents Overfitting
Speeds Up Other Machine Learning
Algorithms
Improves Visualization

Dimensionality Reduction

Noise Reduction
LIMITATIONS OF PCA

Linearity Assumption

Loss of Interpretability

Loss of Information

Sensitivity to Scaling

Orthogonal Components

Finite Group Theory-Martin Isaac
100% (10)
Finite Group Theory-Martin Isaac
364 pages
Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
A Step by Step Explanation of Principal Component Analysis
No ratings yet
A Step by Step Explanation of Principal Component Analysis
7 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
2. PCA
No ratings yet
2. PCA
22 pages
Unit-3
No ratings yet
Unit-3
28 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
18 pages
Pca
No ratings yet
Pca
15 pages
PCA
100% (1)
PCA
33 pages
1501589578da-mod15-Q1-e-text
No ratings yet
1501589578da-mod15-Q1-e-text
9 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
No ratings yet
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
8 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
PCA_dev
No ratings yet
PCA_dev
16 pages
Pca
No ratings yet
Pca
18 pages
Pac
No ratings yet
Pac
70 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
PCA_gl
No ratings yet
PCA_gl
8 pages
PCA_Notes
No ratings yet
PCA_Notes
3 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
9 pages
Steps for PCA
No ratings yet
Steps for PCA
5 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
5 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
PCA - Principal Component Analysis: Step by Step Computation of PCA
No ratings yet
PCA - Principal Component Analysis: Step by Step Computation of PCA
2 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Presentation a i Std 2
No ratings yet
Presentation a i Std 2
63 pages
pca1
No ratings yet
pca1
3 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
DR Pca
No ratings yet
DR Pca
22 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
Lecture 9 -Data Prep - Reduction - PCA-M
No ratings yet
Lecture 9 -Data Prep - Reduction - PCA-M
44 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Qrm2024 Topic5 Pca Fa
No ratings yet
Qrm2024 Topic5 Pca Fa
67 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
Principal Component Analysis (PCA) Explained - Built in
No ratings yet
Principal Component Analysis (PCA) Explained - Built in
11 pages
Module 2 Lab 2
No ratings yet
Module 2 Lab 2
5 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
Dimensionality Reduction by Pca: Non - Feasible
No ratings yet
Dimensionality Reduction by Pca: Non - Feasible
26 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
RES805-RM-Module 2
No ratings yet
RES805-RM-Module 2
26 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
PCA S3
No ratings yet
PCA S3
26 pages
Pca
No ratings yet
Pca
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
Principal Components Analysis (PCA)
No ratings yet
Principal Components Analysis (PCA)
27 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
Pca Ica
No ratings yet
Pca Ica
34 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Advanced Mathematical Applications in Data Science
From Everand
Advanced Mathematical Applications in Data Science
Biswadip Basu Mallik
No ratings yet
Chapter 4 Trig Functions - PDF
100% (1)
Chapter 4 Trig Functions - PDF
14 pages
Projectile Motion
No ratings yet
Projectile Motion
38 pages
Aggregate Testing For Concrete
No ratings yet
Aggregate Testing For Concrete
15 pages
Ratio Proportion Percentage
No ratings yet
Ratio Proportion Percentage
32 pages
TUNE, S. - OBLESER, J. (2022) - A parsimonious look at neural oscillations in speech perception
No ratings yet
TUNE, S. - OBLESER, J. (2022) - A parsimonious look at neural oscillations in speech perception
31 pages
Aluminium
No ratings yet
Aluminium
16 pages
Ncert Solutions Class 11 Maths Chapter 13 Miscellaneous Ex
No ratings yet
Ncert Solutions Class 11 Maths Chapter 13 Miscellaneous Ex
29 pages
2024-25 - Incoming Sr.C-120 - Teaching & Test Schedule@01-03-2024-PHYSICS
No ratings yet
2024-25 - Incoming Sr.C-120 - Teaching & Test Schedule@01-03-2024-PHYSICS
5 pages
Thermie Energy Efficient Industrial Natural Gas Technologies
No ratings yet
Thermie Energy Efficient Industrial Natural Gas Technologies
34 pages
S5 Sub Math Eot
No ratings yet
S5 Sub Math Eot
5 pages
My Life As A Shy Guy
No ratings yet
My Life As A Shy Guy
151 pages
steel design review modules (revised 2001 NSCP)
No ratings yet
steel design review modules (revised 2001 NSCP)
4 pages
Measurement of Humidity: 1. Hair Hygrometer
No ratings yet
Measurement of Humidity: 1. Hair Hygrometer
4 pages
Life Force Fractal Field
No ratings yet
Life Force Fractal Field
48 pages
(Ebook) Handbook of research methods in social and personality psychology by Harry T. Reis (editor); Charles M. Judd (editor) ISBN 9780511996481, 0511996489 download
100% (1)
(Ebook) Handbook of research methods in social and personality psychology by Harry T. Reis (editor); Charles M. Judd (editor) ISBN 9780511996481, 0511996489 download
59 pages
ACKNOWLEDGEMENT
No ratings yet
ACKNOWLEDGEMENT
14 pages
Bruker Good Diffraction Practice IX - LYNXEYE XE Combining 1D Speed With 0D Background
No ratings yet
Bruker Good Diffraction Practice IX - LYNXEYE XE Combining 1D Speed With 0D Background
52 pages
Physics Module 1
No ratings yet
Physics Module 1
21 pages
Honey Extraction
No ratings yet
Honey Extraction
14 pages
Physics Xii CH 12 Case Study Atoms
No ratings yet
Physics Xii CH 12 Case Study Atoms
20 pages
RAC ASSIGNMENT-I Set-2
No ratings yet
RAC ASSIGNMENT-I Set-2
2 pages
1SI19ME410 Provisional Grade Card
No ratings yet
1SI19ME410 Provisional Grade Card
1 page
Motion Notes CLASS 9
No ratings yet
Motion Notes CLASS 9
12 pages
Ch. 15 Biomedical Phy.
No ratings yet
Ch. 15 Biomedical Phy.
11 pages
Gravitation
No ratings yet
Gravitation
6 pages
National Calibrstion: Office
No ratings yet
National Calibrstion: Office
4 pages
Non Verbal Reasoning
No ratings yet
Non Verbal Reasoning
15 pages
Experiment 5 (Physics)
No ratings yet
Experiment 5 (Physics)
3 pages
Atomic Structure
No ratings yet
Atomic Structure
28 pages

Principal Computer Analysis(PCA)

Uploaded by

Principal Computer Analysis(PCA)

Uploaded by

PRINCIPAL

OUTLINE Important Terminologies

How Does PCA Work

Advantages and Disadvantages

• reducing the dimensionality of dataset

• Increasing interpretability without losing information

DIMENSIONALIT NOISE VISUALIZATION FEATURE OVERFITTING DATA MACHINE

• Variance is the sum of squares of differences between all numbers and

• In mathematical notation: σ² = Σ(x - μ)² / (n)

Let A be a square N*N matrix & x be a non-zero vector for which:

For some scalar values λ

• Step 1: Standardize the data.

• μ is the mean of independent features

• σ is the standard deviation of independent features

•Variance is denoted by Var

Cov(X, X) = Σ(Standardized X * Standardized X) / (n - 1) = (1.56 + 0.50 + 0.13 + 2.05 + 0.03) / 4 = 1.305

Let's assume we find two eigenvalues and corresponding eigenvectors:

1.First Principle component is the

• To transform the data into the new principal component space, we

You might also like