Steps Involved in The PCA: Dataset Matrix
Steps Involved in The PCA: Dataset Matrix
Dataset matrix
First, we need to standardize the dataset and for that, we need to calculate the mean
and standard deviation for each feature.
Standardization formula
Standardized Dataset
Since we have standardized the dataset, so the mean for each feature is 0 and the
standard deviation is 1.
cov(f1,f2) =
((-1.0–0)*(-0.632456-0) +
(0.33–0)*(1.264911-0) +
(-1.0–0)* (0.632456-0)+
(0.33–0)*(0.000000 -0)+
(1.33–0)*(-1.264911–0))/5
cov(f1,f2 = -0.25298
In the similar way be can calculate the other covariances and which will result in the
below covariance matrix
Let A be a square matrix (in our case the covariance matrix), ν a vector and λ a scalar
that satisfies Aν = λν, then λ is called eigenvalue associated with eigenvector ν of A.
Rearranging the above equation,
Aν-λν =0 ; (A-λI)ν = 0
Since we have already know ν is a non- zero vector, only way this equation can be
equal to zero, if
det(A-λI) = 0
A-λI = 0
Eigenvectors:
For λ = 2.51579324, solving the above equation using Cramer's rule, the values for v
vector are
v1 = 0.16195986
v2 = -0.52404813
v3 = -0.58589647
v4 = -0.59654663
Going by the same approach, we can calculate the eigen vectors for the other eigen
values. We can from a matrix using the eigen vectors.
eigenvectors(4 * 4 matrix)
Data Transformation