0% found this document useful (0 votes)
66 views

CS306 Data Analysis and Visualization Winter, 2019: Lab. 7 MNIST Dataset For Dimensionality Reduction Using PCA

This document provides instructions for a lab exercise using the MNIST dataset to perform dimensionality reduction via PCA. Students will: 1) Prepare MNIST training data with 784 pixel features for each image and remove class labels. 2) Perform PCA and identify the number of principal components needed to preserve 95%, 90%, 80%, and 75% of data variability. 3) Encode and decode test images using the identified principal components. 4) Compute and report the RMSE reconstruction error for each digit class with different levels of dimensionality reduction in a 10x4 table.

Uploaded by

Dhruvesh Asnani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

CS306 Data Analysis and Visualization Winter, 2019: Lab. 7 MNIST Dataset For Dimensionality Reduction Using PCA

This document provides instructions for a lab exercise using the MNIST dataset to perform dimensionality reduction via PCA. Students will: 1) Prepare MNIST training data with 784 pixel features for each image and remove class labels. 2) Perform PCA and identify the number of principal components needed to preserve 95%, 90%, 80%, and 75% of data variability. 3) Encode and decode test images using the identified principal components. 4) Compute and report the RMSE reconstruction error for each digit class with different levels of dimensionality reduction in a 10x4 table.

Uploaded by

Dhruvesh Asnani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

CS306 Data Analysis and Visualization Winter, 2019

Lab. 7 MNIST dataset for Dimensionality Reduction using PCA

1. In this practical, we have MNIST dataset to explore. You will surely enjoy this

practical. In this dataset, we have 28x28 grayscale images of handwritten digits

from 0 - 9. There are two folders present in MNIST folder. For this practical first we

will use train folder.

1. In this first step we will prepare data for PCA. Read each png file as a sample

with 784 (28x28) features. Each pixel value represents a feature. You may

have class label in the data file but it will not be part of the feature vector.

2. Carry out PCA on prepared data file and find the number of PC’s and

principal components required to preserve 95% , 90% , 80% and 75%

variability of the data.

3. Now encode 10 test images of each class with the identified PC’s. You will

compute the coefficients of each PC’s to build the data.

4. Also decode the encoded feature vectors and reconstruct it back as 28x28

image.

5. Compute and report RMSE error for each class and different levels of

compression / dimensionality reduction. So finally we will have 10x4 table for

RMS errors. (Rows for each class and columns for different number of PC’s

used to encode the data.)

You may use Python/R for this exercises….

(Submit your solutions via mail on [email protected] before next lab.)

You might also like