MNIST
MNIST
3. target Key:
• This key stores the labels or target values associated with
each data point. These are like separate label cards
accompanying the data cards. In the handwritten digit
example, the target card would tell you the actual digit (0-
9) represented by the image in the corresponding data
card.
Example with MNIST Dataset:
• The MNIST dataset is a popular example used in image classification tasks.
It contains thousands of handwritten digit images. Let's see how Scikit-learn
provides access to this data:
• Python
– from sklearn.datasets import fetch_openml
– mnist = fetch_openml('mnist_784') # Load the MNIST dataset
– X, y = mnist["data"], mnist["target"] # Separate data and target arrays
• Here, X holds the image data (features) and y holds the corresponding digit
labels (targets).
Understanding the Data Size:
• We can use the shape attribute to understand the
dimensions of the data and target arrays:
• Python
– print(X.shape) # Output: (70000, 784)
– print(y.shape) # Output: (70000,)
• The first number in X.shape (70,000) represents the total
number of images (data points).
• The second number (784) represents the number of features
for each image. Since each image in MNIST is 28x28 pixels,
784 represents the total number of pixels (28 * 28).
Visualizing a Single Image: