Open In App

CIFAR 100 Dataset

Last Updated : 23 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The CIFAR-100 dataset is a dataset that is widely used in the field of computer vision, serving as a foundational tool for developing and testing machine learning models. This article provides a detailed exploration of the CIFAR-100 dataset and loading process.

What is the CIFAR-100 Dataset?

Developed by the Canadian Institute for Advanced Research (CIFAR), the CIFAR-100 dataset consists of 60,000 color images partitioned into 100 classes, with each class holding 600 images. The dataset is further divided into 50,000 training images and 10,000 testing images. Each image in the CIFAR-100 dataset is a 32x32 color image, which poses a substantial challenge due to its low resolution.

Classes and Superclasses

Unlike its simpler counterpart, CIFAR-10, which contains 10 classes of images, CIFAR-100 is structured around 100 fine classes. These classes are grouped into 20 superclasses. Each superclass encompasses five classes that are semantically related. For instance, the "Aquatic mammals" superclass includes classes like beaver, dolphin, otter, seal, and whale.

Here's a glimpse into some of the superclasses and their corresponding classes:

  • Insects: bee, beetle, butterfly, caterpillar, cockroach
  • Large carnivores: bear, leopard, lion, tiger, wolf
  • Household furniture: bed, chair, couch, table, wardrobe
  • Vehicles 1: bicycle, bus, motorcycle, pickup truck, train

This hierarchical structure with superclasses and classes allows for more nuanced tasks in machine learning, including fine-grained classification, superclass classification, and hierarchical classification tasks.

Role of the CIFAR-100 Dataset in Computer Vision

The CIFAR-100 dataset was created as an extension of the CIFAR-10 dataset, which contains the same number of total images but fewer classes (10 classes instead of 100). It was developed to provide a more challenging dataset that could help advance the development of more sophisticated image recognition technologies. The CIFAR datasets were created by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton and have been widely used in academic and research settings since their introduction.

In computer vision, the CIFAR-100 dataset plays a critical role in the development and evaluation of machine learning models. Its complexity, due to the high number of classes and the granularity of images, provides a robust platform for testing the efficacy of algorithms. It is commonly used in benchmarking studies that compare the performance of various architectures and learning techniques, such as convolutional neural networks (CNNs), on a controlled set of data.

How to Load CIFAR-100 Dataset in TensorFlow

We will load the CIFAR-100 dataset using TensorFlow and plot a 4x4 grid of sample images with their class labels.

The dataset is loaded and unpacked into training and test sets. A function `plot_images` is defined to display images in a grid format, and class names for CIFAR-100 are listed. Finally, the function is called to visualize a 4x4 grid of images from the training set with their corresponding class labels.

Python
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

# Load CIFAR-100 dataset
(cifar100_train, cifar100_test) = tf.keras.datasets.cifar100.load_data()

# Unpack the dataset
(x_train, y_train) = cifar100_train
(x_test, y_test) = cifar100_test

# Define a function to plot a grid of images
def plot_images(images, labels, class_names, grid_size=(4, 4)):
    plt.figure(figsize=(8, 8))
    for i in range(grid_size[0] * grid_size[1]):
        plt.subplot(grid_size[0], grid_size[1], i + 1)
        plt.xticks([])
        plt.yticks([])
        plt.grid(False)
        plt.imshow(images[i])
        plt.xlabel(class_names[labels[i][0]])
    plt.show()

# Get class names for CIFAR-100
class_names = [
    'apple', 'aquarium_fish', 'baby', 'bear', 'beaver', 'bed', 'bee', 'beetle', 'bicycle', 'bottle',
    'bowl', 'boy', 'bridge', 'bus', 'butterfly', 'camel', 'can', 'castle', 'caterpillar', 'cattle',
    'chair', 'chimpanzee', 'clock', 'cloud', 'cockroach', 'couch', 'crab', 'crocodile', 'cup', 'dinosaur',
    'dolphin', 'elephant', 'flatfish', 'forest', 'fox', 'girl', 'hamster', 'house', 'kangaroo', 'keyboard',
    'lamp', 'lawn_mower', 'leopard', 'lion', 'lizard', 'lobster', 'man', 'maple_tree', 'motorcycle', 'mountain',
    'mouse', 'mushroom', 'oak_tree', 'orange', 'orchid', 'otter', 'palm_tree', 'pear', 'pickup_truck', 'pine_tree',
    'plain', 'plate', 'poppy', 'porcupine', 'possum', 'rabbit', 'raccoon', 'ray', 'road', 'rocket',
    'rose', 'sea', 'seal', 'shark', 'shrew', 'skunk', 'skyscraper', 'snail', 'snake', 'spider',
    'squirrel', 'streetcar', 'sunflower', 'sweet_pepper', 'table', 'tank', 'telephone', 'television', 'tiger', 'tractor',
    'train', 'trout', 'tulip', 'turtle', 'wardrobe', 'whale', 'willow_tree', 'wolf', 'woman', 'worm'
]

# Plot a 4x4 grid of images from the training set
plot_images(x_train, y_train, class_names)

Output:

download
CIFAR 100 dataset loaded using TensorFlow

CIFAR-10 vs CIFAR-100

FeatureCIFAR-10CIFAR-100
Number of Classes10100
Class LabelsAirplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, TruckApple, Aquarium Fish, Baby, Bear, Beaver, Bed, Bee, Beetle, Bicycle, Bottle, etc. (total 100 classes)
Number of Images60,000 (50,000 training + 10,000 test)60,000 (50,000 training + 10,000 test)
Image Dimensions32x32 pixels32x32 pixels
Color Channels3 (RGB)3 (RGB)
Data Format32x32x3 numpy arrays32x32x3 numpy arrays
Train/Test Split50,000 training images / 10,000 test images50,000 training images / 10,000 test images
Per-Class Samples6,000 images per class600 images per class
Dataset Size~163 MB~163 MB
Dataset CreatorAlex Krizhevsky, Vinod Nair, and Geoffrey HintonAlex Krizhevsky, Vinod Nair, and Geoffrey Hinton
Year of Release20092009
ApplicationsImage classification, object recognition, machine learning benchmarksFine-grained image classification, object recognition, machine learning benchmarks

Applications of CIFAR-100 Dataset

The CIFAR-100 dataset is primarily used in machine learning and computer vision research for object recognition and classification tasks. It serves as a benchmark dataset to develop and test algorithms that can recognize and classify objects within an image. Applications include:

  • Educational Purposes: CIFAR-100 is often used in academic settings for teaching machine learning concepts and computer vision techniques.
  • Research and Development: Researchers use CIFAR-100 to develop new machine learning models or improve existing ones, particularly those related to image recognition and classification.
  • Benchmarking Tool: The dataset provides a standard for comparing the performance of different algorithms, helping in the identification of the most effective techniques for specific types of visual recognition tasks.

Similar Reads