CIFAR 100 Dataset

Last Updated : 23 Jul, 2025

The CIFAR-100 dataset is a dataset that is widely used in the field of computer vision, serving as a foundational tool for developing and testing machine learning models. This article provides a detailed exploration of the CIFAR-100 dataset and loading process.

Table of Content

What is the CIFAR-100 Dataset?
Classes and Superclasses
Role of the CIFAR-100 Dataset in Computer Vision
How to Load CIFAR-100 Dataset in TensorFlow
CIFAR-10 vs CIFAR-100
Applications of CIFAR-100 Dataset
FAQs on CIFAR-100 Dataset

What is the CIFAR-100 Dataset?

Developed by the Canadian Institute for Advanced Research (CIFAR), the CIFAR-100 dataset consists of 60,000 color images partitioned into 100 classes, with each class holding 600 images. The dataset is further divided into 50,000 training images and 10,000 testing images. Each image in the CIFAR-100 dataset is a 32x32 color image, which poses a substantial challenge due to its low resolution.

Classes and Superclasses

Unlike its simpler counterpart, CIFAR-10, which contains 10 classes of images, CIFAR-100 is structured around 100 fine classes. These classes are grouped into 20 superclasses. Each superclass encompasses five classes that are semantically related. For instance, the "Aquatic mammals" superclass includes classes like beaver, dolphin, otter, seal, and whale.

Here's a glimpse into some of the superclasses and their corresponding classes:

Insects: bee, beetle, butterfly, caterpillar, cockroach
Large carnivores: bear, leopard, lion, tiger, wolf
Household furniture: bed, chair, couch, table, wardrobe
Vehicles 1: bicycle, bus, motorcycle, pickup truck, train

This hierarchical structure with superclasses and classes allows for more nuanced tasks in machine learning, including fine-grained classification, superclass classification, and hierarchical classification tasks.

Role of the CIFAR-100 Dataset in Computer Vision

The CIFAR-100 dataset was created as an extension of the CIFAR-10 dataset, which contains the same number of total images but fewer classes (10 classes instead of 100). It was developed to provide a more challenging dataset that could help advance the development of more sophisticated image recognition technologies. The CIFAR datasets were created by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton and have been widely used in academic and research settings since their introduction.

In computer vision, the CIFAR-100 dataset plays a critical role in the development and evaluation of machine learning models. Its complexity, due to the high number of classes and the granularity of images, provides a robust platform for testing the efficacy of algorithms. It is commonly used in benchmarking studies that compare the performance of various architectures and learning techniques, such as convolutional neural networks (CNNs), on a controlled set of data.

How to Load CIFAR-100 Dataset in TensorFlow

We will load the CIFAR-100 dataset using TensorFlow and plot a 4x4 grid of sample images with their class labels.

The dataset is loaded and unpacked into training and test sets. A function `plot_images` is defined to display images in a grid format, and class names for CIFAR-100 are listed. Finally, the function is called to visualize a 4x4 grid of images from the training set with their corresponding class labels.

Python

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

# Load CIFAR-100 dataset
(cifar100_train, cifar100_test) = tf.keras.datasets.cifar100.load_data()

# Unpack the dataset
(x_train, y_train) = cifar100_train
(x_test, y_test) = cifar100_test

# Define a function to plot a grid of images
def plot_images(images, labels, class_names, grid_size=(4, 4)):
    plt.figure(figsize=(8, 8))
    for i in range(grid_size[0] * grid_size[1]):
        plt.subplot(grid_size[0], grid_size[1], i + 1)
        plt.xticks([])
        plt.yticks([])
        plt.grid(False)
        plt.imshow(images[i])
        plt.xlabel(class_names[labels[i][0]])
    plt.show()

# Get class names for CIFAR-100
class_names = [
    'apple', 'aquarium_fish', 'baby', 'bear', 'beaver', 'bed', 'bee', 'beetle', 'bicycle', 'bottle',
    'bowl', 'boy', 'bridge', 'bus', 'butterfly', 'camel', 'can', 'castle', 'caterpillar', 'cattle',
    'chair', 'chimpanzee', 'clock', 'cloud', 'cockroach', 'couch', 'crab', 'crocodile', 'cup', 'dinosaur',
    'dolphin', 'elephant', 'flatfish', 'forest', 'fox', 'girl', 'hamster', 'house', 'kangaroo', 'keyboard',
    'lamp', 'lawn_mower', 'leopard', 'lion', 'lizard', 'lobster', 'man', 'maple_tree', 'motorcycle', 'mountain',
    'mouse', 'mushroom', 'oak_tree', 'orange', 'orchid', 'otter', 'palm_tree', 'pear', 'pickup_truck', 'pine_tree',
    'plain', 'plate', 'poppy', 'porcupine', 'possum', 'rabbit', 'raccoon', 'ray', 'road', 'rocket',
    'rose', 'sea', 'seal', 'shark', 'shrew', 'skunk', 'skyscraper', 'snail', 'snake', 'spider',
    'squirrel', 'streetcar', 'sunflower', 'sweet_pepper', 'table', 'tank', 'telephone', 'television', 'tiger', 'tractor',
    'train', 'trout', 'tulip', 'turtle', 'wardrobe', 'whale', 'willow_tree', 'wolf', 'woman', 'worm'
]

# Plot a 4x4 grid of images from the training set
plot_images(x_train, y_train, class_names)

Output:

download — CIFAR 100 dataset loaded using TensorFlow

CIFAR-10 vs CIFAR-100

Feature	CIFAR-10	CIFAR-100
Number of Classes	10	100
Class Labels	Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck	Apple, Aquarium Fish, Baby, Bear, Beaver, Bed, Bee, Beetle, Bicycle, Bottle, etc. (total 100 classes)
Number of Images	60,000 (50,000 training + 10,000 test)	60,000 (50,000 training + 10,000 test)
Image Dimensions	32x32 pixels	32x32 pixels
Color Channels	3 (RGB)	3 (RGB)
Data Format	32x32x3 numpy arrays	32x32x3 numpy arrays
Train/Test Split	50,000 training images / 10,000 test images	50,000 training images / 10,000 test images
Per-Class Samples	6,000 images per class	600 images per class
Dataset Size	~163 MB	~163 MB
Dataset Creator	Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton	Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton
Year of Release	2009	2009
Applications	Image classification, object recognition, machine learning benchmarks	Fine-grained image classification, object recognition, machine learning benchmarks

Applications of CIFAR-100 Dataset

The CIFAR-100 dataset is primarily used in machine learning and computer vision research for object recognition and classification tasks. It serves as a benchmark dataset to develop and test algorithms that can recognize and classify objects within an image. Applications include:

Educational Purposes: CIFAR-100 is often used in academic settings for teaching machine learning concepts and computer vision techniques.
Research and Development: Researchers use CIFAR-100 to develop new machine learning models or improve existing ones, particularly those related to image recognition and classification.
Benchmarking Tool: The dataset provides a standard for comparing the performance of different algorithms, helping in the identification of the most effective techniques for specific types of visual recognition tasks.

Must Do Coding Questions - Topic-wise

deepakp7eq

Improve

Article Tags :

Practice Tags :

Machine Learning

CIFAR 100 Dataset

What is the CIFAR-100 Dataset?

Classes and Superclasses

Role of the CIFAR-100 Dataset in Computer Vision

How to Load CIFAR-100 Dataset in TensorFlow

CIFAR-10 vs CIFAR-100

Applications of CIFAR-100 Dataset

Similar Reads

Interview Preparation

Practice @Geeksforgeeks

Data Structures

Algorithms

Programming Languages

Web Technologies

Computer Science Subjects

Data Science & ML

Tutorial Library

GATE CS

Thank You!

What kind of Experience do you want to share?