Handwritten Digit Recognition with OpenCV
Last Updated :
10 May, 2024
Handwritten digit recognition is the ability of a computer to automatically recognize handwritten digits. The article aims to recognize handwritten digits using OpenCV.
Implementation of Handwritten Digit Recognition System
For implementing handwritten digit recognition, we will be using the MNIST dataset and training a Convolutional Neural Network model using Keras and Open CV.
We will install Open-CV and Keras using the following commands:
pip install opencv-python
pip install keras
Step 1: Import Necessary Libraries
We will import OpenCV, Numpy and Keras library. Keras library is imported to define a neural network model for handwritten digit recognition.
import cv2
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from keras.utils import to_categorical
Step 2: Loading MNIST Dataset
We have loaded the MNIST dataset.
# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
Step 3: Preprocessing the Dataset
After loading the dataset, we preprocess the images by normalizing their pixel values to range between 0 and 1 and image is reshaped to include a channel dimension, which is necessary for convolutional neural networks. Then, one hot encoding is performed on labels to convert them into categorical format.
# Preprocess the images
train_images = train_images.astype('float32') / 255
test_images = test_images.astype('float32') / 255
# Reshape the images and add a channel dimension
train_images = np.expand_dims(train_images, axis=-1)
test_images = np.expand_dims(test_images, axis=-1)
# One-hot encode the labels
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)
Step 4: Build the Model
Now, we have defined CNN model using Sequential API, The model 2 convolution layers, two max pooling layers. Flatten layer is added to flatten the output of the convolutional layers into 1D array and two dense layers are added for classification.
# Build the CNN model
model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Flatten(),
Dense(64, activation='relu'),
Dense(10, activation='softmax')
])
Step 5: Compile the Model
We will now compile the model.
# Compile the model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
Step 6: Model Training
The compiled model is trained on the training dataset for 5 epochs , using a batch size of 64 , while validating the model's performance on the testing dataset.
# Train the model
model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_data=(test_images, test_labels))
Step 7: Loading Image of a Digit and Preprocessing the Image
Once, we have trained the model, we consider an image of a digit and check the predicted output is correct or not. For the prediction, we have read the image as grayscale image, resized, invert the colors, normalize the image and reshape the image array to match the input image expected by the neural network
image = cv2.imread('digit.png', cv2.IMREAD_GRAYSCALE)
# Resize the image to 28x28
image = cv2.resize(image, (28, 28))
# Invert the colors
image = cv2.bitwise_not(image)
# Normalize the image
image = image.astype('float32') / 255
# Reshape the image
image = np.expand_dims(image, axis=0)
image = np.expand_dims(image, axis=-1)
Step 8: Prediction
By using np.argmax on the output of 'model.predict(image)', we will obtain the predicted class label for the input image.
# Predict the digit
prediction = np.argmax(model.predict(image))
print("Predicted Digit:", prediction)
Complete Code to Recognize Handwritten Digit
Python
import cv2
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.utils import to_categorical
# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Preprocess the images
train_images = train_images.astype('float32') / 255
test_images = test_images.astype('float32') / 255
# Reshape the images
train_images = np.expand_dims(train_images, axis=-1)
test_images = np.expand_dims(test_images, axis=-1)
# One-hot encode the labels
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)
# Build the model
model = Sequential([
Flatten(input_shape=(28, 28, 1)),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
# Compile the model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Train the model
model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_data=(test_images, test_labels))
Output:
Epoch 1/5
938/938 [==============================] - 8s 7ms/step - loss: 0.2969 - accuracy: 0.9172 - val_loss: 0.1569 - val_accuracy: 0.9547
Epoch 2/5
938/938 [==============================] - 10s 11ms/step - loss: 0.1352 - accuracy: 0.9602 - val_loss: 0.1116 - val_accuracy: 0.9673
Epoch 3/5
938/938 [==============================] - 10s 10ms/step - loss: 0.0935 - accuracy: 0.9725 - val_loss: 0.1068 - val_accuracy: 0.9653
Epoch 4/5
938/938 [==============================] - 6s 6ms/step - loss: 0.0712 - accuracy: 0.9788 - val_loss: 0.0809 - val_accuracy: 0.9755
Epoch 5/5
938/938 [==============================] - 4s 4ms/step - loss: 0.0557 - accuracy: 0.9835 - val_loss: 0.0793 - val_accuracy: 0.9750
<keras.src.callbacks.History at 0x7b5b44b0e170>
Python
image = cv2.imread('digit.png', cv2.IMREAD_GRAYSCALE)
# Resize the image to 28x28
image = cv2.resize(image, (28, 28))
# Invert the colors
image = cv2.bitwise_not(image)
# Normalize the image
image = image.astype('float32') / 255
# Reshape the image
image = np.expand_dims(image, axis=0)
image = np.expand_dims(image, axis=-1)
# Predict the digit
prediction = np.argmax(model.predict(image))
print("Predicted Digit:", prediction)
Output:
Predicted Digit: 5
Conclusion
Similar Reads
Human Activity Recognition with OpenCV
Have you ever wondered while watching a Sci-Fi film how does computer Recognize what a person's next move will be or how it predicts our actions according to activities performed? Well, the simple answer is it uses Human Activity Recognition (HAR) Technology for this. To accurately engineer features
8 min read
Recognizing HandWritten Digits in Scikit Learn
Scikit learn is one of the most widely used machine learning libraries in the machine learning community the reason behind that is the ease of code and availability of approximately all functionalities which a machine learning developer will need to build a machine learning model. In this article, w
10 min read
Handwritten Digit Recognition using Neural Network
Handwritten digit recognition is a classic problem in machine learning and computer vision. It involves recognizing handwritten digits (0-9) from images or scanned documents. This task is widely used as a benchmark for evaluating machine learning models especially neural networks due to its simplici
5 min read
OCR of Handwritten digits | OpenCV
OCR which stands for Optical Character Recognition is a computer vision technique used to identify the different types of handwritten digits that are used in common mathematics. To perform OCR in OpenCV we will use the KNN algorithm which detects the nearest k neighbors of a particular data point an
2 min read
Image Recognition with Mobilenet
Introduction: Image Recognition plays an important role in many fields like medical disease analysis, and many more. In this article, we will mainly focus on how to Recognize the given image, what is being displayed. We are assuming to have a pre-knowledge of Tensorflow, Keras, Python, MachineLearni
5 min read
License Plate Recognition with OpenCV and Tesseract OCR
License Plate Recognition is widely used for automated identification of vehicle registration plates for security purpose and law enforcement. By combining computer vision techniques with Optical Character Recognition (OCR) we can extract license plate numbers from images enabling applications in ar
5 min read
Face Alignment with OpenCV and Python
Face Alignment is the technique in which the image of the person is rotated according to the angle of the eyes. This technique is actually used as a part of the pipeline process in which facial detection is done using the image. This implementation of face alignment can be easily done with the help
6 min read
Image Stitching with OpenCV
Image stitching is a fascinating technique that combines multiple images to create a seamless panoramic image. This technique is widely used in various applications such as creating wide-angle panoramas, virtual tours, and even in scientific imaging to cover a larger area. In this article, we'll exp
6 min read
Detect an object with OpenCV-Python
Object detection refers to identifying and locating objects within images or videos. OpenCV provides a simple way to implement object detection using Haar Cascades a classifier trained to detect objects based on positive and negative images. In this article we will focus on detecting objects using i
4 min read
Face Recognition with Local Binary Patterns (LBPs) and OpenCV
In this article, Face Recognition with Local Binary Patterns (LBPs) and OpenCV is discussed. Let's start with understanding the logic behind performing face recognition using LBPs. A beginner-friendly explanation of LBPs is described below. Local Binary Patterns (LBP)LBP stands for Local Binary Patt
12 min read