0% found this document useful (0 votes)
29 views

Project Group

Uploaded by

Ajay Jaadi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Project Group

Uploaded by

Ajay Jaadi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

A REAL TIME RESEARCH PROJECT/

SOCIETAL RELATED PROJECT

REPORT ON
HAND WRITTEN DIGITS RECOGNIZATION

Submitted by
S. Sai Kiran [22M91A05B8]
S. Vishnu Varshan [22M91A05B9]
T. Venkat Sai [22M91A05C0]
T. Tharun [22M91A05C1]
T. Anjali [22M91A05C2]

BATCH-23

DEPARTMENT
OF
COMPUTER SCIENCE AND ENGINEERING
AURORA’S SCIENTIFIC AND TECHNOLOGICAL INSTITUTE, GHATKESAR-501301
Approved by AICTE, affiliated to JNTUH Hyderabad, Telangana [ 2023-2024]
AURORA’S SCIENTIFIC AND TECHNOLOGICAL INSTITUTE, Approved By AICTE, Affiliated
to JNTUH Hyderabad Ghatkesar – 501301, TELANGANA [ 2023- 2024]
DEPARTMENT
OF
COMPUTER SCIENCE AND ENGINEERING

CERTIFICATE
Certified that Real Time Research Project\ Societal Related Project work entitled “Hand Written
Digit Recognition” is a benefited work carried out in the II-II semester by “S. Sai Kiran
[22M91A05B8], S. Vishnu Varshan [22M91A05B9], T. Venkat Sai [22M91A05C0],T.Tharun
[22M91A05C1], T.Anjali[22M91A05C2]” in a partial fulfilment for the award of Bachelor of
Technology in Computer Science and Engineering from Jawaharlal Nehru Technological
University Hyderabad.

Coordinators HEAD of the Department


R. KAVYA DR. M. SRIDHAR
ACKNOWLEDGEMENT

The completion of this Minor Project Work gives me an opportunity to convey my gratitude to
all those who helped me to complete the Minor Project successfully.
First, I grateful acknowledge my deep sense of gratitude to Almighty for spiritual Guidance
blessings shown to complete the Minor Project. I thank my Parents for unconditional support to
improve myself throughout my life.
My sincere thanks to the MANAGEMENT of Aurora’s Scientific and Technological Institute, for
providing this opportunity to carry out the MINIOR PROJECT in the institution.
I own my respectable thanks to Dr. R. Mahesh Prabhu (Principal) of Aurora’s Scientifical and
Technological Institute, for providing all necessary facilities and encouraging words for completion of
this Minor Project.
I gratefully acknowledge Dr. M. Sridhar (Head of the Department) of computer science and
engineering, for his encouragement and advice during this minor project.
My sincere thanks to R. Kavya (Real Time Research Project\Societal Related Project
Coordinator) of minor project, for continuous support for doing this real time research project.
I would like to express my thanks to all the faculty members of department of computer
science and engineering and non-technical staff, who have rendered valuable help in making this
minor project successful.

S. Sai Kiran [22M91A05B8]


S. Vishnu Varshan [22M91A05B9]
T. Venkat Sai [22M91A05C0]
T. Tharun [22M91A05C1]
T. Anjali [22M91A05C2]
INDEX PAGE

TOPIC PAGE NO:

ABSTRACT 1

1.INTRODUCTION 2

2.METHODOLOGY 3

3.REQUIREMENTS 4

4.STEPS TO DESIGN PROJECT 5-6

5.ARCHITECTURE 7

6.ALGORITHM 8

7.SOURCE CODE 9-13

8.CONCLUSION 14

9.FUTURE ENHANCEMENT 15

10.BIBILOGRAPHY 16
FIGURE INDEX

FIGURE PAGE NO:

FIGURE:1 3

FIGURE:2 7

FIGURE:3 8

FIGURE:4 10

FIGURE:5 13

FIGURE:6 13
ABSTRACT

The handwritten digit recognition problem becomes one of the most famous problems in machine
learning and computer vision applications. Many machine learning techniques have been employed to
solve the handwritten digit recognition problem. This paper focuses on Neural Network (NN)
approaches. The most three famous NN approaches are deep neural network (DNN), deep belief
network (DBN) and convolutional neural network (CNN). In this paper, the three NN approaches are
compared and evaluated in terms of many factors such as accuracy and performance. Recognition
accuracy rate and performance, however, is not the only criterion in the evaluation process, but there
are interesting criteria such as execution time. Random and standard dataset of handwritten digit have
been used for conducting the experiments. The results show that among the three NN approaches,
DNN is the most accurate algorithm; it has 98.08% accuracy rate. However, the execution time of DNN
is comparable with the other two algorithms. On the other hand, each algorithm has an error rate of 1-
2% because of the similarity in digit shapes, specially, with the digits (1,7), (3,5), (3,8), (8,5) and (6,9).

1
1. INTRODUCTION

We are going to use the MNIST datasets for the implementation of a handwritten digit’s recognition.
To implement this, we will use a special type of deep neural network called convolutional neural
network. In the end, we also build a graphical user interface (GUI)

Where you can directly draw the digits and recognize it straight away.

Handwritten digit recognition is the process to provide the ability to machine to recognize human
handwritten digits. It is not an easy task for the machine because handwritten digits are not perfect,
it varies from person to person ,and it is difficult to identify different flavours.

2
2. METHODOLOGY
We have implemented a Neural Network with 1 hidden layer having 100 activation units (excluding
bias units). The data is loaded from a .mat file, features(X) and labels(y) were extracted. Then features
are divided by 255 to rescale them into a range of [0,1] to avoid overflow during computation. Data is
split up into 60,000 training and 10,000 testing examples. Feedforward is performed with the training
set for calculating the hypothesis and then backpropagation is done in order to reduce the error
between the layers. The regularization parameter lambda is set to 0.1 to address the problem of
overfitting. Optimizer is run for 70 iterations to find the best fit model.

Figure:1

3
3. REQUIREMENTS

SOFTWARE REQUIREMENTS:

• Operating System: Windows 10 / 11.


• Coding Language: Python 3.8.
• Web Framework: Flask.
• Frontend: HTML, CSS, JavaScript.

HARDWARE REQUIREMENTS:

• System: Pentium i3 Processor.


• Hard Disk: 500 GB.
• Monitor: 15’’ LED
• Input Devices: Keyboard, Mouse
• Ram: 4 GB

Basic knowledge of deep learning with keras library, the Tkinter library for Gui building, and python
programming are required.

Commands to install libraries:

pip install numpy


pip install tensorflow
pip install keras
pip install pillow

4
4. STEPS TO DESIGN PROJECT

4.1 Import libraries and datasets

4.2 Data processing

4.3 Create the model

4.4 Train the model

4.5 Evaluate the model

4.6 Create GUI to predict digits

4.1. Import libraries and datasets:


At the project beginning, we import all the needed modules for training our model. We can easily
import the dataset and start working on that because the Keras library already contains many
datasets and MNIST is one of them. We call mnist.load_data() function to get training data with its
labels and also the testing data with its labels.

4.2. Data processing:


Model cannot take the image data directly so we need to perform some basic operations and
process the data to make it ready for our neural network. The dimension of the training data is
(60000*28*28). One more dimension is needed for the CNN model so we reshape the matrix to
shape (60000*28*28*1).

4.3. Create the model:


Its time for the creation of the CNN model for this Python-based data science project. A
convolutional layer and pooling layers are the two wheels of a CNN model. The reason behind the
success of CNN for image classification problems is its feasibility with grid structured data. We will
use the Adadelta optimizer for the model compilation

4.4. Train the model:


To start the training of the model we can simply call the model.fit() function of Keras. It takes the
training data, validation data, epochs, and batch size as the parameter.

5
The training of model takes some time. After successful model training, we can save the weights and
model definition in the ‘mnist.h5’ file.

4.5. Evaluate the model:


To evaluate how accurate our model works, we have around 10,000 images in our dataset. In the
training of the data model, we do not include the testing data that’s why it is new data for our model.
Around 99% accuracy is achieved with this well-balanced MNIST dataset

4.6. Create GUI to predict digits:

To build an interactive window we have created a new file in GUI. In this file, you can draw digits on
canvas, and by clicking a button, you can identify the digit. The Tkinter library is the part of Python
standard library. Our predict_digit() method takes the picture as input and then activates the trained
model to predict the digit.

After that to build the GUI for our app we have created the App class. In GUI canvas you can draw a
digit by capturing the mouse event and with a button click, we hit the predict_digit() function and
show the results.

6
5. ARCHITECTURE

Figure:2

1. Input
The raw input image is fed into the network.
2. Convolutional Layer
This layer applies convolutional filters to the input image to extract features such as edges, textures,
and patterns.
3. Pooling Layer
After convolution, a pooling layer reduces the spatial dimensions of the feature maps, typically using
max pooling, to down-sample the data and reduce computational load.
4. Additional Convolution and Pooling Layers
The process of applying convolutional and pooling layers can be repeated multiple times to extract
higher-level features and further reduce the spatial dimensions.
5. Fully Connected Layers
The output from the final pooling layer is flattened and passed through fully connected layers to
combine the features and learn complex representations.
6. Output
The final fully connected layer produces the output, such as class probabilities in classification tasks.

7
6. ALGORITHM

Figure:3

1. Input Handwritten Digit


A handwritten digit is provided as an image input.
2. Pre-Processing
The image is enhanced and standardized through:
Normalization: Scaling pixel values.
Binarization: Converting to black and white.
Noise Reduction: Removing noise.
Resizing: Adjusting to a standard size (e.g., 28x28 pixels).
3. Feature Extraction
Key features (edges, textures, shapes) are extracted from the pre-processed image for better
classification.
4. Training
A machine learning model, typically a Convolutional Neural Network (CNN), is trained on labelled
datasets to learn the association between features and digit labels.
5. Testing
The trained model is evaluated on a separate dataset to assess its accuracy and generalization ability.
6. Output Generation
The system predicts the digit from the input image and may provide a confidence score for the
prediction.

8
7. SOURCE CODE

FILE NAME: “traindigit.py”


from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras import backend as K
import keras.utils

# the data, split between train and test sets


(x_train, y_train), (x_test, y_test) = mnist.load_data()

print(x_train.shape, y_train.shape)

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)


x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)

# convert class vectors to binary class matrices


y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

print('x_train shape:', x_train.shape)


print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

batch_size = 128
num_classes = 10
epochs = 10

model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy',
optimizer='Adadelta',
metrics=['accuracy'])
9
hist = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))

print("The model has successfully trained")

score = model.evaluate(x_test, y_test, verbose=0)


print('Test loss:', score[0])
print('Test accuracy:', score[1])

model.save('mnist.h5')
print("Saving the model as mnist.h5")

RESULT:

Figure:4

10
FILE NAME: “GUI.py”

from keras.models import load_model


from tkinter import *
import tkinter as tk
import win32gui
from PIL import ImageGrab, Image
import numpy as np

model = load_model('mnist.h5')

def predict_digit(img):
# Resize image to 28x28 pixels
img = img.resize((28, 28))
# Convert rgb to grayscale
img = img.convert('L')
img = np.array(img)
# Reshaping to support our model input and normalizing
img = img.reshape(1, 28, 28, 1)
img = img / 255.0
# Predicting the class
res = model.predict([img])[0]
return np.argmax(res), max(res)

class App(tk.Tk):
def __init__(self):
tk.Tk.__init__(self)
self.x = self.y = 0

# Creating elements
self.canvas = tk.Canvas(self, width=300, height=300, bg="white", cursor="cross")
self.label = tk.Label(self, text="Draw..", font=("Helvetica", 48))
self.classify_btn = tk.Button(self, text="Recognize",
command=self.classify_handwriting)
self.button_clear = tk.Button(self, text="Clear", command=self.clear_all)

# Grid structure
self.canvas.grid(row=0, column=0, pady=2, sticky=W)
self.label.grid(row=0, column=1, pady=2, padx=2)
self.classify_btn.grid(row=1, column=1, pady=2, padx=2)
self.button_clear.grid(row=1, column=0, pady=2)

# Bindings
self.canvas.bind("<B1-Motion>", self.draw_lines)

def clear_all(self):
self.canvas.delete("all")

def classify_handwriting(self):
HWND = self.canvas.winfo_id() # get the handle of the canvas
rect = win32gui.GetWindowRect(HWND) # get the coordinate of the canvas
a, b, c, d = rect
11
rect = (a + 4, b + 4, c - 4, d - 4)
im = ImageGrab.grab(rect)

digit, acc = predict_digit(im)


self.label.configure(text=str(digit) + ', ' + str(int(acc * 100)) + '%')

def draw_lines(self, event):


self.x = event.x
self.y = event.y
r = 8
self.canvas.create_oval(self.x - r, self.y - r, self.x + r, self.y + r,
fill='black')

app = App()
mainloop()

RESULT:

Figure:5

Figure:6

12
8. CONCLUSION
In this research, we have implemented three models for handwritten digit recognition using MNIST
datasets, based on deep and machine learning algorithms. We compared them based on their
characteristics to appraise the most accurate model among them. Support vector machines are one
of the basic classifiers that’s why it’s faster than most algorithms and in this case, gives the maximum
training accuracy rate but due to its simplicity, it’s not possible to classify complex and ambiguous
images as accurately as achieved with MLP and CNN algorithms. We have found that CNN gave the
most accurate results for handwritten digit recognition. So, this makes us conclude that CNN is best
suitable for any type of prediction problem including image data as an input. Next, by comparing
execution time of the algorithms we have concluded that increasing the number of epochs without
changing the configuration of the algorithm is useless because of the limitation of a certain model and
we have noticed that after a certain number of epochs the model starts overfitting the dataset and give
us the biased prediction

13
9. FUTURE ENHANCEMENT

This project can be enhanced with a great field of machine learning and artificial intelligence. The world
can think of a software which can recognise the text from a picture and can show it to the others, for
example the shop name detector. Or this project can be extended to a greater concept of all the
character sets in the world. This project has not gone for the total English alphabet because there will
be more and many more training sets and testing values that the neural network model will not be
enough to detect. Think of a AI modelled car sensor going with a direction modelling in the roadside,
user shall give only the destination. All of this enhancement is an application of the texture analysis
where advanced image processing, Neural network model for training and advanced AI concepts will
come. These applications can be modelled further. As this project is fully done by free and available
resources and packages this can be also a limitation of the project. The fund is very important because
all machine learning libraries and advanced packages are not available for free. Unless of those the
most of the visualizing platforms like on which developers are doing some works like Watson Studio or
AWS. These all are mainly paid platforms where a lot of ML projects are going on.

14
10. BIBILOGRAPHY

[1] Non-recursive Thinning Algorithms using Chain Codes Paul C K Mwok Department of Computer
Science the University of Calgary Calgary, Canada T2N 1N4

[2] A dynamic shape preserving thinning algorithm Louisa Lam and Ching Y. Suen Centre for Pattern
Recognition and Machine Intelligence and Department of Computer Science, Concordia University,
1455 de Maisonneuve Blvd. W., Montrdal, Qudbec H3G 1MS, Canada

[3] Object Contour Detection with a Fully Convolutional Encoder-Decoder Network Jimei Yang Adobe
Research [email protected] Brian Price Adobe Research [email protected] Scott Cohen Adobe
Research [email protected] Honglak Lee University of Michigan, Ann Arbor [email protected]
Ming-Hsuan Yang UC Merced mhyang@u

[4] Contour Detection and Image Segmentation by Michael Randolph Maire B.S. (California Institute of
Technology) 2003

[5] Three-Dimensional Nonlinear invisible Boundary detection, IEEE Transaction on Image Processing
VassiliKovalev,J,Chen

[6] Unconstrained OCR for Urdu using Deep CNN – RNN Hybrid Networks; Mohit Jain, Minesh Mathew
et al.

[7] Neural Network and Deep Learning by Michael Nielsen.

[8] How to implement a Neural Network intermezzo 2, Peter Roelant’s (2016)

15

You might also like