Project Group
Project Group
REPORT ON
HAND WRITTEN DIGITS RECOGNIZATION
Submitted by
S. Sai Kiran [22M91A05B8]
S. Vishnu Varshan [22M91A05B9]
T. Venkat Sai [22M91A05C0]
T. Tharun [22M91A05C1]
T. Anjali [22M91A05C2]
BATCH-23
DEPARTMENT
OF
COMPUTER SCIENCE AND ENGINEERING
AURORA’S SCIENTIFIC AND TECHNOLOGICAL INSTITUTE, GHATKESAR-501301
Approved by AICTE, affiliated to JNTUH Hyderabad, Telangana [ 2023-2024]
AURORA’S SCIENTIFIC AND TECHNOLOGICAL INSTITUTE, Approved By AICTE, Affiliated
to JNTUH Hyderabad Ghatkesar – 501301, TELANGANA [ 2023- 2024]
DEPARTMENT
OF
COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
Certified that Real Time Research Project\ Societal Related Project work entitled “Hand Written
Digit Recognition” is a benefited work carried out in the II-II semester by “S. Sai Kiran
[22M91A05B8], S. Vishnu Varshan [22M91A05B9], T. Venkat Sai [22M91A05C0],T.Tharun
[22M91A05C1], T.Anjali[22M91A05C2]” in a partial fulfilment for the award of Bachelor of
Technology in Computer Science and Engineering from Jawaharlal Nehru Technological
University Hyderabad.
The completion of this Minor Project Work gives me an opportunity to convey my gratitude to
all those who helped me to complete the Minor Project successfully.
First, I grateful acknowledge my deep sense of gratitude to Almighty for spiritual Guidance
blessings shown to complete the Minor Project. I thank my Parents for unconditional support to
improve myself throughout my life.
My sincere thanks to the MANAGEMENT of Aurora’s Scientific and Technological Institute, for
providing this opportunity to carry out the MINIOR PROJECT in the institution.
I own my respectable thanks to Dr. R. Mahesh Prabhu (Principal) of Aurora’s Scientifical and
Technological Institute, for providing all necessary facilities and encouraging words for completion of
this Minor Project.
I gratefully acknowledge Dr. M. Sridhar (Head of the Department) of computer science and
engineering, for his encouragement and advice during this minor project.
My sincere thanks to R. Kavya (Real Time Research Project\Societal Related Project
Coordinator) of minor project, for continuous support for doing this real time research project.
I would like to express my thanks to all the faculty members of department of computer
science and engineering and non-technical staff, who have rendered valuable help in making this
minor project successful.
ABSTRACT 1
1.INTRODUCTION 2
2.METHODOLOGY 3
3.REQUIREMENTS 4
5.ARCHITECTURE 7
6.ALGORITHM 8
8.CONCLUSION 14
9.FUTURE ENHANCEMENT 15
10.BIBILOGRAPHY 16
FIGURE INDEX
FIGURE:1 3
FIGURE:2 7
FIGURE:3 8
FIGURE:4 10
FIGURE:5 13
FIGURE:6 13
ABSTRACT
The handwritten digit recognition problem becomes one of the most famous problems in machine
learning and computer vision applications. Many machine learning techniques have been employed to
solve the handwritten digit recognition problem. This paper focuses on Neural Network (NN)
approaches. The most three famous NN approaches are deep neural network (DNN), deep belief
network (DBN) and convolutional neural network (CNN). In this paper, the three NN approaches are
compared and evaluated in terms of many factors such as accuracy and performance. Recognition
accuracy rate and performance, however, is not the only criterion in the evaluation process, but there
are interesting criteria such as execution time. Random and standard dataset of handwritten digit have
been used for conducting the experiments. The results show that among the three NN approaches,
DNN is the most accurate algorithm; it has 98.08% accuracy rate. However, the execution time of DNN
is comparable with the other two algorithms. On the other hand, each algorithm has an error rate of 1-
2% because of the similarity in digit shapes, specially, with the digits (1,7), (3,5), (3,8), (8,5) and (6,9).
1
1. INTRODUCTION
We are going to use the MNIST datasets for the implementation of a handwritten digit’s recognition.
To implement this, we will use a special type of deep neural network called convolutional neural
network. In the end, we also build a graphical user interface (GUI)
Where you can directly draw the digits and recognize it straight away.
Handwritten digit recognition is the process to provide the ability to machine to recognize human
handwritten digits. It is not an easy task for the machine because handwritten digits are not perfect,
it varies from person to person ,and it is difficult to identify different flavours.
2
2. METHODOLOGY
We have implemented a Neural Network with 1 hidden layer having 100 activation units (excluding
bias units). The data is loaded from a .mat file, features(X) and labels(y) were extracted. Then features
are divided by 255 to rescale them into a range of [0,1] to avoid overflow during computation. Data is
split up into 60,000 training and 10,000 testing examples. Feedforward is performed with the training
set for calculating the hypothesis and then backpropagation is done in order to reduce the error
between the layers. The regularization parameter lambda is set to 0.1 to address the problem of
overfitting. Optimizer is run for 70 iterations to find the best fit model.
Figure:1
3
3. REQUIREMENTS
SOFTWARE REQUIREMENTS:
HARDWARE REQUIREMENTS:
Basic knowledge of deep learning with keras library, the Tkinter library for Gui building, and python
programming are required.
4
4. STEPS TO DESIGN PROJECT
5
The training of model takes some time. After successful model training, we can save the weights and
model definition in the ‘mnist.h5’ file.
To build an interactive window we have created a new file in GUI. In this file, you can draw digits on
canvas, and by clicking a button, you can identify the digit. The Tkinter library is the part of Python
standard library. Our predict_digit() method takes the picture as input and then activates the trained
model to predict the digit.
After that to build the GUI for our app we have created the App class. In GUI canvas you can draw a
digit by capturing the mouse event and with a button click, we hit the predict_digit() function and
show the results.
6
5. ARCHITECTURE
Figure:2
1. Input
The raw input image is fed into the network.
2. Convolutional Layer
This layer applies convolutional filters to the input image to extract features such as edges, textures,
and patterns.
3. Pooling Layer
After convolution, a pooling layer reduces the spatial dimensions of the feature maps, typically using
max pooling, to down-sample the data and reduce computational load.
4. Additional Convolution and Pooling Layers
The process of applying convolutional and pooling layers can be repeated multiple times to extract
higher-level features and further reduce the spatial dimensions.
5. Fully Connected Layers
The output from the final pooling layer is flattened and passed through fully connected layers to
combine the features and learn complex representations.
6. Output
The final fully connected layer produces the output, such as class probabilities in classification tasks.
7
6. ALGORITHM
Figure:3
8
7. SOURCE CODE
print(x_train.shape, y_train.shape)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
batch_size = 128
num_classes = 10
epochs = 10
model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adadelta',
metrics=['accuracy'])
9
hist = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
model.save('mnist.h5')
print("Saving the model as mnist.h5")
RESULT:
Figure:4
10
FILE NAME: “GUI.py”
model = load_model('mnist.h5')
def predict_digit(img):
# Resize image to 28x28 pixels
img = img.resize((28, 28))
# Convert rgb to grayscale
img = img.convert('L')
img = np.array(img)
# Reshaping to support our model input and normalizing
img = img.reshape(1, 28, 28, 1)
img = img / 255.0
# Predicting the class
res = model.predict([img])[0]
return np.argmax(res), max(res)
class App(tk.Tk):
def __init__(self):
tk.Tk.__init__(self)
self.x = self.y = 0
# Creating elements
self.canvas = tk.Canvas(self, width=300, height=300, bg="white", cursor="cross")
self.label = tk.Label(self, text="Draw..", font=("Helvetica", 48))
self.classify_btn = tk.Button(self, text="Recognize",
command=self.classify_handwriting)
self.button_clear = tk.Button(self, text="Clear", command=self.clear_all)
# Grid structure
self.canvas.grid(row=0, column=0, pady=2, sticky=W)
self.label.grid(row=0, column=1, pady=2, padx=2)
self.classify_btn.grid(row=1, column=1, pady=2, padx=2)
self.button_clear.grid(row=1, column=0, pady=2)
# Bindings
self.canvas.bind("<B1-Motion>", self.draw_lines)
def clear_all(self):
self.canvas.delete("all")
def classify_handwriting(self):
HWND = self.canvas.winfo_id() # get the handle of the canvas
rect = win32gui.GetWindowRect(HWND) # get the coordinate of the canvas
a, b, c, d = rect
11
rect = (a + 4, b + 4, c - 4, d - 4)
im = ImageGrab.grab(rect)
app = App()
mainloop()
RESULT:
Figure:5
Figure:6
12
8. CONCLUSION
In this research, we have implemented three models for handwritten digit recognition using MNIST
datasets, based on deep and machine learning algorithms. We compared them based on their
characteristics to appraise the most accurate model among them. Support vector machines are one
of the basic classifiers that’s why it’s faster than most algorithms and in this case, gives the maximum
training accuracy rate but due to its simplicity, it’s not possible to classify complex and ambiguous
images as accurately as achieved with MLP and CNN algorithms. We have found that CNN gave the
most accurate results for handwritten digit recognition. So, this makes us conclude that CNN is best
suitable for any type of prediction problem including image data as an input. Next, by comparing
execution time of the algorithms we have concluded that increasing the number of epochs without
changing the configuration of the algorithm is useless because of the limitation of a certain model and
we have noticed that after a certain number of epochs the model starts overfitting the dataset and give
us the biased prediction
13
9. FUTURE ENHANCEMENT
This project can be enhanced with a great field of machine learning and artificial intelligence. The world
can think of a software which can recognise the text from a picture and can show it to the others, for
example the shop name detector. Or this project can be extended to a greater concept of all the
character sets in the world. This project has not gone for the total English alphabet because there will
be more and many more training sets and testing values that the neural network model will not be
enough to detect. Think of a AI modelled car sensor going with a direction modelling in the roadside,
user shall give only the destination. All of this enhancement is an application of the texture analysis
where advanced image processing, Neural network model for training and advanced AI concepts will
come. These applications can be modelled further. As this project is fully done by free and available
resources and packages this can be also a limitation of the project. The fund is very important because
all machine learning libraries and advanced packages are not available for free. Unless of those the
most of the visualizing platforms like on which developers are doing some works like Watson Studio or
AWS. These all are mainly paid platforms where a lot of ML projects are going on.
14
10. BIBILOGRAPHY
[1] Non-recursive Thinning Algorithms using Chain Codes Paul C K Mwok Department of Computer
Science the University of Calgary Calgary, Canada T2N 1N4
[2] A dynamic shape preserving thinning algorithm Louisa Lam and Ching Y. Suen Centre for Pattern
Recognition and Machine Intelligence and Department of Computer Science, Concordia University,
1455 de Maisonneuve Blvd. W., Montrdal, Qudbec H3G 1MS, Canada
[3] Object Contour Detection with a Fully Convolutional Encoder-Decoder Network Jimei Yang Adobe
Research [email protected] Brian Price Adobe Research [email protected] Scott Cohen Adobe
Research [email protected] Honglak Lee University of Michigan, Ann Arbor [email protected]
Ming-Hsuan Yang UC Merced mhyang@u
[4] Contour Detection and Image Segmentation by Michael Randolph Maire B.S. (California Institute of
Technology) 2003
[5] Three-Dimensional Nonlinear invisible Boundary detection, IEEE Transaction on Image Processing
VassiliKovalev,J,Chen
[6] Unconstrained OCR for Urdu using Deep CNN – RNN Hybrid Networks; Mohit Jain, Minesh Mathew
et al.
15