A Comparative Study On Handwriting Digit Recognition Using Neural Networks
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
Abstract—The handwritten digit recognition problem Most current classification and regression machine learning
becomes one of the most famous problems in machine learning methods are shallow learning algorithms [4]. It is difficult to
and computer vision applications. Many machine learning represent complex function effectively, and its generalization
techniques have been employed to solve the handwritten digit ability is limited for complex classification problems[5, 6].
recognition problem. This paper focuses on Neural Network Deep learning is a multilayer neural network learning
(NN) approaches. The most three famous NN approaches are algorithm which emerged in recent years. Applications of deep
deep neural network (DNN), deep belief network (DBN) and learning to various problems have been the subject of a number
convolutional neural network (CNN). In this paper, the three NN
of recent studies ranging from image classification and speech
approaches are compared and evaluated in terms of many
recognition to audio classification [5, 7-9]. It has brought a new
factors such as accuracy and performance. Recognition accuracy
rate and performance, however, is not the only criterion in the
wave to machine learning, and making artificial intelligence
evaluation process, but there are interesting criteria such as and human-computer interaction advance with big strides.
execution time. Random and standard dataset of handwritten Deep Learning algorithms are highly efficient in image
digit have been used for conducting the experiments. The results recognition tasks such as MNIST digit recognition[10].
show that among the three NN approaches, DNN is the most In this paper, we apply deep learning algorithms to
accurate algorithm; it has 98.08% accuracy rate. However, the handwritten digit recognition, and explore the three
execution time of DNN is comparable with the other two
mainstream algorithms of deep learning; the Convolutional
algorithms. On the other hand, each algorithm has an error rate
Neural Network (CNN), the Deep Belief Network (DBN) and
of 1–2% because of the similarity in digit shapes, specially, with
the digits (1,7) , (3,5) , (3,8) , (8,5) and (6,9). the Deep Neural Network (DNN)[4].
IV. METHODOLOGY
As shown in Fig. 4, our proposed system could be divided
into five main steps: preprocessing, segmentation, feature
extraction, classification, training and recognition as shown in
Fig. 4. The stages are:
A. Preprocessing
At this stage, because all images in the database are clean
Fig. 3. The structure of DNN[7] .
and without noise, no noise reduction technique is required
here. But in a real system we need to remove noise from the
D. Neural Network Toolbox in matlab (simulate): images. In any document there could be optical noises present
Neural Network Toolbox provides algorithms, functions, along with the documents. Especially in the handwritten
and apps to create, train, visualize, and simulate neural documents the character shapes may not be always unique.
networks[16, 17]. The toolbox includes convolutional neural Hence the preprocessing is mandatory. We will first apply an
network and auto encoder deep learning algorithms for image Erosion with 3 X 3 structuring elements which will eliminate
classification and feature learning tasks by used MATLAB the one bit errors and give a smooth edge. Then the characters
programming language [17, 18]. are dilated with 2 X 2 elements.
78
B. Segmentation V. EXPERIMENTAL
After the preprocessing step, an image of sequence of digit is
A. Dataset
decomposed into sub-images of individual digit. Preprocessed
input image is segmented into isolated digit by assigning a 1) Standard Dataset
number to each digit using a labeling process. This labeling The primary dataset that is used in training the classifier is
provides information about number of digits in the image. Each the MNIST dataset published by Yann LeCun of Courant
individual digit is uniformly resized into 100 X 70 pixels for Institute at New York University. The dataset contains a
classification and recognition stage [16]. 60,000 labeled training set and a 10,000 labeled test set[21].
The handwritten data samples come from approximately 250
C. Feature Extraction different writers and completely different writers were sampled
After the segmentation step, the Segmented Image is given for the test set. That is, there is no intersection between the
as input to feature extraction module. The statistical features of writers of the test set and training set [10, 21, 22].
the histogram; mean and standard deviation, will be extracted The MNIST dataset is available as binary files stored in an
from the images. IDX file format and a visualization of what the numbers look
like can be seen in Fig. 5. A much bigger visualization of the
D. Training
data set is also available in the appendix. The preprocessing
After the Feature Extraction step, each of the proposed done was to read in the data into an image matrix and perform
algorithms (CNN, DBN, DNN) is trained separately with the a basic normalization procedure for each image where by each
training images. pixel value was divided by the maximum pixel value for that
E. Classification & Recognition sample image [10, 21, 22].
After the training step, “the classification & Recognition
stage is the decision making part of a recognition system and it
uses the features extracted in the previous stage. A feed
forward back propagation neural network having two hidden
layers with architecture of 54-100-100-38 is used to perform
the classification. The hidden layers use log sigmoid activation
function, and the output layer is a competitive layer, as one of
the digits is to be identified. The feature vector is denoted as X
where X = (f1, f2,…,fd) where f denotes features and d is the
number of zones into which each digit is divided. The number
of input neurons is determined by length of the feature vector
d. The total numbers of digits’ n determine the number of Fig. 5. A sample of the MNIST [10, 21, 22].
neurons in the output layer. The number of neurons in the
hidden layers is obtained by trial and error [16]. The most 2) Random Dataset
Compact network is chosen and presented as shown in Fig. 4. This random dataset contains 85 different digit made by the
It is to recognize handwritten digits using the three algorithms author and collected from different resource. Fig. 6 shows a
in which each algorithm recognizes the image in its own way sample of the random dataset.
process. After the training process, the Digits are compared by
an expert to assess the accuracy of the tip. Also, the precision,
the expense of performance and execution time are compared.
B. Experimental Environment
• Windows 10 operating system as test platform,
• CPU is Intel Core I7 6500u, which has dual cores
running on 2.4GHZ.
• RAM 16GB.
• GPU is Nvidia GTX 970, CUDA Cores 1664, 4GB
GDDR5.
Fig. 4. A block diagram of proposed model.
79
C. Experiment's Evaluation Factors A. Accuracy: -
The three algorithms are evaluated in accordance with the
proposed model according to the following Factors:
• Accuracy
• Performance
• Execution Time
80
B. Random Dataset and apply it to more complex image recognition problems. It is
To prove the efficiency and validity of the results obtained interesting is to look at building a real-time classifier and a
by standard dataset, a random dataset is used. related application (mobile and/or desktop) that will take in
Table 1 shows the recognition results of some experiments user input and immediately do recognition and convert that to a
conducted on Random Dataset. The results show significant digit (1,7), (3,5) , (3,8) , (8,5) and (6,9).
convergence in the accuracy of DNN algorithm compared to
REFERENCES
the previous results shown Fig. 7. This also affects the
[1] Ting, R., S. Chun-lin, and D. Jian, Handwritten character recognition
performance ratio and execution time because the confusion using principal component analysis. MINI-MICRO Systems, 2005.
happens in recognition processing on the similarity between 26(2): p. 289-292.
the two digits, such as (3) and (8). This is due to lack of trained [2] Walid, R. and A. Lasfar. Handwritten digit recognition using sparse
algorithms on previously forms of the digits as (7), (3) and (5). deep architectures. in Intelligent Systems: Theories and Applications
(SITA-14), 2014 9th International Conference on. 2014. IEEE.
TABLE I. SOME OF EXPERIMENT RESULTS FOR RANDOM DATASET. [3] Li, Z., et al. Handwritten digit recognition via active belief decision
trees. in Control Conference (CCC), 2016 35th Chinese. 2016. IEEE.
Digit Algo. Accuracy Performance Execution [4] Schmidhuber, J., Deep learning in neural networks: An overview.
Time Neural Networks, 2015. 61: p. 85-117.
1 CNN 98. 45% 95. 11% 37ms [5] LeCun, Y., Y. Bengio, and G. Hinton, Deep learning. Nature, 2015.
DBN 97. 01% 93. 38% 45ms 521(7553): p. 436-444.
DNN 98.6 0% 97. 13% 31ms [6] Hinton, G.E. and R.R. Salakhutdinov, Reducing the dimensionality of
7 CNN 97. 68% 97. 61% 36ms data with neural networks. Science, 2006. 313(5786): p. 504-507.
DBN 97. 04% 97. 04% 51ms [7] Yu, K., et al., Deep learning: yesterday, today, and tomorrow. Journal of
computer Research and Development, 2013. 50(9): p. 1799-1804.
DNN 98. 78% 97. 78% 34ms
3 CNN 93. 98% 91. 32% 71ms [8] Sun, Z.-J., et al., Overview of deep learning. Jisuanji Yingyong Yanjiu,
2012. 29(8): p. 2806-2810.
DBN 90. 61% 90. 01% 77ms
[9] Bengio, Y., Learning deep architectures for AI. Foundations and
DNN 95. 43% 92.65 % 65ms trends® in Machine Learning, 2009. 2(1): p. 1-127.
5 CNN 95. 99% 95. 98% 44ms [10] LeCun, Y., C. Cortes, and C.J. Burges, The MNIST database of
DBN 95. 78% 93. 04% 77ms handwritten digits. 1998.
DNN 96. 41% 95. 63% 49ms [11] Bouchain, D., Character recognition using convolutional neural
networks. Institute for Neural Information Processing, 2006. 2007.
8 CNN 96. 36% 96. 35% 79ms
[12] Hinton, G.E., S. Osindero, and Y.-W. Teh, A fast learning algorithm for
DBN 96. 10% 94. 43% 73ms deep belief nets. Neural computation, 2006. 18(7): p. 1527-1554.
DNN 97. 33% 96. 47% 68ms [13] Wu, M. and L. Chen. Image recognition based on deep learning. in
6 CNN 98. 41% 97. 91% 57ms Chinese Automation Congress (CAC), 2015. 2015. IEEE.
DBN 98. 79% 97. 87% 51ms [14] Fischer, A. and C. Igel. An introduction to restricted Boltzmann
DNN 99. 48% 98.10% 53ms machines. in Iberoamerican Congress on Pattern Recognition. 2012.
Springer.
9 CNN 97. 93% 97.65 % 67ms
[15] Ciregan, D., U. Meier, and J. Schmidhuber. Multi-column deep neural
DBN 97. 42% 96. 32% 62ms networks for image classification. in Computer Vision and Pattern
DNN 98. 29% 97. 92% 59ms Recognition (CVPR), 2012 IEEE Conference on. 2012. IEEE.
[16] Lapuschkin, S., et al., The LRP toolbox for artificial neural networks.
VII. CONCLUSION Journal of Machine Learning Research, 2016. 17(114): p. 1-5.
[17] Neural Network Toolbox - MATLAB & Simulink. Available from:
In this paper, we compared three Neural Network based https://www.mathworks.com/products/neural-network.html.
recognition algorithms to determine the best algorithm in terms [18] Beale, M.H., M.T. Hagan, and H.B. Demuth. Neural network toolbox™
of many factors such as accuracy and performance. Other user’s guide. in R2012a, The MathWorks, Inc., 3 Apple Hill Drive
criteria such as execution time have been also taken in Natick, MA 01760-2098,, www. mathworks. com. 2012. Citeseer.
consideration. Random and standard datasets of handwritten [19] Kaensar, C. A Comparative Study on Handwriting Digit Recognition
Classifier Using Neural Network, Support Vector Machine and K-
digit have been used to evaluate the algorithms. The results Nearest Neighbor. in The 9th International Conference on Computing
showed that DNN is the best algorithm in terms of accuracy and InformationTechnology (IC2IT2013). 2013. Springer.
and performance. CNN algorithm and DNN are of almost [20] Saabni, R. Recognizing handwritten single digits and digit strings using
deep architecture of neural networks. in Artificial Intelligence and
equal in terms of accuracy. DNN algorithm, however, was Pattern Recognition (AIPR), International Conference on. 2016. IEEE.
better than CNN and DBN in terms of execution time. By [21] LeCun, Y., C. Cortes, and C.J. Burges, MNIST handwritten digit
recognizing the correct digits, the margin of errors may occur database. AT&T Labs [Online]. Available: http://yann. lecun.
with similarities between the digits. com/exdb/mnist, 2010.
[22] MNIST handwritten digit database, Yann LeCun, Corinna Cortes and
VIII. FUTURE WORK Chris Burges. Available from:
http://yann.lecun.com/exdb/mnist/index.html.
Future efforts can study the optimization of deep learning,
81