0% found this document useful (0 votes)
103 views

Article Hand Writing Character Recognition Using CNN

artikel ini membahas tentang pengenalan huruf tulisan tangan menggunakan metode Convolutional Neural Network
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

Article Hand Writing Character Recognition Using CNN

artikel ini membahas tentang pengenalan huruf tulisan tangan menggunakan metode Convolutional Neural Network
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

HAND WRITING CHARACTER RECOGNITION USING

CONVOLUTIONAL NEURAL NETWORK


I Kadek Supadma #1, I Ketut Gede Darma Putra #2
#
Department of Information Technology, Udayana University, Jimbaran# Bali, 80361, Indonesia
E-mail: [email protected]; [email protected]

Abstract— Handwriting character recognition is a system created to recognize character patterns from handwriting using the CNN
(Convolutional neural network) method. This system aims to apply the Deep Learning method which is currently popular in its
application to face recognition. In this handwriting recognition system there are two important processes. First is the process of
registering images from handwriting and carrying out the training process in the picture. The second is the stage of recognizing. In
the introductory phase, the system uses image models that are trained at the registration stage to recognize new character patterns.
This study discusses how computers recognize digital image patterns in the form of handwritten character recognition using the
Convolution Neural Network method. The best image pre-processing in this test is using the grayscale method that uses the library
from open cv. The dataset used is taken from the NIST database. The results of the experiments were carried out, the system was able
to recognize character with an accuracy of 95.69%.

Keywords— character recognition; handwriting; convolutional neural network; image processing; dataset; FAR; FRR; recognition
accuracy.

I. INTRODUCTION handwritten characters recognition. Thus, this research was


Handwriting recognition is one of the things that is quite conducted using the CNN (Convolutional Neural Network)
difficult to do but very important to do[1]. Handwriting method for handwriting character recognition using the NIST
recognition is a form of pattern recognition. Research in the dataset, and the results were quite good with a good level of
field of handwriting recognition has developed over a long accuracy.
period of time. This research was conducted because more and
more handwriting models are used in everyday life, such as II. MATERIAL AND METHODS
identification of important documents, proof of authorization
in the banking world, and so on. The problem of handwriting A. Dataset
recognition include, the recognize of characters (letters)[2], The sample data used as a dataset in this experiment using a
the recognize of a number, gesture recognition, signature sample of the different characters of the database NIST.
recognition, and so forth. To solve the problem of handwriting, Database NIST is one database that is often used as a dataset
are closely linked to the pattern recognition aims to generate in research on the introduction of lower-case characters, even
and choose a pattern that can be used for identification. in the set as the standard of learning computer vision[9].
Deep learning is one method that is used for character
recognition and handwriting character [3]. In addition, deep
learning can also be used to identify the objects, such as chairs,
tables, watches, cars, motorbikes[4] and other objects.
Previous studies using several other methods of character
recognition handwritten character. But, this time, Deep
Learning method development is becoming increasingly used
in terms of recognition, especially for a very popular face
recognition[5]. Deep learning has also been widely used in
image classification. Besides object recognition deep learning
is also developed to predict the disease, such as Parkinson[6].
Deep Learning has several methods like ANN (Artificial
Neural Networks), CNN (Convolution Neural Network) and Fig.1 Sample character of NIST Database
others.
Chandra Kusuma Dewa * 1, Amanda Lailatul Fadhilah2, Fig. 1 shows a sample character images taken from NIST
Afiahayati3. Researching Java character recognition using database online. The sample data that can be sized 128 x 128
convolutional neural network method. Results get from these x 3. So we need processing before trained using CNN.
experiments, the system does not reach an accuracy of less
than 90%[7]. B. Pre-processing
Deepa. M1, Deepa. R2, Meena. R3, Nandhini. R4. Image processing in this study uses the Open CV
Examining The Tamil Handwritten Text Recognition using library[10]. In previous studies, the sample character letters
Convolutional Neural Networks[8]. The results show that the were filtered using the greyscale conversion and binary image
proposed system yields good recognition rates which are conversion[9]. However, this time it will apply the method,
comparable to that of feature extraction-based schemes for grayscale, Sobel Adge detection[11], Canny Adge
Detection[12] and Prewitt. Furthermore, we will test the D. Proposed Method
accuracy of four methods. In this study several methods are applied in using
Convolutional Neural networks. Convolutional neural
networks are one of the dynamic methods in their application
to classify objects in an image. The method proposed in this
experiment is for data processing. Preprocessing the dataset
uses the cropping and grayscale method using the Open CV
library. Data set augmentation settings using combination
value zoom_range = 0.1, scale back = 1./255, rotation range =
10, shear_range = 0.1, validation_split = 0.1. The model used
in this experiment is the basic architecture of CNN. For
convolution images, 3x3 + kernel sizes are used, ponds are
2x2 size, Dense 128 with Activation RELU and dropout (0.5).
In this experiment, there are two common processes
applied to CNN that the training process and recognition
process. The training process will register the image and
Fig.2 Example image of the preprocessing result
conduct training to obtain image features. The recognition
process will match the new drawing features with the results
Fig. 2 Showing images that have been done preprocessing.
of the training stored in the database. A general description of
There are four types of preprocessing data sets, namely
the two processes is shown in Figure 4 and Figure 5.
grayscale, sobel edge Detection, Canny edge Detection and
Prewitt. Each of the four types of datasets will be trained and
test the level of training accuracy and testing accuracy to find
out which dataset has the highest accuracy.
C. CNN Architecture
Convolutional Neural Network (CNN) is a development of
Multilayer Perceptron (MLP) that is designed to process two-
dimensional data. CNN is included in the type of Deep Neural
Network because of the high network depth and is widely
applied to image data. In the case of image classification,
MLP is not suitable for use because it does not store the
spatial information of the image data and consider each pixel
as an independent feature that produces poor results. Fig. 4 Overview of the training process.
Automatic handwriting recognition algorithms today are
getting better at recognizing handwritten characters[2]. CNN, Fig.4 Shows the general image training process used as a
are some of the most suitable architecture for it. Lately, the dataset. The registration phase begins by entering a
latest CNN focuses on computer vision problems such as the handwritten character image. The preprocessing process is
introduction of 3D objects, natural images and traffic signs, then carried out, namely resizing the image to smaller and
denoising images[13] and segmentation of images. changing colors to grayscale. After the preprocessing process
Convolutional architecture also appears to be useful for is complete, proceed to the feature extraction process. In this
unsupervised learning algorithm is applied to the image data. process, the image will be convoluted to obtain image features.
Convolutional architecture also appears to be useful for After the convolution process is carried out, the map features
unsupervised learning algorithm is applied to the image data. obtained will be continued to the Pooling Layer process to
reduce the dimensions of each map feature obtained. The final
process of feature extraction is the Flatten process for
converting map features to 1-dimensional format. And after
the feature extraction process is complete, features in 1-
dimensional format will be stored in a database.

Fig. 3 Convolutional neural network architecture used in the handwriting


character recognition system

Fig. 3 displays the architecture used in the system's


recognition of handwritten character which constitute the
basic architecture CNN. The convolution layer uses a 3x3
kernel size and the merging layer uses a 2x2 kernel size.

Fig. 5 General description of the recognition process.


TABLE II
Fig. 5 Shows the general process of recognizing the EPOCH TEST RESULTS
characters of an image. At this recognition stage, it starts by Epoch Validation Validation Testing
inserting an image to be tested. After the image is entered, it Loss Accuracy Accuracy
first enters the preprocessing process. The preprocessing
10 0.0479 0.9846 81.53%
process at the recognize stage is the same as the registration
stage. What must be done on the image is to convert it in the 20 0.0410 0.9858 91.53%
form of gray. Then proceed to the feature extraction process.
As in the registration stage, the image is first convoluted to 30 0.0580 0.9860 93.07%
extract features from the image. Then in pooling to reduce the 40 0.0511 0.9865 92.30%
dimensions of each feature map obtained. And finally entering
the Flatten process, which is changing the map features to 1- 50 0.0326 0.9893 97.59%
dimensional format. After the feature extraction process ends,
proceed to the classification process, which is the map 60 0.0153 0.9945 97.69%
features obtained are matched with the existing database and 70 0.0348 0.9908 96.92%
adjusted to the accuracy of the existing model, so that the final
result of the recognition is true or not. 80 0.0472 0.9888 94.61%

III. RESULT AND DISCUSSION 90 0.0406 0.9905 95.38%

A. Discussion And Testing 100 0.0496 0.9883 93.07%


The test is carried out to find out some parameters of the
CNN method. The first starts by checking and comparing the
Table 2 shows the results of testing for the amount used
results of the dataset with the sobel, prewitt, canny and
epoch, where the epoch that has the highest test accuracy with
grayscale filters. Second, test the dataset with the number of
the epoch number 60 with 97.69% accuracy testing. That way,
samples 5, 10, 15, 20 and 25. The purpose is to find out what
this result will be used as a parameter for further testing.
the results will be if the number of samples is small and large?
Third, examine the number of epochs used and how they
affect the accuracy of training data. Fourth, test the split
validation values used. The fifth tests the threshold and FAR
and FRR values. In the testing process, a total of 130 new
images are used with 5 images / classes.

B. Test Result
From the tests carried out, the results are as follows: The
first test, testing the dataset with different preprocessing.
There are four different filters used. The goal is to get which
filter method has higher testing accuracy. The dataset with the
highest accuracy filter will be used for further testing. In this
first test, each of the 20 samples was used for training data
Fig. 6 Graph of validation loss movement
and 5 samples for test data with 50 epoch. The test results can
be seen in Table 1. Fig. 6 displays a graph of the model of loss validation
TABLE I movement from epoch 10 to epoch 100. From the graphic
RESULTS OF TESTING PRE -PROCESSING DATASETS image, the movement of loss validation of training data is seen
decreasing. Likewise, Loss validation in test data are also
likely to decline, but higher than the validation of training data.
Dataset filter Testing accuracy This means that the process of movement in the training data
and testing is normal.
Sobel 89.23%

Canny 70.76

Grayscale 95.38%

Prewitt 76.92%

The second test, testing the number of epochs used. The


number of epochs tested is from 10 epoch to 100 epoch.
Because in the first test, the dataset has the highest accuracy,
which is a dataset with a Grayscale filter, then the second test
uses a dataset with a grayscale filter. Test results are shown in
table 2.
Fig. 7 Graph of movement accuracy
has the highest accuracy, continues to decrease accuracy, until
Fig. 7 displays the graph model the movement of the lowest accuracy in the value of validation split 0.8 and 0.9
validation accuracy of training data and test data, where the for 54.61%, which is the lowest accuracy of the others.
accuracy of training data showing the movement continues to
rise. Likewise, the movement of the accuracy of the test data,
TABLE IV
but the accuracy of training data more accurate than test data. THE RESULTS OF TESTING THE THRESHOLD VALUE
The third test, testing the value of split validation that used.
This test uses the parameters of the previous test results with Threshold Testing Total Correct Amount
20 samples of the training data and the gray-scale filter and 60 Value Accuracy Data amount is wrong
epochs. The test results are shown in Table 3. 0.0 95.38% 130 124 6
0.1 96.15% 130 125 5
TABLE III 0.2 96.15% 130 125 5
TEST RESULTS FOR SPLIT VALIDATION VALUES
0.3 96.15% 130 125 5
Validation Validation Validation Testing
Split value loss Accuracy Accuracy 0.4 96.15% 130 125 5
0.1 0.0153 0.9945 97.69% 0.5 96.15% 130 125 5
0.2 0.0431 0.9893 93.84% 0.6 96.15% 130 125 5
0.3 0.0574 0.9885 93.84% 0.7 96.15% 130 125 5
0.4 0.0751 0.9838 90.76% 0.8 96.15% 130 125 5
0.5 0.0772 0.9818 90.0% 0.9 96.15% 130 125 5
0.6 0.1378 0.9776 79.23% 1.0 3.84% 130 5 125
0.7 0.1584 0.9735 73.03%
0.8 0.2493 0.9632 54.61% Table 4 showing the results of testing the threshold value
0.9 0.3661 0.9553 54.61% that affects the accuracy of testing. Of all the threshold values
tested from a range of 0 to 1, those that provide high accuracy
values are found at the threshold value of 0.1 - 0.9, which is
the correct number of 125 and which is one of the 130 image
Table 3 showing the results of testing the value of split data tested.
validation. The value used, range 0.1 to 1.0. the results show While for the threshold value of 0, the number of true are
that the value of the 0.1 split validation gets the highest test 124 and 6 is wrong. In contrast to the threshold value of 1, it
accuracy among the other values with an accuracy of 97.69%. turns out the results are reversed, with the correct number 5
Likewise with the accuracy of training, also get the highest and the wrong number 125. To see the graph of the movement
accuracy value of 0.9945. The validation loss generated from can be seen in figure 9.
the value of the validation split is the lowest, which is 0.0153.
That way, the highest test accuracy value is used as the next
test parameter. From the test results table can be seen the
movement on the graph shown in Figure 8.

Fig. 9 Graph testing for threshold values

Fig. 9 displays a movement graph resulting from testing


the threshold value for the accuracy of the test data. in each
test using 5 test images from each class, with a total class of
26 classes. The results obtained are, at the starting point, at the
Fig. 8 Graph movement accuracy of validation split
threshold value of 0, getting the accuracy of the test data
Fig. 8 displaying a graph of movement from the results of 95.38%. The second point reaches the tenth point, with a
testing the value of split validation on the accuracy generated. threshold value of 0.1-0.9 having the same accuracy of
The graph shows the accuracy of the validation split 0.1 which 96.15%. For the results of the last point with a threshold value
of 1, it gets a very small accuracy of 3.84%.
TABLE V validation, use a value of 0.1. The best threshold value used in
RESULTS OF TESTING THE NUMBER OF SAMPLES this experiment is 0.5.

Dataset Data Test Train Testing


Accuracy Accuracy

5 5 0.9777 85.38%

10 5 0.9914 90.0%

15 5 0.9868 93.07%

20 5 0.9893 95.38%

25 5 0.9893 97.69%

Table 5 displays the test results of the number of sample


datasets used. This test uses 5 categories of sample quantities,
namely, 5, 10, 15, 20 and 25. For the movement of accuracy
can be seen in Figure 10.

Fig. 11 The CNN model is used

Fig. 11 displays the layer model of the Convolutional


neural network used in this experiment. The image entered
will be in the first Convolution, the result of the first
Convolution in the second convolution. After the second
Convolution, then at Max-Pooling. The result of Max-Pooling
is convoluted again in the third and fourth Convolution, and at
Max-Pooling again. Next in Flatten and proceed to Layer
Dense with Dropout for classification, up to the last Dense to
produce output.
The experimental results of the threshold values carried
Fig. 10 Graph accuracy for the number of dataset samples.
out shown in table 4 also cause False Acceptance Rates (FAR)
Fig. 10 displays a graph of the movement of the accuracy and False Rejection Rates (FRR) at each threshold value used.
of test data with different dataset samples, from the few (5 FAR and FRR are displayed with the graph in figure 12.
samples) to the most (25 samples).

From the results of the tests done, get the best parameters
that can be used to obtain the best accuracy, using the
convolutional neural network.

TABLE VI
THE BEST PARAMETERS OF THE EXPERIMENTAL RESULTS

The Best Parameters Value


Pre-processing Grayscale
Number of epochs 60 epoch
split Validation value 0.1
Threshold 0.5
Number of samples 25
Fig. 12 Graph of FAR and FRR values

Table 6 showing the best parameters obtained is the best


preprocessing using grayscale images. The best epoch in this Fig. 12 displays the value of False Acceptance Rate (FAR)
experiment is 60 epochs. Then for the best value of split and False Rejection Rate (FRR) of each threshold value
applied. Value is the value of FAR and FRR from the dataset
with grayscale filters and with the number of test data as many REFERENCES
as 5 each class.
[1] M. R. Sazal, S. K. Biswas, F. Amin, and K. Murase, “Bangla
Handwritten Character Recognition Using Deep Belief Network,”
IV. CONCLUSSIONS pp. 1–5, 2013.
Of all the experiments conducted using the Convolutional [2] D. C. Cires, U. Meier, and L. M. Gambardella, “Convolutional
Neural Network Committees For Handwritten Character
neural network method, a number of things can be Classification,” vol. 10, pp. 1135–1139, 2011.
summarized as follows: [3] B. Balci, D. Saadati, and D. Shiferaw, “Handwritten Text
Good accuracy results from Convolutional Neural Recognition using Deep Learning,” Processing, pp. 1–18, 2017.
Network, not determined by the number of epochs used. [4] S. Hayat, S. Kun, Z. Tengtao, Y. Yu, T. Tu, and Y. Du, “A Deep
Learning Framework Using Convolutional Neural Network for
Seeing the results of testing the value of split validation, it can Multi-class Object Recognition,” 2018 IEEE 3rd Int. Conf. Image,
be said that the greater the value of the validation split, the Vis. Comput., pp. 194–198, 2018.
lower the accuracy of the test obtained. Another conclusion is [5] S. Albawi and T. A. Mohammed, “Understanding of a
that the greater the threshold value applied, the higher the Convolutional Neural Network,” 2017.
[6] A. K. Tiwari, “Machine Learning Based Approaches for Prediction
level of False Rejection Rate (FRR) in the system, and the of Parkinson’s Disease,” Mach. Learn. Appl. An Int. J., vol. 3, no. 2,
smaller False Acceptance Rate (FAR). Conversely, the pp. 33–39, 2016.
smaller the threshold value applied, the higher the false [7] C. K. Dewa and A. L. Fadhilah, “Convolutional Neural Networks
acceptance rate (FAR) in the system, and the false rejection for Handwritten Javanese Character Recognition,” vol. 12, no. 1, pp.
83–94, 2018.
rate (FRR) gets smaller. [8] M. Deepa, R. Deepa, R. Meena, and R. Nandhini, “Tamil
The number of samples in the dataset also has a large Handwritten Text Recognition using Convolutional Neural
effect on the accuracy of the test. This can be seen from the Networks,” vol. 9, no. 3, pp. 20986–20988, 2019.
results of testing the number of samples in the dataset shown [9] G. Cohen, S. Afshar, and J. Tapson, “EMNIST : an extension of
MNIST to handwritten characters,” arXiv, 2017.
in table 5. However, Convolutional neural networks cannot [10] K. Mistry and A. Saluja, “An Introduction to OpenCV using Python
provide good results using only one parameter. No less with Ubuntu,” vol. 1, no. 2, pp. 65–68, 2016.
important things that need to be done in using CNN, use the [11] S. Gupta and S. G. Mazumdar, “Sobel Edge Detection Algorithm,”
right preprocessing method in the input image. For this reason, vol. 2, no. 2, pp. 1578–1583, 2013.
[12] H. Sarojadevi, “An Approach to Improvise Canny Edge Detection
it is necessary to test several preprocessing methods for input using Morphological Filters,” vol. 116, no. 9, pp. 38–42, 2015.
images, before conducting image training using the [13] O. Sheremet and K. Sheremet, “Convolutional Neural Networks for
convolutional neural network. Image Denoising in Infocommunication Systems,” 2018 Int. Sci.
Conf. Probl. Infocommunications. Sci. Technol. (PIC S&T), pp.
429–432, 2018.
FUTURE WORKS
In the method proposed in this experiment using the basic
architecture of CNN, with four convolution processes and
twice max pooling. For future research, it can be applied with
other models, to get the best accuracy. In addition, this study
only uses a maximum of 25 data samples for each class, so
that the data variations are relatively small. In the future, it is
expected to use more data samples, so that there is a lot of
variation in the data in the dataset. In addition, this study also
does not test the augmentation values of the convolutional
neural network, and only uses one CNN architecture model.
Besides the small number of samples, the test data used is also
small. The more data samples used as datasets, the more CNN
has many feature models that are available, making it possible
to recognize more new character models.

IMAGE NOTE
The dataset used in Figure 1 can be downloaded via the
following link:
https://catalog.data.gov/dataset/nist-handprinted-forms-and-
characters-nist-special-database-19

ACKNOWLEDGMENT
This research project was created by students of the
Information Technology Department of Udayana University,
Bali Indonesia in order to complete the Learning Machine
Course.

You might also like