0% found this document useful (0 votes)
12 views3 pages

Paper 2

Uploaded by

manmithrane149
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views3 pages

Paper 2

Uploaded by

manmithrane149
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

MATEC Web of Conferences 392, 01128 (2024) https://doi.org/10.

1051/matecconf/202439201128
ICMED 2024

OCR for devanagari script


Mitali Singh1, Pranava Gayathri1* and Prabhakar Kandukuri1
1Department of AI&ML, CBIT, Hyderabad, Telangana, India.

Abstract. The introduction of Optical Character Recognition (OCR)


technology revolutionized text digitization, allowing physical documents to
be converted into editable and searchable digital representations. This paper
goes into the unique challenges and advancements in OCR designed
exclusively for the Devanagari script. It helps to preserve cultural heritage
by digitizing historic manuscripts and religious writings and making them
more accessible. Furthermore, Devanagari OCR has practical uses in
administrative activities, data entry, and educational content digitization.

1 Background
Devanagari is a complex script used for languages like Hindi, Marathi, Sanskrit, and others.
It includes intricate ligatures, conjunct characters, and a variety of fonts. It consists of 36
Consonants, 10 vowels, 10 digits. OCR faces challenges in accurately recognizing these
script features due to its complexity. The main challenge is to recognize conjunct characters.
Researchers and developers continue to invest in ongoing research and development to
address the evolving challenges of Devanagari OCR, striving for improved accuracy and
efficiency.

2 Objectives
1) The primary objective is to accurately recognize and classify individual Devanagari
script letters i.e., consonants and numbers
2) Achieve a high classification accuracy to ensure that Devanagari Characters are correctly
identified in different applications such as OCR and text analysis

3 Methods
The revised papers have an average accuracy of 94.2%. Even with higher average accuracy,
one typical shortcoming is the inability to distinguish between similar-looking letters. Below
is a tabulation of a few of the considered papers.

*
Corresponding author: [email protected]

© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative
Commons Attribution License 4.0 (https://creativecommons.org/licenses/by/4.0/).
MATEC Web of Conferences 392, 01128 (2024) https://doi.org/10.1051/matecconf/202439201128
ICMED 2024

Table 1. Reference no.’s and it’s accuracy

Ref.No Paper Methodology Accuracy % Limitations


[1] Fuzzy Model Based Feature Multiple
Recognition of Extraction: Box classifiers are
Handwritten Hindi Method 91.3 used for 3
Characters Classification: different
Fuzzy algorithm categories
[2] Comparative study Classification: Increased
of Devanagari Mirror Image character set for
handwritten Learning 95 training
character recognition
using different
features and
classifiers
[3] Gradient Local GLAC(Gradient Difficulties in
Auto-Correlation for Local Auto 93.21 & 95 recognising
handwritten Correlation) similar words.
Devanagari character
recognition.
[5] Handwritten Hindi Feature Inconsistency in
character recognition Extraction: K- some characters
using K-means means Clustering 95.8 recognition
clustering and SVM Classification accuracy.
:SVM
[6] Recognition of Feature LDA
handwritten Extraction:PCA implementation
Devanagari and LDA 94.2 is complex and
characters using Classification not so accurate
linear discriminant :SVM for multiclass
analysis classification.
[7] Hindi handwritten Feature Extraction Misclassification
character recognition :HOG 95.9 of characters
using multiple Classification like भ, म
classifiers. :Quadratic SVM
[8] Handwritten 1.Wavelet Inconsistency in
(Marathi) compound transform some characters
character recognition 2.Modified 96.6 recognition
wavelet accuracy.
features
This is the summary of our literature survey. Our objective is to increase the classification
accuracy using deep learning methods like using pretrained models, transfer learning etc.
Fine-tuning pretrained models on your specific dataset or domain can significantly improve
recognition accuracy.

2
MATEC Web of Conferences 392, 01128 (2024) https://doi.org/10.1051/matecconf/202439201128
ICMED 2024

4 Results and Conclusion

In conclusion, the findings of this study demonstrate the effectiveness of Quadratic Support
Vector Machine (SVM) as a classification algorithm and Histogram of Oriented Gradients
(HOG) as a feature extraction method in the context of our problem domain. With an
impressive classification accuracy of 95.9%, our results underscore the robustness and
suitability of this combination for the task at hand. In this study, we propose a 96% accurate
CNN technique in an effort to improve the current model.

References
1. M. Hanmandlu, O. V. R. Murthy and V. K. Madasu, "Fuzzy Model Based Recognition
of Handwritten Hindi Characters," 9th Biennial Conference of the Australian Pattern
Recognition Society on Digital Image Computing Techniques and Applications (DICTA
2007), Glenelg, SA, Australia, 2007, pp. 454-461.
2. U. Pal, T. Wakabayashi, and F. Kimura, "Comparative study of Devanagari handwritten
character recognition using different features and classifiers," in Proc. 10th Conf.
Document Anal. Recognit., 2009, pp. IIII-1115
3. M. Jangid and S. Srivastava, "Gradient Local Auto-Correlation for handwritten
Devanagari character recognition," 2014 International Conference on High Performance
Computing and Applications (ICHPCA), Bhubaneswar, India, 2014, pp. 1-5.
4. A. Indian and K. Bhatia, "A survey of offline handwritten Hindi character recognition,"
2017 3rd International Conference on Advances in Computing, Communication &
Automation (ICACCA) (Fall), Dehradun, India, 2017, pp. 1-6.
5. Gaur, A., and Yadav, S. (2015) “Handwritten Hindi character recognition using K-means
clustering and SVM”, in Fourth International Symposium on Emerging Trends and
Technologies in Libraries and Information Services, IEEE Press, pp. 65–70.
6. Shitole, S., and Jadhav, S. (2018) “Recognition of handwritten Devanagari characters
using linear discriminant analysis”, in Second International Conference on Inventive
Systems and Control, IEEE Press, pp. 100–103.
7. Yadav, M., and Purwar, R. (2017) “Hindi handwritten character recognition using
multiple classifiers”, in Seventh International Conference on Cloud Computing, Data
Science & Engineering – Confluence, IEEE Press, pp. 149–154.
8. Bhandare, M. S., and Kakade, A. S. (2015) “Handwritten (Marathi) compound character
recognition”, in International Conference on Innovations in Information, Embedded and
Communication Systems, IEEE Press, pp. 1

You might also like