0% found this document useful (0 votes)
15 views

Main Project Report

The project report presents a solution for Indian Sign Language recognition to bridge communication gaps between differently abled individuals and the general public. It utilizes machine learning techniques and camera modules to capture and interpret hand gestures in real-time, converting them into readable text. The project aims to provide an accessible and cost-efficient tool to enhance communication for those with hearing and vocal disabilities.

Uploaded by

abhishekjaat0014
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Main Project Report

The project report presents a solution for Indian Sign Language recognition to bridge communication gaps between differently abled individuals and the general public. It utilizes machine learning techniques and camera modules to capture and interpret hand gestures in real-time, converting them into readable text. The project aims to provide an accessible and cost-efficient tool to enhance communication for those with hearing and vocal disabilities.

Uploaded by

abhishekjaat0014
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Project Report

On

Indian Sign Language Recognition

Submi ed in par al fulfillment of the requirement for the award of the degree of

MASTER OF COMPUTER APPLICATION

MCA

Session 2024-25
in

School of Computer Applications and Technology

By

Sakshi Chauhan(23SCSE2030494)
Kanak Rajput(23SCSE2030496)
Aaradhya Rathi(23SCSE2030531)

Under the guidance of

Ms. Nancy Agarwal (Professor)

SCHOOL OF COMPUTER APPLICATION AND TECHNOLOGY

GALGOTIAS UNIVERSITY, GREATER NOIDA


SCHOOL OF COMPUTER APPLICATION AND
TECHNOLOGY
GALGOTIAS UNIVERSITY, GREATER NOIDA

CANDIDATE’S DECLARATION

I/We hereby certify that the work which is being presented in the project, entitled Indian Sign
Language Recognition” in partial fulfillment of the requirements for theaward of the MCA
(Master of Computer Application) submitted in the School of Computer Application and
Technology of Galgotias
University, Greater Noida, is an original work carried out during the period of August, 2023 to
Jan and 2024, under the supervision of Ms. Nancy Agarwal , Department ofComputer
Science and Engineering/School of Computer Application and Technology ,Galgotias
University, Greater Noida. The matter presented in the thesis/project/dissertation has not been
submitted by me/us for the award of any other degree of this or any other places.

Sakshi Chauhan(23SCSE2030494)
Kanak Rajput(23SCSE2030496)

Aaradhya rathi(23SCSE2030531)

This is to certify that the above statement made by the candidates is correct to the best
of my knowledge.

Ms. Nancy Agarwal(Professor)

ii
GALGOTIAS UNIVERSITY

Department of School of Computer Application & Technology

Date:

CERTIFICATE
This is to certify that the mini project report entitled “Indian Sign Language
Recognition” submitted by Sakshi Chauhan, Kanak Rajput, Aaradhya Rathi
have been carried out under the guidance of Prof. Ms. Nancy Agarwal, SCAT ,
Galgotias University.
The project report is approved for submission requirement for practical
examination in 3rd semester(MCA) in School of computer Application and
Technology from Galgotias University.

SIGNATURE

iii
ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of
any task would be incomplete without mentioning the people
who made it possible and their constant encouragement,
guidance has been a source of inspiration throughout the course
of the project.
We express our sincere indebtedness towards our Prof. Alok Singh
Chauhan School of Computer Applica on & Technology, Galgo as
University, for his invaluable guidance, suggestions and supervision
throughout the work. Without his kind patronage and guidance, the
project would not have taken shape. We would also like to express
our gratitude and sincere regards for his kind approval of the project,
time to time counselling and advices.

Date:

iv
ABSTRACT

A major shortcoming in our society is a social barrier between the differently abled members of
the society and the abled folks. One of the most important aspects of human beings, being regarded

as social animals, is in fact communication. Communication is also a major obstacle faced by the

hearing and vocal disabilities people. This inability to communicate leads to frequent problems

and hinders the daily activities of a person with hearing and vocal disabilities.

The underlying reason for this disparity is that abled folks don’t learn and aren’t taught Sign

Language which is the main means of communication for a person with hearing and vocal

disabilities. Thus, abled folks are incapable of having a normally fluent conversation with these
different sections of the society. Consequently, in a verbal exchange among hearing and speech
impaired individuals and an able person the convenience of communique and consequently the

consolation degree is hampered.

So, in our project, we have proposed a cost-efficient solution to overcome this communication

barrier. This solution can be easily used by everyone and can also be, with some modifications,

made to work on most platforms which have a camera module. Our approach uses the integrated

camera module to capture real time hand gestures based on hand key points or landmarks and the

algorithm using machine learning techniques, displays the alphabet that the gesture is

representing.

v
TABLE OF CONTENTS

1. INTRODUCTION
………………………………………………………………………...

1.1 Introduction ………………………………………………………………………….. 1


1.2 Problem Statement ..………………………………………………………………….. 2
1.3 Existing Solutions ...………………………………………………………………….. 2
1.4 Proposed Solution ..…………………………………………………………………... 4
1.5 Methodology ……..…………………………………………………………………...
5
1.6 Organization …....…………………………………………………………………... 10 2.

LITERATURE REVIEW ………………………………………………………………

2.1 Sign Language Recognition System to aid Deaf-dumb People Using PCA ………... 12
2.2 Hand Gesture Recognition for Deaf People Interfacing …………………..………... 13
2.3 An American Sign Language Detection System using HSV Color Model and Edge
Detection……………….…..………...…………………..………...…………………… 13
2.4 Real Time Hand Gesture Recognition Using Different Algorithms Based on American
Sign Language ……………..………...…………………..………...……………………
14

3. SYSTEM DEVELOPMENT …………………………………………………………….

3.1 Dataset ………………...………...…………………...………...…………………… 16


3.2 Algorithms …………...………...…………………...………...…………………….. 20
3.3 Application Screenshots ……...…………………...………...……………………… 28
3.4 ML Pipeline ……...…………………...………...…………………………………... 30

4. PERFORMANCE ANALYSIS ….……………………………………………………….

4.1 Performance Analysis ……...…………………...………...………………………… 31


4.2 Constraints ……...…………………...………...…………………………………….
37

5. CONCLUSIONS ………………………………………………………………………...

5.1 Future Scope ……...…………………...………...…………………………………..

38

5.2 Applications ……...…………………...………...…………………………………... 38

vi
Chapter 1 INTRODUCTION

1.1 Introduction
In the past few years, huge advancements have been made in the fields of science and technology.
Not only this, technology has got much cheaper and its availability has widened as it is now
available to the common man. So, it is vital to no longer overlook the duty of our generation to
make use of this accessibility to technology to contribute to the progress and improvement of
society at large.
Human beings have, since the beginning of time, been described as a social animal. As a social
being, one of the principal aspects of our life is communication. Social interaction or simply
communication has always been regarded as one of the major aspects of living a happy life. For
an individual to live a normal lifestyle, communication is necessary and is required for almost all
of our daily tasks. But there is a not so blessed segment of society which faces hearing and vocal
disabilities. A hearing-impaired individual is one who either can’t hear at all or is able to hear
sounds which are above a certain frequency, or what we’d generally call as ‘can only hear when
spoken to loudly’. An individual with the inability to speak due to any reason whatsoever is
considered as a mute or silent person.
In an enormous research conducted in diverse domain names, it turned into determination that
impairments such as hearing-impairment, vocal-impairment or the ineptitude to express oneself
causes loss of opportunities for such people when compared to able people. Not only does it lead
to this, but also hinders day-to-day activity of an individual such as normal conversations.
According to MoSPI, Govt. of India [1], in 2002 about 30.62 lakh of the then population were
suffering from hearing disorder and 21.55 lakh of the then population were suffering from speech
disorder.
Another 2001 Census [2] states that around 21 million Indian citizens (which constituted 12.6
million males and 9.3 million females approximately), that is, about 2.1 per cent of the then
population of India, were facing certain disabilities. People with speech disability accounted for

1
the 7.5 per cent while those with hearing disability accounted for 5.8 per cent of these 21 million
people in total.
These statistics also show evidence of the problems and discrimination faced by these people.
Additionally, they also provide us with a wealth of facts about specific kinds of disabilities, the
number of humans tormented by these disabilities and the barriers they face in their life. One of
the foremost boundaries a disabled Individual faces in his existence is incapable of talking with an
everyday man or woman.
So with our knowledge of technology, we hope to help such people through our project so that they
are able to communicate normally with others.

1.2 Problem Statement


A major shortcoming in our society is a social barrier between the differently abled members of
the society and the abled folks. One of the most important aspects of human beings, being regarded
as social animals, is in fact communication. Communication is also a major obstacle faced by the
hearing and vocal disabilities people. This inability to communicate leads to frequent problems
and hinders the daily activities of a person with hearing and vocal disabilities.
The underlying reason for this disparity is that abled folks don’t learn and aren’t taught Sign
Language which is the main means of communication for a person with hearing and vocal
disabilities. Thus, abled folks are incapable of having a normally fluent conversation with these
different sections of the society. Consequently, in a verbal exchange among hearing and speech
impaired individuals and an able person the convenience of communique and consequently the
consolation degree is hampered.
So, in our project, we have proposed a cost-efficient solution to overcome this communication
barrier. This solution can be easily used by everyone and can also be, with some modifications,
made to work on most platforms which have a camera module. Our approach uses the integrated
camera module to capture real time hand gestures based on hand key points or landmarks and the
algorithm using machine learning techniques, displays the alphabet that the gesture is representing.

2
1.3 Existing Solutions:
1.3.1 Applications in Apple Store and Google Play Store:

Prior to us, various organizations and individual developers have all attempted to solve this
problem faced by the people with hearing and vocal disabilities using various different techniques.
Some of the attempts are as follows:

● Audio to text conversion programs, ● Applications to interpret sign language, and

● Various standard american sign language guides.

But none of these approaches were able to completely solve the problem as these applications are
not accessible by all. Also each of these techniques had some shortcomings and weren’t totally
foolproof, like taking input as text from the user which is a tedious task for long sentences and
then generating the audio as output. Some others display the corresponding sign for the entered
alphabet. Not only this, but these applications are constrained to the english language only, which
is not readable by everyone.
Some of the applications which are available in Google Play Store are- Virtual Voice, Note Speak

Listen for Deaf, Sign Language Interpreter, Sign Short Message Service etc.

1.3.2 Gestures Gloves:


To convert the gestures into command, the gesture gloves were made. Which can change the
gestures into the signals. In this gloves flex sensor is used , the flex sensor measures the bending
or deflection of the finger and maps it to the corresponding signal. The drawback of these gloves
are that they are costly and everyone cannot afford them. Also additional hardware like lcd display,
buzzer etc are required to change these signals into the corresponding gestures.

3
Figure 1.2: Gesture gloves

1.4 Proposed Solution


In this project, a cost efficient solution is proposed to overcome the communication barrier. This
solution can not only be just just for recognizing the American Sign Language hand gestures but
can also be modified for various other purposes. Our solution is an easy to use one as it uses your
device’s camera (be it the webcamera of the laptop or the camera of a smartphone) and by applying
a few algorithms of Machine Learning and Computer Vision, it recognizes what hand gesture is
being shown and displays it in textual form that is legible to any individual who knows the english
alphabet.

4
Figure 1.3: Our proposed application screenshot

The solution provided here is cheap, easily available and is easy to use by a common man. All one
needs to do is run the program and do the gesture in front of the camera and the algorithms will do
their work in the backend and convert those gestures into readable english alphabets.

1.5 Methodology
Any machine learning based application can be summed up to have at least three phases - data

collection and preprocessing phase, training phase and visualization.

Our program also follows these steps in order. At first, the data is collected and a base dataset is
prepared. This dataset is then divided into training data and testing data which in our case is a
multi-label classification data as we have to predict 26 gestures. To generate our dataset, we’ve
collected hand keypoints from images for each gesture using the laptop’s web camera. Features
are then selected and extracted from the training data. The next step is to decide which machine
learning models to use. Since ours is a multi-class classification problem, the models used were

5
K-Nearest Neighbours, Support Vector Machine and Decision Tree. These models are then trained
on the training set. Then they are made to make predictions on the test set based on which their
performance is evaluated and changes are made to the parameters so as to squeeze out the best
results from these models.

Figure 1.4: Machine Learning pipeline

1.5.1 OpenCV
OpenCV is short for Open Source Computer Vision Library which is a library for achieving
actualtime programs. OpenCV is written in C and C++ and is cross-platform, that is, it works on
all machines regardless of the operating system that is installed on that machine. And it is available
as a library for languages such as Java, Python, C++, etc. As it is an open source project, it is freely
available at http://sourceforge.net/projects/opencvlibrary/.

6
The creation of OpenCV is credited to Grad Bradsky of Intel. It was implemented in 1999, with
only one mission in mind, that is, to encourage research in the field of computer vision and also to
make computer vision available freely even for commercial applications.
Being an open source project, OpenCV also has its documentation available on the web residing

at http://opencv.willowgarage.com/documentation/index.html.

A digital image is an array of discrete values or a matrix of light intensities taken or capture by a
device like camera and are organized into a two-dimensional matrix of pixels, in such a way where
each of the pixels is represented by a number, generally ranging from 0 - 255 (255 due to it being
8-bit). All of this may be stored in the picture formats like jpg and gif [8].
OpenCV uses its own custom data structure, IplImage to represent an image. This data structures
has various accessible fields such as:

1.5.2 MediaPipe Hands

MediaPipe Hands employs machine learning to deduce 21 3-dimensional landmarks of the hand
using just a single frame. Thus, it is regarded as a hi-fi and fairly accurate hand and finger detection
and tracking solution as compared to current state-of-the-art models which generally rely on high
performing machines. MediaPipe is available on various platforms even on the web and
smartphones. MediaPipe Hands is also capable of inferring landmarks of both hands
simultaneously.

7
1.5.3 k-Nearest Neighbours k-Nearest Neighbours or simply k-NN, is a machine learning model

that can be used for both classification and regression problems. The k-NN algorithm is based on

the similarities between the dataset and the sample to be predicted and it puts the sample into the

category that is the closest to the categories present in the dataset. It stores the entire dataset and

classifies the sample data based on similarity.

Figure 1.5: k-NN example (green dot is the sample that is to be classified)

1.5.4 Support Vector Machines

Support Vector Machines or simply SVM, is placed under the category of supervised machine
learning algorithm. It is generally used for classification problems. In this model, each data item
is plotted as a point in a n-D (here, n denotes the number of features in the training dataset) space
and then it performs classification by identifying hyper-plane(s) that separates the two classes with
the maximal distances.

8
Figure 1.6: SVM example (H1 doesn’t separate the classes, H2 does but the distances are not even

whereas H3 separates the classes with maximum distances either side)

Chapter 2 LITERATURE REVIEW

9
2.1 Sign Language Recognition System to aid Deaf-dumb People Using PCA

[8]

2.1.1 Author: Shreyashi Narayan Sawant

2.1.2 Publication: International Journal of Computer Science & Engineering Technology, 2014

Summary:

1. 26 gestures from Indian Sign Language, using MATLAB with WebCam live capture.

2. 260 images in total, 10 of each sign. Captured with white background so as to avoid
illumination effects.
3. Image segmentation using Otsu’s method (a method for quantization), noise removal &
contour smoothing.
4. Centroid is calculated (using image moment) to separate the hand part from the arm.

5. Skin detection using HSV color model.

6. Feature extraction using PCA (eigenvector matrix).

7. Finally, subject gesture is normalized with respect to the average gesture and then projected
onto gesture space using the eigenvector matrix. Finally, Euclidean distance is computed
between this projection and all known projections. The minimum value of these
comparisons is selected for recognition during the training phase.

2.2 Hand Gesture Recognition for Deaf People Interfacing [9]

2.2.1 Author: Isaac Garcia Incertis, Jaime Gomez Garcia-Bermejo, Eduardo Zalama Casanova

2.2.2 Publication: International Journal of Computer Science & Engineering Technology, 2014
Summary:

10
1. The work is carried on the static gesture case corresponding to the alphabet letters in
Spanish Sign Language(LSE).
2. On captured images, hand regions and corresponding contours are extracted through color
segmentation.
3. Contours are Sampled at every arc distance.

4. The resulting points are compared to those of a target gesture in the dictionary.

5. The comparison is performed on the basis of four Distance Criteria- L0 norm, L1 norm ,
L2 Norm , L inf norm.
6. A positive identification is assumed for the closest model.

2.3 An American Sign Language Detection System using HSV Color Model and
Edge Detection [10]

2.3.1 Author: A. Sharmila Konwar, B. Sagarika Borah, C. Dr.T.Tuithung

2.3.2 Publication: International Conference on Communication and Signal Processing, April 3-5,
2014, India

Summary:

1. The aim is to build a user-friendly human-computer interface in which the computer can
understand the human language. .
2. The system is described into two phases- training phase and testing phase.
3. In the first phase , a database is created by capturing the images by web camera or any

camera followed by preprocessing, feature extraction and training.

4. In the second phase , image acquisition, preprocessing, feature extraction and

classification which was based on the testing phase is done

11
5. Canny edge detection algorithm is used to detect hand gestures.
6. PCA and ANN were used for the feature extraction and recognition part respectively.

7. The project was able to detect five alphabets successfully.

8. Due to the geometric variation and uneven background and lighting conditions, some
images were not detected successfully.

2.4 Real Time Hand Gesture Recognition Using Different Algorithms Based on
American Sign Language [11]

2.4.1 Author: Md. Mohiminul Islam,Sarah Siddiqua,and Jawata Afnan3

2.4.2 Publication: IEEE

Summary:

1. Different algorithms are used for the feature detection. Some of the algorithms are K
convex hull for fingertip detection, pixel segmentation, eccentricity, elongatedness of
object etc.
2. Apart from the K-convex hull algorithm, many other algorithms were also used to get
better accuracy.
3. Images are taken using a mobile phone, over that dataset ANN was used.

4. The ANN is trained on 1850 sample images.

5. The model was trained on the real time environment.

6. The proposed model was able to detect ASL alphabets and numbers with the accuracy of
94.32%.

7. Further, the model is improved for the movement detection of the hand for word
recognition.

12
Chapter 3 SYSTEM DEVELOPMENT

3.1 Dataset
The first step in any machine learning problem is to gather the data. The data can either be taken
from some open source datasets from websites such as Kaggle or you can prepare your own
dataset. In our case, we created our own dataset from scratch. For the data gathering process, we
took x- and y-coordinates of 21 hand keypoints using the MediaPipe and OpenCV libraries. For
each gesture, the following x and y keypoints were collected:

● Wrist

● Thumb

● Index finger

● Middle finger

● Ring finger

● Pinky finger

Below are some sample items from the dataset for each gesture: [12]

13
14
15
Figure 3.1: American Sign Language Gestures

Table 3.1: Keypoints for gestures

16
3.2 Algorithms
This project uses several algorithms which are commonly used in the field of computer vision and
machine learning, namely, colour segmentation, labelling, feature extraction, and convolutional
neural networks for recognizing the gesture in real time.

3.2.1 Capture Live Video


The first step was to capture live video from the device webcam. For this purpose OpenCV library

for Python was used. Frames from a live video can be captured in the following way: [2]

3.2.2 Creating the Training Data


For creating the training data, we took x- and y-coordinates of 21 hand keypoints using the
MediaPipe and OpenCV libraries. For each gesture, the following x and y keypoints were
collected: wrist (WRIST), thumb (THUMB_CMC, THUMB_MCP, THUMB_IP, THUMB_TIP),
index finger (INDEX_FINGER_MCP, INDEX_FINGER_PIP, INDEX_FINGER_DIP,
INDEX_FINGER_TIP), middle finger (MIDDLE_FINGER_MCP, MIDDLE_FINGER_PIP,
MIDDLE_FINGER_DIP, MIDDLE_FINGER_TIP), ring finger (RING_FINGER_MCP,
RING_FINGER_PIP, RING_FINGER_DIP, RING_FINGER_TIP) and pinky (PINKY_MCP,
PINKY_PIP, PINKY_DIP, PINKY_TIP). These values were then stored in keypoints.csv which
was later loaded into a pandas dataframe and used for training models from scikit-learn. While

17
capturing the keypoints, the gesture was automatically rotated and slightly varied such as to have
better data for a robust model.

18
Table 3.2: Sample values from the dataset

3.2.3 Feature Extraction (Hand keypoints)


For our dataset, we needed to get the relative x- and y-coordinates of hand keypoints. The keypoints
here are - wrist (WRIST), thumb (THUMB_CMC, THUMB_MCP, THUMB_IP, THUMB_TIP),
index finger (INDEX_FINGER_MCP, INDEX_FINGER_PIP, INDEX_FINGER_DIP,
INDEX_FINGER_TIP), middle finger (MIDDLE_FINGER_MCP, MIDDLE_FINGER_PIP,
MIDDLE_FINGER_DIP, MIDDLE_FINGER_TIP), ring finger (RING_FINGER_MCP,
RING_FINGER_PIP, RING_FINGER_DIP, RING_FINGER_TIP) and pinky (PINKY_MCP,

PINKY_PIP, PINKY_DIP, PINKY_TIP). [3]

19
0.3055737316608429, 0.4160507917404175, 0.3932161629199
9817,
0.39059412479400635 0.37039363384246826, 0.402980923652
, 6489,

[ 0.6311778426170349
0.5198443531990051, 0.4384328424930
5725,
0.5984315276145935
0.36575400829315186
0.44973304867744446
0.46896839141845703
0.376232385635376, 0.46682894229888916, 0.4819741547107
6965,
0.484939306974411, 0.5540761947631836, 0.527718305587
7686,

Figure 3.2: Hand Landmarks

The above figure shows the 21 hand landmarks or keypoints as we are considering them in our
project. Each of the points above represent a keypoint which is a key factor in deciding the gesture.

Sample keypoint for the gesture a:


0.38202211260795593

20
0.5541979074478149, 0.5886871218681335, 0.39763331413269043,

0.5666894912719727, 0.37950295209884644, 0.560907244682312,

0.45397675037384033, 0.5621711015701294, 0.5097349286079407 ]

0.5186744332313538

Figure 3.3: Keypoints over hand for the gesture ‘a’

The following statement can be used to draw and connect the keypoints on the hand image:

21
Finding the keypoints using MediaPipe is quite simple by using the code snippet as follows:

22
3.2.4 k-Nearest Neighbours
k-Nearest Neighbours or simply k-NN, is a machine learning model that can be used for both
classification and regression problems. The k-NN algorithm is based on the similarities between
the dataset and the sample to be predicted and it puts the sample into the category that is the closest
to the categories present in the dataset. It stores the entire dataset and classifies the sample data
based on similarity.

Figure 3.4: k-Nearest Neighbours example

3.2.5 Support Vector Machines

Support Vector Machines or simply SVM, is placed under the category of supervised machine
learning algorithm. It is generally used for classification problems. In this model, each data item
is plotted as a point in a n-D (here, n denotes the number of features in the training dataset) space
and then it performs classification by identifying hyper-plane(s) that separates the two classes with
the maximal distances.

23
Figure 3.5: Support Vector Machine example

3.2.6 Decision Tree

A decision tree is a tree-like model that supports in making decisions. It consists of a tree with
decisions and possible outcomes, etc. A decision tree model uses a decision tree to reach the
conclusions about an item using a set of decisions from the decision tree.

3.3 Code and Application Screenshots

24
Figure 3.6: Application working on gesture ‘a’

Figure 3.7: Application working on gesture ‘s’

25
Figure 3.8: Application working on gesture ‘v’

Figure 3.9: Application working on gesture ‘m’ 3.4

ML Pipeline

Figure 3.10: Model Pipeline


Chapter 4 PERFORMANCE ANALYSIS

26
4.1 Performance Analysis
The results show that the gesture recognition model is quite robust and precise for static images.
However, it is not the same story in the case of video streams. The prediction on video streams is
greatly affected by the illumination of the surroundings. Simply said, the models proved to be
susceptible to noise (here noise refers to the objects in the background which have a texture or
colour similar to the hand) in the live video stream. But, if the hand is kept steady for some time,
the outputs were seen to be quite accurate. Slight hand movements were able to affect the process
of gesture recognition and thus, resulting in inaccurate predictions.

Metrics are used to measure and assess the performance of the machine learning models. Some of
the most common metrics used are - accuracy, precision, recall and f1-score. The performance
metrics for the models used are shown as below:

4.1.1 k-Nearest Neighbours


The metrics for k-nearest neighbours model had the following values:
Accuracy: 41.84%
Precision: 0.35
Recall: 0.41 f1score:
0.35

27
28
Table 4.1: k-NN evaluation report

Figure 4.1: k-NN evaluation report

4.1.2 Support Vector Machine


The metrics for support vector machine model had the following values:
Accuracy: 30.08%
Precision: 0.27
Recall: 0.30 f1score:
0.25

29
30
Table 4.2: SVM evaluation report

Figure 4.2: SVM evaluation report

4.1.3 Decision Tree


The metrics for decision tree model had the following values:
Accuracy: 41.84%
Precision: 0.35
Recall: 0.41 f1score:
0.35

31
32
Table 4.3: Decision tree evaluation report

Figure 4.3: Decision tree evaluation report

4.2 Constraints
1. It is advised to use a dark background for best results.

2. Currently, only a limited number of gestures are recognized.

3. Sudden movement of hands or not making the accurate gestures resulted in inaccurate
predictions.

Chapter 5 CONCLUSIONS

33
5.1 Future Scope
The future work that can improve our proposed work is listed as follows:

1. Understanding and applying a better algorithm to account for background noise and better
separation of the foreground from the background.
2. Achieving better performance by using a customized Neural Network.

3. Comparing the results when Transfer Learning is used instead of basic Neural Networks.

4. Using just computer vision techniques such as convex hull, contour centroid, fingertip
detection and distances to achieve faster results.

5.2 Applications
1. Sign Language Communication
Our project can be used to help in communication with sign language and remove the

barrier between people with speech disabilities and others.

2. Gesture Control
It can be used to control various other products and software applications using such
gestures.
An example will be controlling your music, skipping tracks, changing volume or even
playing games using gesture control.

REFERENCES

[1] CensusIndia,Disabled,Population.https://censusindia.gov.in/census_and_you/disabled_popu
lation.aspx
[2] OpenCV Documentation, https://docs.opencv.org/master/
[3] MediaPipe Hands, https://google.github.io/mediapipe/solutions/hands.html
[4] Model Card - MediaPipe
Hands,https://drive.google.com/file/d/1yiPfkhb4hSbXJZaSq9vDmhz24XVZmxpL/preview

34
[5] Altman, Naomi S. (1992). An introduction to kernel and nearest-neighbor nonparametric
regression
[6] Cortes, Corinna; Vapnik, Vladimir N. (1995). Support-vector networks
[7] Shalev-Shwartz, Shai; Ben-David, Shai (2014). Decision Trees
[8] Sawant, S. (2014). Sign Language Recognition System to aid Deaf-dumb People Using
PCA. [9] I. G. Incertis (2006). Hand Gesture Recognition for Deaf People Interfacing
[10] A. S. Konwar (2014). An American Sign Language detection system using HSV color model
and edge detection
[11] M. M. Islam (2017). Real time Hand Gesture Recognition using different algorithms based
on American Sign Language [12] Patil, Purushottam & Patil, Neel & Tewar, Chandru & Bhoi,
Rohit & Yadav, Deepak. (2019). Deaf Communicator.
[13] Machine Learning Pipeline. https://algorithmia.com/blog/ml-pipeline

35
On

You might also like