0% found this document useful (0 votes)

47 views33 pages

From - Table - of - Content - Report - s2t (1) (1) 2

The document outlines a project focused on developing a real-time system to recognize American Sign Language (ASL) hand gestures and translate them into text and speech, addressing communication barriers faced by deaf and mute individuals. It details the project's motivation, objectives, and modules, including data acquisition, gesture classification, and translation processes, while highlighting the need for an efficient and accessible solution. The literature review compares existing methods and their accuracies, emphasizing the project's aim to leverage deep learning and image processing for improved gesture recognition.

Uploaded by

Pritam Pawar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views33 pages

From - Table - of - Content - Report - s2t (1) (1) 2

Uploaded by

Pritam Pawar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Chapter 1: Introduction

1.1 Introduction:-
American sign language is a predominant sign language Since the only disability D&M people have been
communication related and they cannot use spoken languages hence the only way for them to communicate
is through sign language. Communication is the process of exchange of thoughts and messages in various
ways such as speech, signals, behavior and visuals. Deaf and dumb(D&M) people make use of their hands to
express different gestures to express their ideas with other people. Gestures are the nonverbally exchanged
messages and these gestures are understood with vision. This nonverbal communication of deaf and dumb
people is called sign language.

In our project we basically focus on producing a model which can recognise Fingerspelling based hand
gestures in order to form a complete word by combining each gesture. The gestures we aim to train are as
given in the image below.

figure 1. ASL sign language

1.2 Scope
This System will be Beneficial for Both Dumb/Deaf People and the People Who do not understands the Sign
Language. They just need to do that with sign Language gestures and this system will identify what he/she is
trying to say after identification it gives the output in the form of Text as well as Speech format.

1
1.3 Project Modules
1.3.1.Data Acquisition
1.3.2. Data pre-processing and Feature extraction
1.3.3.Gesture Classification
1.3.4 Text and Speech Translation

1.4 Project Requirements

1.4.1 Hardware Requirement
• Webcam

1.4.2 Software Requirement

• Operating System: Windows 8 and Above
• IDE: PyCharm
• Programming Language: Python 3.9 5
• Python libraries: OpenCV, NumPy, Keras,mediapipe,Tensorflow

2
Chapter 2 :Literature Review
2.1
Motivation:-
In a world driven by communication, the ability to express oneself is not just a
convenience—it's a basic human right. Yet, millions of individuals who are hearing or
speech impaired often struggle to be heard, to be understood, and to participate fully in
daily life.
This project was born out of a deep desire to bridge the gap between the deaf-mute
community and the hearing world. Sign language is their voice—but not everyone
understands it. My goal is to translate their signs into text and speech, giving them a
powerful tool to communicate with ease and confidence.
The idea isn't just to build a software application—it's to create a bridge. A bridge where
gestures become words, where silence turns into voice, and where inclusion replaces
isolation.
With the help of real-time gesture recognition using MediaPipe and deep learning, this
project takes a step towards true accessibility. Every sign translated is a message made
louder. Every sentence spoken from a gesture is one more step toward equality.
Through this innovation, I hope not only to use technology for good, but also to send a
message:
Technology should empower everyone—not just those who can speak or hear.

2.2 Problem Defination:-

Communication is a fundamental aspect of human interaction. However, individuals who
are deaf or mute face significant barriers in expressing themselves and being understood in
day-to-day situations. While sign language serves as an essential medium for their
communication, it is not widely understood by the general population, creating a
communication gap between the hearing-impaired and the rest of society.
This lack of understanding often leads to social exclusion, limited access to services, and
dependency on interpreters, which can be both costly and unavailable in real-time
scenarios.
Despite advancements in technology, there is still a lack of affordable, accessible, and real-
time systems that can translate sign language into spoken language or text, especially in

3
the context of Indian Sign Language (ISL). Most existing systems are either static, slow, or
only work for single-word translation without capturing the flow of natural sentences.

Therefore, there is a critical need for a real-time, efficient, and accurate system that can
recognize hand gestures and convert them into both text and speech, enabling seamless
communication for the hearing and speech impaired.
This project aims to solve this problem by leveraging MediaPipe for hand tracking,
combined with Deep Learning (LSTM) for recognizing sentence-level gestures, and finally
converting the recognized text into audible speech using Text-to-Speech (TTS) technology.

2.3 Aim
• Communication barriers for hearing-impaired individuals.
• Hearing-impaired individuals face communication barriers with non-signing people.
• Sign language is not widely understood by the general population.
• Challenges in workplaces, education, healthcare, and public spaces.
2.4:-Objectives :-
More than 70 million deaf people around the world use sign languages to communicate. Sign
language allows them to learn, work, access services, and be included in the communities.

It is hard to make everybody learn the use of sign language with the goal of ensuring that
people with disabilities can enjoy their rights on an equal basis with others.

So, the aim is to develop a user-friendly human computer interface (HCI) where the computer
understands the American sign language This Project will help the dumb and deaf people by
making their life easy.

To create a computer software and train a model using CNN which takes an image of hand
gesture of American Sign Language and shows the output of the particular sign language in
text format converts it into audio format.

4
2.5:- Literature Review:-

Mahesh Kumar N B1 Assistant Professor (Senior Grade),

Bannari Amman Institute of Technology,
Sathyamangalam, Erode, India (2018):
This paper shows the sign language recognizing of 26 hand gestures in Indian sign language using
MATLAB. The proposed system contains four modules such as: pre-processing and hand segmentation,
feature extraction, sign recognition and sign to text. By using image processing the segmentation can be
done. Otsu algorithm is used for segmentation purposes Some of the features are extracted such as Eigen
values and Eigen vectors which are used in recognition. The Linear Discriminant Analysis (LDA) algorithm
was used for gesture recognition and recognized gestures are converted into text and voice format. The
proposed system helps to dimensionality.

figure 2.1pre-processing and hand segmentation

Translation of Sign Language Finger-Spelling to Text

using Image Processing
-by Krishna Modi

5
Mukesh Patel School of Technology and Management Engineering JVPD Scheme
Bhaktivedanta Marg, Vile Parle (W),
Mumbai-400 056(2013)
In This proposed system, they intend to recognize some very basic elements of sign language and to translate
them to text. Firstly, the video shall be captured frame-by-frame, the captured video will be processed and
the appropriate image will be extracted, this retrieved image will be further processed using BLOB analysis
and will be sent to the statistical database here the captured image shall compared with the one saved in
the database and the matched image will be used to determine the performed alphabet sign in the language.
Here, they will be implementing only

American Sign Language Finger-spellings, and They will construct words and sentences with them. With the
proposed method, they found that the probability of Obtaining desired output is around 93% which is
sufficient and Can be enough to make it suitable to be used on a larger scale For the intended purpose.

figure 2.2 Pre processing Sign Language Finger-spellings

Sign Language to Text and Speech Conversion

-By Bikash K. Yadav
Sinhgad College of Engineering, Pune, Maharashtra(2020)
Sign language is one of the oldest and most natural forms of language for communication. Since most people
do not know sign language and interpreters are very difficult to come by, They have come up with a real-
time method using Convolution Neural Network (CNN) for fingerspelling based American Sign Language
(ASL). In Their method, the hand is first passed through a filter and after the filter has applied the hand is
passed through a classifier that predicts the class of the hand gestures. Using Their approach, They are able
to reach a model accuracy of 95.8%.

6
figure 2.3 Translate the sign language

Sign Language to Text and Speech Translation in Real Time Using Convolutional Neural
Network-by Ankit Ojha Dept. of ISE
JSSATE Bangalore, India .
Creating a desktop application that uses a computer’s webcam to capture a person signing gestures for
American sign language (ASL), and translate it into corresponding text and speech in real time. The
translated sign language gesture will be acquired in text which is farther converted into audio. In this
manner they are implementing a finger spelling sign language translator. To enable the detection of
gestures, they are making use of a Convolutional neural network (CNN). A CNN is highly efficient in tackling
computer vision problems and is capable

of detecting the desired features with a high degree of accuracy upon sufficient training. The modules are
image acquisition, hand region segmentation and hand detection and tracking hand posture recognition
and display as text/speech. A finger spelling sign language translator is obtained which has an accuracy of
95%

CONVERSION OF SIGN LANGUAGE TO TEXT AND SPEECH USING MACHINE LEARNING

TECHNIQUES
Author : Victorial Adebimpe Akano(2018)

Communication with the hearing impaired (deaf/mute) people is a great challenge in our society today;
this can be attributed to the fact that their means of communication (Sign Language or hand gestures at a
local level) requires an interpreter at every instance. To convert ASL signed hand gestures into text as well

7
as speech using unsupervised feature learning to eliminate communication barrier with the hearing
impaired and as well provide teaching aid for sign language.

Sample images of different ASL signs were collected using the Kinect sensor using the image acquisition
toolbox on MATLAB. About five hundred (500) data samples (with each sign count five and ten (5-10)) were
collected as the training data. The reason for this is to make the algorithm very robust for images of the same
database in order to reduce the rate of misclassification. The combination FAST and SURF with a KNN of 10
also showed that unsupervised learning classification could determine the best matched feature from the
existing database. In turn, the best match was converted to text as well as speech. The introduced system
achieved a 92% accuracy of supervised feature learning and 78%of unsupervised feature learning

figure 2.4 Image Collection of different Asl

2.6 An Improved Hand Gesture Recognition Algorithm based on

image contours to Identify the American Sign Language
8
-by Rakesh Kumar
Department of Computer Engineering & Applications, GLA University,Mathura(2021)
this paper proposed a recognition and classification of hand gesture to identify the correct denotation with
maximum accurateness for standard American Sign Language. The proposal intelligently used the
information based on image contours to identify the character's representation of hand gesture. The

proposal optimizes the performance overhead through identifications of 17 characters and 6 symbols based
on image contours and convexity measurement of Standard American Sign Language without using complex
algorithms and specialized hardware devices. Accuracy measurement done through simulation, which shows
how their proposal provides more accuracy with minimum complexity in comparison to other state-of-the-
art works. The average accuracy is 86% overall.

figure 2.5 Accuracy of sign language

9
Chapter 3 : Proposed System

3.1 Comparison Table

Author Mahesh Krishna Bikash K. Ayush Victorial Rakesh
name Kumar Modi Yadav Pandey Adebimpe Kumar
Akano

Algorithm LDA Blob CNN CNN KNN contour

Analysis measure
ment
Accuracy 80% 93% 95.8% 95% 92% 86%

Year 2018 2013 2020 2020 2018 2021

3.1Comparison Table

The table below presents a comparative study of various research efforts undertaken by different authors in
the field of sign language recognition. Each author has proposed a unique algorithm to tackle the challenge
of gesture recognition and translation, evaluated by the accuracy of their respective systems:

1. Mahesh Kumar (2018) implemented the Linear Discriminant Analysis (LDA) technique for
gesture classification. While simple and computationally efficient, the model achieved an accuracy
of 80%, which is moderate compared to more advanced deep learning methods.

2. Krishna Modi (2013) applied Blob Analysis, a classical image processing method. Despite its
simplicity, it delivered an impressive 93% accuracy, showcasing that traditional methods can still
be effective when well-optimized.

3. Bikash K. Yadav (2020) and Ayush Pandey (2020) both employed Convolutional Neural
Networks (CNN), a deep learning-based approach known for its excellent performance in image-

10
related tasks. Their models reached 95.8% and 95% accuracy respectively, indicating CNN's strong
ability to extract complex features and learn from gesture images.

4. Victorial Adebimpe Akano (2018) used the K-Nearest Neighbors (KNN) algorithm, a classic
machine learning method. With an accuracy of 92%, the method proved to be reliable for
classification tasks when applied with proper feature extraction.

5. Rakesh Kumar (2021) opted for contour measurement, focusing on shape-based gesture analysis.
While effective, this method achieved 86% accuracy, slightly lower than deep learning methods but
still significant considering its interpretability and lower resource demand.

3.2 Research Gap

-In first research paper [1] they used LDA algorithm and they converted rgb image to binary
image but the image processing is not as good enough to get more accurate
features of particular sign
-In second research paper [2] they recognize sign using direct image pixels comparison
which are stored into their database they also converted rgb image to binary.in that image
processing they removed some necessary features.
-In third research paper [3] and [4] they have applied CNN algorithm for sign recognition
which is very effective but they didn’t do much image processing before feeding data to train
CNN
-in fifth research paper [5] they have used simplest algorithm knn for sign recognition and
they also didn’t do much image processing maybe that’s the reason for their moderate
accuracy
-in sixth research paper [6] they used contour and convexity measurements for image
recognition. But the algorithm didn’t result in good accuracy

3.3 Project Feasibility Study

3.3.1 Operational feasibility
- The whole purpose of this system is to handle the work much more accurately and efficiently with
less time consumption.

- This app is very user-friendly to use. They only require knowledge about American Sign Language.

-The system is operationally feasible as it is very easy for the End users to operate it. It only needs
basic information about windows application.

3.3.2 Technical feasibility

11
The technical needs of the system may include: Front-end and back-end selection An important issue for
the development of a project is the selection of suitable front-end and back-end. When we decided to
develop the project, we went through an extensive study to determine the most suitable platform that suits
the needs of the organization as well as helps in development of the project. The aspects of our study
included the following factors.
Front-end selection:

It must have a graphical user interface that assists users that are not from IT background.

So we have made front-end using Python Tkinter Gui.

Features:

1. Scalability and extensibility.

2. Flexibility.

3. Easy to debug and maintain.

Back-end Selection:We have used Python as our Back-end Language which has the most widest library
collections The technical feasibility is frequently the most difficult area encountered at this stage. Our app
will fit perfectly for technical feasibility.

3.3.3 Economic Feasibility

The developing system must be justified by cost and benefit. Criteria to ensure that effort is concentrated on
project, which will give best, return at the earliest. One of the factors, which affect the development of a new
system, is the cost it would require. Since the system is developed as part of project work, there is no manual
cost to spend for the proposed system. Also, all the resources are already available, it gives an indication of
the system is economically possible for development.
3.4 Timeline Chart

figure 3.0

12
3.5 Detailed Module description
3.5.1 Data Acquisition
The different approaches to acquire data about the hand gesture can be done in the following ways:
It uses electromechanical devices to provide exact hand configuration, and position. Different glove-
based approaches can be used to extract information. But it is expensive and not user friendly.
In vision-based methods, the computer webcam is the input device for observing the information of
hands and/or fingers. The Vision Based methods require only a camera, thus realizing a natural
interaction between humans and computers without the use of any extra devices, thereby reducing costs.
The main challenge of vision-based hand detection ranges from coping with the large variability of the
human hand’s appearance due to a huge number of hand movements, to different skin-color possibilities
as well as to the variations in viewpoints, scales, and speed of the camera capturing the scene.

3.5.2 Data pre-processing and Feature extraction

• In this approach for hand detection, firstly we detect hand from image that is acquired by webcam
and for detecting a hand we used media pipe library which is used for image processing. So,
after finding the hand from image we get the region of interest (Roi) then we cropped that image
and convert the image to gray image using OpenCV library after we applied the gaussian blur . The
filter can be easily applied using open computer vision library also known as OpenCV. Then we
converted the gray image to binary image using threshold and Adaptive threshold methods.
• We have collected images of different signs of different angles for sign letter A to Z.

13
figure 3.2

figure 3.2

in this method there are many loop holes like your hand must be ahead of clean soft background and that is in
proper lightning condition then only this method will give good accurate results but in real world we dont get
good background everywhere and we don’t get good lightning conditions too.
So to overcome this situation we try different approaches then we reached at one interesting solution in which
firstly we detect hand from frame using mediapipe and get the hand landmarks of hand present in that image
then we draw and connect those landmarks in simple white image
14
figure 3.3 in this image we collacted sign language of B

figure 3.4 in this image we collect the sign lanuages of A

figure 3.5 in this image we collected the sign language D

15
Mediapipe Landmark System:

figure 3.6 Hand Landmark Detection Using MediaPipe (21 Landmarks)

The above diagram represents the 21 hand landmarks used in hand gesture recognition, especially
with tools like MediaPipe Hands by Google. These landmarks are crucial in capturing and
analyzing the position and orientation of fingers for applications like sign language recognition,
gesture control, and hand tracking.
Each red dot in the image indicates a specific joint or fingertip on the hand, and the green lines
represent the connections between them, forming a skeletal structure of the hand. Below is a
breakdown of the landmark indices and their corresponding positions:
1. 0 – WRIST: The base of the hand.
2. 1 to 4 – THUMB (CMC to TIP): Traces the thumb from its base to the tip.
3. 5 to 8 – INDEX FINGER (MCP to TIP): Represents the joints of the index finger.
4. 9 to 12 – MIDDLE FINGER (MCP to TIP): Represents the joints of the middle finger.
5. 13 to 16 – RING FINGER (MCP to TIP): Joints of the ring finger.
6. 17 to 20 – PINKY FINGER (MCP to TIP): Joints of the smallest finger.
Now we will get this landmark points and draw it in plain white background using opencv
library

16
figure 3.7 In this Image now we get Landmark points of “B”

figure 3.8 this image is shown that how to display the with landmark

17
figure 3.9 In this Image now we get Landmark points of “A”

figure 3.10 this image is shown that how to display the with landmark

-By doing this we tackle the situation of background and lightning conditions because the mediapipe labrary
will give us landmark points in any background and mostly in any lightning conditions.
-we have collected 180 skeleton images of Alphabets from A to Z

18
3.5.3 Gesture Classification

Convolutional Neural Network (CNN)

CNN is a class of neural networks that are highly useful in solving computer vision problems. They found
inspiration from the actual perception of vision that takes place in the visual cortex of our brain. They make
use of a filter/kernel to scan through the entire pixel values of the image and make computations by setting
appropriate weights to enable detection of a specific feature. CNN is equipped with layers like convolution
layer, max pooling layer, flatten layer, dense layer, dropout layer and a fully connected neural network layer.
These layers together make a very powerful tool that can identify features in an image. The starting layers
detect low level features that gradually begin to detect more complex higher-level features

Unlike regular Neural Networks, in the layers of CNN, the neurons are arranged in 3 dimensions: width,
height, depth.

The neurons in a layer will only be connected to a small region of the layer (window size) before it, instead of
all of the neurons in a fully-connected manner.

Moreover, the final output layer would have dimensions(number of classes), because by the end of the CNN
architecture we will reduce the full image into a single vector of class scores.

figure 3.11

19
1. Convolutional Layer:
In convolution layer I have taken a small window size [typically of length 5*5] that extends to the depth of
the input matrix.

The layer consists of learnable filters of window size. During every iteration I slid the window by stride size
[typically 1], and compute the dot product of filter entries and input values at a given position.

As I continue this process well create a 2-Dimensional activation matrix that gives the response of that matrix
at every spatial position.

That is, the network will learn filters that activate when they see some type of visual feature such as an edge
of some orientation or a blotch of some colour.

2. Pooling Layer:
We use pooling layer to decrease the size of activation matrix and ultimately reduce the learnable parameters.

There are two types of pooling:

a. Max Pooling:
In max pooling we take a window size [for example window of size 2*2], and only taken the maximum of 4
values.

Well lid this window and continue this process, so well finally get an activation matrix half of its original Size.

b. Average Pooling:
In average pooling we take average of all Values in a window.

figure 3.12 In average pooling we take average of all Values in a window

20
3. Fully Connected Layer:
In convolution layer neurons are connected only to a local region, while in a fully connected region, well
connect the all the inputs to neurons.

figure 3.13 The preprocessed 180 images/alphabet will feed the keras CNN model.

Because we got bad accuracy in 26 different classes thus, We divided whole 26 different alphabets into 8
classes in which every class contains similar alphabets:

figure 3.13a [y,j]

figure 3.13b [c,o]

21
figure 3.13c [g,h]

figure 3.13d [b,d,f,I,u,v,k,r,w]

figure 3.13e [p,q,z]

22
figure 3.13f [a,e,m,n,s,t]

All the gesture labels will be assigned with a

probability. The label with the highest probability will treated to be the predicted label.

So when model will classify [aemnst] in one single class using mathematical operation on hand landmarks we
will classify further into single alphabet a or e or m or n or s or t.

figure 3.14 In this image show that Landmark

3.5.4 Text and Speech Translation

The model translates known gestures into words. we have used pyttsx3 library to convert the recognized words
into the appropriate speech. The text-to-speech output is a simple workaround, but it's a useful feature because
it simulates a real-life dialogue.
23
3.6 Project SRS
3.6.1 System Flowchart

figure 3.6.1

The given diagram represents the conceptual architecture or flow of processes in a sign language
recognition system that converts hand gestures into text and speech. This system is designed to bridge the
communication gap between the speech/hearing-impaired and the general population.

3.6.3 DFD diagram

DFD-Level 0

figure 3.6.2

This diagram provides a simple overview of how a Sign Language to Text Converter System works. It
visually represents the communication flow between the user and the system.

24
DFD-Level 1

figure 3.6.3

1. User Input (Hand Gestures):

• The user performs hand gestures representing specific alphabets, words, or phrases using Indian
Sign Language (ISL) or any other sign language standard.

2. Sign Language to Text Converter:

• The system captures the hand gestures via a camera or sensor module.

• Using techniques such as MediaPipe, CNN models, or keypoint detection, the system identifies the
gesture.

• Each gesture is mapped to its corresponding character based on a pre-trained recognition model.

3. Output (Corresponding Character):

• The recognized character is displayed as text to the user.

• This step may also include text-to-speech conversion for vocal output.

25
3.6.4 Sequence diagram

figure 3.6.4

This sequence diagram illustrates the step-by-step process of sign language recognition, starting from video
capture to final output generation. It explains how different components in the system interact to convert
hand gestures into corresponding text using a machine learning model.

26
Conversion of Sign Language to text/Speech

Implementation Plan for next semester

• Real-Time Sentence Translation:-

Use a Sequence Model like LSTM (Long Short-Term Memory) or
Transformer to:
Recognize a series of gestures (over time, like a video).
Output a full sentence directly, instead of word-by-word.

Current Output Upgraded Output

"HELLO" → "MY" → "NAME" → "IS" "HELLO, MY NAME IS RAHUL"
→ "RAHUL" (in one go)

• Two-Way Communication (Bi-Directional):-

Add Text/Speech → Sign Language (using animation or avatars).

Mode Function
Deaf/Mute → Hearing Sign → Text/Speech
Hearing → Deaf/Mute Text/Speech → Sign (animation) ← ADD THIS

• Multilingual Support:-
Allow users to choose other languages (Marathi, Tamil, Telugu, Bengali,
etc.)

How It Works:
Convert signs to English (default).
Use Google Translate API or any language model to translate English
target language.
Use text-to-speech (TTS) in the selected language.

29
Conversion of Sign Language to text/Speech

Sign English Hindi Tamil

👋
"Hello, how "नमस्ते , आप "வணக்கம் , எப் படி
🤚
are you?" कैसे हैं ?" இருக்கிறீர்கள் ?"
🧍‍♂️

Tech Stack:

• Use Python libraries:

o googletrans for translation

o gTTS or pyttsx3 for multilingual speech

• Integration with Smart Devices (IoT):-

Control smart devices using sign language.
Example:
Make the sign for “LIGHT ON” → the bulb turns on.
“FAN OFF” → fan turns off.
How to implement:
• Use your sign recognition model to detect command.
• Send command to device using:
o MQTT or Wi-Fi communication
o Raspberry Pi / ESP32

30
Conversion of Sign Language to text/Speech

Chapter 5 :
Implementation and Testing
Here are some snapshots when user shows some hand gestures in different background as well as
in different lightning conditions and system is giving corresponding prediction.

figure 5.1 in this image we check the result that “A” is predicted

figure 5.2 in this image we check the result that “W” is predicted

Here the hand gesture of sign ‘W’ is shown with different background and still our model is predicting
correct letter.

31
Conversion of Sign Language to text/Speech

figure 5.3 in this image we check the result that “B” is predicted

figure 5.4 in this image we check the result that “D” is predicted

After Implementing the cnn algorithm we made gui using python Tkinter and add Suggestions
also to make the process smooth for user.

32
Conversion of Sign Language to text/Speech

Below shown sign is used for giving space between words.

figure 5.5 in this image we get a Sentence ”DEER is predicted and it speak

Below shown sign use after predicting each alphabet to move further.

figure 5.6 in this image we indicate the pawm to get next character get

33
Conversion of Sign Language to text/Speech

Chapter 6
Conclusion and Future Work
Finally, we are able to predict any alphabet[a-z] with 97% Accuracy (with and without clean
background and proper lightning conditions) through our method. And if the background is clear
and there is good lightning condition then we got even 99% accurate results.

In Future work we will make one android application in which we implement this algorithm for
gesture prediction

34
Conversion of Sign Language to text/Speech

Chapter 7 : References

1. 1.Zhou, H.; Zhou, W.; Zhou, Y.; Li, H. Spatial-temporal multi-cue network for
continuous sign language recognition. In Proceedingsof the AAAI Conference on
Artificial Intelligence, New York, NY, USA, 7–12 February 2020 ; Volume 34, pp.
13009–13016.
2. Rodriguez, J.; Martínez, F. How important is motion in sign language translation? IET
Comput. Vis. 2021, 15, 224–234. [CrossRef]
3. Zheng, J.; Chen, Y.; Wu, C.; Shi, X.; Kamal, S.M. Enhancing neural sign language
translation by highlighting the facial expressioninformation. Neurocomputing 2021, 464,
462–472. [CrossRef]
4. Li, D.; Xu, C.; Yu, X.; Zhang, K.; Swift, B.; Suominen, H.; Li, H. Tspnet: Hierarchical
feature learning via temporal semanticpyramid for sign language translation. Adv. Neural
Inf. Process. Syst. 2020, 33, 12034–12045.
5. Núñez-Marcos, A.; Perez-de Viñaspre, O.; Labaka, G. A survey on Sign Language
machine translation. Expert Syst. Appl. 2022,213, 118993. [CrossRef]
6. Cui, R.; Liu, H.; Zhang, C. Recurrent convolutional neural networks for continuous sign
language recognition by stagedoptimization. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26July 2017; pp.
7361–7369

Shakespeare Brochure
No ratings yet
Shakespeare Brochure
2 pages
History of Indic Scripts 2016 PDF
100% (1)
History of Indic Scripts 2016 PDF
15 pages
Berndt RM Berndt CH The World of The First Australians An in
100% (1)
Berndt RM Berndt CH The World of The First Australians An in
574 pages
Sign Language To Text Conversion
50% (2)
Sign Language To Text Conversion
27 pages
Language - An Introduction To The Study of Speech
100% (2)
Language - An Introduction To The Study of Speech
266 pages
9th Form
No ratings yet
9th Form
44 pages
Calendar, Dates
No ratings yet
Calendar, Dates
3 pages
Speech Analysis Worksheet 1 - Objectives Audience Context Content and Structure
100% (1)
Speech Analysis Worksheet 1 - Objectives Audience Context Content and Structure
5 pages
Year 2 Grammar Worksheet
No ratings yet
Year 2 Grammar Worksheet
52 pages
s.5 CBC End Ot Term I - GP
No ratings yet
s.5 CBC End Ot Term I - GP
3 pages
Teaching Materials
0% (1)
Teaching Materials
14 pages
Understanding Arabic Names: The Arabic Naming System
No ratings yet
Understanding Arabic Names: The Arabic Naming System
21 pages
Sabbāsavasutta - Word by Word Grammatical Analysis & Translation
No ratings yet
Sabbāsavasutta - Word by Word Grammatical Analysis & Translation
55 pages
Sign Language Translator Presentation
No ratings yet
Sign Language Translator Presentation
19 pages
PR1 MODULE 4LESSON 2 - 2 - Paraphrasing, Summarizing and Quoting Related Literature
No ratings yet
PR1 MODULE 4LESSON 2 - 2 - Paraphrasing, Summarizing and Quoting Related Literature
53 pages
CompleteAdvanced ToC
No ratings yet
CompleteAdvanced ToC
2 pages
DLP in Technology
No ratings yet
DLP in Technology
4 pages
English Exit Test: Registration Code: Eet699
No ratings yet
English Exit Test: Registration Code: Eet699
22 pages
What Is Poetry ?: Problem of Life. First, It Relates Them To Our Emotion and Then Transform Them by
No ratings yet
What Is Poetry ?: Problem of Life. First, It Relates Them To Our Emotion and Then Transform Them by
7 pages
Island of The Blue Dolphins: Chapter 9 - 11
No ratings yet
Island of The Blue Dolphins: Chapter 9 - 11
5 pages
Communication in Various & Multicultural Context
No ratings yet
Communication in Various & Multicultural Context
59 pages
Project Report - Sign Language To Text Conversion..2
No ratings yet
Project Report - Sign Language To Text Conversion..2
37 pages
Noun, Adjective, or Verb Worksheet 1-Merged
No ratings yet
Noun, Adjective, or Verb Worksheet 1-Merged
15 pages
Automated Device For Deaf and Dumb People: Group Members
No ratings yet
Automated Device For Deaf and Dumb People: Group Members
21 pages
T L 6271 CVC Words A Spelling Powerpoint Ver 8
No ratings yet
T L 6271 CVC Words A Spelling Powerpoint Ver 8
14 pages
BUKU Eskul BAHASA INGGRIS Kelas Bawah
No ratings yet
BUKU Eskul BAHASA INGGRIS Kelas Bawah
28 pages
Narration
No ratings yet
Narration
15 pages
Project File
No ratings yet
Project File
66 pages
Vap Project PDF
No ratings yet
Vap Project PDF
66 pages
The Return of The Lion: Notes
No ratings yet
The Return of The Lion: Notes
13 pages
Sign
No ratings yet
Sign
70 pages
Subject Verb Agreement - Notes
No ratings yet
Subject Verb Agreement - Notes
5 pages
Whisper To Waves-3
No ratings yet
Whisper To Waves-3
20 pages
Table of Content
No ratings yet
Table of Content
6 pages
ABSTRACT
No ratings yet
ABSTRACT
34 pages
Waves 1
No ratings yet
Waves 1
62 pages
Hand Gesture Recognition and Voice Conversion For Deaf and Dumb
No ratings yet
Hand Gesture Recognition and Voice Conversion For Deaf and Dumb
8 pages
(10 13) Voice
No ratings yet
(10 13) Voice
4 pages
SLI
No ratings yet
SLI
3 pages
Answer Explanations SAT® Practice Test #4
No ratings yet
Answer Explanations SAT® Practice Test #4
41 pages
Ends Emp PT Sign Language
No ratings yet
Ends Emp PT Sign Language
16 pages
Dti Report
No ratings yet
Dti Report
20 pages
Conversion of Sign Language To Text: Presented by
No ratings yet
Conversion of Sign Language To Text: Presented by
16 pages
Report SLD
No ratings yet
Report SLD
21 pages
1001 Submission
No ratings yet
1001 Submission
7 pages
Abstract 8th Sem
No ratings yet
Abstract 8th Sem
5 pages
Sign Recognition Research Paper
No ratings yet
Sign Recognition Research Paper
16 pages
Synopsis
No ratings yet
Synopsis
4 pages
Signbridge-Audio To Sign Language Translator
No ratings yet
Signbridge-Audio To Sign Language Translator
5 pages
Document 8 - Donee
No ratings yet
Document 8 - Donee
113 pages
Architecture
No ratings yet
Architecture
17 pages
Python Doc Chap
No ratings yet
Python Doc Chap
59 pages
Final Report
No ratings yet
Final Report
39 pages
Project Review 1
No ratings yet
Project Review 1
24 pages
Assignment: Shubam Thakyal (2021A1R032)
No ratings yet
Assignment: Shubam Thakyal (2021A1R032)
51 pages
Project 54
No ratings yet
Project 54
63 pages
Sign Language Recognitio
No ratings yet
Sign Language Recognitio
8 pages
Project Synopsis
No ratings yet
Project Synopsis
17 pages
Design Project 2
No ratings yet
Design Project 2
9 pages
ChatGPT Simplified: A Comprehensive Guide to Understanding and Utilizing AI Language Models, ChatGPT-4, ChatGPT Prompts, Fiction Writing, Blogging, Content Writing, Make Money Online
From Everand
ChatGPT Simplified: A Comprehensive Guide to Understanding and Utilizing AI Language Models, ChatGPT-4, ChatGPT Prompts, Fiction Writing, Blogging, Content Writing, Make Money Online
Silas Quantum
5/5 (1)
Reuse Reduce Revisions
No ratings yet
Reuse Reduce Revisions
2 pages
Bachelor of Technology IN Artificial Intelligence and Machine Learning
No ratings yet
Bachelor of Technology IN Artificial Intelligence and Machine Learning
14 pages
Research Paper
No ratings yet
Research Paper
13 pages
Sign Doc 2 - Merged
No ratings yet
Sign Doc 2 - Merged
42 pages
Sign Language Report
No ratings yet
Sign Language Report
32 pages
Hand Sign Language Research
No ratings yet
Hand Sign Language Research
7 pages
Back Page PDF
No ratings yet
Back Page PDF
67 pages
Sample Lesson Plan: Teaching Fluency Using Readers Theatre
No ratings yet
Sample Lesson Plan: Teaching Fluency Using Readers Theatre
2 pages
Conversing With AI: The World Of Natural Language Processing
From Everand
Conversing With AI: The World Of Natural Language Processing
William Garcia
No ratings yet
Real-Time Conversion For Sign-to-Text and Text-to-Speech Communication Using Machine Learning
No ratings yet
Real-Time Conversion For Sign-to-Text and Text-to-Speech Communication Using Machine Learning
8 pages
Synopsis
No ratings yet
Synopsis
9 pages
Smart Translation
No ratings yet
Smart Translation
24 pages
Adobe Scan 11 Jun 2024
No ratings yet
Adobe Scan 11 Jun 2024
8 pages
Final Minor Report
No ratings yet
Final Minor Report
24 pages
AI Report
No ratings yet
AI Report
23 pages
Sat - 28.Pdf - Sign Language Recognition Using Machine Learning
No ratings yet
Sat - 28.Pdf - Sign Language Recognition Using Machine Learning
11 pages
G7 Synopsis
No ratings yet
G7 Synopsis
14 pages
Mudratalk: Indian Sign Language Translator: Bharati Vidyapeeth Deemed To Be University
No ratings yet
Mudratalk: Indian Sign Language Translator: Bharati Vidyapeeth Deemed To Be University
18 pages
Hand Signs To Audio Converte1
No ratings yet
Hand Signs To Audio Converte1
11 pages
What Does Papi Chulo Mean
No ratings yet
What Does Papi Chulo Mean
2 pages
Major Project Report Template
No ratings yet
Major Project Report Template
44 pages
Dual Mode Sign Language Recognizer-An Android Based CNN and LSTM Prediction Model
No ratings yet
Dual Mode Sign Language Recognizer-An Android Based CNN and LSTM Prediction Model
5 pages
AUDIO TO SIGN LANGUAGE Final Fishries++ (2621+to+2630)
No ratings yet
AUDIO TO SIGN LANGUAGE Final Fishries++ (2621+to+2630)
10 pages
Glide A Communication Aid For Deaf-Mute
No ratings yet
Glide A Communication Aid For Deaf-Mute
4 pages
Ijeit1412202004 05
No ratings yet
Ijeit1412202004 05
5 pages
Hand Sign Language Translator For Speech Impaired
No ratings yet
Hand Sign Language Translator For Speech Impaired
4 pages
Sign Language To Text Converter
No ratings yet
Sign Language To Text Converter
18 pages
Sign Laguage To Text Convertor - Synopsis - Docx - Google Drive
No ratings yet
Sign Laguage To Text Convertor - Synopsis - Docx - Google Drive
12 pages
Persepsi Dan Motivasi Mahasiswa Dalam Memilih Program Studi Pada Jurusan Pendidikan Bahasa Dan Seni
No ratings yet
Persepsi Dan Motivasi Mahasiswa Dalam Memilih Program Studi Pada Jurusan Pendidikan Bahasa Dan Seni
11 pages
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet

From - Table - of - Content - Report - s2t (1) (1) 2

Uploaded by

From - Table - of - Content - Report - s2t (1) (1) 2

Uploaded by

Chapter 1: Introduction

figure 1. ASL sign language

1.4 Project Requirements

1.4.2 Software Requirement

2.2 Problem Defination:-

Mahesh Kumar N B1 Assistant Professor (Senior Grade),

figure 2.1pre-processing and hand segmentation

Translation of Sign Language Finger-Spelling to Text

figure 2.2 Pre processing Sign Language Finger-spellings

Sign Language to Text and Speech Conversion

CONVERSION OF SIGN LANGUAGE TO TEXT AND SPEECH USING MACHINE LEARNING

figure 2.4 Image Collection of different Asl

2.6 An Improved Hand Gesture Recognition Algorithm based on

figure 2.5 Accuracy of sign language

3.1 Comparison Table

Algorithm LDA Blob CNN CNN KNN contour

Year 2018 2013 2020 2020 2018 2021

3.2 Research Gap

3.3 Project Feasibility Study

3.3.2 Technical feasibility

So we have made front-end using Python Tkinter Gui.

1. Scalability and extensibility.

3. Easy to debug and maintain.

3.3.3 Economic Feasibility

3.5.2 Data pre-processing and Feature extraction

figure 3.4 in this image we collect the sign lanuages of A

figure 3.5 in this image we collected the sign language D

figure 3.6 Hand Landmark Detection Using MediaPipe (21 Landmarks)

Convolutional Neural Network (CNN)

There are two types of pooling:

figure 3.12 In average pooling we take average of all Values in a window

figure 3.13a [y,j]

figure 3.13b [c,o]

figure 3.13d [b,d,f,I,u,v,k,r,w]

figure 3.13e [p,q,z]

All the gesture labels will be assigned with a

figure 3.14 In this image show that Landmark

3.5.4 Text and Speech Translation

3.6.3 DFD diagram

1. User Input (Hand Gestures):

2. Sign Language to Text Converter:

3. Output (Corresponding Character):

• The recognized character is displayed as text to the user.

Implementation Plan for next semester

• Real-Time Sentence Translation:-

Current Output Upgraded Output

• Two-Way Communication (Bi-Directional):-

Sign English Hindi Tamil

• Use Python libraries:

o googletrans for translation

o gTTS or pyttsx3 for multilingual speech

• Integration with Smart Devices (IoT):-

Below shown sign is used for giving space between words.

You might also like