0% found this document useful (0 votes)

12 views

FINAL PROJECT SYNOPSIS.PDF

The document outlines a project aimed at developing a Sign-to-Text conversion system to bridge communication gaps between sign language users and non-users. It details the methodology, including data set generation, gesture classification using CNN, and the implementation of features like autocorrect. The project focuses on achieving high accuracy in recognizing various sign languages, particularly American Sign Language (ASL), and aims to create a user-friendly interface for effective communication.

Uploaded by

arpitsinghmba2426

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

FINAL PROJECT SYNOPSIS.PDF

Uploaded by

arpitsinghmba2426

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 12

Synopsis

CONVERSION OF SIGN TO TEXT

Submitted in partial fulfillment of the requirement
For the award of the degree of
B.TECH

Computer Science & Engineering

(Artificial Intelligence & Machine Learning)

Submitted By

ISHU SINGH
(2000681530027)
RAMAN BALIYAN
(2000681530039)
HARSH TYAGI
(2000681530024)
RITIK CHAUUHAN
(2000681530040)

2023-2024

7th Sem

Department of Computer Science & Engineering

PROBLEM STATEMENT

The project aims to develop a Sign-to-Text

conversion system, addressing the communication
gap between individuals who use sign language and
those who may not be familiar with it. Sign language
is a crucial means of communication for the Deaf and
Hard of Hearing community, yet barriers exist when
interacting with individuals who do not understand
sign language.
Why this topic is chosen?

Fоr interасtiоn between normal рeорle аnd D&M рeорle а

lаnguаge bаrrier is сreаted аs sign lаnguаge struсture since it is
different frоm nоrmаl text. Sо, they deрend оn visiоn-bаsed
соmmuniсаtiоn fоr interасtiоn.

If there is а соmmоn interfасe thаt соnverts the sign lаnguаge tо

text, then the gestures саn be eаsily understооd by non-D&M
рeорle. Sо, reseаrсh hаs been mаde fоr а visiоn-bаsed interfасe
system where D&M рeорle саn enjоy соmmuniсаtiоn withоut
reаlly knоwing eасh оther's lаnguаge.

The aim is tо develop а user-friendly Humаn Cоmрuter Interfасe

(HСI) where the соmрuter understаnds the humаn sign lаnguаge.
There аre vаriоus sign lаnguаges аll оver the wоrld, nаmely
Аmeriсаn Sign Lаnguаge (АSL), Frenсh Sign Lаnguаge, British
Sign Lаnguаge (BSL), Indiаn
Sign lаnguаge, Jараnese Sign Lаnguаge аnd wоrk hаs been dоne
оn оther lаnguаges аll аrоund the wоrld.
Objective and Scope

We are planning to achieve higher accuracy even in case of complex

backgrounds by trying out various background subtraction algorithms.

We are also thinking of improving the Pre Processing to predict gestures

in low light conditions with a higher accuracy.

This project can be enhanced by being built as a web/mobile application

for the users to conveniently access the project. Also, the existing project
only works for ASL; it can be extended to work for other native sign
languages with the right amount of data set and training. This project
implements a finger spelling translator; however, sign languages are also
spoken in a contextual basis where each gesture could represent an
object, or verb. So, identifying this kind of a contextual signing would
require a higher degree of processing and natural language processing
(NLP).
METHODOLOGY

The system is a vision-based approach. All signs are represented with bare hands and so it
eliminates the problem of using any artificial devices for interaction.

5.1 Data Set Generation:

For the project we tried to find already made datasets but we couldn’t
find dataset in the form of raw images that matched our requirements. All
we could find were the datasets in the form of RGB values. Hence, we
decided to create our own data set. Steps we followed to create our data
set are as follows.

We used Open computer vision (OpenCV) library in order to produce our

dataset.

Firstly, we captured around 800 images of each of the symbol in ASL

(American Sign Language) for training purposes and around 200 images
per symbol for testing purpose.

First, we capture each frame shown by the webcam of our machine. In

each frame we define a Region Of Interest (ROI) which is denoted by a
blue bounded square as shown in the image below:

Then we apply an Blur Filter to our image which helps us extract various
features of our image.

5.2 Gesture Classification:

Our approach uses two layers of algorithm to predict the final symbol of the user.
Algorithm Layer 1:

1. Apply Gaussian Blur filter and threshold to the frame taken with openCV to get the
processed image after feature extraction.
2. This processed image is passed to the CNN model for prediction and if a letter is
detected for more than 50 frames then the letter is printed and taken into consideration
for forming the word.
3. Space between the words is considered using the blank symbol.

Algorithm Layer 2:

1. We detect various sets of symbols which show similar results on getting detected.
2. We then classify between those sets using classifiers made for those sets only.

Layer 1:

 CNN Model:
1. 1st Convolution Layer: The input picture has resolution of 128x128 pixels. It is first
processed in the first convolutional layer using 32 filter weights (3x3 pixels each). This will
result in a 126X126 pixel image, one for each Filter-weights.
2. 1st Pooling Layer: The pictures are down sampled using max pooling of 2x2 i.e we keep the
highest value in the 2x2 square of array. Therefore, our picture is down sampled to 63x63
pixels.
3. 2nd Convolution Layer: Now, these 63 x 63 from the output of the first pooling layer is
served as an input to the second convolutional layer. It is processed in the second
convolutional layer using 32 filter weights (3x3 pixels each). This will result in a 60 x 60
pixel image.
4. 2nd Pooling Layer: The resulting images are down sampled again using max pool of 2x2
and is reduced to 30 x 30 resolution of images.
5. 1st Densely Connected Layer: Now these images are used as an input to a fully connected
layer with 128 neurons and the output from the second convolutional layer is reshaped to
an array of 30x30x32 =28800 values. The input to this layer is an array of 28800 values. The
output of these layer is fed to the 2nd Densely Connected Layer. We are using a dropout
layer of value 0.5 to avoid overfitting.
6. 2nd Densely Connected Layer: Now the output from the 1st Densely Connected Layer is
used as an input to a fully connected layer with 96 neurons.
7. Final layer: The output of the 2nd Densely Connected Layer serves as an input for the final
layer which will have the number of neurons as the number of classes we are classifying
(alphabets + blank symbol).

 Activation Function:
We have used ReLU (Rectified Linear Unit) in each of the layers (convolutional as well
as fully connected neurons).
ReLU calculates max(x,0) for each input pixel. This adds nonlinearity to the formula and
helps to learn more complicated features. It helps in removing the vanishing gradient
problem and speeding up the training by reducing the computation time.
 Pooling Layer:
We apply Max pooling to the input image with a pool size of (2, 2) with ReLU
activation function. This reduces the amount of parameters thus lessening the
computation cost and reduces overfitting.

 Dropout Layers:
The problem of overfitting, where after training, the weights of the network are so tuned
to the training examples they are given that the network doesn’t perform well when
given new examples. This layer “drops out” a random set of activations in that layer by
setting them to zero. The network should be able to provide the right classification or
output for a specific example even if some of the activations are dropped out [5].

 Optimizer:
We have used Adam optimizer for updating the model in response to the output of the
loss function.
Adam optimizer combines the advantages of two extensions of two stochastic gradient
descent algorithms namely adaptive gradient algorithm (ADA GRAD) and root mean
square propagation (RMSProp).

Layer 2:

We are using two layers of algorithms to verify and predict symbols which are more similar to
each other so that we can get us close as we can get to detect the symbol shown. In our testing
we found that following symbols were not showing properly and were giving other symbols
also:

1. For D : R and U
2. For U : D and R
3. For I : T, D, K and I
4. For S : M and N
So, to handle above cases we made three different classifiers for classifying these sets:
1. {D, R, U}
2. {T, K, D, I}
3. {S, M, N}

5.3 Finger Spelling Sentence Formation Implementation:

1. Whenever the count of a letter detected exceeds a specific value and no other letter is
close to it by a threshold, we print the letter and add it to the current string (In our code
we kept the value as 50 and difference threshold as 20).
2. Otherwise, we clear the current dictionary which has the count of detections of present
symbol to avoid the probability of a wrong letter getting predicted.
3. Whenever the count of a blank (plain background) detected exceeds a specific value and
if the current buffer is empty no spaces are detected.
4. In other case it predicts the end of word by printing a space and the current gets
appended to the sentence below.

Figure
5.4 AutoCorrect Feature:

A python library Hunspell_suggest is used to suggest correct alternatives for each (incorrect)
input word and we display a set of words matching the current word in which the user can
select a word to append it to the current sentence. This helps in reducing mistakes committed
in spellings and assists in predicting complex words.

5.5 Training and Testing:

We convert our input images (RGB) into grayscale and apply gaussian blur to remove
unnecessary noise. We apply adaptive threshold to extract our hand from the background and
resize our images to 128 x 128.

We feed the input images after pre-processing to our model for training and testing after
applying all the operations mentioned above.

The prediction layer estimates how likely the image will fall under one of the classes. So, the
output is normalized between 0 and 1 and such that the sum of each value in each class sums
to 1. We have achieved this using SoftMax function.

At first the output of the prediction layer will be somewhat far from the actual value. To make it
better we have trained the networks using labelled data. The cross-entropy is a performance
measurement used in the classification. It is a continuous function which is positive at values
which is not same as labelled value and is zero exactly when it is equal to the labelled value.
Therefore, we optimized the cross-entropy by minimizing it as close to zero. To do this in our
network layer we adjust the weights of our neural networks. TensorFlow has an inbuilt function
to calculate the cross entropy.

As we have found out the cross-entropy function, we have optimized it using Gradient Descent
in fact with the best gradient descent optimizer is called Adam Optimizer.
HARDWARE AND SOFTWARE
USED

Hardware Specification: - (Minimum requirement)

Processor: Any processor above 2.5GHz
RAM: 8 GB
Hard Disk: 50 GB (Solid state drive)
System: Intel core i5
Internet Connection: Active
Software Specification: -
Operating System: Any operating system
Web Browser: Any web browser
Any system with above or higher configuration is compatible for this
project.
Tools and Technology Used
Programming Languages and Libraries: Python, Matplotlib, Numpy,
TensorFlow, Keras.
Techniques:
Convolutional Neural Network (CNN), Image Preprocessing, Model
Training, Model Evaluation
Other tools: Google Colab, Google Drive.
References

https://en.wikipedia.org/wiki/TensorFlow

https://en.wikipedia.org/wiki/Convolutional_neural_

http://hunspell.github.io/

Number System Recognition (https://github.com/chasinginfinity/number-sign-

recognition)

https://opencv.org/

Sign Language Detection Report
100% (1)
Sign Language Detection Report
40 pages
Sign Language Detection
No ratings yet
Sign Language Detection
32 pages
OET Writing (With 10 Sample Letters) For Doctors by Maggie Ryan Updated OET 2.0, Book VOL. 2, 201 - OET 2.0 Writing Books For Doctors by Maggie Ryan)
77% (22)
OET Writing (With 10 Sample Letters) For Doctors by Maggie Ryan Updated OET 2.0, Book VOL. 2, 201 - OET 2.0 Writing Books For Doctors by Maggie Ryan)
111 pages
Synopsis Main
No ratings yet
Synopsis Main
10 pages
Research Paper
No ratings yet
Research Paper
13 pages
Conversion of Sign Language To Text: For Dumb and Deaf
No ratings yet
Conversion of Sign Language To Text: For Dumb and Deaf
26 pages
Implementation of Hand Gesture Recognition System To Aid Deaf-Dumb People
No ratings yet
Implementation of Hand Gesture Recognition System To Aid Deaf-Dumb People
15 pages
American Sign Language Research Paper
No ratings yet
American Sign Language Research Paper
5 pages
Sign Language Recognition Using Python and OpenCV
100% (1)
Sign Language Recognition Using Python and OpenCV
22 pages
Sign-Language Final (1)
No ratings yet
Sign-Language Final (1)
32 pages
ProjectTemplateFinal 4 4 - 4
No ratings yet
ProjectTemplateFinal 4 4 - 4
23 pages
Deep Learning Based Sign Language Recognition System Using Convolutional Neural Network
No ratings yet
Deep Learning Based Sign Language Recognition System Using Convolutional Neural Network
68 pages
Deepti Presentation CSLTS
No ratings yet
Deepti Presentation CSLTS
18 pages
G41 FinalEval
No ratings yet
G41 FinalEval
29 pages
Paper Template1
No ratings yet
Paper Template1
9 pages
MIE324 Final Report: Sign Language Recognition: Anna Deza (1003287855) and Danial Hasan (1003132228) Decemeber 2nd 2018
No ratings yet
MIE324 Final Report: Sign Language Recognition: Anna Deza (1003287855) and Danial Hasan (1003132228) Decemeber 2nd 2018
8 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
NNFL_assignment_1128
No ratings yet
NNFL_assignment_1128
15 pages
Indian Sign Language Converter Using Convolutional Neural Networks
No ratings yet
Indian Sign Language Converter Using Convolutional Neural Networks
5 pages
Supriya-Plagiarism - Report
No ratings yet
Supriya-Plagiarism - Report
11 pages
Report Project
No ratings yet
Report Project
47 pages
Project Pre - Submission Final Report
No ratings yet
Project Pre - Submission Final Report
17 pages
G7 Synopsis
No ratings yet
G7 Synopsis
14 pages
Hand Gesture
No ratings yet
Hand Gesture
37 pages
Design Project 2
No ratings yet
Design Project 2
9 pages
Project Presentation - Sign Language To Text Conversion
No ratings yet
Project Presentation - Sign Language To Text Conversion
27 pages
Surveyreport 1
No ratings yet
Surveyreport 1
4 pages
Bachelor's-Project Report-(Sign Language To Text Conversion)
No ratings yet
Bachelor's-Project Report-(Sign Language To Text Conversion)
30 pages
SIGNLANGUAGE PPT
100% (1)
SIGNLANGUAGE PPT
15 pages
Group No19A Sign Language Recognition
No ratings yet
Group No19A Sign Language Recognition
30 pages
Blackbook
No ratings yet
Blackbook
35 pages
Final Review Report
No ratings yet
Final Review Report
31 pages
BT40451 Project Report
No ratings yet
BT40451 Project Report
47 pages
2017project Paper
No ratings yet
2017project Paper
5 pages
Sign Language Recognition Using Deep Learning
No ratings yet
Sign Language Recognition Using Deep Learning
12 pages
Sign Language Recognition System Using Deep Neural Network
No ratings yet
Sign Language Recognition System Using Deep Neural Network
5 pages
Conference Latex Template
No ratings yet
Conference Latex Template
6 pages
Conversion of Sign Language Into Speech or Text Using CNN
No ratings yet
Conversion of Sign Language Into Speech or Text Using CNN
11 pages
1628083441 (1)
No ratings yet
1628083441 (1)
9 pages
Implementation of Virtual Assistant With Sign Language Using Deep Learning and Tensor Flow
No ratings yet
Implementation of Virtual Assistant With Sign Language Using Deep Learning and Tensor Flow
4 pages
Sathyabama: Conversion of Sign Language Into Speech or Text Using CNN
No ratings yet
Sathyabama: Conversion of Sign Language Into Speech or Text Using CNN
80 pages
Research Paper On Sign Language To Text
No ratings yet
Research Paper On Sign Language To Text
7 pages
Research Paper at 9890704605
No ratings yet
Research Paper at 9890704605
15 pages
Rida Mumtaz
No ratings yet
Rida Mumtaz
26 pages
Project Synopsis (2) (2) (1)
No ratings yet
Project Synopsis (2) (2) (1)
31 pages
Project Synopsis (2) (2)
No ratings yet
Project Synopsis (2) (2)
22 pages
Sign 1
No ratings yet
Sign 1
10 pages
A, Sign Language Detection
No ratings yet
A, Sign Language Detection
32 pages
Signlanguage Detection 2
No ratings yet
Signlanguage Detection 2
30 pages
Research Paper
No ratings yet
Research Paper
16 pages
Mini Project
No ratings yet
Mini Project
30 pages
Sign Language
No ratings yet
Sign Language
12 pages
Communication Interpretation Using Machine Learning and Open CV
No ratings yet
Communication Interpretation Using Machine Learning and Open CV
11 pages
AI Report
No ratings yet
AI Report
23 pages
Software Requirements Specification: COMSATS University Islamabad, COMSATS Road, Off GT Road, Sahiwal, Pakistan
No ratings yet
Software Requirements Specification: COMSATS University Islamabad, COMSATS Road, Off GT Road, Sahiwal, Pakistan
13 pages
Sign Language Hand Gesture Recognition System (2)(6)
No ratings yet
Sign Language Hand Gesture Recognition System (2)(6)
44 pages
Hand Gesture Recognition Using Matlab2
No ratings yet
Hand Gesture Recognition Using Matlab2
30 pages
Final_Report
No ratings yet
Final_Report
39 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Digital Image Processing: Fundamentals and Applications
From Everand
Digital Image Processing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Integers - Notes
No ratings yet
Integers - Notes
21 pages
WEDE5020POE Assignment
No ratings yet
WEDE5020POE Assignment
23 pages
DLL Matter G7 Q1.W1.D2
No ratings yet
DLL Matter G7 Q1.W1.D2
4 pages
Resumen Adit 1
No ratings yet
Resumen Adit 1
3 pages
REMOTE CLASS-TRAVEL-HOSPITALITY-Passive Voice-ENGLISH 3-ANGELA-UPDATED-Sept-7
100% (1)
REMOTE CLASS-TRAVEL-HOSPITALITY-Passive Voice-ENGLISH 3-ANGELA-UPDATED-Sept-7
9 pages
01-03 Local Attack Defense Configuration
No ratings yet
01-03 Local Attack Defense Configuration
76 pages
(Maa 3.1-3.3) 3D Geometry - Triangles
No ratings yet
(Maa 3.1-3.3) 3D Geometry - Triangles
9 pages
Things To Do After Installing Ubuntu 12-04-3-LTS
No ratings yet
Things To Do After Installing Ubuntu 12-04-3-LTS
124 pages
Module 1 Python Basics - Programs
No ratings yet
Module 1 Python Basics - Programs
13 pages
Remote Dictionary Tscrack Nov - 6 - 2005
No ratings yet
Remote Dictionary Tscrack Nov - 6 - 2005
2 pages
Vlookup Left
No ratings yet
Vlookup Left
2 pages
Module 4 - Reading5 - UniformResourceLocator
No ratings yet
Module 4 - Reading5 - UniformResourceLocator
7 pages
ALP CBT 2 Fitter 21 Jan 2019 Shift 1 English
No ratings yet
ALP CBT 2 Fitter 21 Jan 2019 Shift 1 English
56 pages
Master Thesis Kwalitatief Onderzoek
100% (2)
Master Thesis Kwalitatief Onderzoek
4 pages
AnitBarui - in - Get English Notes, Articles of Different Classes. - MCQ QUESTIONS of - UPON WESTMINSTER BRIDGE
No ratings yet
AnitBarui - in - Get English Notes, Articles of Different Classes. - MCQ QUESTIONS of - UPON WESTMINSTER BRIDGE
6 pages
Mechatronics 1
No ratings yet
Mechatronics 1
12 pages
Pandava Lila as a Folk Performance in Ga
No ratings yet
Pandava Lila as a Folk Performance in Ga
13 pages
Making a Sphere from Flat Material – The Math Doct
No ratings yet
Making a Sphere from Flat Material – The Math Doct
14 pages
JDI-Manual CN 1.3 Programmer
No ratings yet
JDI-Manual CN 1.3 Programmer
46 pages
Class6 Technology Computers Worksheet
No ratings yet
Class6 Technology Computers Worksheet
5 pages
Exercise 1
No ratings yet
Exercise 1
2 pages
angular
No ratings yet
angular
2 pages
Book Review Advent of Fatimids Sajjad Rizvi
No ratings yet
Book Review Advent of Fatimids Sajjad Rizvi
3 pages
Code Ptit
No ratings yet
Code Ptit
27 pages
Angular JS & MVC Architecture
No ratings yet
Angular JS & MVC Architecture
12 pages
Subject: Office Automation DIT Part 1st: Ms Word 2007
No ratings yet
Subject: Office Automation DIT Part 1st: Ms Word 2007
54 pages
Final Examination - Attempt Review
No ratings yet
Final Examination - Attempt Review
26 pages
Logcat
No ratings yet
Logcat
11 pages
Affirmattive Negattive Intterrogattive: S Es Es S Es Es S Es Es
No ratings yet
Affirmattive Negattive Intterrogattive: S Es Es S Es Es S Es Es
6 pages