0% found this document useful (0 votes)
85 views

Divya Final Year Report

This document describes a project report on a sign language recognition system developed by 4 students. The system aims to translate sign language to text or speech to help communication between deaf-mute people and those who don't understand sign language. The project plans to design a real-time system that can process hand gestures, identify the gestures, and convert the sign language to voice so it can be understood by all. The report includes an introduction, literature survey, system requirements, Gantt chart, system design, implementation details, testing approach, and results.

Uploaded by

Sourabha C Dixit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views

Divya Final Year Report

This document describes a project report on a sign language recognition system developed by 4 students. The system aims to translate sign language to text or speech to help communication between deaf-mute people and those who don't understand sign language. The project plans to design a real-time system that can process hand gestures, identify the gestures, and convert the sign language to voice so it can be understood by all. The report includes an introduction, literature survey, system requirements, Gantt chart, system design, implementation details, testing approach, and results.

Uploaded by

Sourabha C Dixit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“Jnana Sangama”, Belagavi – 590018

A Project Report

ON

“SIGN LANGUAGE RECOGNITION SYSTEM ”

BY

AMBIKA G PRABHU 4MT17CS013


DIVYAKEERTHI 4MT17CS033
KEERTHINA V SHETTY 4MT17CS048
MALISSA LIANA RODRIGUES 4MT17CS052

Project Guide
Ms. Aishwarya M Bhat,
Assistant Professor
Department of Computer Science & Engineering
MITE, Moodabidri
February – May 2021

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


(Accredited by NBA)

MANGALORE INSTITUTE OF TECHNOLOGY & ENGINEERING


(An ISO 9001:2015 Certified Institution)
BADAGA MIJAR, MOODABIDRI
DK DIST-574225
MANGALORE INSTITUTE OF TECHNOLOGY & ENGINEERING
(An ISO 9001:2015 Certified Institution)
BADAGA MIJAR, MOODABIDRI, DK DIST – 574225

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

CERTIFICATE

This is to certify that the project work entitled “SIGN LANGUAGE

RECOGNITION SYSTEM” is a bonafide work carried out by AMBIKA G

PRABHU (4MT17CS013), DIVYAKEERTHI (4MT17CS033),

KEERTHINA V SHETTY (4MT17CS048), MALISSA LIANA

RODRIGUES (4MT17CS052) in partial fulfillment for the award of degree of

Bachelor of Engineering in Computer Science & Engineering of the Visvesvaraya

Technological University, Belagavi during the year 2020 – 21. It is certified that all

corrections and suggestions indicated for Internal Assessment have been incorporated in the

report deposited in the departmental library. The project has been approved as it satisfies the

academic requirements in respect of project work prescribed for the Bachelor of Engineering

degree.

Ms. Aishwarya M Bhat Dr. Venkatramana Bhat P Dr. G. L. Easwara Prasad


Project Guide Head of the Department Principal

External Viva

Name of the Examiners Signature with Date

1)

2)
ABSTRACT

Sign language is an indispensable communication means for dumb/mute people because of


their hearing and talking impairment. At present, sign language is not popular
communications method among deaf and dumb people, so that the majority of the hearing
are not willing to have a talk with the deaf-mute, or they have to spend much time and
energy trying to figure out what the correct meaning is. Sign Language Recognition (SLR),
which aims to translate sign language to people who know few about it in the form of text
or speech, can be said to be a great help to deaf-mute/dumb mute and hearing people to
communicate. In this proposal, we are planning to design a real-time system that process
the hand gesture, identify that gesture and convert the sign language to a voice signal so
that all will understand the language which makes the communication among impaired
persons and public

i
ACKNOWLEDGEMENTS
The satisfaction and the successful completion of this project would be incomplete without
the mention of the people who made it possible, whose constant guidance encouragement
crowned our efforts withsuccess.

This project is made under the guidance of Ms. Aishwarya M Bhat, Senior Assistant
Professor, in the Department of Computer Science and Engineering. We would like to
express sincere gratitude to our guide for all the helping hand and guidance in thisproject.

We would like to thank our project coordinators Mr. Shivaprasad T K Senior Assistant
Professor in the Department of Computer Science and Engineering, for their cordial support,
valuable information and guidance, which helped us in completing this project through the
various stages.

We would like to express appreciation to Dr. Venkatramana Bhat P., Professor and Head of
the department, Computer Science and Engineering, for his support and guidance.

We would like to thank our Principal Dr. G.L. Easwara Prasad, for encouraging us and
giving us an opportunity to accomplish the project.

We also thank our management who helped us directly and indirectly in the completion of
thisproject.

Our special thanks to faculty members and others for their constant help and support.

Above all, we extend our sincere gratitude to our parents and friends for their constant
encouragement with moral support.

AMBIKA G PRABHU 4MT17CS013


DIVYAKEERTHI 4MT17CS033
KEERTHINA V SHETTY 4MT17CS048
MALISSA LIANA RODRIGUES 4MT17CS052

ii
TABLE OF CONTENTS

Contents Page No
ABSTRACT i
ACKNOWLEDGEMENT ii
TABLE OF CONTENTS iii
LIST OF TABLES vi

Chapter no TITLE
1. INTRODUCTION 1

1.1 Introduction 1
1.2 Problem Statement 1
1.3 Objectives (Purpose of the project) 2
1.4 Scope of the project 2
1.5 Organization of the report 3
2 LITERATURE SURVEY 4
2.1 Existing System 4
2.2 Limitations of existing systems 7
2.3 Proposed System 8
2.3.1 Convolutional Neural Network 9
2.3.2 Convolutional layers 10
2.3.3 Pooling Layers
11
2.3.4 Fully Connected Layers
2.3.5 Receptive fields 11
2.3.6 Weights
11
2.3.7 Preprocessing and Feature Extraction

3 SYSTEM REQUIREMENTS SPECIFICATION 13


3.1 Overall description 13
3.1.1 Product Perspective 14
3.1.2 Product Functions 14

iii
3.1.3 Assumptions and Dependencies 14

3.2 Specific Requirements 14


3.2.1 Hardware Requirements 14
3.2.2 Software Requirements 15
3.2.3 Functional Requirements 15
3.2.4 Non-functional Requirements 15

4 GANTT CHART 17

5 SYSTEM DESIGN 18

5.1 Architectural Diagram 18

5.2 Sequence Diagram 18


5.3 Activity Diagram 19
5.4 Data flow Diagram 20

6 IMPLEMENTATION 23
6.1 CodeSnippets 23

6.1.1 Initializing the camera 24

6.1.2 Capturing the video 25

6.1.3 Obtaining the frames 25

6.1.4 Classifying the alphabets 26

6.1.5 Obtaining Audio as Output 26

7 TESTING 28

7.1 TestingLevels 28

iv
7.1.1 Unit Testing
7.1.2 Integration Testing
7.1.2 System Testing
7.1.4 Acceptance Testing

7.2 Test cases 30

8 RESULTS AND SNAPSHOTS 32

8.1 Snapshot 1: Activate the virtual environment 32

8.2 Snapshot 2: User Input 32

8.3 Snapshot 3:Sign language converted to text 33

9 CONCLUSION AND FUTURE WORK 34

9.1 Conclusion 34

9.2 Future work 35

REFERENCES 36

v
LIST OF TABLES

Table No. TABLE NAME Page No.


Table 4.1 Gantt chart of planning and scheduling of project 13

vi
CHAPTER 1
INTRODUCTION
Sign Language Recognition System

Chapter 1

INTRODUCTION

This chapter gives the introduction about the sign language.

1.1 Introduction

For deaf-mute people, the importance of body language cannot be more obvious. They
cannot speak and hear, so hand gesture is their most commonly used or even the only tool for
communication. They are primarily dependent on sign language to meet the needs of their daily
lives..The world where the deaf live and the world where the hearing live is the same, but only
few people are willing to communicate with deaf-mute people. The reason is simple: the
majority of hearing people cannot understand sign language and have difficulty communicating
with the deaf-mute. Sign language recognition system is one of the solutions to the problem of
communication with the deafmute. Our project aims to bridge the gap between the speech and
hearing-impaired people and the normal people. The basic idea of this project is to make a
system using which dumb people can significantly communicate with all other people using their
normal gestures.

1.2 Problem Statement

The only way speech and hearing impaired (deaf and dumb) people can communicate is by
sign language. The main problem of this way of communication is normal people who cannot
understand the sign language cannot communicate with these people or vice versa. The basic
idea of this project is to make a system using which dump can significantly communicate with all
other people using their normal gestures. The project uses image processing to identify especially
English alphabetic sign language used by the deaf people to communicate and converts them into
text so that the normal people can understand.

Dept of CSE,MITE,MOODBIDRI Page 1


Sign Language Recognition System

1.3 Objectives
The objective of this system is as follows:

1. The proposed system aimed to recognize hand gesture for signlanguage


2. The process of gesture recognition can be categorized into four stages, namely data
acquisition, preprocessing, features extraction and classification
3. The objective of the proposed work is to identify sign language from the input video
through web camera or inbuilt camera of laptop. Recognized alphabets can be
displayed in the monitor as result.

1.4 Scope

The scope of this project is to design an encryption algorithm that would serve as an
improvement towards all the chaotic approaches used in the past. This algorithm will
demonstrate a better confusion and diffusion tactic to increase key generation randomness
for more secure encryption.

1.5 Organization of the Report

1. Chapter 1 of this document consists of Introduction which gives a brief description of the
project and the scope of it.

2. Chapter 2 of this document describes the Literature Survey. It provides details about the
existing system, the limitations that the existing system experiences and the proposed system for
the project.

3. Chapter 3 of this document describes the Software Requirements Specification. It includes


overall description and specific requirements. The overall requirement is further classified as
product perspective, product functions, user classes and characteristics, design and
implementation constraints, assumptions and dependencies. Specific requirements are classified

Dept of CSE,MITE,MOODBIDRI Page 2


Sign Language Recognition System

as hardware requirements, software requirements, functional requirements and non-functional


requirements.

4. Chapter 4 is the Gantt Chart which is a bar chart showing the project schedule.

5. Chapter 5 is concerned with the System Design. It includes the architectural diagram, the
class diagram, the use case diagram and description, sequence diagram, activity diagram and data
flow diagram.

6. Chapter 6 describes the Implementation. It includes the detail description about how the
project is been implemented.

7. Chapter 7 describes the Testing where the proposed system is tested in various levels like unit
test, integration test and system test and how the program is executed with the set of test cases.

8. Chapter 8 describes the Results and Snapshots of the project.

9. Chapter 9 describes the Conclusion and Future work of the project.

Dept of CSE,MITE,MOODBIDRI Page 3


CHAPTER 2
LITERATURE SURVEY
Sign Language Recognition System

Chapter 2
LITERATURE SURVEY

A literature survey or a literature review in a project report is that section which


shows the various analysis and research made in the field of interest and the
resultsalready published, taking into account the various parameters and the extent of
the project.

2.1 Existing System


“DaviHirafugi Neiva et al have published a paper entitled as Gesture Recognition: a
Review Focusing on Sign Language in a Mobile Context” [1].
Sign languages, which consist of a combination of hand movements and facial expressions, are
used by deaf persons around the world to communicate. However, hearing persons rarely know
sign languages, creating barriers to inclusion. The increasing progress of mobile technology,
along with new forms of user interaction, opens up possibilities for overcoming such barriers,
particularly through the use of gesture recognition through smartphones. This Literature Review
discusses works from 2009 to 2017 that present solutions for gesture recognition in a mobile
context as well as facial recognition in sign languages. Among a diversity of hardware and
techniques, sensor-based gloves were the most used special hardware, along with brute force
comparison to classify gestures. Works that did not adopt special hardware mostly used skin
color for feature extraction in gesture recognition. Classification algorithms included: Support
Vector Machines, Hierarchical Temporal Memory and Feedforward backpropagation neural
network, among others. Recognition of static gestures typically achieved results higher than
80%. Fewer papers recognized dynamic gestures, obtaining results above 90%. However, most
experiments were performed under controlled environments, with specific lighting conditions,
and were only using a small set of gestures. In addition, the majority of works dealt with a simple
background and used special hardware (which is often cumbersome for the user) to facilitate
feature extraction. Facial expression recognition achieved high classification results using

Dept of CSE,MITE MOODBIDRI Page 4


Sign Language Recognition System

Random-Forest and Multi-layer Perceptron. Despite the progress being made with the increasing
interest in gesture recognition, there are still important gaps to be addressed in the context of sign
languages. Besides improving usability and efficacy of the solutions, recognition of facial
expression and of both static and dynamic gestures in complex backgrounds must be considered.

“Ming JinCheok et al have published a paper entitled as A review of hand gesture and
sign language recognition techniques” [2].
Hand gesture recognition serves as a key for overcoming many difficulties and providing
convenience for human life. The ability of machines to understand human activities and their
meaning can be utilized in a vast array of applications. One specific field of interest is sign
language recognition. This paper provides a thorough review of state-of-the-art techniques used
in recent hand gesture and sign language recognition research. The techniques reviewed are
suitably categorized into different stages: data acquisition, pre-processing, segmentation, feature
extraction and classification, where the various algorithms at each stage are elaborated and their
merits compared. When segmenting hand area using skin color threshold algorithm, HSV is a
color space that are generally robust to illumination condition. From previous works, HMM has
been successfully implemented in many researches about gesture recognition, while SVM
appears as a popular approach towards static gesture recognition, having better performance.
Further, we also discuss the challenges and limitations faced by gesture recognition research in
general, as well as those exclusive to sign language recognition. Overall, it is hoped that the
study may provide readers with a comprehensive introduction into the field of automated gesture
and sign language recognition, and further facilitate future research efforts in this area.

“Tejashri J. Joshi et al have published a paper entitled as feature extraction method using
Principal Component Analysis (PCA)” [3].
This paper presents a common dimension reduction method. It is to retain some of the most
significant features of high-dimensional data, removing noise and unimportant features, so as to
achieve the purpose of improving data processing speed.

This paper developed a principal component analysis (PCA)-integrated algorithm for feature
identification in manufacturing; this algorithm is based on an adaptive PCA-based scheme for
identifying image features in vision-based inspection. PCA is a commonly used statistical
method for pattern recognition tasks, but an effective PCA-based approach for identifying

Dept of CSE,MITE MOODBIDRI Page 5


Sign Language Recognition System

suitable image features in manufacturing has yet to be developed. Unsuitable image features tend
to yield poor results when used in conventional visual inspections. Furthermore, research has
revealed that the use of unsuitable or redundant features might influence the performance of
object detection. To address these problems, the adaptive PCA-based algorithm developed in this
study entails the identification of suitable image features using a support vector machine (SVM)
model for inspecting of various object images.

“Christopher Lee and Yangsheng Xu have published a paper entitled as glove based
gesture recognition system”[4].
The researches done in this field are mostly done using a glove based system. In the glove based
system, sensors such as potentiometer, accelerometer etc. is attached to each of the finger. Based
on their readings the corresponding alphabet is displayed. Christopher Lee and Yangsheng Xu
developed glove-based gesture recognition system that was able to recognize 14 of the letters
from the hand alphabet, learn new gestures and able to update the model of each gesture in the
system in online mode. Over the years advanced glove devices have been designed such as the
Sayre Glove, Dexterous Hand Master and Power Glove. The main problem faced by this glove
based system is that it has to be recalibrating every time whenever a new user uses these system.
Also the connecting wires restrict the freedom of movement.

“Byung-Woo Min et al have published a paper entitled as Sign Language Recognition using
Hidden Markov Model” [5].
Gesture is nothing but movement of hands, face and other part of body which is used to
communicate specific message to express thoughts, ideas, emotions, etc. Though other parts can
used for gesture but the hand is most easiest body part. So, in the field of Human-Computer
Interaction (HCI) hand gesture recognition is an active area of research. The hand gesture
recognition can be mainly divided into Data-Glove based and Vision Based approaches. The
Data-Glove based methods use sensor devices for digitizing hand. Due to extra sensors it is easy
to collect hand configuration and movement. It gives good performance but the devices are quite
expensive. The other approach is the Vision Based methods which require only a camera, thus it
gives a natural interaction between humans and computers without the use of any extra devices.
Therefore it is efficient to use and also cost effective. HMM is nothing but Markov process with
hidden states. And this hidden parameters are obtained from observable parameters. Basically

Dept of CSE,MITE MOODBIDRI Page 6


Sign Language Recognition System

HMM has capability of modeling spatio-temporal information. Each gesture is modeled by a


different HMM and given unknown gesture is tested by each HMM that is main advantage of
HMM and The HMM output with maximum probability matching is nothing but the final
recognized result. .The present paper focuses on the diverse stages involved in hand posture
recognition, from the original captured image to its final classification.

“AsanterabiMalima, ErolÖzgür, and MüjdatÇetin have published a paper entitled as A


fast algorithm for vision-based hand gesture recognition for robot control” [6].
Vision-based automatic hand gesture recognition has been a very active research topic in recent
years with motivating applications such as human computer interaction (HCI), robot control, and
sign language interpretation. The general problem is quite challenging due a number of issues
including the complicated nature of static and dynamic hand gestures, complex backgrounds, and
occlusions. Attacking the problem in its generality requires elaborate algorithms requiring
intensive computer resources. What motivates us for this work is a robot navigation problem, in
which we are interested in controlling a robot by hand pose signs given by a human. Due to real-
time operational requirements, we are interested in a computationally efficient algorithm. Our
focus is the recognition of a fixed set of manual commands by a robot, in a reasonably structured
environment in real time. Therefore the speed, hence simplicity of the algorithm is important.
We develop and implement such a procedure in this work. Our approach involves segmenting
the hand based on skin color statistics, as well as size constraints. We then find the center of
gravity (COG) of the hand region as well the farthest point from the COG. Based on these
preprocessing steps, we derive a signal that carries information on the activity of the fingers in
the sign. Finally we identify the sign based on that signal. Our algorithm is invariant to rotations,
translations and scale of the hand. Furthermore, the technique does not require the storage of a
hand gesture database in the robot’s memory. We demonstrate the effectiveness of our approach
on real images of hand gestures.

2.2 Limitations of Existing Systems

1. The main problem faced by this gloved based system is that it has to be recalibrate

Dept of CSE,MITE MOODBIDRI Page 7


Sign Language Recognition System

every time whenever a new user uses this system.


2. Also the connecting wires restrict the freedom of movement.
3. Sign Language Recognition System was also implemented by using Image Processing.
In this way of implementation the sign language recognition part was done by Image
Processing instead of using Gloves.
4. But the only problem this system had was the background was compulsorily to be
black otherwise this system would not work.
5. Also some of the systems required color bands which were meant to be wore on the
finger-tips so that the fingertips are identified by the Image Processing unit.

2.3 Proposed System

In this study, the proposed system is aimed to recognize hand gesture for sign language.

1. In this proposed system there are two modules that is admin and user.
2. Admin will manage the dataset. Dataset is required to train the system.
3. We train the system with the images of dataset where in we use the processes like
preprocessing, feature extraction and classification.
4. Once the features are extracted from the dataset images we will create a weight file. This
weight file contains the summarized information of feature extracted.
5. It is a permanent file so that we can use in the function. Later anyone can use this
system.
6. Here we make use of CNN (Convolutional Neural Network) algorithm.
7. A convolutional neural network algorithm which can take in an input image,assign
importance (learning weights and biases) to various aspects/objects in the image and be
able to differentiate one from another

Dept of CSE,MITE MOODBIDRI Page 8


Sign Language Recognition System

Fig 2.3 Block Diagram of the Proposed System

2.3.1 Convolutional Neural Network

A convolutional neural network consists of an input layer, hidden layers and an output layer. In
any feed-forward neural network, any middle layers are called hidden because their inputs and
outputs are masked by the activation function and final convolution. In a convolutional neural
network, the hidden layers include layers that perform convolutions. Typically this includes a
layer that performs a dot product of the convolution kernel with the layer's input matrix. This
product is usually the Frobenius inner product, and its activation function is commonly ReLU.
As the convolution kernel slides along the input matrix for the layer, the convolution operation
generates a feature map, which in turn contributes to the input of the next layer. This is followed
by other layers such as pooling layers, fully connected layers, and normalization layers.

2.3.2 Convolutional layers


In a CNN, the input is a tensor with a shape: (number of inputs) x (input height) x (input width) x
(input channels). After passing through a convolutional layer, the image becomes abstracted to a
feature map, also called an activation map, with shape: (number of inputs) x (feature map height)
x (feature map width) x (feature map channels). A convolutional layer within a CNN generally
has the following attributes:

Dept of CSE,MITE MOODBIDRI Page 9


Sign Language Recognition System

Convolutional filters/kernels defined by a width and height (hyper-parameters).


The number of input channels and output channels (hyper-parameters). One layer's input
channels must equal the number of output channels (also called depth) of its input.
Additional hyperparameters of the convolution operation, such as: padding, stride, and
dilation.

Convolutional layers convolve the input and pass its result to the next layer. This is similar to the
response of a neuron in the visual cortex to a specific stimulus. Each convolutional neuron
processes data only for its receptive field. Although fully connected feedforward neural
networks can be used to learn features and classify data, this architecture is generally impractical
for larger inputs such as high resolution images. It would require a very high number of neurons,
even in a shallow architecture, due to the large input size of images, where each pixel is a
relevant input feature. For instance, a fully connected layer for a (small) image of size 100 x 100
has 10,000 weights for each neuron in the second layer. Instead, convolution reduces the number
of free parameters, allowing the network to be deeper. For example, regardless of image size,
using a 5 x 5 tiling region, each with the same shared weights, requires only 25 learnable
parameters. Using regularized weights over fewer parameters avoids the vanishing gradients and
exploding gradients problems seen during backpropagation in traditional neural networks.
Furthermore, convolutional neural networks are ideal for data with a grid-like topology (such as
images) as spatial relations between separate features are taken into account during convolution
and/or pooling.

2.3.3 Pooling layers

Convolutional networks may include local and/or global pooling layers along with traditional
convolutional layers. Pooling layers reduce the dimensions of data by combining the outputs of
neuron clusters at one layer into a single neuron in the next layer. Local pooling combines small
clusters,tiling sizes such as 2 x 2 are commonly used. Global pooling acts on all the neurons of
the feature map. There are two common types of pooling in popular use: max and average. Max
pooling uses the maximum value of each local cluster of neurons in the feature map,
while average pooling takes the average value

Dept of CSE,MITE MOODBIDRI Page 10


Sign Language Recognition System

2.3.4 Fully connected layers

Fully connected layers connect every neuron in one layer to every neuron in another layer.
It is the same as a traditional multi-layer perceptron neural network (MLP). The flattened matrix
goes through a fully connected layer to classify the images.

2.3.5 Receptive field

In neural networks, each neuron receives input from some number of locations in the
previous layer. In a convolutional layer, each neuron receives input from only a restricted area of
the previous layer called the neuron's receptive field. Typically the area is a square (e.g. 5 by 5
neurons). Whereas in a fully connected layer, the receptive field is the entire previous layer.
Thus, in each convolutional layer, each neuron takes input from a larger area in the input than
previous layers. This is due to applying the convolution over and over, which takes into account
the value of a pixel, as well as its surrounding pixels. When using dilated layers, the number of
pixels in the receptive field remains constant, but the field is more sparsely populated as its
dimensions grow when combining the effect of several layers.

2.3.6 Weights

Each neuron in a neural network computes an output value by applying a specific function
to the input values received from the receptive field in the previous layer. The function that is
applied to the input values is determined by a vector of weights and a bias (typically real
numbers). Learning consists of iteratively adjusting these biases and weights.The vector of
weights and the bias are called filters and represent particular features of the input (e.g., a
particular shape). A distinguishing feature of CNNs is that many neurons can share the same
filter. This reduces the memory footprint because a single bias and a single vector of weights are
used across all receptive fields that share that filter, as opposed to each receptive field having its
own bias and vector weighting

Dept of CSE,MITE MOODBIDRI Page 11


Sign Language Recognition System

2.3.7 Preprocessing and Feature Extraction

Bottleneck features are the values computed in the pre-classification layer.The basic
technique to get transfer learning working is to get a pre-trained model (with the weights loaded)
and remove final fully-connected layers from that model. We then use the remaining portion of
the model as a feature extractor for our smaller dataset. These extracted features are called
"Bottleneck Features" (i.e. the last activation maps before the fully-connected layers in the
original model). We then train a small fully-connected network on those extracted bottleneck
features in order to get the classes we need as outputs for our problem.Data once prepared, is
divided into three parts.Training,Testing,Validation.Here, each folder is trained.We make use of
tensor flow inceptionV3 model.It is convolutional neural network that has 48 different layers and
that can process image of 299*299 dimensions.

Dept of CSE,MITE MOODBIDRI Page 12


CHAPTER 3
SOFTWARE REQUIREMENTS AND
SPECIFICATIONS
Sign Language Recognition System

Chapter 3

SYSTEM REQUIREMENTS AND SPECIFICATION

A software requirements specification (SRS) is a description of a software system to be


developed. The software requirements specification lays out functional and non- functional
requirements, and it may include a set of use cases that describe user interactions that the
software must provide. Use cases are also known as functional requirements. Inaddition to use
cases the SRS also contains non-functional requirements. Non-functional requirements are
requirements which impose constraints on the design or implementation. For the hardware
requirements the SRS specifies the logical characteristics of each interface between the software
product and the hardware components. It specifies the hardware requirements like memory
restrictions, cache size, the processor, RAM size etc. those are required for the software to run.
Software requirements specification is a rigorous assessment of requirements before the more
specific system design stages, and its goal is to reduce later redesign.

3.1 Overall Description

The SRS is a document, which describes completely the external behavior of the system.
This section of the SRS describes the general factors that affect the product and its requirements.
The system will be explained in its context to show how the system interacts with other systems
and introduce the basic functionalities of it.

3.1.1 Product Perspective

Product perspective is essentially the relationship of the product to other products, defining if it
is independent or is part of a larger product. The proposed system requires the use of python
programming language platform and opencv for image processing to get the desired output.
Worldwide efforts have been made to aid the deaf community in communicating with non-

Dept of CSE,MITE MOODBIDRI Page 13


Sign Language Recognition System

signers but most of the existing system either use specialized sensors or has low performance. It
helps the normal people to communicate with deaf people and even the deaf people can
communicate effectively with public. This system realizes their sign notations and realizes its
English equivalent System then convert text obtained into voice signal.

3.1.2 Product Functions

This system is developed to help deaf people community by converting their sign language into
text. System also will convert text to voice signals to help public to understand their language.
System consists of a web camera through which it receives the video input. Deaf people should
stand in front of the camera and show the sign notation. System converts the video data into
number of frames and realizes its alphabet equivalent. System then converts the word to audio
data which can be played through speakers.

3.1.3 Assumption & Dependencies

Assumptions and dependencies are:

We assume that system is installed with enough resources since system requires high end
configuration than average.
We assume that system consists in built camera to get the quality video frames as input.
System may show bit delay in identifying the sign signal since its computations.

3.2 Specific Requirements

This session includes the detailed description about the hardware requirements, software
requirement, functional requirement and non-functional requirements.

3.2.1 Hardware Requirements

Hardware requirements refer to the physical parts of a computer and related devices. Internal
hardware devices include motherboards, hard drives and RAM. External hardware devices
include monitors, keyboards, mouse, printers and scanner.

RAM : 4GB Minimum

Dept of CSE,MITE MOODBIDRI Page 14


Sign Language Recognition System

Disk : 500 MB of Hard diskspace


CPU : Intel(R) Core(TM) I5-8265U CPU @ 1.60GHz

3.2.2 Software Requirements

Operating system: Ubuntu


IDE: Python

3.2.3 Functional Requirements:

System has following functional modules:

1. Upload Module: This module is used to upload the dataset to the system memory.
2. Presentation Module: This module is used to present the result to end user.
3. Create model: This module is used to create a CNN module to train the system.
4. Train Module: This module is used to train the system with dataset images.
5. Initialize Camera: This module is used to initialize the camera.
6. Classification module: This module is used to classify the input video frames into
different alphabet labels.
7. Text to voice converter: This module is used to convert text into voice signal.

3.2.4 Non functional requirements:


It is a requirement that specifies criteria that can be used to judge the operation of a
system,rather than specific behaviors. They are contrasted with functional requirements that
definespecific behavior or functions.

1. Usability: This system can be used by any deaf person without any effort and it has
appropriate user interface.
2. Maintainability: This software has designed to be user friendly and can be maintained by
even less educated people and require less maintenance.
3. Response Time: This system has good response time so that end user will get result
within the estimated time.

Dept of CSE,MITE MOODBIDRI Page 15


Sign Language Recognition System

4. Software development life cycle: Here agile method is used which combines the
advantages of waterfall approach and iterative model.

Dept of CSE,MITE MOODBIDRI Page 16


CHAPTER 4
GNATT CHART
Sign Language Recognition System

CHAPTER 4

GANTT CHART

A Gantt chart is a type of bar chart, developed by Henry Gantt that illustrates a
project schedule. Gantt charts illustrate the start and finish of the terminal elements and
summary elements of the project. Terminal elements and summary elements comprise
the work breakdown structure of the project.

The following is the Gantt chart of the project “Sign Language Recognition System”

Number Task Start End Duration(days)

1 Synopsis 20-Oct-2020 6-Nov-2020 17

2 Presentation on idea 9-Nov-2020 16-Nov-2020 7

3 Software 5-Mar-2021 20-Mar-2021 15


Requirement
Specification
4 System Design 20-Mar-2021 31-Mar-2021 11

5 Implementation 1-April-2021 29-April-2021 29

6 Presentation on work 1-May-2021 13-May-2021 13


progress
7 Testing 16-May-2021 30-May-2021 14

8 Result and Report 01-June-2021 23-June-2021 22

Table 4.1: Gantt chart of planning and scheduling of project

Dept of CSE,MITE,MOODBIDRI Page 17


CHAPTER 5
SYSTEM DESIGN
Sign Language Recognition System

Chapter 5
SYSTEM DESIGN
System overview provides a top-level view of the entire software product. It highlights the
major components without taking account the inner details of the implementation. It describes
the functionality of the product and context and design of the software product. The application
will be developed in a way which provides the user to interact with the system and simplifies the
tasks by providing the smooth user interface and user experience with easily readable and
understandable view.

5.1 Architectural Diagram


The architectural design gives the description about the overall system design. It is specified
by identifying the components defining the control and data flow between them. The arrow
indicates the connection and rectangular box represents the functional units. The Fig. 5.1 shows
the architectural diagram of this project which shows the overall operation from (selecting the
gray scale image to the decryption of the selected image.

Capture Preprocess Feature Classify the Determine Audio


Video the image Extraction image Alphabet Result

tg

tr Fig. 5.1: Architectural Design Diagram

5.2 Sequence Diagram


A sequence diagram shows how a set of objects communicate with each other to complete
a complex task. The Fig. 5.2 shows the sequence of operations between the different modules
involved in the project.

Dept of CSE,MITE,MOODBIDRI Page 18


Sign Language Recognition System

Fig. 5.2: Sequence Diagram

5.3 Activity Diagram


Activity Diagram shows the sequence of steps that make up complex process. It shows the
flow of control, similar to sequence but focuses on operations rather than on objects. The
components used in this are as follows:

1. Rounded Rectangle indicates the process.


2. Arrow indicates the transition line.
3. Rhombus indicates the decision.
4. Bars represents the start or end concurrent activities.
5. Solid circle represents the initial state of work flow.
6. Encircled black circle represents the final state of work flow.

Dept of CSE,MITE,MOODBIDRI Page 19


Sign Language Recognition System

Is Camera
Initialized

Obtain Frames

Capture Video

Determine Alphabets

Audio Result

Fig. 5.3: Activity Diagram

5.4 Data Flow Diagram


A data flow diagram is a graphical representation of the “flow” of data through an
information system, modelling its process aspects. It is often used as preliminary step to create
an overview of the system, which can later be elaborated.

Dept of CSE,MITE,MOODBIDRI Page 20


Sign Language Recognition System

dataset Dataset
Admin Create
Preprocessing Bottleneck

number of layers, dataset,


Model Creation Train Model Model
image dimensions no. of iteration Validation

Fig. 5.4.1: Admin Data Flow Diagram

The Fig. 5.4.1 shows the data flow between each component in the system. First the admin
preprocesses the dataset to create a bottleneck. Admin also creates a model based on number of
layers and image dimension. Then model is trained by giving the dataset and number of
iterations as inputs,after the training process is completed model is validated.

Notations Input Video Duration Split Video


User into frames

Alphabet Model Weights


Classification

Text

Text To
Voice

Fig. 5.4.2: User Data Flow Diagram

Dept of CSE,MITE,MOODBIDRI Page 21


Sign Language Recognition System

The Fig. 5.4.2 shows the data flow between each component in the system. First the input video
is taken from the user based on the notations. Then the video is split into frames based on the
duration. Then alphabet classification takes place which gives text as a output, then the text will
be converted to voice.

Dept of CSE,MITE,MOODBIDRI Page 22


CHAPTER 6
IMPLEMENTATION
Sign Language Recognition System

Chapter 6
IMPLEMENTATION

Implementation is the core step in software development life cycle. Implementation


gives the detailed view of the project and describes the pseudo code and various important
functions in theproject.The implementation phase of the software development is concerned with
translating design specification into source code. The user tests the developed system and
changes are made according their needs. Our system has been successfully implemented. Before
implementation, several tests have been conducted to ensure that no errors are encountered
during the operation. The implementation phase ends with an evaluation of the system after
placing it into operation for a period of time. Implementation is the third phase of system
process.

6.1 Code Snippets


6.1.1 Initializing the camera

The following code is used to check whether the camera is initialized.

import cv2

# Open the device at the ID 0

cap = cv2.VideoCapture(0)

#Check whether user selected camera is opened successfully.

if cap.isOpened() == False:
print("Could not open video device")

Dept of CSE,MITE,MOODBIDRI Page 23


Sign Language Recognition System

6.1.2 Capturing the video

The following code is used to capture the video.

#cap.set(cv2.CV_CAP_PROP_FRAME_WIDTH, 640)

#cap.set(cv2.CV_CAP_PROP_FRAME_HEIGHT, 480)

while(True):

# Capture frame-by-frame

ret, frame = cap.read()


frame = cv2.flip(frame, 1)

#print(frame)

6.1.3 Obtaining the frames

The following code is used to split the video into frames.

Dept of CSE,MITE,MOODBIDRI Page 24


Sign Language Recognition System

# Display the resulting frame

cv2.imshow("preview",frame)

#Waits for a user input to quit the application

if cv2.waitKey(1) & 0xFF == ord("q"):


break

# When everything done, release the capture

cap.release()

cv2.destroyAllWindows()

6.1.4 Classifying the alphabets

The following code is used classify the alphabets and give text as a output.

Dept of CSE,MITE,MOODBIDRI Page 25


Sign Language Recognition System

import tensorflow as tf
import sys
import os
# Disable tensorflow compilation warnings
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
image_path = sys.argv[1]
# Read the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("logs/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("logs/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))

Dept of CSE,MITE,MOODBIDRI Page 26


Sign Language Recognition System

6.1.5 Obtaining Audio as Output

The following code is used to get audio as a output.

from gtts import gTTS


import os
from playsound import playsound

message='Good Morning'
language='en'

audio=gTTS(text=message,lang=language,slow=False)
audio.save('myvoice.mp3')
# call(['vlc', 'myvoice.mp3'])

CHAPTER 7
playsound('myvoice.mp3'

TESTING

Dept of CSE,MITE,MOODBIDRI Page 27


CHAPTER 7
TESTING
Sign Language Recognition System

Chapter 7
TESTING
Testing is an activity to check whether the actual results match the expected results.
Testing also helps to identify errors, gaps or missing requirements in contrary to the actual
requirements. Testing is an important phase in the development life cycle of the product. During
the testing, the program to be tested was executed with a set of test cases and the output of the
program for the test cases was evaluated to determine whether the program is performing as
expected. Errors were found and corrected by using the following testing steps and correction
was recorded for future references. Thus, a series of testing was performed on the system before
it was ready for implementation. An important point is that software testing should be
distinguished from the separate discipline of Software Quality Assurance (SQA), which
encompasses all business process areas, not just testing.

7.1 Testing Levels

Testing is part of Verification and Validation. Testing plays a very critical role for quality
assurance and for ensuring the reliability of the software. The objective of testing can be stated in
the following ways.

A successful test is one that uncovers as-yet-undiscovered bugs.


A better test case has high probability of finding un-noticed bugs.
A pessimistic approach of running the software with the intent of finding errors.
Testing can be performed in various levels like unit test, integration test and system test.

7.1.1 Unit Test

Unit testing tests the individual components to ensure that they operate correctly. Each
component is tested independently, without other system component. This system was tested
with the set of proper test data for each module and the results were checked with the expected
output. Unit testing focuses on verification effort on the smallest unit of the software design
module.

Dept of CSE,MITE,MOODBIDRI Page 28


Sign Language Recognition System

1. After every step of the algorithm is prepared debugging is carried out to ensure its proper
functioning.
2. A testing is carried out to whether the input video is captured or not.
3. A test was carried out whether the video is split into image frames and image is
normalized properly.
4. Testing was carried out to check the symbol in captured image matches with any of the
symbol image of the database.

7.1.2 Integration Testing

Integration testing is another aspect of testing that is generally done in order to uncover
errors associated with the flow of data across interfaces. The unit-tested modules are grouped
together and tested in small segment, which makes it easier to isolate and correct errors. This
approach is continued until we have integrated all modules to form the system as a whole. After
the completion of each step it has been combined with the remaining module to ensure that the
project is working properly as expected

7.1.3 System Testing

System testing tests a completely integrated system to verify that it meets its requirements.
After the completion of all the module they are combined together to test whether the entire
project is working properly. It deals with testing the whole project for its intended purpose. In
other words, the whole system is tested here. System testing was carried out by selecting the
image from the project folder and normalizing it for the sharpness of the image. After that the
image is distorted For each of the dataset image in each class (A,B,C...Z) a bottleneck file will be
created.

7.1.4 Acceptance Testing

Project is tested at different levels to ensure that it is working properly and was meeting
the requirements which are specified in the requirement analysis. Acceptance testing is done
once the project is done and checked for the acceptance. The results from the system was

Dept of CSE,MITE,MOODBIDRI Page 29


Sign Language Recognition System

compared with the results from the traditional evaluation approach. Then the accuracy of the
system was tested by increasing the epochs or iterations

7.2 Test Cases


A test case is a software testing document, which consists of events, action,
input, output, expected result and actual result. Technically a test case includes test
description, procedure, expected result and remarks. Test cases should be based
primarily on the software requirements and developed to verify correct functionality and
to establish conditions that reveal potential errors. Individual PASS/FAIL criteria are
written for each test case. All the tests need to get a PASS result for proper working of
an application.

Test Case 1 :Activate the Web camera

Objective: To capture the video

Steps:.The following steps have to be followed to carry out the test.

1. In the terminal,give the command source venv/bin/activate to activate the virtual


environment
2. Enter ls command to check all the files
3. Enter the command python classify_webcam.py
4. Two pop up windows will be displayed
Expected Results:Web cam is activated

Result: Successful

Test Case 2 :Conversion of Text to audio

Objective: To convert sign language to text as well as audio

Steps:.The following steps have to be followed to carry out the test.

1. Once the user stops showing the sign language in front of the camera,the pop up

Dept of CSE,MITE,MOODBIDRI Page 30


Sign Language Recognition System

window closes
2. The sign language is converted and displayed as text in the terminal and audio of the
text is also displayed
Expected Results: Text is converted to audio

Result: Successful

Dept of CSE,MITE,MOODBIDRI Page 31


CHAPTER 8
RESULTS AND SNAPSHOTS
Sign Language Recognition System

CHAPTER 8
RESULTS AND SNAPSHOTS
8.1 Snapshot-1
Figure 8.1 shows the how the virtual environment is activated and how to run the program

Fig 8.1:Activating virtual environment and running the program

8.2 Snapshot-2
Figure 8.2 shows User showing the sign language in front of the webcam and video is captured

Fig 8.2 User input

Dept of CSE,MITE,MOODBIDRI Page 32


Sign Language Recognition System

8.3 Snapshot-3
Figure 8.3 shows that sign language converted to text and it is displayed in the terminal

Fig 8.3 Sign language converted to text

Dept of CSE,MITE,MOODBIDRI Page 33


CHAPTER 9
CONCLUSION AND
FUTURE WORK
Sign Language Recognition System

Chapter 9
CONCLUSION AND FUTURE WORK
9.1 Conclusion

Our project aims to make communication simpler between deaf and dumb people by introducing
Computer in communication path so that sign language can be automatically captured,
recognized, translated to text and displayed it on LCD.The output of the sign language will be
displayed in the text as well as voice signal.This makes the system more efficient and hence
communication of the hearing andspeech impaired people more easy.

9.2 Future Work

In future work, proposed system can be developed and implemented using RaspberryPi. Image
Processing part should be improved so that System would be able to communicate in both
directions i.e.it should be capable of converting normallanguage to sign language and vice
versa.We will try to recognize signs which include motion. Moreover we will focus onconverting
the sequence of gestures into text i.e. word and sentences and thenconverting it into the speech
which can be heard.

Dept of CSE,MITE,MOODBIDRI Page 34


Sign Language Recognition System

REFERENCES

[1] Davi Hirafuji Neiva, Cleber Zanchenttin, Gesture recognition: A review focusing on sign
language in a mobile context, Expert Systems with Applications, 103, 159-183 (2018).

[2] Suharjito, Ricky Anderson, Fanny Wiryana, Meita Chandra Ariesta, Gede Putra Kusuma,
Sign Language Recognition Application Systems for Deaf-Mute People: A Review Based on
Input-Process-Output, 2nd International Conference on Computer Science and Computational
Intelligence 2017, ICCSCI 2017, 3-14 October 2017, Bail, Indonesia (2017).

[3] Ming Jin Cheok, Zaid Omar, Mohamed Hisham Jaward, A review of hand gesture and sign
language recognition techniques (2017).

[4]Tejashri J. Joshi, Shiva Kumar, N. Z. Tarapore, Vivek Mohile, Static Hand Gesture
Recognition using an Android Device, in International Journal of Computer Applications
(0975 – 8887) , Vol 12, No.21 (2015).

[5] Pan, T.-Y. , Lo, L.-Y. , Yeh, C.-W. , Li, J.-W. , Liu, H.-T. , & Hu, M.-C., Real-time sign
language recognition in complex background scene based on ahierarchical clus- tering
classification method, In Proceedings of the IEEE second international con - ference on
multimedia big data (BIGMM) ,64-67, IEEE (2016).

Dept of CSE,MITE,MOODBIDRI Page 35

You might also like