Research Paper Hand Gesture Recognition
Research Paper Hand Gesture Recognition
Abstract—A hand gesture recognition system provides a nat- ploy multiple parallel networks have been shown to improve
ural, innovative and modern way of non verbal communication. recognition rates of single networks.
The Project setup consist of a Computer’s Web camera to capture
the image of gesture performed by the user and take this hand
image as an input to the proposed algorithm in which the II. M ETHODOLOGY
captured image is first segmented and then these collected images
The work we present in this paper follows a three step
dataset are feed in the machine learning model. The intention
of this paper is to discuss a novel approach of hand gesture process as follows-
recognition based on detection of some shape based features. • Computer vision techniques applied for preprocessing the
The overall algorithm divided into several steps, which includes dataset.
hand segmentation, orientation detection, feature extraction and
• Convolution Neural Network to predict gestures per-
classification. The proposed algorithm is independent of user
characteristics. Our model achieves an accuracy of 99.0% in formed by the user in real time.
recognizing ten different hand gestures which shows the relia- • Building the autonomous robot using Arduino and using
bility and feasibility of the proposed method. Further we also the predicted gestures to operate it.
developed a robotic mini car that will be controlled through the
predicted gestures.
A. Computer Vision
Index Terms—Hand Gesture Recognition; Computer Vision;
Convolutional Neural Network; Gesture Controlled Mini Car. The computer vision techniques utilized to perform pre-
processing on the hand detected in the webcam are deployed
using python and OpenCV. Before anything, the hand region
I. I NTRODUCTION needs to be segmented from the background environment.[11].
The above objective is achieved following three underlying
With the advancement in technology, everything is made
methods:
automated and robots are made indulged to accomplish dif-
ferent tasks that has different applications in different areas 1) Background Subtraction: Background subtraction essen-
in today’s world. Computer vision and machine learning has tially means to eliminate background from the fore-
been proved to be more efficient in areas of automation. One ground. This employs the concept of running averages
such application is Hand gesture recognition in which the where the system is made to study a scene for some 30
different gestures are made recognized by a machine after frames. The running average is computed during this
training it on large number of instances of such gesture images. time. This time duration allows the system to figure
Hand gesture recognition finds utility in various aspects like out and register the background in its memory. The
it is useful for conveying messages to physically handicapped algorithm of running average works by taking into
people. In this paper OpenCV libraries and machine learning account the current and the previous frames and some
algorithms are used to recognize hand gestures and then corre- predefined weight on the input image. The mathemat-
sponding to different gestures an arduino based robot is made ical formulation of a function that constantly keeps
to perform specific task e.g., when a forward gesture is made updating the running average of a frame sequence is:
then the machine reads the gesture in the form of an image dst(x, y) = (1-a).dst(x, y) + a.src(x, y) where:
using machine’s webcam and predicts the type of gesture • src - input image (can be 1 or 3 channel)
and then the robot is made to move forward. To recognize • dst - output image or the accumulator image.
gestures, the input image is first converted to segmented form (should be of the same channel as the source image.
using Computer Vision approach which is further divided into • a - weight of the input image, that basically tells
several steps. The various steps in computer vision approach how fast the update is i.e., how fast the destination
are Background Subtraction using Running Average technique frame dismisses the contents of the previous frames
in Image Processing, Motion Detection and Thresholding from its memory.[1]
and Contour Extraction. For training purpose, Deep learning The background is therefore figured out using running
Convolutional Neural network is built along with a pretrained averages and the same is used to eliminate it from the
model named RESNET 50. Recently, classification with deep foreground i.e., the hand, once it is brought into the
convolutional neural networks has been successful in various frame of the camera. The absolute difference between
recognition challenges. Multi-column deep CNNs that em- the background just modelled and foreground frame
2
C. Experimental Results
In this paper, we chose ten gesture[10] types for recognition.
Fig. 2. shows the gestures to be recognized. The images were
taken through a web cam whose size was 215x240 pixels.
First, the background elimination was performed on each Fig. 5: Real-time prediction
image to extract the foreground object. Then the resultant
image was thresholded so that each pixel has only value 1 or
0. [13]It is done to save the computational time for CNN after D. Hardware Implementation Using Arduino:
background elimination. Then contouring was done on the Arduinos are development boards used extensively in
resultant image so as to know the boundaries of the foreground robotics and emebedded systems. Our gesture controlled
object. We took 10000 and 1000 images from a subject to mini car is developed using Arduino Nano which is
train and test using the CNN architecture. To validate the a microcontroller board developed by Arduino.cc and is
proposed method, we used categorical cross-entropy function. based on the microcontroller Atmega328p.[4] This micro-
The network was trained across 10 iterations with a batch size controller can support data upto 8 bits and has 32KB
of 64. The ratio of training set to validation set was 1000 : built-in internal memory. The library employed to allow
100. The proposed model achieves an accuracy of 99% on python to establish serial communication with Arduino is
the validation dataset. The confusion matrix shown in Fig. pySerial. pySerial is a python API module that provides
4. presents the results of validation. It represents the classes access to the serial port. It can work consistently on
which were correctly classified and the ones which were not. multiple operating systems like Windows, Linux etc. [7]
The first column represent the actual classes and the first row The gesture predictor code is modified to include the Serial
represent the predicted classes. This trained model was further libraries and pass values for the predicted gestures to the
used for real time gesture prediction. As shown in Fig. 5. the arduino. (Palm sends ’0’, ThumbsUp sends ’1’, ThumbsDown
web cam was turned on and the subject had to adjust his sends ’2’, right sends ’3’ and left sends ’4’). Through the
hand gesture in the displayed box. The predicted class and the arduino code, the bot is instructed to perform a certain specific
confidence interval was displayed on the screen. and a different action for each data value it receives. The bot
4
stops for Palm, moves forward for ThumbsUp, backward for [12] Gjorgji Strezoski, Dario Stojanovski, Ivica Dimitrovski, and Gjorgji
ThumbsDown, turns right for the gesture right and left for the Madjarov. Hand gesture recognition using deep convolutional neural
networks. pages 49–58, 2016.
gesture left. See Fig.6[5] [13] Chong Wang, Zhong Liu, and Shing-Chow Chan. Superpixel-based hand
gesture recognition with kinect depth camera. IEEE transactions on
multimedia, 17(1):29–39, 2015.
III. C ONCLUSION
In this paper we developed a Convolutional Neural Network
based human hand gesture recognition system. The main
objective of the project is to make the lives of physically
challenged and paralysed people easier. The preprocessing
of the image is done using the computer vision techniques
described in the paper. Then the resultant image is fed into the
deep neural net for training purpose. In this way the model
was trained on 10000 images of 10 different gestures. The
model achieved an accuracy of 99% through the validation
data set of 1000 test images. Lastly the trained model was
successfully used to control a prototype of a simplified gesture
control car. The system learned to steer autonomously when
running on fixed speed. It can be controlled through simple
hand gestures from a few feet away. This for sure is economic,
easily accessible to people and more importantly supports the
sustainability of mankind.
R EFERENCES
[1] Qing Chen, Nicolas D Georganas, and Emil M Petriu. Real-time vision-
based hand gesture recognition using haar-like features. pages 1–6, 2007.
[2] Hong Cheng, Lu Yang, and Zicheng Liu. Survey on 3d hand gesture
recognition. IEEE Transactions on Circuits and Systems for Video
Technology, 26(9):1659–1673, 2016.
[3] Srinivas Ganapathyraju. Hand gesture recognition using convexity hull
defects to control an industrial robot. pages 63–67, 2013.
[4] Vinayak Kamath and Sandeep Bhat. Kinect sensor based real-time robot
path planning using hand gesture and clap sound. pages 129–134, 2014.
[5] Wei Lu, Zheng Tong, and Jinghui Chu. Dynamic hand gesture recog-
nition with leap motion controller. IEEE Signal Processing Letters,
23(9):1188–1192, 2016.
[6] Sébastien Marcel, Olivier Bernier, J-E Viallet, and Daniel Collobert.
Hand gesture recognition using input-output hidden markov models.
pages 456–461, 2000.
[7] Pavlo Molchanov, Shalini Gupta, Kihwan Kim, and Kari Pulli. Multi-
sensor system for driver’s hand-gesture recognition. 1:1–8, 2015.
[8] GRS Murthy and RS Jadon. Hand gesture recognition using neural
networks. pages 134–138, 2010.
[9] Guillaume Plouffe and Ana-Maria Cretu. Static and dynamic hand
gesture recognition in depth data using dynamic time warping. IEEE
transactions on instrumentation and measurement, 65(2):305–316, 2016.
[10] U Rajkanna, M Mathankumar, and K Gunasekaran. Hand gesture based
mobile robot control using pic microcontroller. pages 1687–1691, 2014.
[11] Joyeeta Singha, Amarjit Roy, and Rabul Hussain Laskar. Dynamic hand
gesture recognition using vision-based approach for human–computer
interaction. Neural Computing and Applications, 29(4):1129–1141,
2018.