Research Paper Final Year Final
Research Paper Final Year Final
Dr. Surekha KB , Prerna Shankar, Priya Pawan Chaudhary, Piyush Kumar, Ritesh Mehta
BMSIT&M, Bangalore
ABSTRACT
The Gesture and Voice Controlled Virtual Instead of relying on conventional input devices like
Mouse revolutionizes the way users interact with mouse and keyboards, Computer Vision offers an
computers by seamlessly integrating voice exciting alternative: a virtual engagement device that
commands and hand gestures, reducing the operates solely with a camera. This innovative
reliance on direct physical contact. Through the approach eliminates the need for physical touch-
application of machine learning techniques, this based interfaces, opening up a world of possibilities
system digitally manages input and output for more affordable and accessible HCI solutions.
processes, enabling users to navigate and execute The core concept behind this technology lies in
actions like left- and right-clicking using dynamic hand gesture recognition algorithms and methods. By
hand motions in lieu of a traditional physical leveraging advanced algorithms, computers can
mouse. This innovative implementation leverages interpret and respond to various hand gestures made
machine learning for precise hand identification in front of a camera. This gesture recognition
and integrates Python modules for effective voice technology empowers individuals to interact with
assistance. By adopting this approach, the project machines in a manner that feels incredibly natural,
not only simplifies fundamental mouse bridging the gap between human actions and digital
operations but also extends its capabilities to responses.
include tasks such as adjusting brightness,
controlling volume, and managing applications, The applications of gesture recognition
all while accommodating variations in ambient technology are vast and diverse. From augmented
noise. Notably, the project boasts a voice reality experiences that blend the virtual and real
assistant feature, providing an additional layer of worlds seamlessly, to enhancing computer graphics
convenience. Users can communicate with the for immersive visualizations, the potential
virtual assistant through verbal or text applications span a wide spectrum. In the realm of
instructions. In summary, the Gesture and Voice computer games, this technology opens doors to
Controlled Virtual Mouse represents a interactive gaming experiences that respond
sophisticated fusion of Computer Vision, Human- dynamically to players' gestures, creating a new level
Computer Interaction, Speech Recognition, and of engagement and excitement.
Hand Gesture Recognition. Moreover, gesture recognition technology holds
Keywords: Computer Vision, Human- promise in the field of biomedical equipment,
Computer Interaction, Speech Recognition, Hand offering innovative solutions for individuals who face
Gesture Recognition challenges in controlling their limbs. By translating
subtle hand movements into digital commands,
I. INTRODUCTION individuals with limited mobility can gain newfound
In the ever-evolving landscape of technology, the independence and access to digital interfaces,
significance of Human-Computer Interaction (HCI) improving their overall quality of life.
is reaching new heights. With the widespread This endeavor represents a pioneering step toward
adoption of touch screen technology in mobile a future where technology adapts to human behavior,
devices, human-computer interaction has become making interactions with computers more natural,
more intuitive and natural. However, this seamless accessible, and inclusive. Such innovative projects
interaction paradigm has been somewhat limited in harness technologies such as computer vision, speech
desktop systems, where the cost and complexity of recognition, and machine learning to create a more
implementing touch screen interfaces have posed intuitive and accessible human-computer interaction
significant challenges. experience, offering increased convenience and
This is precisely where Computer Vision steps in productivity for users and enables users to control
as a game-changing innovation. By harnessing the their computers and interact with digital devices
power of Computer Vision, it can revolutionize the through gestures and voice commands.
way humans interact with computers.
The final step involves simulating various mouse
II. LITERATURE REVIEW actions like left, right, single as well as double
The paper named "Human Computer clicks. This resulted in a comprehensive process for
Interaction using manual Hand gestures in real computer interaction through hand gestures.
time" by Mohammad Alsaffar, Abdullah One limitation of this system arises when the
Alshammari et al. suggests that the system's background color coincides with the designated
development primarily relied on key components hand color, potentially causing malfunctions. This
such as the ADV7183 video encoder, the parallel issue is exacerbated in environments with varying
peripheral interface and the DMA Controller, ambient background lighting. [3].
along with asynchronous memory SDRAM which
work in tandem. The PPI, when combined with "Cursor Control System Using Hand Gesture
the DMA Controller, facilitates hardware-based Recognition" by Ashwini M. Patil, Sneha U.
image processing. Subsampling thus enhances Dudhane et al. is a paper of related work that
processing speed by reducing memory access suggests that for implementation, a convex hull
without imposing a substantial performance algorithm is employed for hand detection and gesture
impact on the application. recognition, enabling control of computers and other
applications through detected gestures. Skin color
The implementation of the system is done using characteristics within the YCrCb color space are
a dedicated processor with encoders and employed for effective discrimination and
controllers. Space limitation is a drawback of the background subtraction to distinguish between
processor because of saving the gesture in each human skin and background objects. This process
iteration and also requires hardware devices to involves capturing an initial frame with just the
perform the mouse operations.[1] background, and subsequent frames are compared to
The paper "Virtual Mouse and Assistant: A it. Pixels that pass a specific threshold are considered
Technological Revolution Of AI" by Jagbeer Singh part of the human body and are retained in a new
,Yash Goel et al. suggests that the proposed frame, while others are treated as background. This
system operates seamlessly without the need for results in a frame with only the human figure against
external hardware, relying on Computer Vision a zero-color background, facilitating more effective
and AI ML algorithms for automatic hand gesture detection. The feature of the algorithm helps in
recognition and assistance. When performing detecting finger tips of the hand and recognizes if the
mouse operations via hand gestures, the initial step finger is folded.
involves utilizing OpenCV to activate a webcam The drawback is that the system detects skin
and capture real-time images of the user's hands. pixels and segments the hand, both of which need
These images are processed to track landmarks and previously captured images.[4]
then various mouse operations are then executed.
The drawback of the system was that the system The paper "Hand Gesture Recognition System
used complex hand gestures to perform mouse Using Camera" by Viraj Shinde, Tushar Bacchav,
operations, i.e., the system employed intricate hand Jitendra Pawar et al. suggests a system that consisted
gestures for executing mouse functions which of three primary stages. The first stage involved
could be difficult for the users.[2] object detection, where the goal was to identify hand
objects within digital images or videos. While
The paper titled "Virtual Mouse Using Hand
improved environmental conditions and camera
Gesture Recognition- A Systematic Literature
equipment can help mitigate these problems,
Review" by Prathamesh Shinde, Prof. Pradnya V.
achieving control in real-world settings or product
Kulkarni and others provides a thorough
applications can be challenging. Therefore,
examination of existing methods and techniques
employing image processing methods served as a
implemented or proposed, along with an
more effective solution for handling these image
exploration of challenges in the field of virtual
issues and constructing gesture recognition system
mouse using hand gestures. The suggested
that is adaptable and resilient.
approach outlines a systematic methodology
beginning with image capture through a webcam In the second stage, hand objects were analyzed
or built-in camera. The system is then configured for identifying specific gestures, addressing key
to identify hands, allowing users to specify research issues such as extracting distinctive features
parameters such as the number of hands, and selecting effective classifiers. Finally, the third
minimum detection confidence, and minimum stage entailed the analysis of sequential gestures to
tracking confidence. Upon image acquisition, the recognize user instructions or behaviors.
conversion to RGB and application of image
A limitation of the system was its reliance on the
processing techniques, such as background
user to wear a black belt on the wrist of the
subtraction and color segmentation, facilitate the
gesturing hand to achieve accurate segmentation of
extraction of hand and finger landmarks. The
hand shape which will be eliminated in our proposed
system subsequently tracks the hand, determining
methodology.[5]
its location and unique identification.
"Real-Time Hand Gesture Detection and
Recognition Using Bag-of-Features and Support The paper "Static Hand Gesture Recognition with
Vector Machine Techniques" by Nasser H. Dardas Parallel CNN" by Qing Gao, Zhaojie Ju, Jinguo Liu
and Nicolas D. Georganas et al. introduces a real- and others introduces a technique involving two
time system with three models including hand channels of parallel Convolutional Neural Networks
identifying and tracing through background (CNNs). One channel analyzes images recording all
subtraction, second being skin detection, followed the gestures of hand in the RGB color space based
by contour comparison algorithms. It uses posture on color information, while the other processes hand
recognition using bag-of-features and multiclass gesture images using depth information. In the
SVM. The algorithm extracts key points from experimental phase, the system's performance was
training images, mapping them into a unified bag- compared to single-channel RGB-CNN, single-
of-words vector with k-means clustering which is channel Depth-CNN, and prior research by other
input to the multiclass SVM classifier. In testing, scholars. The outcomes revealed that employing the
the classifier categorizes detected hand postures parallel CNNs approach improves the precision of
from a webcam, constructing visual word vectors interpreting hand gestures. This ensures smoother
for key points in a small image. A drawback is the spatial Human-Robot Interaction (HRI) tasks, and
utilization of cluster and multiclass SVM models, the parallel CNN proves effective in varying
which can be challenging and time-consuming for illumination and complex backgrounds.
complex tasks like hand gesture recognition. The drawback of this system is that it uses parallel
System accuracy is affected by webcam quality, CNN: one for color of hand and the other for depth
training image quantity, and the choice of cluster hand gesture images making it more complex to fuse
count for model construction.[6] the two.[9]
[2] Virtual Mouse The hand gestures and The system achieves real- The system uses
and Assistant: A assistant both work time capture of hands with complex hand
Technological automatically without the substantial accuracy. gestures to perform
Revolution Of AI help of any external mouse operations.
hardware by using
Computer Vision
and AI ML
algorithms.
[3] Virtual Mouse The paper addresses all The system captures hand The background
Using Hand implemented or proposed images, converts them to color coincides with
Gesture methods and techniques RGB, and conducts image the designated hand
Recognition- A comprehensively and their processing, including color, causing few
Systematic diverse challenges in the background subtraction malfunctions. This
Literature hand gesture recognition and color segmentation. issue is exacerbated
Review for a virtual mouse. in environments with
varying ambient
background lighting.
[4] Cursor Control The convex hull algorithm The characteristics of the The system detects
System Using is employed to determine algorithm help in skin pixels and
Hand Gesture whether a finger is located detecting the tips of the segments the hand,
Recognition within the palm area by fingers of the hand and both of which need
addressing the challenge of recognizes if the finger is previously captured
identifying the largest folded or not. images.
polygon encompassing all
vertices.
[5] Hand Gesture The stages are object The system's advantage For the precise
Recognition detection, object lies in minimizing external segmentation of hand
System Using recognition and analyzing interfaces such as a mouse shape, users need to
Camera sequential gestures. and keyboard. wear a black belt on
the wrist of the
gesturing hand.
[6] Real-Time Hand During the training stage, The system has effective The algorithms for
Gesture Detection the cluster and multiclass real-time performance, multiclass SVM
and Recognition SVM classifier models maintaining satisfactory classifier models that
Using Bag-of- were constructed, and results across varying were used can be
Features and these models were frame resolutions. It also challenging and
Support Vector subsequently employed in exhibits high classification time-consuming for
Machine the testing phase to accuracy even under complex tasks.
Techniques identify hand gestures conditions of variable
captured by a webcam. scale, different orientation,
illumination, and also in
cluttered background.
[7] A Real Time The paper suggests an Recognition rates are not More intricate hand
Hand Gesture adaptive skin color model only good for static movements are the
Recognition based on facial features gestures but also for principal drawbacks of
System Using and a method utilizing dynamic gestures. its methodology.
Motion History motion history images
Image derived from hand
movement direction.
[8] Hand Gesture The paper reviews all the The various techniques for Hand gesture
Recognition different techniques hand gesture recognition recognition seeks to
Based on including glove based identify a variety of rectify a flaw in
Computer Vision: attached sensor and color gestures to perform interaction systems,
A Review of based hand gesture mouse operations. emphasizing the
Techniques recognition. necessity for the
development of
reliable and robust
algorithms.
[9] Static Hand Parallel CNNs is It works well in changing It considers parallel
Gesture designed where one illumination and CNN: one for the
Recognition channel analyzes images complicated backgrounds. color of the hand and
with Parallel capturing hand gestures the other for the depth
CNN in the RGB color space of the hand gesture
based on color images making it more
information, while the complex to fuse the
other processes hand two.
gesture images using
depth information.
[2] Virtual Mouse The system identifies The suggested system Only limited
and Assistant: A user commands using a functions as a proficient functionalities
voice assistant,
Technological speech recognition accomplishing tasks with are added.
Revolution Of AI module. optimal efficiency.