0% found this document useful (0 votes)
7 views7 pages

Research Paper Final Year Final

The document discusses the development of a Gesture and Voice Controlled Virtual Mouse that utilizes computer vision and machine learning to enable users to interact with computers through hand gestures and voice commands, eliminating the need for traditional input devices. It highlights the potential applications of this technology in various fields, including gaming, augmented reality, and assisting individuals with limited mobility. The paper also reviews existing literature on gesture recognition systems, outlining their advantages and limitations.

Uploaded by

Priya Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views7 pages

Research Paper Final Year Final

The document discusses the development of a Gesture and Voice Controlled Virtual Mouse that utilizes computer vision and machine learning to enable users to interact with computers through hand gestures and voice commands, eliminating the need for traditional input devices. It highlights the potential applications of this technology in various fields, including gaming, augmented reality, and assisting individuals with limited mobility. The paper also reviews existing literature on gesture recognition systems, outlining their advantages and limitations.

Uploaded by

Priya Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

A Survey on Gesture Controlled Virtual Mouse and Voice

Assistant in the field of AI and ML

Dr. Surekha KB , Prerna Shankar, Priya Pawan Chaudhary, Piyush Kumar, Ritesh Mehta
BMSIT&M, Bangalore

ABSTRACT
The Gesture and Voice Controlled Virtual Instead of relying on conventional input devices like
Mouse revolutionizes the way users interact with mouse and keyboards, Computer Vision offers an
computers by seamlessly integrating voice exciting alternative: a virtual engagement device that
commands and hand gestures, reducing the operates solely with a camera. This innovative
reliance on direct physical contact. Through the approach eliminates the need for physical touch-
application of machine learning techniques, this based interfaces, opening up a world of possibilities
system digitally manages input and output for more affordable and accessible HCI solutions.
processes, enabling users to navigate and execute The core concept behind this technology lies in
actions like left- and right-clicking using dynamic hand gesture recognition algorithms and methods. By
hand motions in lieu of a traditional physical leveraging advanced algorithms, computers can
mouse. This innovative implementation leverages interpret and respond to various hand gestures made
machine learning for precise hand identification in front of a camera. This gesture recognition
and integrates Python modules for effective voice technology empowers individuals to interact with
assistance. By adopting this approach, the project machines in a manner that feels incredibly natural,
not only simplifies fundamental mouse bridging the gap between human actions and digital
operations but also extends its capabilities to responses.
include tasks such as adjusting brightness,
controlling volume, and managing applications, The applications of gesture recognition
all while accommodating variations in ambient technology are vast and diverse. From augmented
noise. Notably, the project boasts a voice reality experiences that blend the virtual and real
assistant feature, providing an additional layer of worlds seamlessly, to enhancing computer graphics
convenience. Users can communicate with the for immersive visualizations, the potential
virtual assistant through verbal or text applications span a wide spectrum. In the realm of
instructions. In summary, the Gesture and Voice computer games, this technology opens doors to
Controlled Virtual Mouse represents a interactive gaming experiences that respond
sophisticated fusion of Computer Vision, Human- dynamically to players' gestures, creating a new level
Computer Interaction, Speech Recognition, and of engagement and excitement.
Hand Gesture Recognition. Moreover, gesture recognition technology holds
Keywords: Computer Vision, Human- promise in the field of biomedical equipment,
Computer Interaction, Speech Recognition, Hand offering innovative solutions for individuals who face
Gesture Recognition challenges in controlling their limbs. By translating
subtle hand movements into digital commands,
I. INTRODUCTION individuals with limited mobility can gain newfound
In the ever-evolving landscape of technology, the independence and access to digital interfaces,
significance of Human-Computer Interaction (HCI) improving their overall quality of life.
is reaching new heights. With the widespread This endeavor represents a pioneering step toward
adoption of touch screen technology in mobile a future where technology adapts to human behavior,
devices, human-computer interaction has become making interactions with computers more natural,
more intuitive and natural. However, this seamless accessible, and inclusive. Such innovative projects
interaction paradigm has been somewhat limited in harness technologies such as computer vision, speech
desktop systems, where the cost and complexity of recognition, and machine learning to create a more
implementing touch screen interfaces have posed intuitive and accessible human-computer interaction
significant challenges. experience, offering increased convenience and
This is precisely where Computer Vision steps in productivity for users and enables users to control
as a game-changing innovation. By harnessing the their computers and interact with digital devices
power of Computer Vision, it can revolutionize the through gestures and voice commands.
way humans interact with computers.
The final step involves simulating various mouse
II. LITERATURE REVIEW actions like left, right, single as well as double
The paper named "Human Computer clicks. This resulted in a comprehensive process for
Interaction using manual Hand gestures in real computer interaction through hand gestures.
time" by Mohammad Alsaffar, Abdullah One limitation of this system arises when the
Alshammari et al. suggests that the system's background color coincides with the designated
development primarily relied on key components hand color, potentially causing malfunctions. This
such as the ADV7183 video encoder, the parallel issue is exacerbated in environments with varying
peripheral interface and the DMA Controller, ambient background lighting. [3].
along with asynchronous memory SDRAM which
work in tandem. The PPI, when combined with "Cursor Control System Using Hand Gesture
the DMA Controller, facilitates hardware-based Recognition" by Ashwini M. Patil, Sneha U.
image processing. Subsampling thus enhances Dudhane et al. is a paper of related work that
processing speed by reducing memory access suggests that for implementation, a convex hull
without imposing a substantial performance algorithm is employed for hand detection and gesture
impact on the application. recognition, enabling control of computers and other
applications through detected gestures. Skin color
The implementation of the system is done using characteristics within the YCrCb color space are
a dedicated processor with encoders and employed for effective discrimination and
controllers. Space limitation is a drawback of the background subtraction to distinguish between
processor because of saving the gesture in each human skin and background objects. This process
iteration and also requires hardware devices to involves capturing an initial frame with just the
perform the mouse operations.[1] background, and subsequent frames are compared to
The paper "Virtual Mouse and Assistant: A it. Pixels that pass a specific threshold are considered
Technological Revolution Of AI" by Jagbeer Singh part of the human body and are retained in a new
,Yash Goel et al. suggests that the proposed frame, while others are treated as background. This
system operates seamlessly without the need for results in a frame with only the human figure against
external hardware, relying on Computer Vision a zero-color background, facilitating more effective
and AI ML algorithms for automatic hand gesture detection. The feature of the algorithm helps in
recognition and assistance. When performing detecting finger tips of the hand and recognizes if the
mouse operations via hand gestures, the initial step finger is folded.
involves utilizing OpenCV to activate a webcam The drawback is that the system detects skin
and capture real-time images of the user's hands. pixels and segments the hand, both of which need
These images are processed to track landmarks and previously captured images.[4]
then various mouse operations are then executed.
The drawback of the system was that the system The paper "Hand Gesture Recognition System
used complex hand gestures to perform mouse Using Camera" by Viraj Shinde, Tushar Bacchav,
operations, i.e., the system employed intricate hand Jitendra Pawar et al. suggests a system that consisted
gestures for executing mouse functions which of three primary stages. The first stage involved
could be difficult for the users.[2] object detection, where the goal was to identify hand
objects within digital images or videos. While
The paper titled "Virtual Mouse Using Hand
improved environmental conditions and camera
Gesture Recognition- A Systematic Literature
equipment can help mitigate these problems,
Review" by Prathamesh Shinde, Prof. Pradnya V.
achieving control in real-world settings or product
Kulkarni and others provides a thorough
applications can be challenging. Therefore,
examination of existing methods and techniques
employing image processing methods served as a
implemented or proposed, along with an
more effective solution for handling these image
exploration of challenges in the field of virtual
issues and constructing gesture recognition system
mouse using hand gestures. The suggested
that is adaptable and resilient.
approach outlines a systematic methodology
beginning with image capture through a webcam In the second stage, hand objects were analyzed
or built-in camera. The system is then configured for identifying specific gestures, addressing key
to identify hands, allowing users to specify research issues such as extracting distinctive features
parameters such as the number of hands, and selecting effective classifiers. Finally, the third
minimum detection confidence, and minimum stage entailed the analysis of sequential gestures to
tracking confidence. Upon image acquisition, the recognize user instructions or behaviors.
conversion to RGB and application of image
A limitation of the system was its reliance on the
processing techniques, such as background
user to wear a black belt on the wrist of the
subtraction and color segmentation, facilitate the
gesturing hand to achieve accurate segmentation of
extraction of hand and finger landmarks. The
hand shape which will be eliminated in our proposed
system subsequently tracks the hand, determining
methodology.[5]
its location and unique identification.
"Real-Time Hand Gesture Detection and
Recognition Using Bag-of-Features and Support The paper "Static Hand Gesture Recognition with
Vector Machine Techniques" by Nasser H. Dardas Parallel CNN" by Qing Gao, Zhaojie Ju, Jinguo Liu
and Nicolas D. Georganas et al. introduces a real- and others introduces a technique involving two
time system with three models including hand channels of parallel Convolutional Neural Networks
identifying and tracing through background (CNNs). One channel analyzes images recording all
subtraction, second being skin detection, followed the gestures of hand in the RGB color space based
by contour comparison algorithms. It uses posture on color information, while the other processes hand
recognition using bag-of-features and multiclass gesture images using depth information. In the
SVM. The algorithm extracts key points from experimental phase, the system's performance was
training images, mapping them into a unified bag- compared to single-channel RGB-CNN, single-
of-words vector with k-means clustering which is channel Depth-CNN, and prior research by other
input to the multiclass SVM classifier. In testing, scholars. The outcomes revealed that employing the
the classifier categorizes detected hand postures parallel CNNs approach improves the precision of
from a webcam, constructing visual word vectors interpreting hand gestures. This ensures smoother
for key points in a small image. A drawback is the spatial Human-Robot Interaction (HRI) tasks, and
utilization of cluster and multiclass SVM models, the parallel CNN proves effective in varying
which can be challenging and time-consuming for illumination and complex backgrounds.
complex tasks like hand gesture recognition. The drawback of this system is that it uses parallel
System accuracy is affected by webcam quality, CNN: one for color of hand and the other for depth
training image quantity, and the choice of cluster hand gesture images making it more complex to fuse
count for model construction.[6] the two.[9]

The paper "The Virtual Assistant- A wearable device


"A Real-Time System for Hand Gesture for Independent Living of the Visually Impaired" by
Recognition Using Motion History Image," Balamurugan M et al. suggests the utilization of the
authored by Chen-Chiung Hsieh and others, device is primarily made up of two parts: the first is
introduces a face-centric, adaptive skin color a headset that includes an IR sensor, microphone,
model alongside a motion history image technique and earbuds, and the second is a processing unit. The
that relies on the direction of hand movements. RF module and Arduino UNO provides wireless
The paper defines six hand gestures, comprising communication, while IR sensors detects objects in
of four dynamic gestures and two static gestures. front of the user.
Specifically designed Harr-like features are
employed to detect the four directional dynamic The drawback is that hardware is required for
hand gestures, while the skin color models that building RF Receiver, decoder, encoder, transmitter,
adjust or modify themselves based on facial etc. for developing smart device which is costly and
characteristics identify the static hand gestures difficult to maintain. The proposed technology will
within an area of focus (AOF) around the face. not make use of any hardware devices in order to
The system underwent testing with five execute tasks according to user instructions.[10]
participants, revealing experimental results that
The paper "A Study on Speech Recognition
indicated an accuracy of 94.1% on an average,
Technology" authored by Dr. Ramalatha Marimuthu
validating the proposed system's feasibility. A et al. underscores the intricacies of speech
constraint of the structure was the utilization of technology, encompassing processes like sampling,
more intricate hand movements for mouse bit resolution, and the recognition of vocal signals.
operations [7]. An inherent strength of this technology lies in its
natural and unobtrusive signal production, rendering
The paper "Hand Gesture Recognition Based on it well-suited for applications involving remote users
Computer Vision: A Review of Techniques" by and tasks reliant on speech interfaces, showcasing
Munir Oudah et al. examines various techniques the technology's versatility. Notably, speech
for gesture recognition. Explored methods include verification achieves impressive accuracy while
identification of hand gestures relying on the placing minimal computational demands on systems.
Color-based recognition, Appearance-based
recognition, Computer Vision Approach, Nevertheless, a notable drawback of this system is
Instrumental Glove Approach among others. The the inherent challenge of maintaining robustness
paper highlights a flaw in interaction systems, amidst the variability of the communication channel.
This limitation poses a considerable hurdle to
underscoring the necessity for reliable and robust
achieving consistent performance in diverse settings,
algorithms. Our suggested methodology will pointing to an area for potential improvement in
enhance this approach using computer vision for future developments of speech recognition
hand gesture detection. Utilizing a camera vision- technology.[11]
based sensor proves advantageous, as it enables
contactless communication between humans and
computers.[8]
Table 2.1: REVIEW OF RECENT RESEARCH ARTICLES ON GESTURE CONTROLLED VIRTUAL
MOUSE

Article Title Of Paper Description Advantages Disadvantages


[1] Human Computer The implementation of the The system identifies 79% Space limitation is a
Interaction using system is done using a of the frames successfully, drawback of the
manual hand dedicated processor with resulting in a notable processor while
gestures in real few encoders and reduction in processing saving the gesture in
time.
time controllers. each iteration.

[2] Virtual Mouse The hand gestures and The system achieves real- The system uses
and Assistant: A assistant both work time capture of hands with complex hand
Technological automatically without the substantial accuracy. gestures to perform
Revolution Of AI help of any external mouse operations.
hardware by using
Computer Vision
and AI ML
algorithms.

[3] Virtual Mouse The paper addresses all The system captures hand The background
Using Hand implemented or proposed images, converts them to color coincides with
Gesture methods and techniques RGB, and conducts image the designated hand
Recognition- A comprehensively and their processing, including color, causing few
Systematic diverse challenges in the background subtraction malfunctions. This
Literature hand gesture recognition and color segmentation. issue is exacerbated
Review for a virtual mouse. in environments with
varying ambient
background lighting.

[4] Cursor Control The convex hull algorithm The characteristics of the The system detects
System Using is employed to determine algorithm help in skin pixels and
Hand Gesture whether a finger is located detecting the tips of the segments the hand,
Recognition within the palm area by fingers of the hand and both of which need
addressing the challenge of recognizes if the finger is previously captured
identifying the largest folded or not. images.
polygon encompassing all
vertices.

[5] Hand Gesture The stages are object The system's advantage For the precise
Recognition detection, object lies in minimizing external segmentation of hand
System Using recognition and analyzing interfaces such as a mouse shape, users need to
Camera sequential gestures. and keyboard. wear a black belt on
the wrist of the
gesturing hand.
[6] Real-Time Hand During the training stage, The system has effective The algorithms for
Gesture Detection the cluster and multiclass real-time performance, multiclass SVM
and Recognition SVM classifier models maintaining satisfactory classifier models that
Using Bag-of- were constructed, and results across varying were used can be
Features and these models were frame resolutions. It also challenging and
Support Vector subsequently employed in exhibits high classification time-consuming for
Machine the testing phase to accuracy even under complex tasks.
Techniques identify hand gestures conditions of variable
captured by a webcam. scale, different orientation,
illumination, and also in
cluttered background.
[7] A Real Time The paper suggests an Recognition rates are not More intricate hand
Hand Gesture adaptive skin color model only good for static movements are the
Recognition based on facial features gestures but also for principal drawbacks of
System Using and a method utilizing dynamic gestures. its methodology.
Motion History motion history images
Image derived from hand
movement direction.
[8] Hand Gesture The paper reviews all the The various techniques for Hand gesture
Recognition different techniques hand gesture recognition recognition seeks to
Based on including glove based identify a variety of rectify a flaw in
Computer Vision: attached sensor and color gestures to perform interaction systems,
A Review of based hand gesture mouse operations. emphasizing the
Techniques recognition. necessity for the
development of
reliable and robust
algorithms.
[9] Static Hand Parallel CNNs is It works well in changing It considers parallel
Gesture designed where one illumination and CNN: one for the
Recognition channel analyzes images complicated backgrounds. color of the hand and
with Parallel capturing hand gestures the other for the depth
CNN in the RGB color space of the hand gesture
based on color images making it more
information, while the complex to fuse the
other processes hand two.
gesture images using
depth information.

Table 2.2: REVIEW OF RECENT RESEARCH ARTICLES ON VOICE ASSISTANTS

Article Title Of Paper Description Advantages Disadvantages


[10] The Virtual The device is primarily The RF module and Arduino Hardware is
Assistant - A made up of two parts: provide efficient wireless required for
wearable device the first is a headset that communication, while IR building RF
for Independent includes an IR sensor, sensors detects objects in Receiver, decoder,
Living microphone, and front of the user. encoder,
earbuds, and the second transmitter, etc.
is a processing unit. for developing
smart device.

[2] Virtual Mouse The system identifies The suggested system Only limited
and Assistant: A user commands using a functions as a proficient functionalities
voice assistant,
Technological speech recognition accomplishing tasks with are added.
Revolution Of AI module. optimal efficiency.

[11] A Study on Speech Speech technology is Application of speech Robustness to


Recognition described by using recognition technology is channel variability
Technology processes like sampling, increasing day by day for is the biggest
bit resolution and remote users and speech challenge to the
identification of speech interface tasks. current systems.
signals etc.
Moreover, the impact extends far beyond
entertainment, reaching into the realm of biomedical
applications, where individuals with limited mobility
can regain independence through intuitive digital
interfaces.
At the heart of our initiative lies the goal to
democratize HCI, making it more affordable,
accessible, and inclusive for everyone. By
eliminating the barriers posed by expensive touch
screen technology and intricate desktop setups, it is
paving the way for a future where individuals can
effortlessly communicate with computers using
natural hand gestures.
As researchers delve deeper into the realms of
computer vision and gesture recognition, it is not
merely redefining the way user interact with
machines; it is also fostering a sense of
empowerment and independence among users,
particularly those with physical limitations. Our
vision is to create a world where technology adapts
seamlessly to human behavior, enhancing lives,
fostering creativity, and bridging the gap between
the digital and physical worlds.
In essence, the fusion of Computer Vision and
HCI represents a significant leap forward, marking a
paradigm shift in the way user perceives and engage
with technology. Developers continue to innovate
and explore the vast potential of these technologies,
and are shaping a future where human-computer
interactions are not just interfaces but meaningful and
natural dialogues, enriching the human experience in
profound ways. Such a system has the potential to
propel the technological world to new heights,
enabling a multitude of operations to be executed
without the need for physical touch.
Fig 2.1: Statistics Of Consumer Adoption to Voice
Technology and Digital Assistants

As shown in Fig. 2.1, Voice technology and digital IV. REFERENCES


assistants have revolutionized the way users [1] Mohammad Alsaffar, Abdullah Alshammari ,
interact with technology, offering a hands-free and Gharbi Alshammari , Tariq S Almurayziq ,
natural user experience. Their convenience, Saud Aljaloud , Dhahi Alshammari, and
accessibility, and personalization have fueled their Assaye Belay ,” Human Computer Interaction
rapid adoption, particularly among younger using manual Hand gestures in real time”- 2023
generations.
[2] Jagbeer Singh ,Yash Goel, Shubhi Jain, Shiva
Yadav, “Virtual Mouse and Assistant: A
III. CONCLUSION Technological Revolution Of AI” - 2022
In conclusion, the intersection of Computer [3] Renuka Annachhatre, Miti
Vision and Human-Computer Interaction (HCI) Tamakuwala,Prathamesh Shinde, Prof.
heralds a groundbreaking era in technology, offering Pradnya V. Kulkarni, ” Virtual Mouse Using
innovative solutions to longstanding challenges. Hand Gesture Recognition- A Systematic
Utilizing gesture recognition algorithms and Literature Review”- 2022
leveraging cameras as virtual engagement tools has [4] Ashwini M. Patil, Sneha U. Dudhane , Monika
the potential to revolutionize our computer B. Gandhi, Nilesh J. Uke, “Cursor Control
interaction methods. The introduction of this System Using Hand Gesture Recognition” -
technology opens doors to a myriad of possibilities, 2021
from creating immersive augmented reality
experiences to revolutionizing the gameplay.
[5] Ashwini M. Patil, Sneha U. Dudhane , Monika OpenCV “ – 2020
B. Gandhi, Nilesh J. Uke, “Cursor Control [19] Ehsan Niloy, Jannatun Meghna, Mohammad
System Using Hand Gesture Recognition”- Shahriar, “Hand Gesture-Based Character
2021 Recognition Using OpenCV and Deep Learning”-
[6] Nasser H. Dardas and Nicolas D. Georganas, 2021
“Real-Time Hand Gesture Detection and [20] Abhay Dekate, Chaitanya Kulkarni, Rohan
Recognition Using Bag-of-Features and
Support Vector Machine Techniques” – 2021 Killedar Department of Computer Engineering,
AISSMS College of Engineering, Pune,
[7] Chen-Chiung Hsieh, Dung-Hua Lio and David Maharashtra, India; “Study of Voice Controlled
Lee, “A Real Time Hand Gesture Recognition Personal Assistant Device” Article. December
System Using Motion History Image” - 2020 2021
[8] Munir Oudah , Ali Al-Naji and Javaan Chahl,
“Hand Gesture Recognition Based on
Computer Vision: A Review of Techniques”-
2020
[9] Qing Gao, Jinguo Liu, Zhaojie Ju , Yangmin Li,
Tian Zhang , and Lu Zhang, “Static Hand
Gesture Recognition with Parallel CNN” –
2020
[10] Balamurugan M, Logesh , Nirmal Kumar ,
Prasath S, Saranraj N, “The Virtual Assistant-
A wearable device for Independent Living”-
2023
[11] Arul.V.H , Dr. Ramalatha Marimuthu, “A
Study on Speech Recognition Technology” –
2021
[12] Md. Zahirul Islam Department of Computer
Science and Engineering University of
Chittagong Chittagong, Bangladesh; “Static
Hand Gesture Recognition using Convolutional
Neural Network” Article. April 2021
[13] Mayur V. Gore Department of electronics
Government college of engineering
Aurangabad; “Human Computer Interaction
using Hand Gesture Recognition” Article. April
– 2021
[14] Lokesh Gagnani, Hetvi Patel, Sakshi
Chaturvedis, Rajeshwari Jaiswal Computer
Engineering Department, LDRP Institute of
Technology and Research Gandhinagar Gujarat
India; “Gesture Controlled Mouse and Voice
Assistant” Article. 10 Oct 2022
[15] Swapnil D. Badgujar, Gourab Talukdar, ,
Omkar Gondhalekar,Mrs. S.Y. Kulkarni,
“Hand Gesture Recognition System”- 2020
[16] Hsiang-Yueh. Lai , Han-Jheng. Lai , “Real-
Time Dynamic Hand Gesture Recognition”-
2020
[17] Ahmad Puad Ismail , Farah Athirah Abd Aziz,
Nazirah Mohamat Kasim, Kamarulazhar Daud,
“Hand gesture recognition on python and
opencv” – 2020
[18] Ruchi Manish Gurav, Premanand K. Kadbe, ”
Real time Finger Tracking and Contour
Detection for Gesture Recognition using

You might also like