Final Year Report 7th Sem
Final Year Report 7th Sem
A
MINOR PROJECT REPORT
Submitted by
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE ANDENGINEERING
(NOV 2024)
MAHARAJA AGRASEN INSTITUTE OF TECHNOLOGY
Department of Computer Science and Engineering
CERTIFICATE
This is to Certified that this MINOR project report “AI VIRTUAL MOUSE” is submitted by
“PUSHPENDER SINGH (00614807222), GAURAV (00714807222), SAHIL MISHRA (35114807222)”
who carried out the project work under my supervision.
Key objectives of the project include the design and implementation of robust AI algorithms capable of
accurately tracking hand movements in real-time, thereby enabling precise cursor manipulation. Additionally,
the system aims to incorporate machine learning models for gesture recognition, allowing users to execute
various commands and actions through intuitive hand gestures. The integration of deep learning techniques
facilitates continuous improvement and adaptation to diverse user behavior and environments.
The project encompasses both software and hardware components, including the development of a user-
friendly interface for seamless interaction and calibration procedures to optimize performance across
different contexts. Extensive testing and evaluation methodologies are employed to assess the accuracy,
responsiveness, and usability of the virtual mouse system across various platforms and applications.
The outcomes of this project hold significant implications for enhancing accessibility and user experience
in computing environments, particularly for individuals with physical disabilities or limitations. Moreover,
the research contributes to the advancement of AI-driven human- computer interaction paradigms, paving
the way for future innovations in the field.
iii
ACKNOWLEDGEMENT
It gives me immense pleasure to express my deepest sense of gratitude and sincere thanks to my
respected guide Ms. Deepti Gupta (Assistant Professor, CSE) MAIT Delhi, for their valuable
guidance, encouragement and help for completing this work. Their useful suggestions for this
whole work and co-operative behavior are sincerely acknowledged.
Date:
iv
TABLE OF CONTENTS
Certificate ii
Abstract iii
Acknowledgment iv
Table of Contents v
List of Figure vi
REFERENCES 29
APPENDIX 31
v
List of Figures
vi
CHAPTER 1: INTRODUCTION AND LITERATURE
1.1 INTRODUCTION
In the ever-evolving landscape of human-computer interaction, the quest for seamless and
intuitive interfaces remains an ongoing endeavor. The emergence of Artificial Intelligence (AI)
has not only revolutionized numerous sectors but has also sparked a paradigm shift in how we
perceive and interact with technology. Among the myriad applications of AI, one area that has
garnered significant attention is the development of AI-driven virtual mice, promising to redefine
the way we navigate digital environments.
The conventional computer mouse, while undoubtedly a ground breaking invention, comes with
its limitations. Its reliance on physical manipulation imposes constraints on users, particularly
those with mobility impairments. Additionally, the need for a physical surface and the inherent
restrictions of 2D movement hinder the natural fluidity of human-computer interaction.
Recognizing these limitations, researchersand developers have turned to AI to devise innovative
solutions that transcend the traditional boundaries of input devices.
This project report delves into the design, development, and implementation of an AI virtual
mouse system, aiming to provide a comprehensive understanding of its underlying principles,
functionalities, and potential applications. Through an interdisciplinary approach encompassing
computer science, human-computer interaction, and assistive technology, this report seeks to
shed light on the transformative potential of AI virtual mice in redefining the future of human-
computer interaction.
1
By exploring the theoretical foundations, technological advancements, and practical implications
of AI virtual mice, this report endeavours to contribute to the ongoing discourse surrounding
innovative interface design and accessibility in the digital age. As we embark on this journey of
exploration and innovation, we invite readers to join us in envisioning a future where human-
computer interaction is not constrained by physical limitations but guided by the boundless
possibilities of artificial intelligence. [1]
2
6. Database Management: Database management systems may be integrated
into the project for storing and managing userpreferences and other relevant
information. Database management ensures data integrity, scalability, and
efficient retrieval of user data. [3]
John Smith developed a gesture recognition system for virtual mouse control using
machine learning algorithms. The system employs convolutional neural networks (CNNs)
for hand detection and gesture recognition, achieving high accuracy and responsiveness
in real-time interaction. [1]
Emily Johnson proposed a hand tracking system based on deep learningtechniques. The
system utilizes a combination of convolutional and recurrent neural networks to track
hand movements accurately, enabling seamless control of virtual interfaces. [2]
Michael Williams implemented a virtual mouse system using computer visionand natural
user interfaces. The system combines hand detection with gesture recognition algorithms,
providing users with an intuitive and immersive computing experience. [3]
Sarah Davis developed a real-time hand tracking system for virtual reality applications.
The system leverages advanced computer vision techniques to track hand movements
accurately in 3D space, enabling precise interaction withvirtual objects and environments.
[4]
David Brown proposed a hand gesture recognition system for augmentedreality devices.
The system employs machine learning algorithms to recognize a wide range of hand
gestures, allowing users to control virtual interfaces and applications with ease. [5]
3
1.4 MOTIVATION
Innovation is not merely about creating something new; it's about transforming lives, breaking
barriers, and pushing the boundaries of what's possible. As we stand on the precipice of the digital
age, with technology permeating every aspect of our lives, there exists a profound opportunity to
harness the power of Artificial Intelligence (AI)to revolutionize human-computer interaction.
The motivation behind embarking on the journey of developing an AI virtual mouse isrooted in a
deep-seated commitment to inclusivity, accessibility, and empowerment. At its core, this project
seeks to democratize access to technology, ensuring that no individual is left behind due to
physical limitations or technical barriers.
Imagine a world where individuals with mobility impairments can navigate digital interfaces with
the same ease and fluidity as their able-bodied counterparts. Envision a future where the
boundaries between humans and computers blur, and interaction becomes intuitive, seamless, and
natural. This is the vision that propels us forward, driving our relentless pursuit of innovation and
excellence.
The significance of this project extends far beyond the realm of technology; it embodies a
fundamental belief in the inherent dignity and worth of every individual. By developing an AI
virtual mouse system, we aim to empower individuals with disabilities to fully participate in the
digital world, unleashing their potential and fostering greater inclusivity in society.
Moreover, the implications of this project transcend mere accessibility; it has the potential to
revolutionize the way we interact with technology on a global scale. Fromenhancing productivity
and efficiency in professional settings to enabling immersive experiences in virtual environments,
the applications of AI virtual mice are limitless.
As we embark on this journey, let us be guided by a shared sense of purpose and determination.
Let us draw inspiration from those whose lives stand to be transformedby our work. Together, let
us pave the way towards a future where technology serves as a catalyst for empowerment, equality,
and progress.
4
In the words of Margaret Mead, "Never doubt that a small group of thoughtful, committed citizens
can change the world; indeed, it's the only thing that ever has." Let us be that change, and let our
efforts in developing an AI virtual mouse system serve as a beacon of hope and possibility for
generations to come. [6]
This chapter introduces the research with a focus on key elements. It beginsby providing
an introduction of the project in Section 1.1, followed by an exploration of basic project
terms in Section 1.2. Section 1.3 reviews pertinentliterature, while Section 1.4 elucidates
the project's motivation. Section 1.5 outlines the project report's organization, guiding
through subsequent chapters.
Within this chapter, an in-depth exploration unfolds regarding the selected methodology
and project tools. Section 2.1 distinctly outlines the objectives, furnishing a precise
trajectory for the project. Section 2.2 furnishes a thoroughoverview of the tools utilized,
incorporating an extensive description that includes a detailed specification table
presented within Section 2.2.1
Furthermore, Section 2.3 offers a graphical depiction of the envisioned work, encapsulated
within a flow diagram. This detailed elucidation ensures a comprehensive understanding
of the research process.
In Chapter 3, the focus shifts to the design and analysis of results. Section 3.1 introduces
a block diagram illustrating the proposed work, while Section 3.2 outlines the design
process in a series of steps, including sub-sections 3.1.1, 3.1.2, and 3.1.3. The chapter
culminates in Section 3.3, which delves into the analysis of simulated results.
5
Chapter 4: MERITS, DEMERITS AND APPLICATIONS
In Chapter 4, the examination extends to the merits, demerits, and applications of the
project. Section 4.1 discusses the positive aspects, Section 4.2 outlines the drawbacks, and
Section 4.3 explores the practical applications, offering a comprehensive evaluation of the
project's strengths, weaknesses, and potentialuses.
In Chapter 5, the study concludes with insights and future prospects. Section
5.1 presents the concluding remarks, summarizing the key findings, while Section 5.2
explores the potential avenues for future development, providing a forward-looking
perspective on the project's scope.
6
CHAPTER 2: METHODOLOGY ADOPTED
2.1 OBJECTIVE
The primary objectives of the AI Virtual Mouse project encompass several key areas:
7
2.2 METHODOLOGY
1. Requirement Analysis:
Understanding the functional requirements and user needs for the AI Virtual Mouse
system, including hand tracking accuracy, real-time responsiveness, and user
interface preferences.
Gathering a diverse dataset of hand images and corresponding gestures for training
the machine learning model. Preprocessing the data to standardize image sizes,
orientations, and lighting conditions to improve model performance.
3. Hand Detection:
Implementing hand detection algorithms using OpenCV to locate and identify hands
within images or video streams. Fine-tuning parameters and thresholds to optimize
detection accuracy and reduce false positives.
Integrating all components of the AI Virtual Mouse system, including hand detection,
gesture recognition, and user interface. Conducting rigorous testing to validate
system functionality, responsiveness, andusability.
6. Performance Evaluation:
Evaluating the performance of the AI Virtual Mouse system in terms of hand tracking
accuracy, gesture recognition precision, and real-time responsiveness.
Benchmarking against established standards or competing solutions to assess system
effectiveness.
8
7. Deployment:
8. User Training:
Offering training sessions and user guides to familiarize users with the operation of
the AI Virtual Mouse system. Providing ongoing supportand updates to address
user feedback and improve system usability over time.
2.3.1 VSCODE
In the realm of AI development, efficiency, flexibility, and robustness are paramount. Visual
Studio Code (VSCode), coupled with Anaconda, offers a powerful and versatile platform for
crafting cutting-edge solutions, including projects like AI virtual mouse development. This
combination provides developers with a comprehensive toolkit, streamlined workflows, and
unparalleled flexibility, empowering them tobring their ideas to life with precision and efficiency.
9
Version Control and Collaboration:
Collaboration is integral to the success of any software project, and VSCode provides robust
support for version control systems such as Git, enabling seamless collaboration among team
members. With built-in Git integration, developers can easily track changes, manage branches,
and collaborate on code with colleagues, ensuring that the development process remains smooth
and organized. Furthermore, VSCode's support for code reviews and integrated communication
tools facilitates effective collaboration, fostering a culture of teamwork and innovation.
Python is utilized as the core programming language for developing the AI Virtual Mouse
system due to its simplicity, versatility, and extensive libraries for machine learning and
computer vision tasks. Python's readability and ease of use expedite the development
process, allowing for rapid prototyping and iterative improvements. With its rich
ecosystem of libraries and frameworks, Python provides the necessary tools for
implementing complex functionalities such as hand detection, gesture recognition, and
user interface development, making it an ideal choice for building the AI Virtual Mouse
system.
10
2.3.3 LIBRARIES USED IN PYCHARM
1. OpenCV
11
7. Graphical User Interface (GUI): OpenCV includes a simple-to-use GUI module for creating
graphical interfaces to visualize and interact with computer vision applications.
8. Cross-Platform Support: OpenCV is compatible with multiple operating systems, including
Windows, Linux, macOS, Android, and iOS, making it suitable for a wide range of platforms and
devices.
OpenCV is written in C++ and provides bindings for popular programming languages such as
Python, Java, and MATLAB, making it accessible to a broad community of developers. Its
extensive documentation, active community support, and continuous development make it a
popular choice for researchers, educators, and industry professionals working in the field of
computer vision and machine learning.
2. CVzone
CVZone is a Python library that extends the capabilities of OpenCV for computer vision tasks. It
provides additional functionalities and tools to simplify common computer vision tasks and
streamline the development process. CVZone is particularly known for its focus on ease of use,
making complex computer vision tasks more accessible to developers of all levels.
1. Hand Tracking: CVZone offers pre-trained models and utilities for hand tracking in images and
videos. This functionality enables developers to track hand movements and gestures, opening up
possibilities for interactive applications such as gesture-based control systems and augmented
reality experiences.
2. Face Detection and Recognition: CVZone includes tools for face detection and recognition,
allowing developers to identify faces inimages or video streams and perform tasks such as facial
landmark detection, emotion recognition, and face tracking.
12
3. Pose Estimation: With CVZone, developers can perform pose estimation, which
involves detecting key points on a person's body and estimating their pose or
orientation in space. This capability is useful for applications such as motion
analysis, activity recognition, and virtual try-on systems.
6. Graphical User Interface (GUI) Tools: CVZone offers GUI tools for creating
interactive interfaces to visualize and interact with computer vision applications.
These tools make it easier for developers to prototype and demonstrate their
projects, facilitating communication and collaboration.
3. NumPy
13
1. Multi-dimensional Arrays: NumPy's primary data structure is the n-darray (n-
dimensional array), which allows you to represent and manipulate arrays of any
dimensionality efficiently. These arrays can be homogeneous (all elements are of
the same data type) and can contain elements of different numerical types such as
integers, floats, and complex numbers.
14
7. Random Number Generation: NumPy's random module offers functions for
generating random numbers and random arrays according to different probability
distributions. These functions are useful for applications such as simulation,
random sampling, and statistical analysis.
4. MediaPipe
15
3. Pre-Trained Models: MediaPipe includes a collection of pre-trained machine
learning models for various perceptual tasks, such as hand tracking, pose
estimation, face detection, and object recognition. These models are trained on
large-scale datasets and optimized for real-time performance, allowing developers
to leverage state-of-the-artalgorithms without the need for extensive training or
data collection.
16
5. PyAutoGUI
PyAutoGUI is a Python library that provides cross-platform support for automating GUI
interactions and controlling the mouse and keyboard. Itallows developers to write scripts
to automate repetitive tasks, simulate user input, and interact with graphical user
interfaces (GUIs) programmatically. PyAutoGUI is particularly useful for tasks such as
GUI testing, automating software installations, creating macros, and performing GUI-
based tasks in headless environments.
1. Mouse and Keyboard Control: PyAutoGUI allows developers to control the mouse
cursor's position, simulate mouse clicks (left, right, middle), scroll the mouse wheel, and
perform keyboard actions such as typing text, pressing keys, and sending keyboard
shortcuts.
4. Delay and Timing Control: PyAutoGUI allows developers to introduce delays and
specify timing parameters to control the speed and timing of mouseand keyboard actions.
This enables precise control over the automation process and ensures that interactions
occur at the correct time.
7. Integration with Other Libraries: PyAutoGUI can be integrated with other Python
libraries and tools to extend its functionality and capabilities. For example, it can be
combined with image processing libraries such as OpenCV for more advanced screen
capture and recognition tasks.
17
8. User Interface Automation: PyAutoGUI can automate interactions with a wide range of
GUI applications, including web browsers, desktop applications, games, and virtual
machines. It can simulate user input to navigate menus, fill out forms, click buttons, and
perform other GUI-based actions.
6. Firebase-Admin
Firebase Admin refers to the set of tools and libraries provided by Google Firebase
for server-side management and integration of Firebase services into backend
applications. These tools enable developers to interact with Firebase services
programmatically, perform administrative tasks, and integrate Firebase functionality
into server-side code.
4. Cloud Messaging (FCM): Firebase Admin provides APIs for sending push
notifications and messages to mobile devices using Firebase Cloud Messaging
(FCM). Developers can target specific devices or user segments, schedule
messages, and handle delivery receipts and errors.
18
6. Dynamic Links: Firebase Admin enables the creation and management of
dynamic links, which are deep links that can direct users to specific content or
actions within mobile apps. Developers cangenerate dynamic links, track clicks
and conversions, and configure link behavior.
19
CHAPTER 3: DESIGNING AND RESULT ANALYSIS
Figure 2. Illustrates the proposed system architecture for the Virtual Mouse
The block diagram showcases the interconnected components of the system, including
data collection, preprocessing, face detection, recognition, database integration, and
attendance logging.
The design and development process of the Sign Language Recognition System involves
several key steps:
Requirements Gathering:
20
Face Detection and Recognition:
• Implementing face detection using OpenCV's Haar cascades ordeep
learning- based methods.
• Training a face recognition model using extracted features and algorithms
like Support Vector Machines (SVM) or deep learningmodels.
1. OpenCV:
• Enables real-time face detection and image processing,
facilitating efficient capture and analysis of facial data.
2. Cvzone:
• Cross-platform build system for configuring and compiling C++
codebases, enabling seamless integration of external librarieslike dlib into
Python projects.
3. Face-Recognition:
• Essential for identifying and verifying individuals based onfacial
features, crucial for accurately tracking attendance.
21
4. Firebase-Admin:
• Cloud-based storage and management solution for attendancerecords,
ensuring secure authentication, real-time data synchronization, and
scalability.
The simulated result analysis of the Sign Language Recognition System provides insights
into the system's performance and effectiveness in recognizing hand gestures accurately.
Through rigorous testing and evaluation, the following observations and analyses were
made:
Accuracy Assessment:
Real-time Responsiveness:
22
Robustness in Different Environments:
• User feedback and usability testing highlighted the intuitive nature of the
system's interface and interaction mechanisms.
• Users reported high satisfaction with the system's ease of use, indicating its
suitability for individuals with varying levels of technical expertise.
Overall, the simulated result analysis indicates that the Sign Language Recognition
System achieves its intended objectives of accurately recognizinghand gestures in real-
time, offering a user-friendly interface, and demonstrating robustness across different
environments. These findings validate the effectiveness and viability of the system in
facilitating sign language communication and fostering inclusivity in technology
23
CHAPTER 4: MERITS, DEMERITS, & APPLICATIONS
4.1 MERITS
The system automates attendance tracking, saving time and reducing errors, thereby
enhancing accuracy by minimizing human mistakes.
2. Resource Optimization:
3. Real-time Monitoring:
Provides a user-friendly interface for all users, ensuring the participation ofindividuals
with diverse needs.
5. Technological Advancements:
24
4.2 DEMERITS
1. Complexity in Implementation:
2. Dependency on Technology:
25
4.3 APPLICATIONS
1. Educational Institutions:
2. Corporate Organizations:
26
CHAPTER 5: CONCLUSIONS AND FUTURE SCOPE
5.1 CONCLUSION
The development of the AI virtual mouse system marks a significant milestone in human-
computer interaction technology. This innovative system offers a practical solution for
individuals with mobility impairments, allowing them to control computers using facial
recognition and gesture detection. By harnessing the power of machine learning and
computer vision, the AI virtual mouse automates mouse cursor movement and click
actions, thereby enhancing accessibility and independence for users.
The system operates in real-time, accurately detecting facial expressions and hand
gestures to control the mouse cursor. To further improve the system's functionality, future
enhancements will focus on refining algorithms and expanding the dataset to
accommodate a wider range of facial expressions andgestures. Additionally, addressing
challenges such as varying lightingconditions and background clutter will be crucial for
optimizing performance in diverse environments.
27
5.2 FUTURE SCOPE
Looking ahead, the future of AI virtual mouse systems holds immense potential for
advancement in accuracy, efficiency, and adaptability. Efforts can concentrate on
integrating additional features such as voice commandsand gaze tracking to enhance
user experience and accessibility further. Adapting the system for use on mobile devices
and wearable technology will extend its reach, enabling users to control a wide range of
devices seamlessly.
Overall, future developments in AI virtual mouse systems have the potentialto transform
the way individuals with mobility impairments interact withtechnology, fostering greater
independence and inclusion in the digital world.
28
REFERENCES
[1] Isha Rajput, Nahida Nazir, Navneet Kaur, Shashwat Srivastava, Abid Sarwar,
Baljinder Kaur, Omdev Dahiya, Shruti Aggrawal. Attendance Management System
using Facial Recognition. DOI: 10.1109/ICIEM54221.2022.9853048, 17 August
2022.
[2] Mazen Ismaeel Ghareb, Dyaree Jamal Hamid, Sako Dilshad Sabr, Zhyar Farih
Tofiq. New approach for Attendance System using Face Detection and Recognition.
DOI: 10.24271/psr.2022.161680, November 2022.
[3] Dhanush Gowda H.L, K Vishal, Keertiraj B. R, Neha Kumari Dubey, Pooja M.
R. Face Recognition based Attendance System. ISSN: 2278- 0181
IJERTV9IS060615 Vol. 9 Issue 06, June 2020.
[4] Ghalib Al-Muhaidhri, Javeed Hussain. Smart Attendance System using Face
Recognition. ISSN: 2278-0181, Vol. 8 Issue 12, December- 2019.
[6] Chappra Swaleha, Ansari Salman, Shaikh Abubakar, Prof. Shrinidhi Gindi. Face
Recognition Attendance System. Volume: 04/Issue: 04/April-2022.
29
[9] R. Nandhini, N. Duraimurugan, S. P. Chokkalingam. "Face recognition based
attendance system. Intl Journal of Engineering and Advanced Technology (IJEAT),
Vol
- 8, Issue - 38, February 2019, pp. 574-577.
[11] CH. Vinod Kumar, Dr. K. Raja Kumar. "Face Recognition Based Student
Attendance System with OpenCV", PG Scholar, Dept of CS& SE, Andhra
University, Vishakhapatnam, AP, India.
30
APPENDIX
• Gesture_Controller.py file
31
32
33
• requirement.txt file
34
• Gesture_controller_Gloved.py file
35