Tom - Project Review
Tom - Project Review
A PROJECT REPORT
Submitted by
HARISHANKAR M (913121104030)
BARATHVAJ T K S (913121104302)
BACHELOR OF ENGINEERING
IN
AUTONOMOUS
MAY 2025
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
Such a system has the accuracy and responsiveness required with virtually no
latency to make it applicable to a broad field of uses. With this in mind, it is
designed to be user-friendly to provide a clean, natural experience by allowing
users to interact with computers through hand gestures alone. This report
presents the architecture of the system, how it was developed, and its
performance in several scenarios as applied in everyday environments and
other environments where touchless control is necessitated. The Hand-
Tracking Virtual Mouse is one of the developing gesture-based technologies
that will take over the future of human-computer interaction.
CONCLUSION
The Hand-Tracking Virtual Mouse (HTVM) represents a significant
advancement in the field of human-computer interaction, providing a hygienic,
accessible, and efficient alternative to traditional input devices. By leveraging
real-time hand tracking and gesture recognition, HTVM achieves the accuracy
and responsiveness required for practical applications, from everyday tasks to
specialized environments like medical and AR/VR settings. This touch-free
interface not only demonstrates the potential of gesture-based controls but also
paves the way for future innovations in touchless technology. As gesture-based
interfaces continue to evolve, HTVM underscores the feasibility and potential
of these systems to transform how users interact with digital devices, heralding
a more natural and intuitive era of computing.
TABLES OF CONTENT
1. Introduction 7
1.1 Overview 7
1.2 Objective 8
2. Literature Survey 9
3. System Study 10
4. System Proposal 14
4.3 Advantages 16
4.4 Disadvantages 17
5. System Requirements 18
8. System Architecture 32
9. System Implementation 33
9.1 Step-by-Step Implementation 33
10. Conclusion 38
11. Appendices 39
12. References 42
LIST OF FIGURES
1.1 OVERVIEW
2. LITERATURE SURVEY
Research in botnet detection within IoT networks has explored various machine
learning approaches, each offering distinct methodologies and insights to
address the growing threat of botnet attacks. Several key studies have
contributed to this area, highlighting both advancements and limitations.
Zhang, Liu, and Wang (2023) combined Random Forest (RF) and Support
Vector Machine (SVM) techniques for anomaly detection in IoT networks.
Their hybrid model improved detection accuracy by analyzing anomalous
patterns in network traffic. However, the approach increased computational
complexity and resource demand, posing challenges for large datasets common
in IoT environments.
Wang, Chen, and Zhang (2023) applied Graph Neural Networks (GNNs) to
model network traffic as a graph, offering valuable insights into network
behavior and relationships between devices. Although this approach provided
detailed understanding, its significant computational demands hindered
scalability, particularly for large and complex IoT networks.
Gupta and Singh (2023) utilized Principal Component Analysis (PCA) for
dimensionality reduction, combined with ensemble learning techniques like
Random Forest (RF) and Gradient Boosting, to enhance botnet detection. While
this method improved classification accuracy, the high computational resources
required limited its applicability in resource-constrained IoT environments.
3. SYSTEM STUDY
Economic Feasibility
Technical Feasibility
Social Feasibility
Cost Management: The project can minimize initial costs by using open-
source libraries like OpenCV, Mediapipe, and Autopy, along with
consumer-grade webcams. This reduces expenses on proprietary software
and specialized hardware, enabling cost-effective development.
Future Savings: The touch-free interface can lead to long-term savings
in settings where traditional input devices require regular sanitization or
maintenance, such as healthcare or food services. By mitigating the need
for physical contact, HTVM offers potential savings on equipment wear
and cleaning.
Market Viability: Although immediate financial returns may be modest,
the HTVM system’s unique application in emerging markets like AR/VR
and remote healthcare presents growth opportunities and potential
revenue in industries prioritizing hygienic, non-contact interactions.
The HTVM system detects specific gestures mapped to functions like cursor
movement, clicking, scrolling, and dragging. Key features include:
Through these features, the HTVM system seeks to redefine user interaction
with computers, enabling practical, touch-free control across multiple
application areas.
4.3 ADVANTAGES:
4.4 DISADVANTAGES:
· Latency Issues: Some existing systems may have noticeable latency, which
can disrupt smooth cursor control and lead to a suboptimal user experience,
particularly in real-time applications.
· Limited Use Cases: Many existing hand-tracking systems are not designed to
integrate into environments that demand high hygienic standards, like
healthcare settings, or to seamlessly function in AR/VR applications, limiting
their versatility.
· Lack of Scalability: Current hand-tracking systems may struggle to adapt to
new gestures or integrate with multiple applications, restricting their
adaptability across different usage scenarios.
5. SYSTEM REQUIREMENTS
Additional Considerations:
The software requirements outline the essential tools, libraries, and frameworks
for developing, deploying, and operating the HTVM system. Below are the
software components:
Autopy is a Python library that facilitates virtual control over the operating
system's native input devices. In HTVM, Autopy serves to simulate traditional
mouse inputs based on the data provided by MediaPipe. Its main contributions
are:
7. SYSTEM DESIGN
The system design for the Hand-Tracking Virtual Mouse (HTVM) involves a
layered architecture that organizes the components and flow of data from input
(hand movements) to output (simulated cursor control). HTVM’s design
consists of multiple stages, including input capture, hand detection, gesture
recognition, and output simulation.
The HTVM architecture is a pipeline design, where each stage feeds into the
next:
Input Capture Layer
Webcam
o Captures video frames in real time.
o Sends the captured frames to the Hand Detection Layer for analysis.
OpenCV
o Preprocesses video frames for clarity and contrast.
o Defines a region of interest (ROI) to reduce processing load by
focusing on areas where the hand is most likely to appear.
MediaPipe Hand Tracking
o Detects and tracks the hand within the frame using 21 key
landmarks.
o Outputs hand landmark coordinates to the Gesture Recognition
Layer.
Autopy
o Maps hand landmark coordinates to screen coordinates.
o Simulates mouse actions based on interpreted gestures, including:
Cursor movement.
Left and right clicks.
Scrolling.
Coordinate Mapper
each frame, providing precise data on hand posture and finger positions.
4. Interpret Gestures: The Gesture Interpretation Module examines the
This comprehensive design ensures that the HTVM system provides a seamless,
touch-free interface for real-time computer interaction. The layered approach
facilitates modular updates, high responsiveness, and scalability, creating a
future-proof system suitable for diverse applications.
8.SYSTEM ARCHITECTURE
Input Layer
Processing Layer
Output Layer
9. SYSTEM IMPLEMENTATION
Based on the provided flow diagram, the system implementation of the Hand-
Tracking Virtual Mouse (HTVM) involves several key steps that transform
video frames from a webcam into real-time mouse control using hand gestures.
Here is a breakdown of the system implementation according to the diagram
The system checks for a pinch gesture, where the thumb and
index finger come close together. If detected, it simulates a
mouse click.
Pinch gesture :
10.CONCLUSION
11. APPENDICES:
APPENDIX 1
Library 2 :
Library 3 :
Library 4 :
APPENDIX 2
pTime = 0
cTime = 0
cap = cv2.VideoCapture(1)
detector = handDetector()
while True:
success, img = cap.read()
img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)
if len(lmList) != 0:
print(lmList[4])
cTime = time.time()
fps = 1 / (cTime - pTime)
pTime = cTime
cv2.imshow("Image", img)
cv2.waitKey(1)
APPENDIX 3
import cv2
import numpy as np
import HandTrackingModule as htm
import time
import autopy
pTime = 0
plocX, plocY = 0, 0
clocX, clocY = 0, 0
cap = cv2.VideoCapture(1)
cap.set(3, wCam)
cap.set(4, hCam)
detector = htm.handDetector(maxHands=1)
wScr, hScr = autopy.screen.size()
# print(wScr, hScr)
while True:
1. Find hand Landmarks
success, img = cap.read()
img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)
2. Get the tip of the index and middle fingers
if len(lmList) != 0:
x1, y1 = lmList[8][1:]
x2, y2 = lmList[12][1:]
print(x1, y1, x2, y2)
7. Move Mouse
autopy.mouse.move(wScr - clocX, clocY)
cv2.circle(img, (x1, y1), 15, (255, 0, 255), cv2.FILLED)
plocX, plocY = clocX, clocY
12. REFERENCES
Manresa, J. C., Varona, J., Mas, R., Perales, F. J. (2005). Hand Tracking and
Gesture Recognition for Human-Computer Interaction. Electronics Letters on
Computer Vision and Image Analysis, 5(3), 96-104.
A study on hand tracking and gesture recognition, focusing on the
application of these techniques in HCI systems.
Kabid Hassan, S., et al. (2019). Design and Development of Hand Gesture-
Based Virtual Mouse. Proceedings of the International Conference on Advances
in Science, Engineering and Robotics Technology (ICASERT), IEEE, 2019.
This paper presents the design of a gesture-based virtual mouse using
hand tracking, closely related to your project.
Zeng, M., Luo, H., Cao, Z., Zhang, J. (2021). Context- Aware Gesture
Recognition in Smart Environments: Applications and Challenges. IEEE
Transactions on Emerging Topics in Computing, 9(1), 139-151.
A discussion of context-aware gesture recognition systems in smart
environments, exploring the adaptability of gesture recognition in
different settings.