Puthan Specs
Puthan Specs
ABHAY A V ADITHYAN P B
CEM20CS002 CEM20CS004
Equipped with a camera and image processing capabilities, the glasses detect obstacles
and provide real-time feedback.
Automated voice alerts through earphones inform users about detected obstacles, enabling
swift navigation and safety.
Integrated health sensors monitor vital signs like temperature and pulse, ensuring holistic
well-being support.
Combining technology and compassion, these smart spectacles revolutionize the visually
impaired's experience, fostering confidence and convenience.
PROBLEM STATEMENT
• Lack of real-time support for the visually impaired in existing
technology solutions.
• Develop smart spectacles with advanced audio assistance capabilities to aid visually impaired
individuals in their surroundings; using Raspberry Pi 4 and Pi Camera for seamless integration and
optimal performance.
• Implement object recognition technology to enhance the spectacles functionality, allowing users to
identify objects and obstacles in their environment.
• Integrate health monitoring features, including temperature sensing using LM35, pulse sensing using
a pulse sensor, Oximetry sensor IC max30100 for measuring oxygen level in blood etc.. to provide
real-time health data and alerts.
• Design a user-friendly interface using voice commands and ensure hands-free compatibility
LITERATURE
SURVEY
Design and Implementation of Voice Assisted Smart Glasses for Visually Impaired People
Using Google Vision AP [1]
• Hardware Integration: The first step involves equipping smart glasses with essential hardware
components, including a camera or sensor, Raspberry Pi for processing, and an ultrasonic sensor for
obstacle detection. These components work together to capture and process real-time data.
• Object Detection: Google Vision API, along with a deep learning model based on the You Only Look
Once (YOLO) algorithm, is used to detect objects in the images captured by the smart glasses. This
enables the system to recognize and classify objects present in the user's environment.
• Voice Alerts: When objects are detected, the system converts this information into spoken alerts.
These alerts are conveyed to the user in real-time, providing details about detected objects and their
location.
• User Interaction: The smart glasses offer voice-controlled interaction. Users can use voice
commands to control the system, like stopping vibrations or requesting information.
Benefits
• Real-time object recognition through Google Vision API improves navigation for
visually impaired users by providing auditory cues about obstacles and surroundings.
• The smart glasses utilize text recognition capabilities, reading printed text aloud and
assisting users in understanding their surroundings, fostering independence.
Constrains
• Heavy reliance on stable internet connectivity may compromise the performance of the
smart glasses during network issues, impacting real-time assistance.
• The use of cloud-based services, like Google Vision API, raises concerns about the
privacy and security of user data, potentially leading to unauthorized access or misuse.
• Lack of customization options may restrict the adaptability of the smart glasses to
individual preferences, potentially limiting the user experience for some individuals.
Crosswalk Guidance System for the Blind [2]
• Wearable Assistive Technology: The study proposes a wearable assistive technology solution for
individuals with visual impairment (VI) to address the challenges of street crossing at signalized
crosswalks.
• Detection and Classification Module: The system utilizes a detection and classification module that
includes a cascade classifier and convolutional neural network (CNN) to identify crosswalk signals
and classify them as "safe-to-cross" or "do-not-cross.“
• Navigation Module: The navigation module is responsible for providing real-time guidance to users.
It uses visual-inertial odometry and geometry constraints to localize users and plan their paths.
• Experiment with Visually Impaired Subjects: The methodology involved testing the system with
visually impaired subjects, including those with varying degrees of visual impairment, both indoors
and outdoors
Benefits
• The system provides clear guidance, reducing the risk of accidents and enhancing safety for visually
impaired individuals when crossing roads.
• By offering real-time information and guidance, the system promotes independence, enabling users
to confidently navigate crosswalks without constant assistance.
• The integration of tactile, auditory, and smartphone-based guidance ensures a versatile approach,
catering to various user preferences and needs.
Constrains
• The system's effectiveness is contingent on technological reliability, and any malfunctions could
fail accurate guidance.
• Comprehensive Crosswalk Guidance Systems can be expensive to install and maintain, posing
financial challenges for widespread implementation.
• Users may face a learning curve in adapting to the system, with some individuals being hesitant or
resistant to embracing new technologies.
A Google Glass Based Real-Time Scene Analysis for the Visually Impaired [3]
• System Overview: The system comprises Google Glass for capturing images, a smartphone for
image processing, and the Azure Vision API for object recognition.
• User Interaction: Users employ voice commands to initiate the image capture and recognition
process. The device captures the surroundings and sends the image to the smartphone app.
• Image Processing: The smartphone decompresses the image, uses the Vision API to generate
captions and identify objects, and sends this information back to the Google Glass.
• Audio Output: The Google Glass converts the text response into audio using a text-to-speech API,
enabling BVIP users to hear the descriptions in real-time.
Benefits
• The Google Glass-based system provides real-time scene analysis, allowing visually impaired
users to receive immediate information about their surroundings.
• Users can interact with the system hands-free, enhancing convenience and allowing them to
navigate and access information without physical input.
• Leveraging Google Glass allows seamless integration with various Google services, providing a
comprehensive and versatile platform for the visually impaired
Constrains
• The Google Glass's limited field of view may restrict the amount of visual information processed
in real-time, potentially missing important details in the user's surroundings.
• The system's functionality is dependent on the battery life of Google Glass, and extended use may
require frequent recharging, potentially causing interruptions in assistance.
• Real-time scene analysis may unintentionally record sensitive information, raising privacy
concerns for both users and those in their vicinity.
Glasses Connected to Google Vision that Inform Blind People about what is in Front of Them [4]
• Understanding and Observing: The authors conducted a study with 150 visually impaired individuals from
the National Union of the Blind of Peru (NUBP) to understand their needs and levels of visual impairment,
classifying them into different categories.
• Find Patterns (Define the Problem): Interviews were conducted to understand the users' autonomy and
requirements, including the need for assistance in navigation, the ability to recognize pedestrian walkways,
vehicles, traffic lights, and other aspects.
• Devise Possible Solutions (Design Principles): The authors designed a prototype of smart glasses,
considering factors like aesthetics, portability, and ease of use.
• Prototype the Models (Make Them Tangible): The authors conducted tests with the developed
prototype. They provided a detailed diagram of the lens prototype and the overall system architecture.
Benefits
• The glasses connected to Google Vision provide visually impaired individuals with real-time
information about their surroundings, enhancing situational awareness.
• Users can navigate their environment more independently with auditory cues, allowing them to
identify objects and obstacles in front of them, improving mobility.
• The integration with Google Vision offers versatile assistance, allowing users to receive detailed
information about a wide range of objects and scenes.
Constrains
• Constantly connected glasses using Google Vision may raise privacy concerns as they capture and
process visual information, potentially unintentionally recording sensitive data.
• The functionality of the glasses is reliant on a stable internet connection, and disruptions may
hinder real-time information updates, impacting the user's experience.
• The effectiveness of informing about the surroundings depends on the accuracy of Google Vision,
and there could be limitations or inaccuracies in recognizing certain objects or scenes.
Real-Time Family Member Recognition Using Raspberry Pi for Visually Impaired People [5]
• Hardware Setup: The research employed a Raspberry Pi 4 Model B along with a Pi camera, a power
source, and headphones to create a compact, wearable smart glass system.
• Face Database Creation: Images of family members' faces were captured using the Pi camera,
encompassing various angles and facial expressions. These images were stored with corresponding
family member names.
• Face Recognition Algorithm: The research utilized Haar-like features for face recognition due to
their efficiency in face detection. A classifier was trained extensively with a substantial dataset of
facial images for real-time recognition.
• Real-Time Recognition: In real-time, the system continuously captured video frames. It performed
face encoding and employed the trained classifier to detect family members. When a family member
was recognized, the system immediately provided an audio name announcement to assist the visually
impaired user.
Benefits
• Real-time object recognition through Google Vision API improves navigation for visually
impaired users by providing auditory cues about obstacles and surroundings.
• The smart glasses utilize text recognition capabilities, reading printed text aloud and
assisting users in understanding their surroundings, fostering independence.
• Hands-free control via a voice-assisted interface offers convenience and ease, promoting
an intuitive interaction method for users with varying motor skills.
Constrains
• Visually impaired users can actively engage with their surroundings through real-time family
member recognition, fostering a more socially connected and inclusive experience.
DESIGN
3D MODEL
MODULE DESCRIPTION
4. OCR Module
CIRCUIT DIAGRAM
PROJECT WORKFLOW
• It initializes the camera with a preview resolution of 640x360 pixels (RGB888 format) and a 30
FPS frame rate.
• The script captures frames, displaying them via OpenCV's imshow for real-time feedback.
• It responds to voice commands, such as "capture" or "take a photo," to trigger photo capture.
• This script offers hands-free camera interaction, enhancing accessibility and user experience
in Python projects with Raspberry Pi's camera capabilities.
MODULE 2 Voice Commands
MODULE 2 Voice Commands
In this code snippet, we import the pyttsx3 library and initialize the speech
synthesis engine using the init() function. The argument 'sapi5' represents the Speech
API provided by Microsoft, which is available on Windows.
The text to speech() function takes a text input and utilizes the say() method
of the engine object to convert the text into speech. The runAndWait() method
is then called to play the synthesized speech output.
MODULE 3 Face Recognition
MODULE 3 Face Recognition
• The code segment utilizes the Haar Cascade algorithm from OpenCV for face
detection, coupled with the face recognition library for face encoding, to train a face
recognition model.
• It extracts the name of the person from the image path and loads the input image in
BGR format, converting it to RGB for processing.
• The Haar Cascade algorithm detects faces in the image using face recognition. Face
locations with the "hog" model.
• Facial encodings are computed for each detected face using face recognition. Face
encodings and stored in lists for training the recognition model.
• The serialized facial encodings and names are saved into a pickle file named
"encodings. Pickle" for future use in recognition tasks.
MODULE 3 Face Recognition
• Frames are processed for face detection and recognition using the face
recognition library.
• Detected faces are labeled with corresponding names and displayed in real-time
using OpenCV's imshow.
• The script stops upon pressing the 'q' key and calculates the frames per second
(FPS) performance.
• Training of the CNN model encompasses diverse datasets to encompass a wide range
of object classes, with meticulous hyperparameter optimization for superior model
performance.
• The health monitoring system integrates LM35 temperature sensor, DHT11 humidity
sensor, ADC converter, and a pulse sensor with Raspberry Pi, leveraging Adafruit
components for seamless functionality.
• It continuously monitors vital health parameters such as temperature, humidity levels, and
pulse rate in real-time, enabling proactive health management.
• Real-time data processing and analysis empower users with actionable insights for
informed decision-making and personalized healthcare.
RESULT & DISCUSSION
FINAL STRUCTURE OF THE SPECTACLES
Implementation Video
FUTURE SCOPE
• Expand object detection capabilities with a wider range of classes and improved accuracy using
advanced CNN architectures.
• Incorporate additional sensors for comprehensive health monitoring, such as blood pressure
sensors or oxygen saturation monitors.
• Integrate AR overlays for enhanced user guidance, providing contextual information about
surroundings and detected objects.
• Implement cloud-based data storage and analysis for long-term health trend monitoring and
personalized insights.
• Develop a user-friendly mobile application for remote monitoring, allowing caregivers to access
real-time health data and receive alerts.
Conclusion
• Our project on smart spectacles for the blind with audio assistance and health
monitoring is a significant step forward in assistive technology.
• By integrating key technologies like OCR, voice commands, face recognition, object
detection, and health monitoring, we've created a comprehensive solution for daily
challenges faced by visually impaired individuals.
[1] A., Hafeez Ali et al. (2021, December 13). Real-Time Scene Analysis Using Google Glass.
National Institute of Technology Karnataka Surathkal. DOI: 10.1109/ACCESS.2.
[2] Rajendran, P. Selvi, Krishnan, Padmaveni, & Aravindhar, D. John. (2020, November 17). Voice-
Assisted Smart Glasses for the Visually Impaired. Hindustan Institute of Technology and Science,
India.
[3] Cabanillas-Carbonell, Michael, Aguilar Chávez, Alexander, & Banda Barrientos, Jeshua. (2020,
October 29-30). Glasses for Assisting the Blind with Google Vision. Universidad Autónoma del
Perú. IEEE: 978-1-7281-8803-4/20/$31.00 ©2020 IEEE 021.3135024.
[4] Son, Hojun, Krishnagiri, Divya, Jeganathan, V. Swetha, & Weiland, James (IEEE Member).
(2020, June 6). Crosswalk Guidance System for the Blind. IEEE. DOI: 78-1-7281-1990-
8/20/$31.00 ©2020 IEEE.
[5] Islam, Md. Tobibul, Ahmad, Mohiuddin, & Bappy, Akash Shingha. (2020, June 5-7). Real-Time
Family Member Recognition Using Raspberry Pi. Khulna University of Engineering & Technology,
Bangladesh. DOI: 978-1-7281-7366-5/20/$31.00 ©2020 IEEE.