THE DSKC
PROJECT REPORT
AREA -1:
INTERFACING MODULES
SMART NAVIGATION
ASSISTANT FOR THE VISUALLY
CHALLENGED
Under the supervision of:
Prof. Bilasini Devi Naorem, Prof. Monika
SUBMITTED BY: -
(in format NAME (Roll No., E-mail Id)
Aakansha Maheshwari (24047567044,
[email protected] )
Jaskirat Singh*(24068567001,
[email protected])
*Project leader
Table of Contents
ABSTRACT..............................................................................................................................................3
CHAPTER – 1: INTRODUCTION..............................................................................................................4
HARDWARE REQUIREMENTS: -..........................................................................................................4
SOFTWARE REQUIREMENTS: -...........................................................................................................4
METHOD 1.....................................................................................................................................4
METHOD 2.....................................................................................................................................5
PROPOSED METHODOLOGY: -............................................................................................................6
CHAPTER-2: GETTING STARTED WITH RASPBERRY PI...........................................................................7
INTRODUCTION TO RASPBERRY PI: -..................................................................................................7
Key Features of Raspberry Pi: -......................................................................................................7
Hardware Overview: -....................................................................................................................8
INSTALLING OS IN RASPBERRY PI: -..................................................................................................10
METHOD 1...................................................................................................................................10
METHOD 2...................................................................................................................................13
INSTALLING THE REQUIRED MODULES: -.....................................................................................15
CONFIGURING THE KEYBOARD: -.................................................................................................15
CHAPTER-3: PROGRAMMING USING PYTHON...................................................................................17
REQUIRED LIBRARIES: -....................................................................................................................17
THE YOLOv3 ALGORITHM: -.............................................................................................................17
THE CODE STRUCTURE: -.................................................................................................................18
IMPORTING THE REQUIRED LIBRARIES........................................................................................18
DEFINING THE FUNCTIONS AND VARIABLES................................................................................18
THE MAIN LOOP...........................................................................................................................20
CHAPTER-4: ASSEMBLING THE MODEL...............................................................................................23
CHAPTER-5: RESULTS AND DISCUSSION.............................................................................................24
CHAPTER-6: FUTURE SCOPE AND LIMITATIONS..................................................................................26
CHAPTER-7: BIBLIOGRAPHY................................................................................................................27
ABSTRACT
This Project Presents The Design And
Implementation Of A Real-Time Object Detection
And Navigation Assistance System Using The
Yolov3-Tiny Deep Learning Model On A Raspberry
Pi. The System Captures Video Feed From The
Raspberry Pi Camera, Identifies Objects Using A
Pre-Trained Yolov3-Tiny Neural Network, And
Provides Audio Feedback Through Text-To-Speech
For Detected Objects. In Addition To Object
Detection, The System Performs Lane Detection
Using Image Processing Techniques Such As Edge
Detection And The Hough Line Transform To Infer
Road Direction. This Hybrid Approach Of
Combining Deep Learning With Classical
Computer Vision Enables Low-Latency
Performance Suitable For Embedded Systems.
The Project Demonstrates Potential Applications
In Assistive Technologies For The Visually
Impaired, Autonomous Navigation.
CHAPTER – 1: INTRODUCTION
We often witness that visually challenged people find it extremely hard to
navigate through the city. They are always dependent on others for navigation,
especially in Metropolitan cities like Delhi where we also have buses running
on roads which implies an all-time danger to the visually challenged. Nowadays
we have equipments like Smart Sticks which basically work on infrared sensors
which are not reliable in different weather conditions like monsoon, as the
device would constantly vibrate due to the rain and would not serve its
purpose. Let us consider a situation, like a bent tree, which bends right in front
of the face of the pedestrian, then the Smart Stick just simply would not work
and the person may bump their head straight into the tree trunk.
We have designed a Smart Navigation Assistant which uses Raspberry Pi to
scan the environment and converts it into speech instruction about the objects
and the navigation available ahead.
HARDWARE REQUIREMENTS: -
Raspberry Pi 4 Model B
Raspberry Pi Camera Module v1.3
SD Card - 32 GB
SD Card Reader
IR Sensor Module
HDMI to Micro-HDMI converter
A stable 5V-3A type-C Power Bank
Earphones
Jumper Wires
Monitor/Keyboard/Mouse
SOFTWARE REQUIREMENTS: -
METHOD 1
Raspberry Pi Imager (for Windows):
https://www.raspberrypi.com/software/
(OS chosen by us:
Raspberry Pi OS with desktop and recommended software
Release date: May 13th 2025
System: 64-bit
Kernel version: 6.12
Debian version: 12 (bookworm)
Size: 3,113MB)
{This one software allows us to download the OS and flash it in the SD
card. It automatically formats the card.}
METHOD 2
SD Card formatter: https://www.sdcard.org/downloads/formatter
7-zip: https://www.7-zip.org
{converting XZ format OS file to Disk Imager file for Flashing into the SD
Card, extracting the (.exe) file of the SD Card Formatter from its ZIP file}
Balena Etcher: https://etcher.balena.io/#download-etcher
{Flashing the .img Image file into the SD Card}
Raspberry Pi OS (64-bit) - (Raspberry Pi OS with desktop and
recommended software):
https://www.raspberrypi.com/software/operating-systems/
(OS chosen by us:
Raspberry Pi OS with desktop and recommended software
Release date: May 13th 2025
System: 64-bit
Kernel version: 6.12
Debian version: 12 (bookworm)
Size: 3,113MB)
(XZ file is a compressed version of a large file, our 10.9 GB OS image file was
compressed to 2.39 GB in the XZ format!)
PROPOSED METHODOLOGY: -
Initialize Capture Frame Detect Path
Start Boot Pi
Camera Direction
Speak Path,
Any Object Run YOLO Object Using Text –To- Path Turning?
Detected? Detection Speech
If Yes, Get Is Object Already If No, Speak Object Label Stop, When the Pi Is
Object Class Spoken Recently? Using Text –To-Speech Powered Off
Till power is On
CHAPTER-2: GETTING
STARTED WITH RASPBERRY PI
INTRODUCTION TO RASPBERRY PI: -
Raspberry Pi is a series of small, single-board computers (SBCS) developed by
the Raspberry Pi Foundation. It is designed to promote computer science
education and provide an affordable computing platform for hobbyists,
developers, and researchers. These credit card-sized computers are capable of
running Linux-based operating systems and performing various tasks, including
programming, robotics, IoT applications, and even basic desktop computing.
Key Features of Raspberry Pi: -
Raspberry Pi is a low-cost computer which offers many features such as
USB ports, HDMI output, and built-in Wi-Fi, (depending on the model).
It includes GPIO (General Purpose Input/Output) pins for connecting
external devices like sensors and motors, making it perfect for DIY
projects.
The Raspberry Pi runs on Linux-based operating systems, with Raspberry
Pi OS being the most commonly used one.
Hardware Overview: -
Components of Raspberry Pi
Source
BCM2711 CHIPSET
A Broadcom system-on-chip (SoC) used in the Raspberry Pi 4 Model B,
Raspberry Pi 400, and Raspberry Pi Compute Module 4. It features a
quad-core ARM Cortex-A72 processor, a VideoCore VI GPU, and supports
various multimedia functionalities.
40 PIN GPIO HEADER
Provides a way to connect the
board to external hardware,
allowing it to interact with the
physical world. These pins can be
configured as inputs or outputs,
enabling the Pi to both sense and
control external devices. The
header includes a mix of General-
Navigation of GPIO pins
Source
Purpose Input/Output (GPIO) pins, power pins (5V and 3.3V), and ground
pins.
WIRELESS BLUETOOTH
This allows it to connect to various Bluetooth devices like speakers,
headphones, keyboards, and mice.
MICRO SD CARD SLOT
This slot is where you insert the microSD card containing the operating
system and other files needed to boot and run the Raspberry Pi.
DISPLAY PORT
The Raspberry Pi 4 has 3 display connectors in total and they are; two
Micro – HDMI connectors and one DSI connector.
POWER PORT
A USB-C port used to provide power to the device. It's the primary and
recommended method for powering the Raspberry Pi 4. A 5V, 3A power
supply is recommended, and using the official Raspberry Pi USB-C power
supply is suggested for optimal performance and stability.
CAMERA PORT
The Raspberry Pi 4 uses a MIPI CSI connector for its camera, which is a
standard interface for connecting cameras to embedded systems. This
connector is a 15-pin, 1.0mm pitch FPC/FFC connector. It allows for the
transmission of both image data and control signals between the camera
and the Raspberry Pi.
AUDIO JACK
The Raspberry Pi 4 includes a 4-pole 3.5mm audio/video jack that can
output both audio and composite video signals. This jack, also known as
a TRRS connector, has four conductors: tip for left audio, ring for right
audio, another ring for ground, and the sleeve for video.
USB PORT
The Raspberry Pi 4 has four USB ports: two USB 3.0 and two USB
2.0. These ports allow for connecting various peripherals like keyboards,
mice, storage devices, and other USB-enabled devices. The USB 3.0 ports
offer significantly faster data transfer speeds compared to the USB 2.0
ports.
ETHERNET PORT
Allows for a wired network connection. It's a standard RJ45 connector
that supports speeds up to 1000 megabits per second, also known as
Gigabit Ethernet, unlike the Raspberry Pi 3 which was limited to 300
Mbps. This port is essential for connecting to networks, including the
internet, and for tasks like software updates or installing packages from
online repositories.
INSTALLING OS IN RASPBERRY PI: -
METHOD 1
1. Connect your SD card to your device using a SD card reader.
2. Download and launch the Raspberry Pi Imager from the official website
of Raspberry Pi.
3. Select the model of your Raspberry Pi through “CHOOSE DEVICE”, select
the OS through “CHOOSE OS” and select your SD card through “CHOOSE
STORAGE” to flash the desired OS in the SD card.
4. Click “NEXT” and customize OS settings by selecting “EDIT SETTINGS”.
5. Customise the settings as shown below
6. The Imager then proceeds to write the OS in the SD card by first
formatting it.
7. The SD card is now ready to be inserted in the Raspberry Pi.
METHOD 2
1. Download all the required software, and the OS through the official
website of Raspberry Pi, which is an XZ file for newer versions.
Downloaded Software
2. Format the SD Card, either through the in-built way, or through the
software.
3. Open Etcher and then select the file to be flashed along with the location
where it’s to be flashed, click “Flash!”. The SD Card is ready to be used,
once the process is finished.
WORKING WITH RASPBERRY PI: -
Having the OS installed in the SD card, we insert it into the Raspberry Pi and
connect the Pi with the Desktop using the HDMI cable. The keyboard and
mouse are to be connected using the 2.0 USB ports. The camera module goes
into its assigned port.
Once power is given to the Raspberry Pi, an
interface window is displayed as shown in the
figure.
INSTALLING THE REQUIRED MODULES: -
The commands given on the terminal in order to install all the required libraries
and modules are as follows:
cat /etc/os-release: For checking the OS version and other relevant
information on Linux systems.
sudo apt update: Updates the system.
sudo raspi-config: Opens the Raspberry Pi Configuration tool in the
terminal, allowing you to configure various aspects of your Raspberry Pi
system.
sudo apt install gedit: Updates the package list and then installs Gedit
Text Editor.
sudo apt install python3- “library name”: Installs any required library.
(For example- sudo apt install python3-opencv: installs OpenCV)
vcgencmd get_camera: Tells you whether the system recognizes a camera
is connected (detected) and whether the camera is supported by the
current software configuration (supported).
libcamera-hello: Reviews the camera and displays the camera image on
the screen.
CONFIGURING THE KEYBOARD: -
When you first set up your Raspberry Pi, the keyboard will be configured to use
the standard UK layout, which includes the £ symbol instead of # above the
number 3. It is different from the one we use in general, so in order to change
its layout we need to configure it using the following steps:
1. Use the command: sudo raspi-config, go to Localization
OptionsKeyboard.
2. Selecting keyboard model: We were using the HP 125 Wired Keyboard;
hence we chose the “Hewlett-Packard Internet” model from the list
which appeared.
3. Selecting keyboard layout: The most general layout used by the
keyboards, especially here in India, the keyboard is (US) – English (US,
Symbolic)”.
4. Selecting the AltGr key: Some keyboards have the AltGr key. You may
select the appropriate option accordingly.
5. Selecting the Compose key: Some keyboards have the Compose key. You
may select the appropriate option accordingly.
6. Finally, a window appears asking if you need the Control+Alt+Backspace
combination (used to exit the X terminal forcibly.)
7. Your keyboard is now successfully configured.
The flow is shown below:
CHAPTER-3: PROGRAMMING
USING PYTHON
REQUIRED LIBRARIES: -
1. os: Provides a portable way to interact with the operating system.
2. cv2: Refers to the Python bindings for OpenCV (Open-Source
Computer Vision Library). OpenCV is a widely used open-source
library for computer vision and machine learning.
3. numpy: Short for Numerical Python, is a foundational open-source
library in Python for scientific computing. Its primary contribution
is the ndarray object, which is a powerful and efficient multi-
dimensional array data structure.
4. picamera2: Allows to interact with the camera system of
Raspberry Pi devices.
5. time: Provides various functions for handling time-related
operations.
THE YOLOv3 ALGORITHM: -
You Only Look Once (YOLO) is a series of real-time object detection systems
based on convolutional neural networks. YOLO has undergone several
iterations and improvements, becoming one of the most popular object
detection frameworks.
The name "You Only Look Once" refers to the fact that the algorithm requires
only one forward propagation pass through the neural network to make
predictions. Compared to previous methods like R-CNN and OverFeat, instead
of applying the model to an image at multiple locations and scales, YOLO
applies a single neural network to the full image. This network divides the
image into regions and predicts bounding boxes and probabilities for each
region. These bounding boxes are weighted by the predicted probabilities.
YOLOv3, introduced in 2018, contained only "incremental" improvements,
including the use of a more complex backbone network, multiple scales for
detection, and a more sophisticated loss function.
THE CODE STRUCTURE: -
The code can be broken into fragments to have a clear understanding.
IMPORTING THE REQUIRED LIBRARIES
The very first and fundamental step of any code
is calling out the necessary libraries.
These modules handle file system interaction,
computer vision processing, numerical operations, and camera input.
DEFINING THE FUNCTIONS AND VARIABLES
The functions and variables used in the program are as follows:
1. Non-Maximum Suppression (NMS)
This function removes redundant bounding boxes based on overlap
threshold, keeping only the most confident detections.
2. YOLO Output Extraction
Extracts the output layers from the YOLO neural network and filters out
weak predictions.
3. Drawing Bounding Boxes
Draws rectangles around detected objects on the image.
4. Audio Alerts
Uses espeak to provide audible alerts for detected objects, ensuring each
label is not repeated more often than the specified cooldown time.
5. Direction Detection from Lane Lines
Analyses the slopes of detected lines to
infer navigation directions (Left Turn, Right
Turn, or No Turn).
6. Model and Class Initialization
Loads YOLOv3-tiny model configuration, weights, and class names from
disk and initializes the camera.
THE MAIN LOOP
This loop is the heart of the application, combining deep learning-based object
detection with classical computer vision techniques for a practical real-time
navigation assistant. It continuously processes video frames and gives the user
both visual and auditory cues, making it suitable for embedded vision tasks like
robotic pathfinding or assistive technology.
Let us break it down one by one.
1. Image Capture and Preprocessing
Captures a
frame from
the Pi camera
and converts it
to BGR which is compatible with the YOLO model and OpenCV.
2. YOLO Model Input Preparation
The frame
dimensions are
extracted and
converted into
blob, resized to
320X320 for the YOLO to take as the input. The blob is then passed
into the network and output detections are retrieved.
3. Detection Parsing
Bounding box
coordinates,
confidence, and
class scores from
each detection are
extracted and
converted into
normalized YOLO outputs to pixel coordinates relative to image
dimensions. The detection data is then stored in lists for further
processing.
4. Non-Maximum Suppression (NMS)
Applies Non-Maximum Suppression to filter overlapping bounding
boxes, retaining only the most confident detections.
5. Drawing Bounding Boxes and Speaking Labels
For each detection, a
bounding box and
label is drawn on the
frame and espeak is
utilized to say it out
loud.
6. Lane Detection Using Edge Detection and Hough Transform
The frame is
converted to
greyscale and
Gaussian blur is
applied to reduce noise. It calls Canny edge detection followed by the
Hough Line Transform to detect lines, which are candidates for lane
boundaries.
7. Slope Calculation and Line Filtering
The slope of
each line is
calculated and
very steep or
horizontal
slopes are
filtered out.
Duplicate and
redundant slopes are avoided for increased accuracy of direction
analysis.
8. Direction Estimation
The slope
list is
passed to the get_direction () function, which determines the most
probable movement direction (e.g., Left Turn, Right Turn, No Turn)
based on the prevalence of left- or right-tilted lines and the result is
displayed on the live feed.
9. Display and Exit Handling
Displays the processed frame in a
window. Monitors for the ESC key
(ASCII 27) to break the loop and
terminate the program.
CHAPTER-4: ASSEMBLING THE
MODEL
We have used a 3D model to compile all the components in one place and
make it portable. A 3D illustration of the same is designed using ‘Tinkercad’:
All the elements are connected as shown below:
CHAPTER-5: RESULTS AND
DISCUSSION
1. Object Detection Performance
The system successfully
detected a wide range of
objects such as people,
vehicles, traffic signs, and
obstacles using the YOLOv3-tiny
model. The detections were
displayed on the live video feed
in real time with bounding
boxes and corresponding class
labels. Additionally, audio alerts
were accurately delivered using
the espeak module, providing timely voice feedback.
Detection Accuracy: The model demonstrated high precision for
commonly seen objects, especially when lighting conditions were
favourable.
Speed: Real-time detection was achieved with minimal latency due to
the lightweight nature of the YOLOv3-tiny architecture and the use of
optimized image resolutions.
Audio Feedback: The cooldown mechanism ensured that labels were not
spoken repeatedly, making the system more user-friendly.
2. Direction Detection (Lane Guidance)
The project also implemented a basic lane detection algorithm using edge
detection and the Hough Transform. The system could effectively identify the
general direction of road curvature and announce instructions such as “Left
Turn,” “Right Turn,” or “No Turn.”
Robustness: While the direction detection worked well on clear road
patterns, performance degraded in poor lighting or noisy environments
with many vertical lines (e.g., fences).
False Positives: Occasional noise in line detection led to inaccurate slope
calculations, though this was partially mitigated by slope filtering and de-
duplication logic.
3. Integration and Real-Time Performance
The system ran continuously on a Raspberry Pi, demonstrating that both
object detection and lane estimation can be performed efficiently on
low-power embedded devices.
The live feed displayed processed frames with detected objects and
directional cues, making it suitable for assistive or robotics applications.
CHAPTER-6: FUTURE SCOPE
AND LIMITATIONS
Future Improvements
Integration of more advanced lane detection using deep learning.
Addition of obstacle avoidance and GPS integration for mobile robotic
applications.
Use of threading or multiprocessing to improve frame processing speed.
Model training on a custom dataset.
Limitations
The YOLOv3-tiny model trades off some accuracy for speed. More
accurate models could be used if computational resources allow.
The system lacks temporal smoothing, so occasional false positives or
jitter in object direction may occur.
CHAPTER-7: BIBLIOGRAPHY
1. Software Installation (Method-1): https://youtu.be/ntaXWS8Lk34?
si=wog5Ip1nEMBtnw-7 / https://youtu.be/sxLs-2bz23w?
si=VDCqcn8tmdrYgpgY
2. Software Installation (Method-2) + Setup of Raspberry Pi 4:
https://youtu.be/2RHuDKq7ONQ?si=tpyU421pOmkNx_Rx
3. Training YOLOv3: https://youtu.be/2_9M9XH8EDc?
si=MEIUTg3sCX5WtqyP
4. Understanding YOLO: https://youtu.be/ag3DLKsl2vk?
si=GNVkZR8OTzBB3hmh /
https://en.wikipedia.org/wiki/You_Only_Look_Once
5. Integrating YOLOv3 with Python (Reference for Code):
https://youtu.be/yZ1b3PENhhM?si=rg-p7AspN5lqj-Bg
6. Introduction to 3d Printers/Printing: https://youtu.be/2vFdwz4U1VQ?
si=QadipuazwOB04xR7
7. Understanding how Tinkercad works: https://youtu.be/24ByEWUmJ3g?
si=ZKFQURtZdc5_xFvM