0% found this document useful (0 votes)
4 views

Object detection research paper

The paper presents an object recognition system designed for visually impaired individuals, utilizing a combination of neural networks and image processing to identify objects in real time. The system employs the Haar Cascade algorithm and integrates with a Raspberry Pi to process images captured by a camera, providing audio feedback to guide users. Results indicate that the system effectively recognizes objects and calculates their distance, with potential applications extending to various fields such as security and medical imaging.

Uploaded by

nikhilshet17.ns
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Object detection research paper

The paper presents an object recognition system designed for visually impaired individuals, utilizing a combination of neural networks and image processing to identify objects in real time. The system employs the Haar Cascade algorithm and integrates with a Raspberry Pi to process images captured by a camera, providing audio feedback to guide users. Results indicate that the system effectively recognizes objects and calculates their distance, with potential applications extending to various fields such as security and medical imaging.

Uploaded by

nikhilshet17.ns
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2019 International Conference on Nascent Technologies in Engineering (ICNTE 2019)

Object Recognition for The Visually Impaired


Heetika Gada1, Vedant Gokani2, Abhinav Kashyap3, Amit A. Deshmukh4
1,2,3
U. G. Student, Department of Electronics and Telecommunication
D. J. Sanghvi College of Engineering
4 Professor and Head of Electronics and Telecommunication Department

D. J. Sanghvi College of Engineering


E-mail: [email protected], [email protected], [email protected],
[email protected]

Abstract—Today images and videos are everywhere. In fact, in this domain. It uses ultrasonic sensors to discover objects
the sheer quantity of images on social media and networking sites and hurdles for the correct evaluation [1]. The use of high
is unfathomable. Every device is now fitted with a camera. This value ultrasonic sensors may be user friendly but it is highly
opens up huge possibilities. Object Recognition is a process of affected by temperature variations and also it has problems in
detecting an object and identifying it using various image scrutinizing reflections from soft, curved, slim and tiny
algorithms. The main purpose of this paper is to recognize objects
objects. With the help of buzzers and alarm the alert is
in real time and allot the objects to the classes that are previously
defined. The algorithms that we utilized are more computationally generated so as to make sure the person does not face any
efficient. Previously, object detection was done using RFID and accident. This paper also makes use of the Dangling Object
IR technologies which required dedicated hardware. But with the Detection algorithm which determines the position of the
advent of image processing and neural networks, we require object and if it falls in the warning range. The research
almost no new hardware. Almost everything has camera these days analysis of this paper shows that the users can quickly obtain
from pens to mobile phones. This has given rise to a new field real time output by utilizing their device when they are in
called computer vision i.e. using pictures and videos to detect, movement. The use of buzzers is a viable option in our project
segregate and track objects or events so that we can “understand” as it is an ideal output generation mode for the visually
a real world scenario.
impaired. Using Haar Cascade algorithm we can create a
similar comparison model for recognition. One can extract
Keywords—Haar Cascade, numpy array, depth calculation, the images and make the identification by dividing it into
tensorflow, datasets various regions and running the algorithm in each region for
part by part division for a higher operational accuracy [2]. We
can also implement this system using RFID tags which
I. INTRODUCTION increases the overall accuracy. This is based on transmitter-
receiver technology where the objects are fitted with a
Blind people have traditionally always relied on guide canes
transmitter and the person holds the receiver. It transfers data
and physical touch to sense objects. With about 285 million
from a tag, which sends information to a reader, thus it can
people over the world have some form of visual impairment,
influence the data making decisions [3]. Using the above
according to World Health Organization, developing visual
method one can measure the distance between the detector
aids is one of the most vibrant research projects among the
and the object and the motion of the locator with respect to
computer vision community. We have designed one such aid
the RFID tag which is made using the RSSI (Received Signal
by combining the traditional cane with a device which uses
Strength Indicator) value. The usage of Haar-like features
neural networks and image processing to guide the visually
makes the task simpler and easy to design as a Haar-like
impaired.
feature takes neighboring pixel regions at a required point in
a detection frame, adds up the pixel intensities in all frames
One of the biggest challenge in computer vision and in image
and estimates the difference between these summations. This
processing is achieving true invariant object recognition.
difference is then used to classify sub regions of an image [4].
Although concepts like image matching, robust feature
This constructed frame keeps sliding over the entire image
detection, and 3D models have been in the conversation for a
and defines a positive or negative value set to find the object.
long time now, it’s only recently, more specifically since the
If the object is found then that frame gives out a positive value
twilight of the last decade that researchers and professionals
or else a negative. Thus by carrying out this process over the
have approached this problem seriously. Only recently, has
entire image we can accurately identify the object. The
there been substantial progress in the implementation of
overall speed of this process is high and thus there is a bleak
algorithms that detect invariant features in every-day more
chance of a delayed or incorrect output. Advancement in the
complex images. The early endeavors towards digital image
object detection domain has been with the use of haptic
recognition were limited in scope. That is identification was
technology with the help of a virtual environment [5]. With
limited to only corners and edges. This proved effective but
the help of virtual reality we can easily construct an ideal
had many limitation as the mere recognition of corners was
route for the visually impaired. In the object recognition
not enough for the elaboration of 3D models and object
scenario we reduce the time required for the system to
reconstruction suffered in many cases. Hence another class of
identify objects based on different shapes which are
algorithm focused on matching textures was included. As we
comprehended using different edges, points and lines.
see there have been many projects and technical publications
978-1-5386-9166-3/19/$31.00 ©2019 IEEE
2019 International Conference on Nascent Technologies in Engineering (ICNTE 2019)

Systems like voice are very efficient in converting captured We are implementing an object recognition system using the
images to sound signals which can be heard by the person on Haar Cascade algorithm. We have taken various input real
a headset [6]. One can also use the Vibe system which takes time datasets which are stored internally. We then import
into account certain scanning laws and correlates the image these images in our algorithm so as to keep as a reference
intensity to sound thus giving a different sound level for model for comparison. Our input datasets comprise of all the
different pixel intensities [7] . common image models which could be encountered by a
Traditionally radars have also been used for object detection, blind person in their day to day life. We take our input image
especially when the objects are moving vehicles, but even in using a Raspberry Pi camera module and interface it with the
this scenario cameras are found to offer significant assistance Raspberry Pi hardware. We set our frame rate as 1fps. With
to radars. But this is limited for getting more accuracy in the frame rate of 10fps, the Raspberry Pi takes more than 20
motion, more desirable resolution and lesser overall costs. seconds to process the image and recognize the objects. This
Volvo’s Blind Spot Information System is one example[8]. time lag is not desirable. Thus, the frame rate is kept less. This
The fact that extra features can be added to already existing helps us to get an output with good accuracy without
camera systems enables the case of using a camera over other compromising on the speed.
devices. Object and its moving direction detection using
Depth calculation has also been realized [9]. In this paper for
image acquisition they have used a RGB camera with a depth
sensor of Microsoft Kinect. For data collection a total of 600
samples of depth images of many different front scenes with
their respective RGB images are scrutinized to test the new
system. This is becomes a huge problem again because of its
computational heavy nature and real time application would
become very arduous. A near-range object detection using
randomly aligned stereo cameras is presented in [10]. It is
based on stereo reverse perspective mapping. Images on the
left and right are compared and using their differences
obstacle detection is achieved. After the transformation phase
a comparison is made using a polar histogram. The system
holds up well in a variety of conditions but it requires two
cameras, which would increase cost and space requirements.

A Fisheye camera can be used for tracking multiple vehicles


and monitoring the blind areas [11]. This paper uses only one
fisheye camera which can view up to 180◦ but fisheye camera
suffers from wide lens distortion towards the edges and
correction functions are expensive and complex which is not
feasible. Also the paper makes use of AdaBoost algorithm,
but again it would prove to be computationally heavy for the
limited RAM of Raspberry Pi. A system based on a neural
network trained with the back propagation algorithm can be
used to discriminate a specific object from other objects [12].
But in back propagation algorithm once a network learns one
set of weights, any new learning causes catastrophic
forgetting. The paper itself accepts itself concedes, that
reliance on rhythmic features is detrimental. Ground visiblity
is required. This is the main constraint. We feel that Haar
Cascade algorithm is better compared to other algorithms like
SIFT (Scale Invariant Feature Transform), SURF (Speed Up
Robust Features) and MSER (Maximally Stable External
Regions) as they are mathematically complicated,
computationally heavy and vulnerable to scale variations. The
usage of Raspberry Pi hardware makes the development more
cost efficient and alt the same time the detection of object for
larger distances can be done more readily. We have used a
basic camera to capture objects and implemented an Fig. 1 Flowchart of the algorithm
algorithm to compare and give the result.

We took objects from real life and took more than 200 images
II. IMPLEMENTATION of each object. This dataset was then trained for more than 40
epochs i.e. cycles. Thus, the edges were clearly defined in
these datasets. To add simplicity to our project, we utilized
2019 International Conference on Nascent Technologies in Engineering (ICNTE 2019)

the imageai library. Along with our original datasets this the object and identifier we can give an output signal
library helped us to recognize more objects. These images are conveying the position of the object, how far it is and what
made into MODEL files by tensorflow. best route should be adopted to avoid collision.
The real time data was then compared with the datasets. The We also added the features for obstacle aversion. This is
images were converted to a numpy array. It facilitates with a necessary as we want to avoid obstacles and design a clear
elevated-performance array object and different tools for path for movement. The object estimation has to be done
running on them. The numpy array that we made is a grid of perfect so as to avoid any mishap. The signals could be as
values and by a tuple of nonnegative integers, it is indexed. simple as ‘left’ and ‘right’ signals which notify the user about
We also used the SciPy library. SciPy provides primary the object and thus update him or her for the change in route.
functions to work with various images. It read images from The output also gives information on the object regarding its
disk into numpy arrays and resized them. Numpy array was name so that the user has a clear idea of what is in front of
particularly useful in creating a set of image pixels on which him or her. This output signal is given in terms of an audio
various operations can be performed. Once the image was output so that it is friendly to the visually impaired person. So
obtained we imported the numpy function which gives first we introduced a buzzer with gives an alert about the
required value to all pixels and creates a grid. This array presence of an object and then the audio output giving the
makes it easy for the classifier to initialize the recognition information about the object, how far it is and what direction
process and implement the required logic. We ran the change to be adopted. Using these audio output directions one
inference for each frame per second. In each of the frames, can easily manage his/her route safely.
the objects are detected. Thus, the objects called the
appropriate functions. In the end, these objects were labelled
with the classes that we defined before. The probability of the III. RESULTS AND CONCLUSION
object actually being present was also found. We set the
threshold of the error probability to 55%. Below this As we obtained the input images the classifier converted
threshold, even though the object is detected and recognized, those images into a numpy array of defined values. After
it won’t show in the output. detecting the image, certain functions were called to find the
edges and points of the image and finding its particular shape
and dimension. As we took input datasets of various objects
we were able to identify these objects by the class names that
we had taken and stated a probability of that object appearing
in the image. In the first part of this project we designed the
algorithm for identifying objects in a given captured image.
The objects covering a larger area of the image are considered
as large objects and their probability of occurrence is high.
We ran the code for various different images with varying
light and background to check the accuracy. The output
showed the smallest of images as been identified by the
classifier.

Fig.2 Interfacing of Raspberry Pi with the camera

We would interface the Raspberry camera and the chip two


hardware platforms on a blind stick which manually used by
the person. When the user moves the stick according to his or
her convenience. Thus, this eliminates the need to keep servo
motors below the camera to capture the image. In summary,
when the input image is matched with previously stored
dataset image, the classifier recognizes a particular object and
gives it the required name. It also calculates the required Fig. 3 Object recognition on an image
probability of the image being recognized correctly and only
displays it if it is above the threshold. Additionally, after the As shown above, we took an image with various objects and
recognition, we used the depth calculation feature. We can it detected all of them. We identified the person, truck,
estimate how far is that object from the person by this feature. airplane 1 and 2 in the above image. The advantage in running
The depth calculation can be done using various techniques the image directly is the computationally efficiency. The
like triangle similarity, binocular disparity, photometric image processes much faster.
stereo and many more. After calculating the distance between
2019 International Conference on Nascent Technologies in Engineering (ICNTE 2019)

V. FUTURE WORK

With the advent of image processing, we require almost no


new hardware. Cameras are present almost everywhere. We
can exploit this fact and extend the same technology that we
have used for object recognition for a variety of applications.
The same algorithm can be used to detect and recognize a
variety of faults or abnormalities in bone structures. This can
be achieved by making use of a higher resolution camera and
extract more features. Once this is designed we eliminate
human supervision completely to perform fundamental
inspections on bone related fractures. Although, this would
require the introduction of vast and extensive datasets. It
could also in theory be used for home applications like
security cameras to recognize certain people. Other real time
applications could be rear view mirror moving vehicle
detection. Passenger alert can be wide area where we can
Fig. 4 Object recognition on real time data implement our system. Once the obstacle is tracked we can
easily re-route thus confirming passenger safety. Other than
industrial, medical and security purposes object detectors find
The image used above is a real time image and though it a big role in household applications. If we are able to
identifies the two persons and the chair properly but it integrate such a system with the basic automation system then
incorrectly identifies the PCB board as a laptop. This error our work load would become much lesser.
comes because the PCB Board has similar features in edges
to a laptop. This error can further be eliminated by running
for more number of cycles. Additionally, giving more images VI. REFERENCES
in the database will reduce the paper. Thus, it is seen that it is
difficult identifying smaller objects or objects which cover a [1] C. H. Lin ; P. H. Cheng ; S. T. Shen, “Real-time dangling objects
smaller area of the image as they have a smaller probability sensing- A preliminary design of mobile headset ancillary device for
visually impaired” 2014 36th Annual International Conference of the
of occurrence. This problem only occurs for real time imaging IEEE Engineering in Medicine and Biology Society, August 2014
and hence we plan to work with more real time datasets so as [2] Aniqua Nusrat Zereen ; Sonia Corraya, “ Detecting real time object
to eliminate this error.
along with the moving direction for visually impaired people” 2016
2nd International Conference on Electrical, Computer &
Telecommunication Engineering (ICECTE), December 2016
IV. SOCIAL IMPACT [3] Alessandro Dionisi ; Emilio Sardini ; Mauro Serpelloni, “ Wearable
object detection system for the blind” 2012 IEEE International
While it’s true that a lot of research have been done to aid the Instrumentation and Measurement Technology Conference
Proceedings, May 2012
visually impaired, from wearable devices [13] which use
[4] Sander Soo, “Object detection using Haar-cascade Classifier ”,
RFID tags to mobile headsets that make use of ultrasonic Distributed Systems Group, University Of Tartu
sensors [1]. But all these solutions require introduction of a [5] Georgios Nikolakis ; Dimitrios Tzovaras ; Michael G. Strintzis, “
device which might feel foreign for the user, and require a Object recognition for the blind”, 2005 13th European Signal
learning period. People who have been accustomed to Processing Conference, April 2005
traditionally used aids like guide canes might be [6] M. Auvray, S. Hanneton, J. K. O'Regan, "Learning to perceive with a
uncomfortable with this transition. We propose an aid which visuo - auditory substitution system: Localizations ‘The
vOICe‘", Perception, vol. 36, pp. 416-430, 2007.
at its essence makes this guide cane, the most commonly used
[7] B. Durette, N. Louveton, D. Alleysson, J. Hérault, "Visuo-auditory
visual aid, smarter. We just simply attach our device to any sensory substitution for mobility assistance: testing The
regular cane. We believe the popularity of the cane is the vibe", Workshop on Computer Vision Applications for the Visually
reason why our approach can make a significant impact. Impaired, 2008.
Also in third world countries, where average income is very [8] Volvo Blind Spot Information System. [Online]. Available:
low, most devices would be financially out of reach for the http://qalive.volvocars.com/ie/top/myvolvo/guides-
manuals/Pages/Volvo-BLIS.aspx
general population. Our usage of cheap equipment like
[9] L. Zhao and C. Thorpe, “Stereo and neural network-based pedestrian
Raspberry pi camera and processor, makes the aid extremely detection,” In ITS, 2000, vol. 1, no. 3, pp. 148-154.
feasible and affordable, this was intentionally done to cater to [10] Aniqua Nusrat Zereen and Sonia Corraya “Detecting Real Time Object
the poorer population of the blind. Our smarter cane can also Along with the Moving Direction for Visually Impaired People”, In
be used in a combination with the already available devices International Conference on Electrical, Computer &
mentioned above, as it is based on image processing and not Telecommunication Engineering (ICECTE).
IR or ultrasonic sensing, this independent nature of the input [11] Damien Dooley; Brian McGinley; Ciarán Hughes; Liam Kilmartin;
Edward Jones and Martin Glavin “ A Blind-Zone Detection Method
can enhance the overall experience. By focusing on the cane Using a Rear-Mounted Fisheye Camera With Combination of Vehicle
itself, our approach does not replace a timeless aid rather just Detection Methods”
improves its performance.
2019 International Conference on Nascent Technologies in Engineering (ICNTE 2019)

[12] Alberto Broggi, Paolo Medici, and Pier Paolo Porta,“StereoBox: A


Robust and Efficient Solution for Automotive Short-Range Obstacle
Detection”, EURASIP Journal on Embedded Systems, vol.2007.
[13] Alessandro Dionisi ; Emilio Sardini ; Mauro Serpelloni “Wearable
object detection system for the blind”. Published in: 2012 IEEE
International Instrumentation and Measurement Technology
Conference Proceedings.
[14] Keisuke Kubo ; Teijiro Isokawa ; Nobuyuki Matsui, “On Recognition
of Three-Dimensional Objects Using Curvature Information” 2015 7th
International Conference on Emerging Trends in Engineering &
Technology (ICETET), Nov 15
[15] J. Kaszubiak ; M. Tornow ; R.W. Kuhn ; B. Michaelis, “ Real-time, 3-
D multi object position estimation and tracking” Proceedings of the
17th International Conference on Pattern Recognition, 2004. ICPR
2004, August 2004
[16] M. M. Farhad ; S. M. Nafiul Hossain ; Md. Imtiaz Hossain ; Md.
Sohorab Hossain ; Mohiuddin Ahmad, “ An efficient moving object
detection and distance measurement algorithm using correlation
window” 2014 9th International Forum on Strategic Technology
(IFOST), Oct 2014

You might also like