0% found this document useful (0 votes)
18 views

VisualPal A mobile app for object recognition for the visually impaired

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

VisualPal A mobile app for object recognition for the visually impaired

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

IEEE International Conference on Computer, Communication and Control (IC4-2015).

VisualPal: A Mobile App for Object Recognition for


the Visually Impaired
Shagufta Md.Rafique Bagwan1 Prof. L.J.Sankpal2
Sinhgad Academy of Engineering Kondhwa Sinhgad Academy of Engineering Kondhwa
Pune, India Pune, India
[email protected] [email protected]

Abstract— As far as out-door activities are concerned the applications too, that provide assistance in accessing
blind face difficulties in safe and independent mobility depriving information through speech synthesizers.
them of normal professional and social life. Also there are issues
of communication and access to information. There are software Today, the smartphone has become a basic necessity of
applications for computers and touch screen devices equipped every individual including the blind users. Earlier there were
with speech synthesizers. This project is for the visually impaired phones with screen readers that read the entire content on
people and is based on the android platform. The project detects
screen. The phones with physical keyboards were also
the direction of maximum brightness and the major colors .The
major module of the project scans and detects the object in the popular. The advent of a smartphone has helped the blind
image captured by the camera of an in built camera of a users too. The android smartphone provides a great platform
smartphone for the visually impaired. It is a dedicated image for dedicated applications for the blind users as they are
recognition application running on an Android system
smartphone. It overcomes the limitation of previously existing equipped with speech synthesizers to support them.
technologies that require the visually impaired to possess special Here, we propose ‘VisualPal’ which is a mobile phone
hardware or devices. Image recognition results are application that detects the direction of maximum light,
communicated to the blind users by means of pre-recorded detects colors and most importantly recognizes an object. Also
verbal messages. By the use of hybrid algorithm the computation
time is reduced and the correct object is detected very fast. The we propose a Hybrid algorithm that uses Artificial Neural
visually impaired are thus largely benefitted by this project. Use Network and Euclidean Distance measures in combination that
of android platform overcomes the shortcomings of the already makes the proposed system a strong object recognition mobile
existing devices. The color information is also used apart from
application for the visually blind users. The results are
the grayscale information. The object from any orientation can
be recognized correctly. communicated to the blind user using verbal messages.
The paper is structured as follows. In section III we
Keywords—Image Processing; Smartphone; Visually Impaired; describe the existing system, in section IV we describe the
Android
proposed system, in section V we compare the proposed
I. INTRODUCTION system with the existing system, in section VI shows the
The visually impaired people face a lot of problems in mathematical model , section VII covers the results and
their day to day life. Unlike a normal sighted person they are discussion, section VIII mentions the future scope and finally
unable to view their surroundings. Hence, they have limitation we conclude.
in almost every aspect of their lives like mobility, decision
making etc. They have to face difficulties in accessing II. RELATED WORK
information and communicating the same. Thus, their Computer vision is an advanced field comprising different
methods to acquire, analyze, process and understand images.
personal, social as well as professional life is affected [1].
This technology is useful in today’s world where there is a lot
Keeping such people in mind many technologies have been of scope for research. This technology is useful for the
developed for their assistance. The blind people have been physically disabled people too. The visually impaired users
certainly benefitting from the modern technologies. These can make use of this advanced technology. Research shows
technologies also possess some limitations. They are mostly that there is constantly an urge to get better technology for
such people .
devices that have to be handled by the blind users constantly.
These are the dedicated devices available for the blind such as Researchers have done work in the past. Technology aids
navigation devices, color readers etc. Thus, they have to be the visually impaired users to a great extent. There are
Software applications for computers and touch screen devices
carried around by the blind users which are inconvenient. Also available which are equipped with speech synthesizers. They
they are expensive. There are a number of software help the blind users in internet surfing and accessing
IEEE International Conference on Computer, Communication and Control (IC4-2015).

information like text documents. Also, there are low- tech by the camera of the mobile phone then the image is converted
labeling systems[2-3] in which labels like text messages(in into grayscale image. Next, keypoints are detected and finally,
Braille) or tactile signs are attached to the objects. There are
for each detected keypoint a descriptor is calculated [7].
high tech systems like RFID[4].
Pairing the keypoints describing objects from the database and
Recognizer [5] is software developed by Looktel that the keypoints describing newly scanned objects produced
recognizes household objects. An EyeRing project [6] is many false positive recognition results. Pairs of the classified
another application that is worn in the finger by the blind user. key points that feature different rotation were treated as
It communicates with an Android phone. It detects banknotes, invalid and excluded from the recognition algorithm.
colors and calculates distance. A major drawback is that it is
expensive and needs the blind user to wear it constantly. IV. PROPOSED SYSTEM
DAVID G. LOWE presented a method for extracting Technology can aid the visually impaired users to a large
distinctive invariant features [7] from images that can be used extent. As Android smartphone has become a basic necessity
to perform reliable matching between different views of an of even the visually impaired users, a dedicated application
object or scene. These features are used in recognition. Also a can be very beneficial for such people.
recognition approach that robustly identify objects among Here, we propose “VisualPal” a system for object
recognition for the visually impaired users. It also detects
clutter and occlusion in real time.
major colors and the direction of maximum brightness. This
Kanghun Jeong and Hyeonjoon Moon proposed a real time system would be a boon to the partially blind users and also to
object recognition system under smartphone environments the color blind users.
exploring SIFT, SURF and FAST corner detection algorithm An image is captured by the visually impaired user using
which provides faster computation of features as only corner the in-built camera of the smartphone with minimum
information is extracted [8]. resolution possible. The object from the captured image is
detected and recognized by the application on the mobile and
The book “Assistive technology for the blind” [9]
conveyed to the blind user in the form of verbal messages. For
describes the assistive technology available for the visually object recognition, we propose a hybrid algorithm, in which
impaired to overcome their limitations in society as well as the Artificial Neural Networks and Euclidean Distance
their homes. It also describes the engineering principles and measures [13] are used in combination.
the ways of their usage. In the book “Computer Vision: The major colors of the object may be recognized and the
Algorithms and Applications” , Richard Szeliski presents direction of maximum brightness can be detected by this
several Computer Vision approaches, image processing basics, application.
concepts in recognition and many other concepts that are
required to carry on an image processing and computer vision A. Color Detection
project [10]. The major colors are detected by the application and
conveyed to the user in the form of speech. The image is
Artificial Neural Networks has been used for ECG
pattern recognition [11]. Sung-Hyuk Cha states that in order to captured by the in-built camera of the smartphone with
solve many pattern recognition problems such as smallest possible resolution. The RGB image is converted to
classification, retrieval etc. distance measures or similarity HSV image. We separate color components from intensity for
measures are essential. various reasons such as robustness to lightening conditions,
III. EXISTING SYSTEM removing shadows etc. ‘H’ component represents the actual
color and this can be used for major color detection. Average
The existing system is a dedicated mobile phone
value of the H component is calculated which is then
application that detects colors, light and objects [12]. The
compared with table of references of colors and the color for
image is captured using the camera of the Android mobile
the object is detected.
with automatic flash. The RGB images are converted into Hue
Saturation Intensity color images. The color is detected and B. Detection of Direction of Maximum Brightness
message is conveyed or if the image is too dark or too bright a
A partially blind person can be benefited from this
warning message is conveyed. The light detector module module. This module can detect the direction of maximum
generates a high frequency audio signal on detection of bright brightness. As the intensity of brightness goes on increasing,
light. The frequency of the generated sound depends on the the frequency of sound signal generated will go on increasing.
brightness of the light. The camera preview will be used for this. Thus the partially
blind user can find the direction in which he should proceed.
The existing system recognizes objects from images
recorded by the camera of the mobile device. The SIFT was
applied in this application. Firstly, an RGB image is captured
IEEE International Conference on Computer, Communication and Control (IC4-2015).

C. Object Recognition x If the error obtained is within the acceptable range then,
the image is identified.
x Verbal message generated for the user.
x If error obtained is greater than the threshold then the
x object is a new object.
x Train the new object image.

In detection, the output obtained is value. The object can


be identified based on the values that are obtained. Further,
Euclidean distance measure is used to calculate the error
between the values.
Comparison with the threshold values increases the
possibility of getting the correct result. For e.g. If the user
captures a completely new image i.e. an image for which the
ANN had not been trained, it does not forcefully recognize it
giving a wrong result. Thus the hybrid algorithm involves the
Fig. 1 shows simple block diagram of the application on an Android
Smartphone. combination of the use of Artificial Neural networks and
Euclidean distance measures [13] that give correct result but
The main module of the system is the object recognition not wrong forceful results. The use of ANN makes the
module. It detects objects from the image captured by the in-
built camera of an android smartphone and conveys the result,
which is an identified object, in the form of verbal messages.
TextToSpeech is used to convey the results. The object
recognition algorithm is thus a dedicated application for the
Android smartphone. Various image processing algorithms are
applied on the image that is captured by the in-built camera
with either flash on or off. These algorithms include Blurring,
Edge detection, Thresholding, boundary detection, edge
detection, etc.
The image captured goes through a number of processes.
Edge features as well as color feature are extracted. The use of
color features make it possible to get correct results
irrespective of the brightness or lightening conditions.
Artificial Neural Network is used in training and detection.
The ANN is used as a classifier to which the input given is the
extracted features. Back-propagation algorithm is used for
training the neural network. In training, the input is given to
the ANN and output obtained is trained objects. The objects
have object tags i.e. a name is assigned to the object image.
We propose a hybrid algorithm that recognizes an object
correctly and conveys the same using TextToSpeech.

Hybrid Algorithm:
In the detection phase,
x The features extracted from image will be given as an
input to the ANN.
x For all trained objects, we get output values.
Fig. 2.Proposed system processing steps
x Euclidean Distance ‘d’ is applied using its formulae
x Euclidean Distance Measure calculates the error i.e. the application strong by increasing its performance efficiency.
value by which the output is dissimilar to the trained
objects. Fig 2 shows the steps of processing an image after it is
x Compare with Threshold captured, in the proposed system. The hybrid algorithm is
applied before the recognition step.
IEEE International Conference on Computer, Communication and Control (IC4-2015).

V. COMPARISON WITH EXISTING SYSTEM ITH =Threshold (IGS) (4)


The existing system uses SIFT(Scale Invariant Feature In the analysis of the objects in images we should be able
Transform) which is a powerful computer vision algorithm, as to distinguish between relevant images i.e. objects of interest
an object recognition algorithm. The SIFT describes and and the rest. We call the latter group as background. The
detects local image features. SIFT allows to compute feature segmentation techniques are used to find the objects of
descriptors which are used to recognize objects. The RGB interest. A parameter called the brightness threshold is chosen
image is converted into greyscale and color information is not and applied to the image a[m, n] as follows:
used. Keypoint detection procedure is performed further. The
proposed system uses Artificial Neural networks for training. For objects on a dark background:
Euclidean distance measures are used for calculating the error. If a[m, n] a[m, n] = object = 1
Thus, a hybrid algorithm which is combination of Artificial Else a[m, n] = background = 0
Neural Networks and Euclidean distance measures is used
which increases the accuracy of the application by reducing For dark objects on a light background we would use:
the rate of false positive results
If a[m, n] a[m, n] = object = 1
VI. MATHEMATICAL MODEL Else a[m, n] = background = 0
Let S= {I, IGS, ITH, IB, IED, IBD, IC, IHSV, IH, IN, IREG, IREC}
Where, d) RGB to HSV
I: Original Image set extracted from a video feed The RGB to HSV conversion is carried out to find out the
IB: Blurred image set extracted post application of blur filter value of ‘H’ i.e. Hue.
on original image to smoothen the object within the frame
ITH: Image set after applying the threshold filter IHSV= RGBtoHSV(IC) (5)
IED: Image set after the edges have been detected by the edge
detection algorithm e) Artificial Neural Networks
IBD: Image set after the blob detection
IC: Image set after cropping the object A neuron with k inputs transforming a set X ‫ ؿ‬IRk of input
IHSV: Image set after applying the Hue-Saturation Value signals (a k-neuron on X) is a function
IH: Image set after the histograms have been computed
IN: Image set after the Normalization
IREG: The set of registered images
IREC: The set of images that are recognized
where ‘w’ is a weights vector, h<·,·>i is a real scalar product,
1) Relevant Mathematics: and f : IR →IR is called an activation function of the neuron.
If f is a linear operator, then the neuron is called linear. A
i= ExtractFrame (1) function ,

a) Blur:

iB= Blur(i) (2) is said to be a trained k-neuron on X.


Gaussian Blur is used for this
f) Euclidean Distance Measures
b) Greyscale
The Euclidean distance between a point X (X1, X2, etc.) and a
IGS = Grayscale (iB) (3) point Y (Y1, Y2, etc.) Is given by the formula,
Sobel operator is used in edge detection algorithms. A
convolution mask is much smaller than the actual image. The
mask is slid over the image, manipulating a square of pixels at
a time. If Gx and Gy are actual sobel masks represented by VII. RESULTS AND DISCUSSION
matrices then,
The magnitude of the gradient is calculated as:
Fig. 3. Shows the GUI for the Object Recognition module. It
shows three options namely: Train New Objects, Detect
Objects, and Reset Memory. Genymotion is used as a virtual
environment to test the application.
c) Threshold
IEEE International Conference on Computer, Communication and Control (IC4-2015).

Fig. 5. Recognized object on Samsung Smartphone

Fig. 5 shows that an object has been recognized as a mouse.


Verbal message is given to the user is “The current object is a
mouse”. The change in orientation of the object also gives
Fig. 3. GUI of Object Detection Module
correct results.
If the “detect color” checkbox is checked and when the
user touches the screen of the smartphone, the color detector
Fig. 4 shows the GUI for registration and training of the module produces a verbal message for the correct color
objects. The objects can be registered as objects one to five. detected.
Each object is tagged i. e each object is given a name. The
training of these objects can then be carried out. The object
will be recognized by its given name in the detection phase.
The user is prompted to train the registered objects if he
presses the back key

Fig. 6. Color Detection Result

Fig. 6 shows the color detection result.

The color detected is yellow and the verbal message given to


the user is “Color is Yellow”. The conversion of RGB to HSV
Fig. 4. Registration and Training of the Objects makes it easier to detect the correct color. The color blind
users who cannot differentiate between colors can benefit
IEEE International Conference on Computer, Communication and Control (IC4-2015).

from this. The light detector module shows the direction of


maximum brightness. A user with a low vision can be shown
way by this module by increasing the frequency of audible ACKNOWLEDGMENT
sound, in the direction of brightness. I would like to thank the people who directly or indirectly
contributed to the completion of this research. Particularly, I
Fig. 7 shows a graph of accuracy percentage on point 2 would like to thank my guide Prof. L.J. Sankpal for her
and point 3. guidance in this field and Head of department Prof. B.B. Gite
to allow me to continue this topic. I would also like to thank
the Department of computer engineering, Sinhgad Academy
of Engineering, Kondhwa, Pune for providing me with the
facilities and also for imparting their knowledge with me
during my research work.

REFERENCES
[1] Strumillo P. “Electronic navigation systems for the blind and the
visually impaired”, Lodz University of Technology Publishing House (in
Polish),2012.
[2] Gill J. (2008), “Assistive devices for people with visual impairments”.
In: Helal S., Mokhtari M. and Abdulrazak B. (ed) The engineering
handbook of smart technology for aging, disability, and independence, J
Wiley and Sons, Inc, Hoboken, New Jersey, pp. 163-190I.S. Jacobs and
C.P. Bean, “Fine particles, thin films and exchange anisotropy,” in
Fig. 7. Increase in Accuracy Percentage using Hybrid Algorithm (Class 2) Magnetism, vol. III, G.T. Rado and H. Suhl, Eds. New York: Academic,
1963, pp. 271-350
Class 1 shows the accuracy percentage when only ANN is use [3] Onishi J., Ono T, “Contour pattern recognition through auditory labels
of Freeman chain codes for people with visual impairments”,
and Class 2 in red shows the increased accuracy percentage International IEEE Systems, Man, and Cybernetics Conference,
due to the use of Hybrid Algorithm. Anchorage, Alaska USA, pp. 1088-1093R,2011.
Use of Hybrid algorithm increases the percentage of accuracy, [4] Tinnakorn K., Punyathep P. (2011), “A voice system, reading
which is not the case when ANN is not use in combination medicament label for visually impaired people”, Proceedings of RFID
SysTech 2011; 7th European Workshop on Smart Objects: Systems,
with Euclidean Distance. For e.g. if we have a set of 100 Technologies and Applications, Dresden, Germany.
objects, on using a hybrid algorithm, if the objects detected are [5] LookTelRecognizer.http://www.looktel.com/recognizer
98 then all 98 objects are correctly detected. It cannot give us [6] Nanayakkara S. C., Shilkrot R. and Maes P. (2012). “EyeRing: An Eye
false positive result. The Euclidean distance calculates the on a Finger”. Intl. Conf. Human Factors in Computing (CHI 2012).
error, which is compared with the threshold value that is set. [7] D. Lowe (1999), “Object Recognition from Local Scale-Invariant
For e.g. we have set the threshold as detect object only if the Features”. International Conference on Computer Vision, pp. 1150-
error is between the range of 0-10.We get the result only 1157.
when the output values are within this range. The following [8] Kanghun Jeong, Hyeonjoon Moon (2011), “Object Detection using
FAST Corner Detector based on Smartphone Platforms”, International
graph shows the increase in the percentage of accuracy. In this Conference on Computers, Networks, Systems, and Industrial
case the accuracy percentage is 97.5%. Engineering
[9] Marion A. Hersh and Michael A. Johnson (Eds.)(2008), “Assistive
Technology for Visually Impaired and Blind People, Springer”, London
VIII. FUTURE WORK [10] Richard Szeliski, Computer Vision: Algorithms and Applications,
September 3, 2010 draft 2010 Springer.
The application is under tests on different Android
[11] Jayanta Kumar Basu, Debnath Bhattacharyya, Tai-hoon Kim, “Use of
Smartphones. The work can be extended further by making Artificial Neural Network in Pattern Recognition”, International Journal
the access of application completely touch free. This would of Software Engineering and Its Applications Vol. 4, No. 2, April 2010.
make it easier for the visually impaired user to access the [12] K. Matusiak, P.Skulimowski and P. Strumi, “Object recognition in a
mobile app. mobile phone application for visually impaired users”,HIS Sopot Poland
(2013).
IX. CONCLUSION [13] Sung-Hyuk Cha, “Comprehensive Survey on Distance/Similarity
Measures between Probability Density Functions”, INTERNATIONAL
‘VisualPal’, an Object recognition system is proposed for JOURNAL OF MATHEMATICAL MODELS AND METHODS IN
the visually impaired users. It is a dedicated mobile APPLIED SCIENCES,(2007).
application for the visually impaired users that detects colors,
direction of maximum brightness and most importantly it
recognizes an object. Hybrid algorithm, which is the
combination of Artificial Neural Network and Euclidean
Distance measure, is proposed that increases the performance
efficiency of the system. We get a high accuracy percentage.

You might also like