A Smart Reader For Visually Impaired People Using Raspberry PI
A Smart Reader For Visually Impaired People Using Raspberry PI
699
ISSN 2321 3361 © 2016 IJESC
Abstract:
According to the World Health organization (WHO), 285 million people are estimated to be visually impaired worldwide
among which 90% live in developing countries [1]. and forty five million blind individuals world-wide [2]. Though there are
many existing solutions to the problem of assisting individuals who are blind to read, however none of them provide a reading
experience that in any way parallels that of the sighted population. In particular, there is a need for a portable text reader that is
affordable and readily available to the blind community. Inclusion of the specially enabled in the IT revolution is both a social
obligation as well as a computational challenge in the rapidly advancing digital world today. This work proposes a smart reader
for visually challenged people using raspberry pi. This paper addresses the integration of a complete Text Read-out system
designed for the visually challenged. The system consists of a webcam interfaced with raspberry pi which accepts a page of
printed text. The OCR (Optical Character Recognition) package installed in raspberry pi scans it into a digital document which is
then subjected to skew correction, segmentation, before feature extraction to perform classification. Once classified, the text is
readout by a text to speech conversion unit (TTS engine) installed in raspberry pi. The output is fed to an audio amplifier before it
is read out. The simulation for the proposed project can be done in MATLAB. The simulation is just an initiation of image
processing ie., the image to text conversion and text to speech conversion done by the OCR software installed in raspberry pi. The
system finds interesting applications in libraries, auditoriums, offices where instructions and notices are to be read and also in
assisted filling of application forms. Results along with analysis are presented.
Keyword: Raspberry pi, Web Cam, Optical character recognition, Text to Speech Engine, Audio amplifier.
I. INTRODUCTION
people with blindness and limited vision are built on the two
Visually impaired people report numerous basic building blocks of OCR software and Text-to-Speech
difficulties with accessing printed text using existing (TTS) engines. Optical character recognition (OCR) is the
technology, including problems with alignment, focus, translation of captured images of printed text into machine-
accuracy, mobility and efficiency. We present a smart encoded text. OCR is a process which associates a symbolic
device that assists the visually impaired which effectively meaning with objects (letters, symbols an number) with the
and efficiently reads paper-printed text. The proposed image of a character. It is defined as the process of
project uses the methodology of a camera based assistive converting scanned images of machine printed into a
device that can be used by people to read Text document. computer process able format. Optical Character recognition
The framework is on implementing image capturing is also useful for visually impaired people who cannot read
technique in an embedded system based on Raspberry Pi Text document, but need to access the content of the Text
board. The design is motivated by preliminary studies with documents. Optical Character recognition is used to digitize
visually impaired people, and it is small-scale and mobile, and reproduce texts that have been produced with non-
which enables a more manageable operation with little computerized system. Digitizing texts also helps reduce
setup. In this project we have proposed a text read out storage space. Editing and Reprinting of Text document that
system for the visually challenged. The proposed fully were printed on paper are time consuming and labour
integrated system has a camera as an input device to feed intensive. It is widely used to convert books and documents
the printed text document for digitization and the scanned into electronic files for use in storage and document
document is processed by a software module the OCR analysis. OCR makes it possible to apply techniques such as
(optical character recognition engine). A methodology is machine translation, text-to-speech and text mining to the
implemented to recognition sequence of characters and the capture / scanned page.
line of reading. As part of the software development [11] the
Open CV (Open source Computer Vision) libraries is The final recognized text document is fed to the
utilized to do image capture of text, to do the character output devices depending on the choice of the user. The
recognition. Most of the access technology tools built for output device can be a headset connected to the raspberry pi
International Journal of Engineering Science and Computing, March 2016 2997 http://ijesc.org/
board or a speaker which can spell out the text document The power supply is given to the 5V micro USB connector
aloud. of raspberry pi through the Switched Mode Power Supply
Prevalence of blindness (per thousand) (SMPS). The SMPS converts the 230V AC supply to 5V
As per the estimate of 2015 DC. The web camera is connected to the USB port of
Types of blindness Urban Rural Total raspberry pi. The raspberry pi has an OS named RASPION
which process the conversions. The audio output is taken
Total blindness 4.43 5.99 5.40
from the audio jack of the raspberry pi. The converted
Economic blindness 11.14 15.44 13.83 speech output is amplified using an audio amplifier. The
One eye blindness 8.23 7.00 7.46 Internet is connected through the Ethernet port in raspberry
pi. The page to be read is placed on a base and the camera is
focused to capture the image. The captured image is
processed by the OCR software installed in raspberry pi.
The captured image is converted to text by the software. The
text is converted into speech by the TTS engine. The final
output is given to the audio amplifier from which it is
connected to the speaker. Speaker can also be replaced by
a headphone for convenience.
CHARACTER IMAGE TO
SEGMENTATIO TEXT
INTERNET
POWER SMPS
THROUGH N CONVERTER
SUPPLY
ETHERNET
RASPBERRY
PI
CHARACTER FEAUTERS
DATABASE EXTRACTION
USB
WEB PORT SPEAKER
CAMERA Figure. 3 Flow of process
The function for skew detection checks for an angle of
orientation between ±15 degrees and if detected then a
simple image rotation is carried out till the lines match with
the true horizontal axis, which produces a skew corrected
Figure.2 Block diagram of Proposed Method image. The noise introduced during capturing or due to poor
International Journal of Engineering Science and Computing, March 2016 2998 http://ijesc.org/
quality of the page has to be cleared before further conversion which is done in OCR can be simulated in
processing MATLAB.The conversion process in MATLAB includes
the following processes.
3.3 SEGMENTATION
After pre-processing, the noise free image is passed 1.Binary image conversion.
to the segmentation phase. It is an operation that seeks to 2. Complementation.
decompose an image of sequence o characters into sub- 3.Segmentation and labeling.
image of individual symbol (characters). The binarized 4.Isolating the skeleton of character.
image is checked for inter line spaces. If inter line spaces
are detected then the image is segmented into sets of 4.1 SAMPLE IMAGE
paragraphs across the interline gap. The lines in the The following image which is captured by the
paragraphs are scanned for horizontal space intersection webcam contains the following word.This image is in the
with respect to the background. Histogram of the image is jpeg format which has to be converted into text.
used to detect the width of the horizontal lines. Then the
lines are scanned vertically for vertical space intersection.
Here histograms are used to detect the width of the words.
Then the words are decomposed into characters using
character width computation
Figure. 4 Sample Image
3.4 FEATURE EXTRACTION
4.2 BINARY CONVERSION
Feature extraction is the individual image glyph is In this section sample image is converted into
considered and extracted for features. First a character glyph binary format. The image which was a 3D image initially is
is defined by the following attributes: (1) Height of the converted to 2D image .Binary 0 represents black color of
character; the characters. Binary 1 represents white color of the
(2) Width of the character; characters.
(3) Numbers of horizontal lines present—short and long;
(4) Numbers of vertical lines present—short and long;
(5) Numbers of circles present;
(6) Numbers of horizontally oriented arcs;
(7) Numbers of vertically oriented arcs;
(8) Centroid of the image;
(9) Position of the various features;
(10) Pixels in the various regions.
International Journal of Engineering Science and Computing, March 2016 2999 http://ijesc.org/
character is programmed and it can vary from 0 to 255 bits boundaries meets itself, the fire will extinguish itself and the
of characters occupying memory in the database. points at which this happens form the so called `quench line'
Light fires simultaneously at all points along the boundary VI. EXPERIMENTAL OUTPUTS
of this region and watch the fire move into the interior. At
points where the fire travelling from two different
International Journal of Engineering Science and Computing, March 2016 3000 http://ijesc.org/
processes the image and reads it out clearly. This is an
economical as well as efficient device for the visually
impaired people. We have applied our algorithm on many
images and found that it successfully does its conversion.
The device is compact and helpful to the society.
VIII. REFRENCES
Figure 10(C) Text conversion [6] Vikram shirol, abhijit m, savitri a et al. “DRASHTI-
an android reading aid” International journal of
The text document which has to be read out has to be computer science and information technologies vol.6
placed at a considerable distance from the webcam so that (july 2015)
the image is clear enough with proper illumination. The
figure. 10(b) shows a terminal window which is seen at [7] Catherine a. todd, ammara rounaq et al “An audio
once we switch on to raspion OS. In the terminal window haptic tool for visually impaired web users” Journal
the command for image to text conversion has to be given. of emerging trends in computing and information
Immediately a form window opens. In the form window, a science vol. 3, no. 8, aug 2012.
dialog box is seen named ‘image to read’. That option has to
be clicked to enable the webcam. The webcam auto focuses [8] Hay mar htun, Theingi zin, hla myo tun “Text to
the image and it is captured. speech conversion using different speech synthesis”
International journal of scientific & technology
The figure. 10(a) shows a sample image which has research volume 4, issue 07, july 2015.
been captured using the webcam. The image which has been
processed is displayed in the form window. This is shown in [9] Jaiprakash verma, khushali desai “Image to
figure.10(c). The displayed image is read out by the text to sound conversion” International journal of
speech engine ESPEAK. advance research
VII. CONCLUTION
We have implemented an image to speech conversion
technique using raspberry pi. The simulation results have
been successfully verified and the hardware output has been
tested using different samples. Our algorithm successfully
International Journal of Engineering Science and Computing, March 2016 3001 http://ijesc.org/