0% found this document useful (0 votes)
439 views

A Smart Reader For Visually Impaired People Using Raspberry PI

A Smart Reader for Visually Impaired People Using Raspberry PI

Uploaded by

Ananthu J V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
439 views

A Smart Reader For Visually Impaired People Using Raspberry PI

A Smart Reader for Visually Impaired People Using Raspberry PI

Uploaded by

Ananthu J V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

DOI 10.4010/2016.

699
ISSN 2321 3361 © 2016 IJESC

Research Article Volume 6 Issue No. 3

A Smart Reader for Visually Impaired People Using Raspberry PI


D.Velmurugan1, M.S.Sonam2, S.Umamaheswari3, S.Parthasarathy4, K.R.Arun 5
Assistant professor 1, UG Student2, 3, 4, 5
Department of Electrical and Electronics Engineering
Info Institute of Engineering, Kovilpalayam, Coimbatore, Tamilnadu, India.

Abstract:
According to the World Health organization (WHO), 285 million people are estimated to be visually impaired worldwide
among which 90% live in developing countries [1]. and forty five million blind individuals world-wide [2]. Though there are
many existing solutions to the problem of assisting individuals who are blind to read, however none of them provide a reading
experience that in any way parallels that of the sighted population. In particular, there is a need for a portable text reader that is
affordable and readily available to the blind community. Inclusion of the specially enabled in the IT revolution is both a social
obligation as well as a computational challenge in the rapidly advancing digital world today. This work proposes a smart reader
for visually challenged people using raspberry pi. This paper addresses the integration of a complete Text Read-out system
designed for the visually challenged. The system consists of a webcam interfaced with raspberry pi which accepts a page of
printed text. The OCR (Optical Character Recognition) package installed in raspberry pi scans it into a digital document which is
then subjected to skew correction, segmentation, before feature extraction to perform classification. Once classified, the text is
readout by a text to speech conversion unit (TTS engine) installed in raspberry pi. The output is fed to an audio amplifier before it
is read out. The simulation for the proposed project can be done in MATLAB. The simulation is just an initiation of image
processing ie., the image to text conversion and text to speech conversion done by the OCR software installed in raspberry pi. The
system finds interesting applications in libraries, auditoriums, offices where instructions and notices are to be read and also in
assisted filling of application forms. Results along with analysis are presented.

Keyword: Raspberry pi, Web Cam, Optical character recognition, Text to Speech Engine, Audio amplifier.

I. INTRODUCTION
people with blindness and limited vision are built on the two
Visually impaired people report numerous basic building blocks of OCR software and Text-to-Speech
difficulties with accessing printed text using existing (TTS) engines. Optical character recognition (OCR) is the
technology, including problems with alignment, focus, translation of captured images of printed text into machine-
accuracy, mobility and efficiency. We present a smart encoded text. OCR is a process which associates a symbolic
device that assists the visually impaired which effectively meaning with objects (letters, symbols an number) with the
and efficiently reads paper-printed text. The proposed image of a character. It is defined as the process of
project uses the methodology of a camera based assistive converting scanned images of machine printed into a
device that can be used by people to read Text document. computer process able format. Optical Character recognition
The framework is on implementing image capturing is also useful for visually impaired people who cannot read
technique in an embedded system based on Raspberry Pi Text document, but need to access the content of the Text
board. The design is motivated by preliminary studies with documents. Optical Character recognition is used to digitize
visually impaired people, and it is small-scale and mobile, and reproduce texts that have been produced with non-
which enables a more manageable operation with little computerized system. Digitizing texts also helps reduce
setup. In this project we have proposed a text read out storage space. Editing and Reprinting of Text document that
system for the visually challenged. The proposed fully were printed on paper are time consuming and labour
integrated system has a camera as an input device to feed intensive. It is widely used to convert books and documents
the printed text document for digitization and the scanned into electronic files for use in storage and document
document is processed by a software module the OCR analysis. OCR makes it possible to apply techniques such as
(optical character recognition engine). A methodology is machine translation, text-to-speech and text mining to the
implemented to recognition sequence of characters and the capture / scanned page.
line of reading. As part of the software development [11] the
Open CV (Open source Computer Vision) libraries is The final recognized text document is fed to the
utilized to do image capture of text, to do the character output devices depending on the choice of the user. The
recognition. Most of the access technology tools built for output device can be a headset connected to the raspberry pi

International Journal of Engineering Science and Computing, March 2016 2997 http://ijesc.org/
board or a speaker which can spell out the text document The power supply is given to the 5V micro USB connector
aloud. of raspberry pi through the Switched Mode Power Supply
Prevalence of blindness (per thousand) (SMPS). The SMPS converts the 230V AC supply to 5V
As per the estimate of 2015 DC. The web camera is connected to the USB port of
Types of blindness Urban Rural Total raspberry pi. The raspberry pi has an OS named RASPION
which process the conversions. The audio output is taken
Total blindness 4.43 5.99 5.40
from the audio jack of the raspberry pi. The converted
Economic blindness 11.14 15.44 13.83 speech output is amplified using an audio amplifier. The
One eye blindness 8.23 7.00 7.46 Internet is connected through the Ethernet port in raspberry
pi. The page to be read is placed on a base and the camera is
focused to capture the image. The captured image is
processed by the OCR software installed in raspberry pi.
The captured image is converted to text by the software. The
text is converted into speech by the TTS engine. The final
output is given to the audio amplifier from which it is
connected to the speaker. Speaker can also be replaced by
a headphone for convenience.

III. FLOW OF PROCESS

3.1 IMAGE CAPTURING


The first step in which the device is moved over
Figure.1 Survey of blind people the printed page and the inbuilt camera captures the images
of the text. The quality of the image captured will be high so
Total Blindness = Visual acuity less than 3/60 in as to have fast and clear recognition due to the high
better eye with spectacle correction resolution camera
Economic blindness = Visual acuity less than 6/60 in the
better eye with spectacle correction 3.2 PRE-PROCESSING
One eye blindness = Visual acuity less than 3/60 in one Pre-processing stage consists of three steps: Skew
eye and better than 6/60 in the other eye with spectacle Correction, Linearization and Noise removal. The captured
correction. image is checked for skewing. There are possibilities of
image getting skewed with either left or right orientation.
II. BLOCK DIAGRAM OF PRPOPOSESD Here the image is first brightened and binarized.
METHOD
SCANNED
The figure 1 illustrates the block diagram of proposed IMAGE
method. The framework of the proposed project is the
raspberry pi board. The raspberry pi B+ is a single board
computer which has 4 USB ports, an Ethernet port for SPEECH
internet connection, 40 GPIO pins for input/ output, CSI ENHANCED SYNTHESIS
camera interface, HDMI port, DSI display interface, SOC IMAGE
(system on a chip), LAN controller, SD card slot, audio
jack, and RCA video socket and 5V micro USB connector.

CHARACTER IMAGE TO
SEGMENTATIO TEXT
INTERNET
POWER SMPS
THROUGH N CONVERTER
SUPPLY
ETHERNET

RASPBERRY
PI
CHARACTER FEAUTERS
DATABASE EXTRACTION
USB
WEB PORT SPEAKER
CAMERA Figure. 3 Flow of process
The function for skew detection checks for an angle of
orientation between ±15 degrees and if detected then a
simple image rotation is carried out till the lines match with
the true horizontal axis, which produces a skew corrected
Figure.2 Block diagram of Proposed Method image. The noise introduced during capturing or due to poor

International Journal of Engineering Science and Computing, March 2016 2998 http://ijesc.org/
quality of the page has to be cleared before further conversion which is done in OCR can be simulated in
processing MATLAB.The conversion process in MATLAB includes
the following processes.
3.3 SEGMENTATION
After pre-processing, the noise free image is passed 1.Binary image conversion.
to the segmentation phase. It is an operation that seeks to 2. Complementation.
decompose an image of sequence o characters into sub- 3.Segmentation and labeling.
image of individual symbol (characters). The binarized 4.Isolating the skeleton of character.
image is checked for inter line spaces. If inter line spaces
are detected then the image is segmented into sets of 4.1 SAMPLE IMAGE
paragraphs across the interline gap. The lines in the The following image which is captured by the
paragraphs are scanned for horizontal space intersection webcam contains the following word.This image is in the
with respect to the background. Histogram of the image is jpeg format which has to be converted into text.
used to detect the width of the horizontal lines. Then the
lines are scanned vertically for vertical space intersection.
Here histograms are used to detect the width of the words.
Then the words are decomposed into characters using
character width computation
Figure. 4 Sample Image
3.4 FEATURE EXTRACTION
4.2 BINARY CONVERSION
Feature extraction is the individual image glyph is In this section sample image is converted into
considered and extracted for features. First a character glyph binary format. The image which was a 3D image initially is
is defined by the following attributes: (1) Height of the converted to 2D image .Binary 0 represents black color of
character; the characters. Binary 1 represents white color of the
(2) Width of the character; characters.
(3) Numbers of horizontal lines present—short and long;
(4) Numbers of vertical lines present—short and long;
(5) Numbers of circles present;
(6) Numbers of horizontally oriented arcs;
(7) Numbers of vertically oriented arcs;
(8) Centroid of the image;
(9) Position of the various features;
(10) Pixels in the various regions.

3.5 IMAGE TO TEXT CONVERTER


The ASCII values of the recognized characters are Figure. 5 Binary 0 text representation
processed by Raspberry Pi board. Here each of the
characters is matched with its corresponding template and
saved as normalized text transcription. This transcription is
further delivered to audio output.

3.6 TEXT TO SPEECH


The scope of this module is initiated with the
conclusion of the receding module of Character
Recognition. The module performs the task of conversion of
the transformed Tamil text to audible form.

The Raspberry Pi has an on-board audio jack, the


on-board audio is generated by a PWM output and is
minimally filtered. A USB audio card can greatly improve
the sound quality and volume.
Two options of attaching a microphone into Raspberry Pi.
One is to have USB mic, another to have an external USB
sound card. Figure .6 Binary 1 text representation

IV. SIMULATION ENVIRONMENT 4.3 BOUNDARY MARKING


The image to text and text to speech conversion is The area of the text is bordered and the boundary
done by the OCR software installed in raspberry pi.The for each character is isolated. The boundary for each

International Journal of Engineering Science and Computing, March 2016 2999 http://ijesc.org/
character is programmed and it can vary from 0 to 255 bits boundaries meets itself, the fire will extinguish itself and the
of characters occupying memory in the database. points at which this happens form the so called `quench line'

4.4 Segmentation and labelling 4.6 AUDIO OUTPUT


The isolated blocks of characters are segmented and are The programming codes are run in MATLAB and
automatically labelled for identity. Image segmentation is corresponding output is generated. The output is in the form
the process of partitioning a digital image into multiple of audio. The audio is heard using headphone or speaker
segments (sets of pixels, also known as super pixels). connected to the system. Each character of the word is
spelled out first and then the entire word is read out.

Figure 7 Segmentation and labelling

The result of image segmentation is a set of


segments that collectively cover the entire image, or a set of
contours extracted from the image (see edge detection).
Each of the pixels in a region are similar with respect to
some characteristic or computed property, such as color,
intensity, or texture. Adjacent regions are significantly
different with respect to the same characteristics.
Connected-component labelling is used in
computer vision to detect connected regions in binary digital
images, although color images and data with higher Figure 8 Audio Output
dimensionality can also be processed. When integrated into
an image recognition system or human-computer interaction V. HARDWARE IMPLEMENTATION
interface, connected component labelling can operate on a The hardware of the proposed work consists of a raspberry
variety of information. Blob extraction is generally pi board interfaced with a USB camera. Wi Fi dongle is
performed on the resulting binary image from a thresholding connected to the system for internet connection which is
step. Blobs may be counted, filtered, and tracked. taken to Pi through LAN cable. A 5mp camera is connected
to one of the USB port of raspberry pi. A 5V supply is given
4.5 FORMING CHARACTER SKELETON to Raspberry pi from the system through a power cable.
Skeletonization is a process for reducing
foreground regions in a binary image to a skeletal remnant
that largely preserves the extent and connectivity of the
original region while throwing away most of the original
foreground pixels. To see how this works, imagine that the
foreground regions in the input binary image are made of
some uniform slow-burning material.

Figure 7 Character Skeleton Figure 9 Hardware Setup

Light fires simultaneously at all points along the boundary VI. EXPERIMENTAL OUTPUTS
of this region and watch the fire move into the interior. At
points where the fire travelling from two different

International Journal of Engineering Science and Computing, March 2016 3000 http://ijesc.org/
processes the image and reads it out clearly. This is an
economical as well as efficient device for the visually
impaired people. We have applied our algorithm on many
images and found that it successfully does its conversion.
The device is compact and helpful to the society.

VIII. REFRENCES

[1] Bindu Philip and r. d. sudhaker Samuel 2009


“Human machine interface – a smart ocr for the
visually challenged” International journal of
Figure 10(a) Terminal and form window recent trends in engineering, vol no.3,November

[2] Roy shilkrot, pattie maes, jochen huber, suranga c.


nanayakkara, connie k (april may 2014) “Finger
reader: a wearable device to support text reading on
the go”Journal of emerging trend and information

[3] V. Ajantha devi1, dr. Santhosh baboo “Embedded


optical character recognition on tamil text image
using raspberry pi” international journal of computer
science trends and technology (ijcst) –
Figure 10 (b) image capturing volume 2 issue 4, jul-aug 2014

[4] Prachi khilari, bhope v. (july 2015) “Online speech


to text engine” International journal of innovative
research in science, engineering andtechnology. vol.
4, issue 7, july 2015

[5] Gopinath , aravind , pooja et.Al “Text to speech


conversion using matlab” International journal of
emerging technology and advanced engineering.
volume 5, issue 1, (january 2015)

Figure 10(C) Text conversion [6] Vikram shirol, abhijit m, savitri a et al. “DRASHTI-
an android reading aid” International journal of
The text document which has to be read out has to be computer science and information technologies vol.6
placed at a considerable distance from the webcam so that (july 2015)
the image is clear enough with proper illumination. The
figure. 10(b) shows a terminal window which is seen at [7] Catherine a. todd, ammara rounaq et al “An audio
once we switch on to raspion OS. In the terminal window haptic tool for visually impaired web users” Journal
the command for image to text conversion has to be given. of emerging trends in computing and information
Immediately a form window opens. In the form window, a science vol. 3, no. 8, aug 2012.
dialog box is seen named ‘image to read’. That option has to
be clicked to enable the webcam. The webcam auto focuses [8] Hay mar htun, Theingi zin, hla myo tun “Text to
the image and it is captured. speech conversion using different speech synthesis”
International journal of scientific & technology
The figure. 10(a) shows a sample image which has research volume 4, issue 07, july 2015.
been captured using the webcam. The image which has been
processed is displayed in the form window. This is shown in [9] Jaiprakash verma, khushali desai “Image to
figure.10(c). The displayed image is read out by the text to sound conversion” International journal of
speech engine ESPEAK. advance research

VII. CONCLUTION
We have implemented an image to speech conversion
technique using raspberry pi. The simulation results have
been successfully verified and the hardware output has been
tested using different samples. Our algorithm successfully

International Journal of Engineering Science and Computing, March 2016 3001 http://ijesc.org/

You might also like