0% found this document useful (0 votes)
238 views

Optical Character Recognition Project Report

The document discusses optical character recognition (OCR) technology. OCR involves scanning documents and using image processing and classification algorithms to convert images of text into machine-readable text. It has many applications, including making documents searchable, processing bank checks, and creating digital libraries. OCR has improved accessibility for the blind by allowing scanned documents to be read aloud or in braille. Key aspects of OCR systems include segmentation, feature extraction, and classification of character images. MATLAB is well-suited for developing OCR applications due to its image processing and machine learning tools.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
238 views

Optical Character Recognition Project Report

The document discusses optical character recognition (OCR) technology. OCR involves scanning documents and using image processing and classification algorithms to convert images of text into machine-readable text. It has many applications, including making documents searchable, processing bank checks, and creating digital libraries. OCR has improved accessibility for the blind by allowing scanned documents to be read aloud or in braille. Key aspects of OCR systems include segmentation, feature extraction, and classification of character images. MATLAB is well-suited for developing OCR applications due to its image processing and machine learning tools.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Optical Character Recognition

1. INTRODUCTION
1.1 Introduction to Project:

OCR technology provides reproductive systems by scanning and imaging systems the
ability to convert images of characters in a font of machine character can be understood
or recognized by a computer. Thus, images of characters in a font of machine are
drawn from a bitmap of the image reproduced by the scanner. This project is
completely based on concept of IMAGE PROCESSING (IP).

The OCR process involves several aspects such as segmentation, feature extraction and
classification. Image Processing Toolbox for MATLAB provides a feature set that extends
the product's capabilities to develop new algorithms and applications in the field of process
and image analysis. The environment and create mathematical MATLAB is ideal for image
processing, because these images are, after all, dies. This toolbox includes functions for:

* Filter design.

* Improving and retouched images.

* Image Analysis and Statistics.

* Morphological operations, geometric and color.

* 2D transformations.

Parshvanath College of Engineering(Computer Dept) Page 1


Optical Character Recognition

Image processing is an absolutely crucial area of work for those groups and industries that
are working in areas such as medical diagnostics, astronomy, geophysics, environmental
science, data analysis in laboratories, industrial inspection, etc. Although not originally
developed for users who are visually impaired, Optical Character Recognition (OCR)
technology has become an aid for inputting documents quickly by and for users with vision
impairments. A complete OCR system consists of a scanner, the recognition component, and
OCR software that interacts with the other components to store the computerized document
in the computer. The process of inputting the material into the computer begins with the
scanner taking a picture of the printed material. Then, during the recognition process, the
picture is analyzed for layout, fonts, text and graphics. Finally, the picture of the document is
converted into an electronic format that can be edited with an application software. OCR
systems designed specifically for users with visual impairments have modified interfaces that
can be used with minimal assistance. In addition, technological developments will see an
increase in accuracy, including the ability of these products to decipher even handwritten
materials.

In general, OCR systems work as an external device with the user's existing assistive
technology. Once the picture is in electronic format, it is accessed for reading and/or editing
through the user's braille, speech or magnification technology. Since some products work
better with certain speech or braille systems, therefore, it is important to note its
compatibility with the other products in the user's computer system . Some products,
however, have an adaptive device built in. These are referred to as "stand-alone reading
machines." Of these, some products have the added flexibility of working either as a stand
alone or with a computer

In addition to whether the system is stand-alone or works in conjunction with the user's
computer, there are many other features to consider depending on the user's needs, including:
accuracy of scanning, and whether the product can handle poor-quality print or image print,
such as faxes; level of flexibility in terms of the material's size and format, such as large
paper documents and books, and whether the system will offers automatic adjustment of
brightness, contrast and orientation of print (vertical or horizontal); whether the user controls

Parshvanath College of Engineering(Computer Dept) Page 2


Optical Character Recognition

are accessible, such as braille keypads or speech feedback: whether manuals are supplied in
braille and/or on casette, and the level of online technical support.

1.2 Why this Project:


Machine replication of human functions, like reading, is an ancient dream. However, over the
last five decades, machine reading has grown from a dream to reality. Optical character
recognition has become one of the most successful applications of technology in the field of
pattern recognition and artificial intelligence. Many commercial systems for performing
OCR exist for a variety of applications, although the machines are still not able to compete
with human reading capabilities.

1.3 Advantages:
Ø There are many documents in world which are not avaliable in computer
readable format.(specially in regional languages).

Ø Computer vision and Robotics can’t be imagine without OCR.

Ø Benefits to Blind Persons.

Ø OCR has enabled scanned documents to become more than just image files, turning
into fully searchable documents with text content that is recognized by computers.

Ø With the help of OCR, people no longer need to manually retype important
documents when entering them into electronic databases. Instead, OCR extracts
relevant information and enters it automatically. The result is accurate, efficient
information processing in less time.

Parshvanath College of Engineering(Computer Dept) Page 3


Optical Character Recognition

1.4 Applications:
Basic Application:
OCR can be used to enter data automatically into a computer for dissemination and
processing.

Banking:
The uses of OCR vary across different fields. One widely known application is in banking,
where OCR is used to process checks without human involvement. A check can be inserted
into a machine, the writing on it is scanned instantly, and the correct amount of money is
transferred. This technology has nearly been perfected for printed checks, and is fairly
accurate for handwritten checks as well, though it occasionally requires manual confirmation.
Overall, this reduces wait times in many banks.

Healthcare:
In the legal industry, there has also been a significant movement to digitize paper documents.
In order to save space and eliminate the need to sift through boxes of paper files, documents
are being scanned and entered into computer databases. OCR further simplifies the process
by making documents text-searchable, so that they are easier to locate and work with once in
the database. Legal professionals now have fast, easy access to a huge library of documents
in electronic format, which they can find simply by typing in a few keywords.

Digital Library:
Digital library can be made with the help of OCR by converting large book collections into
computer readable text format for on-line viewing of content.

Invoice and Shipping Receipt Processing:


Customers can use OCR to extract the invoice or bill of lading number off of the document
and rename the scanned document to match the invoice number of convert the image to a
different format for viewing on the web. With the invoice or bill of lading number they are
able to quickly perform an electronic search and retrieval of the scanned document
improving customer service.

Parshvanath College of Engineering(Computer Dept) Page 4


Optical Character Recognition

Robotics and Computer vision:


Robotics and Computer vision cant be imagine without OCR.

Document identification & Automatic number-plate readers:


OCR also allows for the software to accurately identify different types of documents and
Number plates.

Parshvanath College of Engineering(Computer Dept) Page 5


Optical Character Recognition

2. Review of Literature
2.1 Goal:
Ø Performing the mechanical or electronic translation of images of handwritten,
typewritten or printed text (usually captured by a scanner) into machine-editable text.
Ø Accuracy in Character Recognition
Ø Fast performance so that we can process more documents in short time

2.2 Other Existing Systems:


Object recognition
in computer vision is the task of finding a given object in an image or video sequence.

Face Recognition:
A facial recognition system is a computer application for automatically identifying or
verifying a person from a digital image or a video frame from a video source. One of the
ways to do this is by comparing selected facial features from the image and a facial database.

Speech Recognition :
While OCR converts visible words or characters to text Speech recognition system converts
audible words to text. It is also known as automatic speech recognition or computer speech
recognition and it converts spoken words to text.

Parshvanath College of Engineering(Computer Dept) Page 6


Optical Character Recognition

3. System Analysis:
3.1 Requirement Analysis
Hardware Requirements:
Ø WINDOWS O.S , 1 GB RAM , 80-120 GB HDD
Ø 2.0 GHZ PROCESSOR
Ø SCANNER , WEB CAM

Software Requirements:
Ø MATLAB 7.0 AND ABOVE VERSION
Ø Text Editor

WHY MATLAB?

Ø MATLAB integrates mathematical computing, visualization, and a powerful language


to provide a flexible environment for technical computing.
Ø MATLAB includes tools for:
Data acquisition,
Data analysis,
Exploration Visualization,

Parshvanath College of Engineering(Computer Dept) Page 7


Optical Character Recognition

Image processing Algorithm,


Prototyping nd simulation Programming and application development
.
Ø MATLAB allows matrix manipulations, plotting of functions and data.
Ø Support Image Processing.
Ø Supports complicated mathematical operations.
Ø Algorithm Design and Development easily .
Ø Images can be conveniently represented as matrices.

Parshvanath College of Engineering(Computer Dept) Page 8


Optical Character Recognition

3.2 Software Requirement Specification:


No. Requirement Essential Description of the Remarks
or Requirement
Desirable

RS1 The system should Very All the characters in Primary


recognize characters Essential the image should be function of
recognize. system

RS2 The system should do Essential Noise in the image Since real life
filtering of image should be removed images have
before processing it. by filtering it so noise, system
that processing should have
should not have any noise removal
glitch. facility.

RS3 The system should Essential System should This will make
work on background works irrespective system work for
of various colours. of background various
colour.(yellow,pink documents
etc..)

Parshvanath College of Engineering(Computer Dept) Page 9


Optical Character Recognition

3.3 Feasibility:
The feasibility study of “Optical Character Recognition” are based on the following
aspects:

Ø Technical
The implementation of our project is quite simple. There are very few components
required for our project. One of the components required is a scanner, MATLAB
Software, Working Pentium 4 PC. Due to the less number of components required
“Optical Character Recognition” becomes technically feasible.

Ø Financial
The system requires only two major components i.e. a scanner and MATLAB. Out of
these the MATLAB is bit expensive and the other two things are easily affordable.
Also no servicing of product is required since it is software based.

Ø Resources
All the resources are easily available and time available is sufficient to complete
project . However testing phase may suffer due to short span of time

Ø Operational
System is Operationally feasible since it can satisfy all requirements in requirement
analysis phase.

Parshvanath College of Engineering(Computer Dept) Page 10


Optical Character Recognition

3.4 Risk Analysis:


Risk Identification:
Risk identification is a systematic process to specify threats to the project plan along with
estimates, schedule, and resource loading. The following list of risk item checklists analyzes
each risk and its related counter effects and policies.

Ø Product Size
No product size risk is involved.

Ø Business Impact
There are no specific business risks associated with the project, as this project is not
associated with any budget as the project is being built as a mini project

Ø Development Environment
The development environment is not too complicated as the project is being built on
MATLAB

Ø Technology Newness
The software used for this project not an old technology and can be understood to people
who have little knowledge of this language.

Ø Staff size and experience


There is no risk involved with the staff, as it is possible for four people to develop this
software.

Parshvanath College of Engineering(Computer Dept) Page 11


Optical Character Recognition

Risk Projection :
Risks Category Probability Impact
Faulty scans. TE 50% 2
Multicoloured background TE 40% 3
Deadline tightened. BU 60% 2

PS-Project Size

BU-Business Impact

SSE-Staff Size and Experience.

TE-Technology to be built.

Impact values:

1-Catastrophic

2-Critical

3-Marginal

4-Negligible

Parshvanath College of Engineering(Computer Dept) Page 12


Optical Character Recognition

Sort the risk table with respect to the impact values

Risk Mitigation, Monitoring &Management

3.4.1.1 Risk Information Sheet 1:


Risk information sheet
Risk Id: 001 Date: 15/08/09 Prob: 60% Impact: Critical

Description: Deadline being tightened

The time required to build the project was calculated. There was less time to complete the
project as size increased.
Refinement/context: 1.Misinterpretion of some aspect of the system.

2. Technology to be used changes.


Mitigation/monitoring: Repeated assessment after completion of each function
Management/contingency plan/trigger: Work overtime.
Current Status: Mitigation steps initiated.

Parshvanath College of Engineering(Computer Dept) Page 13


Optical Character Recognition

3.4.1.2 Risk Information Sheet 2:


Risk information sheet
Risk Id: 002 Date: 02/03/10 Prob: 40% Impact: marginal

Description: Faulty Scans

Refinement/context:

Due to faulty scans characters in image can’t be recognize properly


Mitigation/monitoring: Perform filtering and other image enhancement operations on image
which are to be processed.
Management/contingency plan/trigger:

Use various image enhancement techniques specially image filtering to remove high
frequency noise.
Current Status:

Risk mitigation is in processed using median filtering

Parshvanath College of Engineering(Computer Dept) Page 14


Optical Character Recognition

3.4.1.3 Risk Information Sheet 3:


Risk information sheet
Risk Id: 003 Date: 21/09/09 Prob: 40% Impact: Marginal

Description:

Multicoloured Background
Refinement/context:

1. Real life images have muticoloured background

2. So character (if present) must be recognize by the system


Mitigation/monitoring:.

Work on white background image.


Management/contingency plan/trigger:

Work on gray scale image instead of rgb image


Current Status:

Mitigation steps initiated.

Parshvanath College of Engineering(Computer Dept) Page 15


Optical Character Recognition

4. Project Planning:
4.1 Timeline Chart
July August September October

W W W W W W W W W W W W W W W W
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

1) Gathering
Initial
Requirement
s.

Meet Internal
Guide

Identify needs
& Constraints

Determine
Goals & Scope

Establish
specification.

Milestone:-product specification complete.

Parshvanath College of Engineering(Computer Dept) Page 16


Optical Character Recognition

July August September October


W W W W W W W W W W W W W W W W
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
2)
Feasibility
Phase
Technical(S/
W & H/W)
Feasibility
Economic
Feasibility
Application
(code)
Feasibility
Operational
Feasibility
Milestone:- Feasibility study complete
3)Require-
ment
Determina
-tion
Phase
Determine
inputs
Determine
outputs
Process
Control
Synopsis

Milestone:- Synopsis is ready

Parshvanath College of Engineering(Computer Dept) Page 17


Optical Character Recognition

January February March April


4) Design W W W W W W W W W W W W W W
1 2 3 4 1 2 3 4 1 2 3 4 1 2
Designing different
functions to perform
Construction of template

Gathering standard
character and interrelating
various functions
Milestone: - Design Specification complete.
5) Development
Developing Code for
various supportive
functions
Developing code for main
OCR function
Milestone: - Development Phase complete.
7) Testing Phase
Testing characters
Valid Input Testing

System Testing & Load


Testing
Milestone: - Testing Phase complete.
8) Implementation
Phase
Implementation of OCR
system
Milestone: - Implementation Phase complete.
9) Evaluation Phase
Documentation and
recommendation
Milestone: - Evaluation Phase complete.

Parshvanath College of Engineering(Computer Dept) Page 18


Optical Character Recognition

4.1 Gantt Chart:

Proposed Time

Actual Time

System
analysis

System
Design

Coding

Testing

July October January February April

Parshvanath College of Engineering(Computer Dept) Page 19


Optical Character Recognition

5. System Design:
5.1 Block Diagram

Parshvanath College of Engineering(Computer Dept) Page 20


Optical Character Recognition

Ø Image Acquisition :
In this step we aquire image with the help of scanner.

Ø RGB to Gray conversion:


In this step we convert rgb image to gray scale image.

Ø Filtering Process:
In this step ,Noise in the image is removed.
Ø Character recognition:
In this step various character in the images are recognized. All processing is done in
MATLAB software.

Ø Save to Text File:


As the characters are recognized we simultaneously save it into the text file.

Ø Display Output:
Optically Recognized Characters i.e output of the system is displayed on screen.

Parshvanath College of Engineering(Computer Dept) Page 21


Optical Character Recognition

5.2 Structural Design:

DATA FLOW DIAGRAM FOR OCR SYSTEM

Parshvanath College of Engineering(Computer Dept) Page 22


Optical Character Recognition

5.3 Flowchart of OCR system:

Flowchart for ocr.m

Parshvanath College of Engineering(Computer Dept) Page 23


Optical Character Recognition

Flowchart for ocr.m

Parshvanath College of Engineering(Computer Dept) Page 24


Optical Character Recognition

Flowchart for ocr.m

Parshvanath College of Engineering(Computer Dept) Page 25


Optical Character Recognition

Flowchart for lines.m

Parshvanath College of Engineering(Computer Dept) Page 26


Optical Character Recognition

Flowchart for columns.m

Parshvanath College of Engineering(Computer Dept) Page 27


Optical Character Recognition

Flowchart for columns.m

Parshvanath College of Engineering(Computer Dept) Page 28


Optical Character Recognition

Flowchart for clip.m

Parshvanath College of Engineering(Computer Dept) Page 29


Optical Character Recognition

Flowchart for samedim.m

Parshvanath College of Engineering(Computer Dept) Page 30


Optical Character Recognition

6. IMPLEMENATION
6.1 Functions used
The following functions have been used while implementing the OCR system:

BWLABEL

Label connected components in a binary image.

Syntax:
L = bwlabel(BW,n)

[L,num] = bwlabel(BW,n)

Description:
L = bwlabel(BW,n) returns a matrix L, of the same size as BW, containing labels for the
connected objects in BW. n can have a value of either 4 or 8, where 4 specifies 4-connected
objects and 8 specifies 8-connected objects; if the argument is omitted, it defaults to 8.

The elements of L are integer values greater than or equal to 0. The pixels labeled 0 are the
background. The pixels labeled 1 make up one object, the pixels labeled 2 make up a second
object, and so on.

[L,num] = bwlabel(BW,n) returns in num the number of connected objects found in BW.

Parshvanath College of Engineering(Computer Dept) Page 31


Optical Character Recognition

FIND:
Find indices and values of nonzero element

Syntax:
indices = find(X)

[i,j] = find(X)

[i,j,v] = find(X)

[...] = find(X, k)

find(X, k, 'first')

[...] = find(X, k, 'last')

Description:
indices = find(X) returns the linear indices corresponding to the nonzero entries of the array
X. If none are found, find returns an empty matrix. In general, find(X) regards X as X(:),
which is the long column vector formed by concatenating the columns of X.

[i,j] = find(X) returns the row and column indices of the nonzero entries in the matrix X. This
syntax is especially useful when working with sparse matrices. If X is an N-dimensional
array with N > 2, j contains linear indices for the dimensions of X other than the first.

[i,j,v] = find(X) returns a column vector v of the nonzero entries in X, as well as row and
column indices

[...] = find(X, k) or [...] = find(X, k, 'first') returns at most the first k indices corresponding to
the nonzero entries of X. k must be a positive integer, but it can be of any numeric data type.

[...] = find(X, k, 'last') returns at most the last k indices corresponding to the nonzero entries
of X.

Parshvanath College of Engineering(Computer Dept) Page 32


Optical Character Recognition

IMREAD
Read image from graphics file

Syntax
A = imread(filename,fmt)

[X,map] = imread(filename,fmt)

[...] = imread(filename)

[...] = imread(URL,...)

[...] = imread(...,idx) (CUR, GIF, ICO, and TIFF only)

[...] = imread(...,'PixelRegion',{ROWS, COLS}) ( TIFF only)

[...] = imread(...,'frames',idx) (GIF only)

[...] = imread(...,ref) (HDF only)

[...] = imread(...,'BackgroundColor',BG) (PNG only)

[A,map,alpha] = imread(...) (ICO, CUR, and PNG only)

Description
The imread function supports four general syntaxes, described below. The imread function
also supports several other format-specific syntaxes. See Special Case Syntax for information
about these syntaxes.

Parshvanath College of Engineering(Computer Dept) Page 33


Optical Character Recognition

MAX
Maximum elements of an array

Syntax
C = max(A)

C = max(A,B)

C = max(A,[],dim)

[C,I] = max(...)

Description
C = max(A) returns the largest elements along different dimensions of an array. If A is a
vector, max(A) returns the largest element in A. If A is a matrix,max(A) treats the columns
of A as vectors, returning a row vector containing the maximum element from each column.
If A is a multidimensional array, max(A) treats the values along the first non-singleton
dimension as vectors, returning the maximum value of each vector.

C = max(A,B) returns an array the same size as A and B with the largest elements taken from
A or B.

C = max(A,[],dim) returns the largest elements along the dimension of A specified by scalar
dim. For example, max(A,[],1) produces the maximum values along the first dimension (the
rows) of A.

[C,I] = max(...) finds the indices of the maximum values of A, and returns them in output
vector I. If there are several identical maximum values, the index of the first one found is
returned.

Parshvanath College of Engineering(Computer Dept) Page 34


Optical Character Recognition

MIN
Minimum elements of an array

Syntax
C = min(A)

C = min(A,B)

C = min(A,[],dim)

Description
C = min(A) returns the smallest elements along different dimensions of an array. If A is a
vector, min(A) returns the smallest element in A.C,I] = min(...)

If A is a matrix, min(A) treats the columns of A as vectors, returning a row vector containing
the minimum element from each column.

C = min(A,B) returns an array the same size as A and B with the smallest elements taken
from A or B.f A is a multidimensional array, min operates along the first nonsingleton
dimension.

C = min(A,[],dim) returns the smallest elements along the dimension of A specified by scalar
dim. For example, min(A,[],1) produces the minimum values along the first dimension (the
rows) of A.

[C,I] = min(...) finds the indices of the minimum values of A, and returns them in output
vector I. If there are several identical minimum values, the index of the first one found is
returned.

Parshvanath College of Engineering(Computer Dept) Page 35


Optical Character Recognition

IMVIEW
Display image in the Image Viewer

Syntax
imview(I)

imview(RGB)

imview(X,map)

imview(I,range)

imview(filename)

imview(...,'InitialMagnification',initial_mag)

h = imview(...)

imview close all

Description:
imview(I) displays the intensity image I. imview(RGB) displays the true-color image RGB.

imview(X,map) displays the indexed image X with colormap map.

imview(I,range) displays the intensity image I, where range is a two-element vector [LOW
HIGH] that controls the black-to-white range in the displayed image. imview displays the
value LOW (and any value less than LOW) as black, and the value HIGH (and any value
greater than HIGH) as white. Values in between are displayed as intermediate shades of gray.
range can also be empty ([]), in which case imview displays the minimum value of I as black
and the maximum value of I as white. In other words, imview(I,[]) is equivalent to
imview(I,[min(I(:)) max(I(:))]).

imview(filename) :displays the image contained in the file specified by filename. The file
must contain an image that can be read by imread. If the file contains multiple images, the
first one is displayed.

Parshvanath College of Engineering(Computer Dept) Page 36


Optical Character Recognition

With no input arguments, imview displays a file chooser dialog box so you can select an
image file interactively. H = imview(...) returns a handle H to the tool. close(H) closes the
image viewer. imview close all closes all image

Parshvanath College of Engineering(Computer Dept) Page 37


Optical Character Recognition

FUNCTION
Function M-files

Description
You add new functions to the MATLAB vocabulary by expressing them in terms of existing
functions. The existing commands and functions that compose the new function reside in a
text file called an M-file.

M-files can be either scripts or functions. Scripts are simply files containing a sequence of
MATLAB statements. Functions make use of their own local variables and accept input
arguments.

The name of an M-file begins with an alphabetic character and has a filename extension of
.m. The M-file name, less its extension, is what MATLAB searches for when you try to use
the script or function.

A line at the top of a function M-file contains the syntax definition. The name of a function,
as defined in the first line of the M-file, should be the same as the name of the file without
the .m extension. For example, the existence of a file on disk called stat.m with

function [mean,stdev] = stat(x)

n = length(x);

mean = sum(x)/n;

stdev = sqrt(sum((x-mean).^2/n));

defines a new function called stat that calculates the mean and standard deviation of a
vector. The variables within the body of the function are all local variables. A sub function,
visible only to the other functions in the same file, is created by defining a new function with
the function keyword after the body of the preceding function or subfunction.

For example, avg is a subfunction within the file stat.m:

Parshvanath College of Engineering(Computer Dept) Page 38


Optical Character Recognition

function [mean,stdev] = stat(x)

n = length(x);

mean = avg(x,n);

stdev = sqrt(sum((x-avg(x,n)).^2)/n);

function mean = avg(x,n)

mean = sum(x)/n;

Subfunctions are not visible outside the file where they are defined. Functions normally
return when the end of the function is reached. Use a return statement to force an early
return.

Parshvanath College of Engineering(Computer Dept) Page 39


Optical Character Recognition

ISEMPTY
Test if array is empty

Syntax:
f = isempty(A)

Description
tf = isempty(A) returns logical true (1) if A is an empty array and logical false (0) otherwise.
An empty array has at least one dimension of size zero, for example, 0-by-0 or 0-by-5.

Examples

B = rand(2,2,2);

B(:,:,:) = [];

isempty(B)

ans =1

Parshvanath College of Engineering(Computer Dept) Page 40


Optical Character Recognition

MEDFILT2
Perform two-dimensional median filtering

Syntax
B = medfilt2(A,[m n])

B = medfilt2(A)

B = medfilt2(A,'indexed',...)

Description
Median filtering is a nonlinear operation often used in image processing to reduce "salt and
pepper" noise. Median filtering is more effective than convolution when the goal is to
simultaneously reduce noise and preserve edges.

B = medfilt2(A,[m n]) performs median filtering of the matrix A in two dimensions. Each
output pixel contains the median value in the m-by-n neighborhood around the corresponding
pixel in the input image. medfilt2 pads the image with 0's on the edges, so the median values
for the points within

[m n]/2 of the edges might appear distorted.

B = medfilt2(A) performs median filtering of the matrix A using the default 3-by-3
neighborhood.

B = medfilt2(A,'indexed',...) processes A as an indexed image, padding with 0's if the class of


A is uint8, or 1's if the class of A is double.

Parshvanath College of Engineering(Computer Dept) Page 41


Optical Character Recognition

WHILE
Repeat statements an indefinite number of times

Syntax
while expression

statements

end

Description
while repeats statements an indefinite number of times. The statements are executed while
the real part of expression has all nonzero elements.

expression is usually of the form

expression rel_op expression

where rel_op is ==, <, >, <=, >=, or ~=.

The scope of a while statement is always terminated with a matching end.

Arguments Expression:
expression is a MATLAB expression, usually consisting of variables or smaller expressions
joined by relational operators (e.g., count < limit) or logical functions (e.g., isreal(A)).

Simple expressions can be combined by logical operators ( &, |, ~) into compound


expressions such as the following. MATLAB evaluates compound expressions from left to
right, adhering to operator precedence rules.

(count < limit) & ((height - offset) >= 0)

Parshvanath College of Engineering(Computer Dept) Page 42


Optical Character Recognition

Statements
statements is one or more MATLAB statements to be executed only while the expression is
true or nonzero.

Parshvanath College of Engineering(Computer Dept) Page 43


Optical Character Recognition

SIZE
Array dimensions

Syntax
d = size(X)

[m,n] = size(X)

m = size(X,dim)

[d1,d2,d3,...,dn] = size(X)

Description
d = size(X) returns the sizes of each dimension of array X in a vector d with ndims(X)
elements.

[m,n] = size(X) returns the size of matrix X in separate variables m and n.

m = size(X,dim) returns the size of the dimension of X specified by scalar dim.

[d1,d2,d3,...,dn] = size(X) returns the sizes of the first n dimensions of array X in separate
variables.If the number of output arguments n does not equal ndims(X), then for:

n > ndims(X) size returns ones in the "extra" variables, i.e., outputs ndims(X)+1 through n.

n < ndims(X) dn contains the product of the sizes of the remaining dimensions of X, i.e.,
dimensions n+1 through ndims(X).

Parshvanath College of Engineering(Computer Dept) Page 44


Optical Character Recognition

FOR
Repeat statements a specific number of times

Syntax

for variable = expression

statements

end

Description
The general format is

for variable = expression

statement

...

statement

end

The columns of the expression are stored one at a time in the variable while the following
statements, up to the end, are executed.

In practice, the expression is almost always of the form scalar :scalar, in which case its
columns are simply scalars.

The scope of the for statement is always terminated with a matching end.

Parshvanath College of Engineering(Computer Dept) Page 45


Optical Character Recognition

IF
Conditionally execute statements

Syntax
if expression

statements

end

Description

MATLAB evaluates the expression and, if the evaluation yields a logical true or nonzero
result, executes one or more MATLAB commands denoted here as statements.

When you are nesting ifs, each if must be paired with a matching end.

When using elseif and/or else within an if statement, the general form of the statement is

if expression1

statements1

elseif expression2

statements2

else

statements3

end

Arguments expression
expression is a MATLAB expression, usually consisting of variables or smaller expressions
joined by relational operators (e.g., count < limit), or logical functions (e.g., isreal(A)).
Simple expressions can be combined by logical operators ( &, |,~) into compound

Parshvanath College of Engineering(Computer Dept) Page 46


Optical Character Recognition

expressions such as the following. MATLAB evaluates compound expressions from left to
right, adhering to operator precedence rules.

(count < limit) & ((height - offset) >= 0)

Statements
statements is one or more MATLAB statements to be executed only if the expression is
true or nonzero.

Parshvanath College of Engineering(Computer Dept) Page 47


Optical Character Recognition

ZERO
Transmission zeros of LTI models

Syntax
z = zero(sys)

[z,gain] = zero(sys)

Description
zero computes the zeros of SISO systems and the transmission zeros of MIMO systems. For
a MIMO system with matrices (A,B,C,D), the transmission zeros are the complex values
lmbda for which the normal rank of drops.

Z = zero(sys) returns the (transmission) zeros of the LTI model sys as a column vector.

[z,gain ] = zero(sys) also returns the gain (in the zero-pole-gain sense) if sys is a SISO
system

Parshvanath College of Engineering(Computer Dept) Page 48


Optical Character Recognition

FCLOSE
Close one or more open files.

Syntax
status = fclose(fid)

status = fclose('all')

Description
status = fclose(fid) closes the specified file if it is open, returning 0 if successful and -1 if
unsuccessful. Argument fid is a file identifier associated with an open file.

status = fclose('all') closes all open files (except standard input, output, and error), returning 0
if successful and -1 if unsuccessful.

Parshvanath College of Engineering(Computer Dept) Page 49


Optical Character Recognition

EDGE AND SOBEL


Find edges in an intensity image.

Syntax
BW = edge(I,'sobel')

BW = edge(I,'sobel',thresh)

BW = edge(I,'sobel',thresh,direction)

[BW,thresh] = edge(I,'sobel',...)

----------------

BW = edge(I,'prewitt')

BW = edge(I,'prewitt',thresh)

BW = edge(I,'prewitt',thresh,direction)

[BW,thresh] = edge(I,'prewitt',...)

------

BW = edge(I,'roberts')

BW = edge(I,'roberts',thresh)

[BW,thresh] = edge(I,'roberts',...)

---------------

BW = edge(I,'log')

BW = edge(I,'log',thresh)

BW = edge(I,'log',thresh,sigma)

[BW,threshold] = edge(I,'log',...)

Parshvanath College of Engineering(Computer Dept) Page 50


Optical Character Recognition

-------

BW = edge(I,'zerocross',thresh,h)

[BW,thresh] = edge(I,'zerocross',...)

-----------

BW = edge(I,'canny')

BW = edge(I,'canny',thresh)

BW = edge(I,'canny',thresh,sigma)

[BW,threshold] = edge(I,'canny',...)

Description
edge takes an intensity image I as its input, and returns a binary image BW of the same size
as I, with 1's where the function finds edges in I and 0's elsewhere.edge supports six different
edge-finding methods: The Sobel method finds edges using the Sobel approximation to the
derivative. It returns edges at those points where the gradient of I is maximum

The Prewitt method finds edges using the Prewitt approximation to the derivative. It returns
edges at those points where the gradient of I is maximum.

The Roberts method finds edges using the Roberts approximation to the derivative. It returns
edges at those points where the gradient of I is maximum.

The Laplacian of Gaussian method finds edges by looking for zero crossings after filtering I
with a Laplacian of Gaussian filter. The zero-cross method finds edges by looking for zero
crossings after filtering I with a filter you specify.

The Canny method finds edges by looking for local maxima of the gradient of I. The gradient
is calculated using the derivative of a Gaussian filter. The method uses two thresholds, to
detect strong and weak edges, and includes the weak edges in the output only if they are
connected to strong edges. This method is therefore less likely than the others to be fooled by
noise, and more likely to detect true weak edges.

Parshvanath College of Engineering(Computer Dept) Page 51


Optical Character Recognition

The parameters you can supply differ depending on the method you specify. If you do not
specify a method, edge uses the Sobel method.

Sobel Method
BW = edge(I,'sobel') specifies the Sobel method.

BW = edge(I,'sobel',thresh) specifies the sensitivity threshold for the Sobel method. edge
ignores all edges that are not stronger than thresh. If you do not specify thresh, or if thresh is
empty ([]), edge chooses the value automatically.

BW = edge(I,'sobel',thresh) specifies the sensitivity threshold for the Sobel method. edge
ignores all edges that are not stronger than thresh. If you do not specify thresh, or if thresh is
empty ([]), edge chooses the value automatically.

[BW,thresh] = edge(I,'sobel',...) returns the threshold value.

Parshvanath College of Engineering(Computer Dept) Page 52


Optical Character Recognition

FOPEN
Open a file or obtain information about open files

Syntax
fid = fopen(filename)

fid = fopen(filename, mode)

[fid,message] = fopen(filename, mode, machineformat)

fids = fopen('all')

[filename, mode, machineformat] = fopen(fid)

Description
fid = fopen(filename) opens the file filename for read access. (On PCs, fopen opens

files for binary read access.)

fid is a scalar MATLAB integer, called a file identifier. You use the fid as the first
argument to

other file input/output routines. If fopen cannot open the file, it returns -1. Two file
identifiers are

automatically available and need not be opened. They are fid=1 (standard output) and fid=2

(standard error).

fid = fopen(filename, mode) opens the file filename in the specified mode. The mode

argument can be any of the following:

'r' Open file for reading (default).

'w' Open file, or create new file, for writing; discard existing contents, if any.

'a' Open file, or create new file, for writing; append data to the end of the file.

Parshvanath College of Engineering(Computer Dept) Page 53


Optical Character Recognition

'r+' Open file for reading and writing.

'w+' Open file, or create new file, for reading and writing; discard existing contents, if any.

'a+' Open file, or create new file, for reading and writing; append data to the end of the file.

'A' Append without automatic flushing; used with tape drives.

'W' Write without automatic flushing; used with tape drives.

filename can be a MATLABPATH relative partial pathname if the file is opened for reading
only. A

relative path is always searched for first with respect to the current directory. If it is not
found, and reading only is specified or implied, then fopen does an additional search of the
MATLABPATH.

Parshvanath College of Engineering(Computer Dept) Page 54


Optical Character Recognition

WINOPEN
Open file in appropriate application (Windows only)

Syntax
winopen('filename')

Description
winopen('filename') opens filename in the appropriate Microsoft Windows application. The
winopen function uses the appropriate Windows shell command, and performs the same
action as if you double-click the file in the Windows Explorer. If filename is not in the
current directory, specify the absolute path for filename.

Parshvanath College of Engineering(Computer Dept) Page 55


Optical Character Recognition

FPRINTF
Write formatted data to file

Syntax

count = fprintf(fid,format,A,...)

Description
count = fprintf(fid,format,A,...) formats the data in the real part of matrix A (and in any
additional matrix arguments) under control of the specified format string, and writes it to the
file associated with file identifier fid. fprintf returns a count of the number of bytes written.

Argument fid is an integer file identifier obtained from fopen. (It can also be 1 for standard
output (the screen) or 2 for standard error. Omitting fid causes output to appear on the screen.

Format String

The format argument is a string containing C language conversion specifications. A


conversion specification controls the notation, alignment, significant digits, field width, and
other aspects of output format. The format string can contain escape characters to represent
nonprinting characters such as newline characters and tabs.

Conversion specifications begin with the % character and contain these optional and required
elements:

Flags (optional)

Width and precision fields (optional)

A subtype specifier (optional)

Conversion character (required)

Parshvanath College of Engineering(Computer Dept) Page 56


Optical Character Recognition

LOAD
Load workspace variables from disk

Syntax
load

load('filename')

load('filename', 'X', 'Y', 'Z')

load('filename', '-regexp', exprlist)

load('-mat', 'filename')

load('-ascii', 'filename')

S = load(...)

load filename -regexp expr1 expr2 ...

Description
load loads all the variables from the MAT-filematlab.mat, if it exists, and returns an error if it
doesn't exist.

load('filename') loads all the variables from filename given a full pathname or a

MATLABPATH relative partial pathname. If filename has no extension, load looks for a file
named filename.mat and treats it as a binary MAT-file. If filename has an extension other
than .mat, load treats the file as ASCII data.

load('filename', 'X', 'Y', 'Z') loads just the specified variables from the MAT-file. The
wildcard '*' loads variables that match a pattern (MAT-file only).

load('filename', '-regexp', exprlist) loads those variables that match any of the

Parshvanath College of Engineering(Computer Dept) Page 57


Optical Character Recognition

regular expressions in exprlist, where exprlist is a comma-delimited list of quoted regular


expressions.

load('-mat', 'filename') forces load to treat the file as a MAT-file, regardless of file extension.
If the file is not a MAT-file, load returns an error.

load('-ascii', 'filename') forces load to treat the file as an ASCII file, regardless of file
extension. If the file is not numeric text, load returns an error.

S = load(...) returns the contents of a MAT-file in the variable S. If the file is a MAT-file, S is
a struct containing fields that match the variables retrieved. When the file contains ASCII
data, S is a double-precision array.

load filename -regexp expr1 expr2 ... is the command form of the syntax.

Use the functional form of load, such as load('filename'), when the file name is

stored in a string, when an output argument is requested, or if filename contains spaces. To


specify a command-line option with this functional form, specify any option as a string
argument, including the hyphen

Parshvanath College of Engineering(Computer Dept) Page 58


Optical Character Recognition

UNLOAD
Remove the current target application from the target PC

Syntax

MATLAB command line

unload(target_object)

target_object.unload

Arguments

target_object Name of a target object that represents a target application.

Description

Method of a target object. The kernel goes into loader mode and is ready to download new
target application from the host PC.

Parshvanath College of Engineering(Computer Dept) Page 59


Optical Character Recognition

6.2 Screenshots

Input image

Parshvanath College of Engineering(Computer Dept) Page 60


Optical Character Recognition

Filtered Image

Parshvanath College of Engineering(Computer Dept) Page 61


Optical Character Recognition

Inverted Clipped

Parshvanath College of Engineering(Computer Dept) Page 62


Optical Character Recognition

Character wise Clipped Image

Parshvanath College of Engineering(Computer Dept) Page 63


Optical Character Recognition

Output

Parshvanath College of Engineering(Computer Dept) Page 64


Optical Character Recognition

7. Testing

Unit Testing:
In computer programming, a unit test is a method of testing the correctness of particular
module of source code. Each functions in Source Code has been tested individually. Testing
Results were found satisfactory. Each function in source code has given positive results while
working on Images with various extensions.

White Box Testing:


White box testing is also called glass box testing, transparent box testing, structural testing. It
uses an internal perspective of the system to design test cases based on internal perspective of
the system to design test cases based on internal structure. In OCR , testing has been done on
all internal structures in code and results were found satisfactory.

Black Box Testing:


Black Box Test treats the system as a “black-box”, so it doesn’t explicit use knowledge of the
internal structure. Black Box Test design is usually described as focusing on testing
functional requirements. Synonyms for black box include: Behavioral, functional opaque-box
and closed-box. Optical Character Recognition System has been tested for various images
(such color images and gray images, images with various color background images) and
results were found satisfactory.

Parshvanath College of Engineering(Computer Dept) Page 65


Optical Character Recognition

8. Future
Ø Recognition will be done from captured images rather than from the actual item
Ø Robotics and computer vision
Ø mixing methodologies and making more use of context.
Ø Processing of skewed documents
Ø Grouping of symbols and indentification of characters.

Parshvanath College of Engineering(Computer Dept) Page 66


Optical Character Recognition

9. Conclusion
What does the future hold for OCR? Given enough entrepreneurial designers and sufficient
research and development dollars, OCR can become a powerful tool for future data entry
applications. However, limited availability of funds in a capital-short environment could
restrict the growth of this technology. It will be very difficult to identify a single application
that could generate a sufficient return on investment for extensive research. Marketing
professionals will have to create enough general use applications to justify these
expenditures. But, given the proper impetus and encouragement, the automated entry of data
by OCR is one of the most attractive, labor reducing technologies available.

Parshvanath College of Engineering(Computer Dept) Page 67


Optical Character Recognition

10. Biblography
10.1 Internet References:

Ø http://en.wikipedia.org/wiki/Optical_character_recognition

Ø http://www.mathworks.com/

Ø http://www.mathworks.com/image-video-processing/

Ø http://www.mathtools.net/MATLAB/

10.2 Books:

Ø Gonzalez Rafael .C ”Digital Image Processing” Pearson Education Second


Edition, Upper Saddle River New Jersey,USA,2002
Ø Chapman Stephen .J ”MATLAB programming for Engineering ” second
edition, 2002
Ø Forsyth David .A, Ponce Jean” Computer Vision-A Modern Approach”
Pearson Education, First Edition, Upper Saddle River New Jersey, USA,
2003

Parshvanath College of Engineering(Computer Dept) Page 68


Optical Character Recognition

Table of Contents
1. INTRODUCTION .............................................................................................................................. 1
1.1 Introduction to Project: .......................................................................................................... 1
. ........................................................................................................................................................... 1
1.2 Why this Project: .................................................................................................................... 3
1.3 Advantages: ............................................................................................................................ 3
1.4 Applications: ........................................................................................................................... 4
Basic Application: ........................................................................................................................... 4
Banking: .......................................................................................................................................... 4
Healthcare: ..................................................................................................................................... 4
Digital Library: ................................................................................................................................ 4
Invoice and Shipping Receipt Processing: ...................................................................................... 4
Robotics and Computer vision: .................................................................................................... 5
Document identification & Automatic number-plate readers:...................................................... 5
2. Review of Literature ....................................................................................................................... 6
2.1 Goal: ....................................................................................................................................... 6
2.2 Other Existing Systems: .......................................................................................................... 6
Face Recognition: ........................................................................................................................... 6
Speech Recognition : ...................................................................................................................... 6
3. System Analysis: ............................................................................................................................. 7
3.1 Requirement Analysis............................................................................................................. 7
Hardware Requirements: ............................................................................................................... 7
Software Requirements: ................................................................................................................ 7
WHY MATLAB? ............................................................................................................................... 7
3.2 Software Requirement Specification:.................................................................................... 9
3.3 Feasibility:............................................................................................................................. 10
3.4 Risk Analysis: ........................................................................................................................ 11
Risk Identification: ........................................................................................................................ 11
Risk Projection : ............................................................................................................................ 12
Sort the risk table with respect to the impact values .................................................................. 13

Parshvanath College of Engineering(Computer Dept) Page 69


Optical Character Recognition

4. Project Planning: .......................................................................................................................... 16


4.1 Timeline Chart ...................................................................................................................... 16
4.1 Gantt Chart: .......................................................................................................................... 19
5. System Design: ............................................................................................................................. 20
5.1 Block Diagram....................................................................................................................... 20
5.2 Structural Design: ................................................................................................................. 22
DATA FLOW DIAGRAM FOR OCR SYSTEM ........................................................................................ 22
5.3 Flowchart of OCR system: .................................................................................................... 23
6. IMPLEMENATION ......................................................................................................................... 31
6.1 Functions used ..................................................................................................................... 31
BWLABEL ...................................................................................................................................... 31
FIND: ............................................................................................................................................. 32
IMREAD......................................................................................................................................... 33
MAX .............................................................................................................................................. 34
MIN ............................................................................................................................................... 35
IMVIEW......................................................................................................................................... 36
FUNCTION..................................................................................................................................... 38
ISEMPTY........................................................................................................................................ 40
MEDFILT2...................................................................................................................................... 41
WHILE ........................................................................................................................................... 42
SIZE ............................................................................................................................................... 44
FOR ............................................................................................................................................... 45
IF ................................................................................................................................................... 46
ZERO ............................................................................................................................................. 48
FCLOSE .......................................................................................................................................... 49
EDGE AND SOBEL.......................................................................................................................... 50
FOPEN ........................................................................................................................................... 53
WINOPEN ..................................................................................................................................... 55
FPRINTF ........................................................................................................................................ 56
LOAD ............................................................................................................................................. 57
UNLOAD........................................................................................................................................ 59
6.2 Screenshots .......................................................................................................................... 60

Parshvanath College of Engineering(Computer Dept) Page 70


Optical Character Recognition

Input image .................................................................................................................................. 60


Inverted Clipped ........................................................................................................................... 62
Output .................................................................................................................................................. 64
7. Testing .......................................................................................................................................... 65
8. Future ........................................................................................................................................... 66
9. Conclusion .................................................................................................................................... 67
10. Biblography .............................................................................................................................. 68
10.1 Internet References:............................................................................................................. 68
10.2 Books: ................................................................................................................................... 68

Parshvanath College of Engineering(Computer Dept) Page 71

You might also like