ANPD Project Documentation
ANPD Project Documentation
A Project Report
In
ELECTRONICS AND COMMUNICATION ENGINEERING
By
DIVYASRI KARTHEEK
SRUTHILAYA ANAND PAUL
BONAFIDE CERTIFICATE
This is to certify that this project work entitled
“TELUGU HANDWRITTEN
RECOGNITION USING CNN”
Is the bonafied work of
Mr/Miss……………………………………………………………………………….
Regd.no ……………………………. of final year B.Tech along with his/her batch
mates submitted in partial fulfilment of the requirements for the award of Degree in
Bachelor of Technology in Electronics and Communication
Engineering during the academic year2016-2020.
Guide: Head of the Department:
This is to certify that we had examined the report and hereby accord our
approval of it as a project carried out and presented in a manner required for its
This approval does not necessarily endorse or accept every statement made,
opinion expressed or conclusion drawn as recorded in the project; it only signifies the
Our most sincere and grateful acknowledgement to our alma mater SAGI RAMA
KRISHNAM RAJU ENGINEERING COLLEGE for giving us the opportunity to fulfill our
We are highly indebted to Asst. Prof, Smt B.Revathi Department of Electronics and
Communication Engineering, our project guide for giving valuable and timely suggestions
Electronics and Communication Engineering, for his kind cooperation in the successful
We are grateful to our principal , providing us with necessary facilities to carry out
our project.
We extend our sense of gratitude to all our teaching and non-teaching staff and all our
college , affiliated to JNTU, KAKINADA comprises only our original work and due
A.DIVYASRI(18B91A0409)
B.SRUTHILAYA(18B91A0421)
KARTHEEK(18B91A0457)
ANANDPAUL(18B91A0410)
CONTENTS
`
ABSTRACT
Every country's traffic control and vehicle owner identification has become a major issue. It
can be difficult to identify the owner of a vehicle that violates traffic laws and drives too fast.
As a result, it is impossible to apprehend and punish such individuals because traffic officers
may be unable to retrieve the vehicle number from a moving vehicle due to the vehicle's
speed.
As a result, one of the solutions to this problem was the development of Automatic Number
Plate Recognition (ANPR). We have proposed an Optical Character Recognition (OCR)
technique using the ResNet 50 network model which has higher character level accuracy
rates with the goal of exploring current challenges such as working in low and bright light,
blurred images, and detecting fast moving vehicles. Characters in number plate are pre-
processed and segmented using morphological operation and we have achieved good
recognition rate.Initially we have achieved an accuracy of 90%. Due to some confusing
characters in printed letters (like ‘B’ and ‘8’) , the character recognition is mismatched. Use
conditional rendering to get the best possible results. Finally we have achieved an accuracy
of more than 90%
CHAPTER 1:INTRODUCTION
1.1Chapter overview:
Technologies toward smart vehicles, smart cities, and intelligent transportation systems continue
to transform many facets of human life. As a consequence, technologies such as automatic
number plate recognition (ANPR) have become part of our everyday activities. Moreover, the
concept of ANPR is promising to contribute toward various use cases while eliminating human
intervention.
Automatic number plate recognition is a computer vision practice that allows devices to read
license number plates on vehicles quickly and automatically, without any human interaction.
Hence, ANPR is used to capture and identify any number plate accurately through the use of
video or photo footage from cameras.
ANPR is one of the most accurate applications of computer vision systems. Systems for
automated number plate recognition use optical character recognition (OCR) to read vehicle
registration plates. Cameras capture high-speed images of number plates, and software for image
processing is used to detect characters, verify the sequence of those characters, and convert the
number plate image to text.
Typical ANPR systems include a digital image capture unit (camera), a processing unit, and
different algorithms for video analytics. In addition, the use of infrared lighting allows such
systems to capture vehicle registration plates at night, making it possible to operate ANPR at all
hours of the day.
1. Firstly, the ANPR camera captures images that contain a license plate (video stream or
photo).
2. Then, the plate is detected using machine learning and computer vision processes (Object
Detection)
3. Finally, OCR software is applied to the detected plate area to return the license plate
number in text format. The converted number is usually stored in a database for
integration with other IT systems.
1.2Motivation:
Similarly to CCTV, automatic number plate recognition systems can provide you with the details
regarding when someone was at your premises, whenever they are required. The images taken by
this camera can be used as evidence and can provide valuable information that can be used in
investigations.
1.3Problem definition:
The vehicle number plate is considered as an input image, the system should extract that number
from the image and should search the database for that recognized number plate. It should
recognize the number plates even in the low light or shadow like conditions.
1.4 Work carried out in thesis:
The image of a vehicle whose number plate is to be recognised is taken from a digital camera
which is then loaded to a local computer for further processing. OpenCV Open Source Computer
Vision) is a library of programming functions mainly aimed at real-time computer vision. In
simple language it is a library used for Image Processing. It is mainly used to do all the
operations related to Images. Python, being a versatile language, is used here as a programming
language. Python and its modules like Numpy, Scipy, Matplotlib and other special modules
provide the optimal functionality to be able to cope with the flood of pictures. To enhance the
number plate recognition further, we use a median filter to eliminate noises but it not only
eliminates noise. It concentrates on high frequencies also. So it is more important in edge
detection in an image, generally the number plates are in rectangular shape, so we need to detect
the edges of the rectangular plate. Image Processing mainly involves the following steps:
1. Image acquisition: This is the first step or process of the fundamental steps of digital image
processing. Image Acquisition is the capturing of an image by any physical device (in this case
the primary camera of the computer) so as to take the input as a digital image in the computer.
2. Image Enhancement: Image enhancement is among the simplest and most appealing
areas of digital image processing. Basically, the idea behind enhancement techniques is to bring
out detail that is obscured, or simply to highlight certain features of interest in an image. Such as,
changing brightness & contrast etc. In this step the quality or rather the clarity of the input image
is enhanced and the image is made clear enough to be processed.
6. Recognition: Recognition is the process that assigns a label, such as, “Plate”to an object based
on its descriptors.
OCR systems are made up of a combination of hardware and software that is used to convert
physical documents into machine-readable text. Hardware, such as an optical scanner or
specialized circuit board is used to copy or read text while software typically handles the
advanced processing. Software can also take advantage of artificial intelligence (AI) to
implement more advanced methods of intelligent character recognition (ICR), like identifying
languages or styles of handwriting.
Fig:3.1The different areas of character recognition
The process of OCR is most commonly used to turn hard copy legal or historic documents into
PDFs. Once placed in this soft copy, users can edit, format and search the document as if it was
created with a word processor.
The first step of OCR is using a scanner to process the physical form of a document. Once all
pages are copied, OCR software converts the document into a two-color, or black and white,
version. The scanned-in image or bitmap is analyzed for light and dark areas, where the dark
areas are identified as characters that need to be recognized and light areas are identified as
background.
The dark areas are then processed further to find alphabetic letters or numeric digits. OCR
programs can vary in their techniques, but typically involve targeting one character, word or
block of text at a time. Characters are then identified using one of two algorithms:
1. Pattern recognition- OCR programs are fed examples of text in various fonts and formats
which are then used to compare, and recognize, characters in the scanned document.
2. Feature detection- OCR programs apply rules regarding the features of a specific letter or
number to recognize characters in the scanned document. Features could include the
number of angled lines, crossed lines or curves in a character for comparison. For
example, the capital letter “A” may be stored as two diagonal lines that meet with a
horizontal line across the middle.
The origin of character recognition can actually be traced to as far back as 1870. This was the
year that C.R. carey invented the Retina Scanner which was an image transmission system using
a mosaic of photocells. Further developments are as follows.
• The first attempts of OCR are credited to Tauschek in 1929 and Handel in 1933. In 1929
Gustav Tauschek obtained a patent on OCR in Germany, followed by Handel who obtained a US
patent on OCR in 1933. Tauschek's machine was a mechanical device that used templates and a
photodetector.
• In 1950, David H. Shepard addressed the problem of converting printed messages into
machine language for computer processing and built a machine to do this. Shepard, I Page 63
Chapter 4: Pre-processing, Noise Removal, and Segmentation founded Intelligent Machines
Research Corporation (IMR), which went on to deliver the world's first several OCR systems
used in commercial operation. • In about 1965, Reader's Digest and RCA collaborated to build an
OCR Document reader designed to digitize the serial numbers on Reader Digest coupons
returned from advertisements. This reader was followed by a specialized document reader
installed at TWA where the reader processed Airline Ticket stock. The readers processed
documents at a rate of 1,500 documents per minute, and checked each document, rejecting those
it was not able to process correctly. The product became part of the RCA product line as a reader
designed to process 'Turnaround Documents' such as those Utility and insurance bills returned
with payments.
• The United States Postal Service has been using OCR machines to sort mail since 1965 based
on technology devised primarily by the prolific inventor Jacob Rabinow. The first use of OCR in
Europe was by the British General Post Office (GPO). Canada Post has been using OCR systems
since 1971. OCR systems read the name and address of the addressee at the first mechanized
sorting center and print a routing barcode on the envelope based on the postal code. Envelopes
may then be processed with equipment based on simple barcode readers.
• In 1974, Ray Kurzweil started the company Kurzweil Computer Products, Inc. and led
development of the first omni-font optical character recognition system - a· computer program
capable of recognizing text printed in any normal font. He created a reading machine for the
blind in 1976. This device required the invention of two enabling technologies - the CCD flatbed
scanner and the text-to-speech synthesizer.
• In 1978, Kurzweil Computer Products began selling a commercial version of the optical
character recognition computer program. LexisNexis was one of the first customers who bought
the program to upload paper legal and news documents onto its nascent online databases. Two
years later, Kurzweil sold his company to Xerox, which had an interest in further
commercializing paper-to-computer text conversion. Kurzweil Computer Products became a
subsidiary of Xerox known as Scansoft, now Nuance Communications.
CHAPTER 4: IMAGE SEGMENTATION AND EDGE BASED METHOD
1.Thresholding Method: -
To identify certain threshold values, use the image's Histogram peaks. It is the
simplest method for image segmentation, and no prior knowledge is required. The restriction is
strongly reliant on peaks. Binary images can be created via thresholding.
2. Edge Based Method: -
3. Region-based method:
This method is based on splitting a picture into homogeneous regions that are
more noise-resistant, and it is beneficial when defining similarity criteria is simple. The
disadvantage is that the process is time and memory intensive.
4. Clustering Method:
It is based on homogeneous cluster division. This is primarily used to solve
problems in real time. However, establishing the membership function is difficult in this case. As
a result, we will not always prefer this strategy.
5. Watershed Method:
Image processing on computers serves two purposes: Create more appropriate visuals for
individuals to observe and recognise first. Second, we wish that computers could recognise and
understand images automatically. The most fundamental aspects of an image are its edges. It
provides a variety of internal image information. As a result, edge detection is one of the most
important image processing research projects.
There are numerous methods for detecting edges (Sobel [1,2], Prewitt [3], Roberts [4],
Canny [5]). These techniques have been presented for identifying image transitions.
Edges define boundaries and are hence a critical issue in image processing. Images with
strong intensity contrasts — a jump in intensity from one pixel to the next – are called
edges. Edge detection minimizes the quantity of data in an image and filters away
unnecessary information while keeping the image's crucial structural qualities. Edge
detection can be done in a variety of methods. The majority of approaches, however, may
be divided into two categories: gradient and Laplacian. The gradient approach finds edges
by looking for maximum and minimum values in the image's first derivative.
If we take the gradient of this signal (which, in one dimension, is just the first derivative with
respect to t) we get the following:
Figure 3.2 gradient of the above signal
The derivative clearly reveals a maximum in the original signal at the middle of the edge.
The Sobel method is part of the "gradient filter" family of edge detection filters, which
includes this way of locating an edge. If the gradient value exceeds a certain threshold, a
pixel position is deemed an edge location. Edges have higher pixel intensity values than the
surrounding intensity values. After setting a threshold, we can compare the gradient value to
the threshold value and detect an edge anytime the threshold is surpassed. Furthermore, the
second derivative is zero when the first derivative is at its greatest. As a result, finding the
zeros in a matrix provides another option for locating the location of an edge.
The Sobel operator performs a 2-D spatial gradient measurement on an image and
emphasizes regions of high spatial gradient that correspond to edges. Typically, it is used to
find the approximate absolute gradient magnitude at each point in an input grayscale image.
Compared to other edge operator, Sobel has two main advantages:
1.Since the introduction of the average factor, it has some smoothing effect on the
random noise of the image.
2.Because it is the differential of two rows or two columns, so the elements of the
edge on both sides have been enhanced, so that the edge seems thick and bright.
Canny Edge Detection is a popular edge detection algorithm. It was developed by John F.
Canny in 1986. It is a multi-stage algorithm and we will go through each stage.
1. Noise Reduction
Since edge detection is susceptible to noise in the image, the first step is to remove the
noise in the image with a 5x5 Gaussian filter. We have already seen this in previous chapters.
Smoothened image is then filtered with a Sobel kernel in both horizontal and vertical
direction to get the first derivative in horizontal direction ( ) and vertical direction ( ). From
these two images, we can find edge gradient and direction for each pixel as follows:
After getting gradient magnitude and direction, a full scan of the image is done to
remove any unwanted pixels which may not constitute the edge. For this, at every pixel, the
pixel is checked if it is a local maximum in its neighborhood in the direction of gradient.
Check the image below:
Point A is on the edge (in vertical direction). Gradient direction is normal to the edge. Point
B and C are in gradient directions. So, point A is checked with point B and C to see if it
forms a local maximum. If so, it is considered for the next stage, otherwise, it is suppressed
(put to zero).
In short, the result you get is a binary image with “thin edges”.
4. Hysteresis Thresholding :
This stage decides which all edges are really edges and which are not. For this, we
need two threshold values, min Val and max Val. Any edges with intensity gradient more
than max Val are sure to be edges and those below min Val are sure to be non-edges, so
discarded. Those who lie between these two thresholds are classified edges or non-edges
based on their connectivity. If they are connected to “sure-edge” pixels, they are considered
to be part of edges. Otherwise, they are also discarded. See the image below:
Fig3.5 Hysteresis Thresholding
The edge A is above the max Val, so considered as “sure-edge”. Although edge C is
below max Val, it is connected to edge A, so that is also considered a valid edge and we get
that full curve. But edge B, although it is above min Val and is in the same region as that of
edge C, it is not connected to any “sure-edge”, so that is discarded. So it is very important
that we have to select min Val and max Val accordingly to get the correct result.
This stage also removes small pixel noises on the assumption that edges are long
lines. So what we finally get is strong edges in the image.
Sobel Operator:
The Sobel operator's greatest advantage is its simplicity. The gradient magnitude is
approximated using the Sobel method. The Sobel operator also has the ability to recognise
edges and their orientations. The detection of edges and their orientations is said to be simple
in this cross operator due to the approximation of the gradient magnitude. The Sobel
approach, on the other hand, has several downsides. That is, it is affected by noise. As the
amount of noise in the image rises, the magnitude of the edges will deteriorate. When a
result, as the magnitude of the edges diminishes, Sobel operator accuracy suffers. Overall,
the Sobel approach is unable to detect a thin and smooth edge accurately.
Any noise in an image can be eliminated with a Gaussian filter. The second benefit is that the
signal is enhanced in relation to the noise ratio. The non-maxima suppression method is used
since the output is one pixel broad ridges. The third benefit of using the thresholding method is
that it improves edge identification, especially in noisy situations. Adjustable parameters
influence the efficiency of the Canny technique. Fine filters are preferred for detecting small,
crisp lines since they generate less blurring. For identifying larger, smoother edges, large filters
are preferred. However, it increases the likelihood of blurring. The main downside of the Canny
edge detector is that it takes a long time to compute due to its complexity.
CHAPTER 5
INTRODUCTION OF NEURAL NETWORKS
The word "neural networks" conjures up images of brain activity. It conjures up images of robots
that resemble brains, and it could be packed with the Frankenstein mythos' science fiction
undertones. One of the book's main goals is to deconstruct neural networks and demonstrate
how, while they do have something to do with brains, their study intersects with other fields of
science, engineering, and mathematics. Although some mathematical notation is required to
quantitatively explain particular principles, procedures, and structures, the goal is to do it in as
non-technical a manner as possible. Nonetheless, all symbols and expressions will be explained
as they come, so that they do not get in the way of the essentials, which are the thoughts and
ideas that are being discussed.
To flesh this out a little we first take a quick look at some basic neurobiology. The human
brain consists of an estimated (100 billion) nerve cells or neurons, a highly stylized example of
which is shown in Figure. Neurons communicate via electrical signals that are short-lived
impulses or "spikes" in the voltage of the cell wall or membrane. The interneuron connections
are mediated by electrochemical junctions called synapses, which are located on branches of the
cell and referred to as dendrites. Each neuron typically receives many thousands of connections
from Figure Essential components of a neuron, shown in stylized form. other neurons and is
therefore constantly receiving a multitude of incoming signals, which eventually reach the cell
body. Here, they are integrated or summed together in some way and, roughly speaking, if the
resulting signal exceeds some threshold then the neuron will "fire" or generate a voltage impulse
in response. This is then transmitted to other neurons via a branching fiber known as the axon. In
determining whether an impulse should be produced or not, some incoming signals produce an
inhibitory effect and tend to prevent firing, while others are excitatory and promote impulse
generation. The distinctive processing ability of each neuron is then supposed to reside in the
type—excitatory or inhibitory—and strength of its synaptic connections with other neurons.
An artificial neural network (ANN) is a hardware or software system that mimics the way
neurons in the brain and nervous system work. Artificial neural networks are a type of deep
learning technology that falls under the Artificial Intelligence umbrella.
Deep learning is a subset of machine learning that employs a variety of neural networks. Because
these algorithms are based on how our brains work, many scientists feel they are our best chance
at achieving true AI (Artificial Intelligence).
Different types of neural networks use different principles in determining their own rules.
There are many types of artificial neural networks, each with their unique strengths. You can
take a look at this video to see the different types of neural networks and their applications in
detail.
5.3.1) Feedforward Neural Network – Artificial Neuron
One of the most basic types of artificial neural networks is feedforward. The data in a
feedforward neural network flows via many input nodes before reaching the output node.
To put it another way, data only goes in one direction from the top tier to the output node. This is
also known as a front propagating wave, and it's normally accomplished with the help of a
classifying activation function.
There is no back propagation in this sort of neural network, and data only moves in one way. A
feed-forward neural network might have only one layer or multiple levels.
The sum of the products of the inputs and their weights is determined in a feed-forward neural
network.
A simple feedforward neural network is equipped to deal with data which contains a lot
of noise. Feedforward neural networks are also relatively simple to maintain.
A radial basis function considers the distance of any point relative to the center. Such
neural networks have two layers. In the inner layer, the features are combined with the radial
basis function.
Then the output of these features is taken into account when calculating the same output
in the next time-step. Here is a diagram which represents a radial basis function neural network.
The radial basis function neural network is applied extensively in power restoration
systems. In recent decades, power systems have become bigger and more complex.
This increases the risk of a blackout. This neural network is used in the power restoration
systems in order to restore power in the shortest possible time.
A multilayer perceptron has three or more layers. It is used to classify data that cannot be
separated linearly. It is a type of artificial neural network that is fully connected. This is because
every single node in a layer is connected to each node in the following layer.
A multilayer perceptron uses a nonlinear activation function (mainly a hyperbolic tangent
or logistic function). Here’s what a multilayer perceptron looks like
This type of neural network is applied extensively in speech recognition and machine
translation technologies.
A convolutional neural network (CNN) employs a multilayer perceptual variation. A CNN may
have one or more convolutional layers. These layers can be entirely integrated or grouped
together.
The convolutional layer performs a convolutional operation on the input before transferring the
result to the next layer. The network can be much deeper but with fewer parameters thanks to
this convolutional process.
Convolutional neural networks excel at image and video recognition, natural language
processing, and recommender systems because of this ability.
Convolutional neural networks excel at semantic parsing and paraphrase identification as well.
They're also used in picture categorization and signal processing.
CNNs are also used in agriculture for picture analysis and recognition where meteorological
features are important.
Which is nothing but a dot product of the input function and a kernel function.In case of
Image processing, it is easier to visualize a kernel as sliding over an entire image and thus
changing the value of each pixel in the process.
Fig 4.6 kernel matrix
Receptive field:In neural networks, each neuron receives input from some number of
locations in the previous layer. In a fully connected layer, each neuron receives input from every
element of the previous layer. In a convolutional layer, neurons receive input from only a
restricted subarea of the previous layer. Typically, the subarea is of a square shape (e.g., size 5
by 5). The input area of a neuron is called its receptive field. So, in a fully connected layer, the
receptive field is the entire previous layer. In a convolutional layer, the receptive area is smaller
than the entire previous layer.
The vector of weights and the bias are called filters and represent particular features of
the input (e.g., a particular shape). A distinguishing feature of CNNs is that many neurons can
share the same filter. This reduces memory footprint because a single bias and a single vector of
weights are used across all receptive fields sharing that filter, as opposed to each receptive field
having its own bias and vector weighting.
Distinguishing features:
Local connectivity: following the concept of receptive fields, CNNs exploit spatial locality
by enforcing a local connectivity pattern between neurons of adjacent layers. The architecture
thus ensures that the learned "filters" produce the strongest response to a spatially local input
pattern. Stacking many such layers leads to non-linear filters that become increasingly global
(i.e. responsive to a larger region of pixel space) so that the network first creates representations
of small parts of the input, then from them assembles representations of larger areas.
Shared weights: In CNNs, each filter is replicated across the entire visual field. These
replicated units share the same parameterization (weight vector and bias) and form a feature
map. This means that all the neurons in a given convolutional layer respond to the same feature
within their specific response field. Replicating units in this way allows for the resulting feature
map to be equivariant under changes in the locations of input features in the visual field, i.e. they
grant translational equivariance.
Pooling: In a CNN's pooling layers, feature maps are divided into rectangular sub-regions, and
the features in each rectangle are independently down-sampled to a single value, commonly by
taking their average or maximum value. In addition to reducing the sizes of feature maps, the
pooling operation grants a degree of translational invariance to the features contained therein,
allowing the CNN to be more robust to variations in their positions.
Spatial arrangement:Three hyper parameters control the size of the output volume of the
convolutional layer: the depth, stride and zero-padding.
● The Depth of the output volume controls the number of neurons in a layer that connect to
the same region of the input volume. These neurons learn to activate different features in
the input. For example, if the first convolutional layer takes the raw image as input, then
different neurons along the depth dimension may activate in the presence of various
oriented edges, or blobs of color.
● Stride controls how depth columns around the spatial dimensions (width and height) are
allocated. When the stride is 1 then we move the filters one pixel at a time. This leads to
heavily overlapping receptive fields between the columns, and also to large output
volumes. When the stride is 2 then the filters jump 2 pixels at a time as they slide around.
Similarly, for any S > 0 integer a stride of S causes the filter to be translated by S units at
a time per output. In practice, stride lengths of S >= 3 are rare. The receptive fields
overlap less and the resulting output volume has smaller spatial dimensions when stride
length is increased.
● Sometimes it is convenient to pad the input with zeros on the border of the input volume.
The size of this padding is a third hyper parameter. Padding provides control of the
output volume and spatial size. In particular, sometimes it is desirable to exactly preserve
the spatial size of the input volume.
As a result, a large and complex computational process can be done significantly faster
by breaking it down into independent components. The computation speed increases because the
networks are not interacting with or even connected to each other. Here’s a visual representation
of a Modular Neural Network.
Fig 4.8 Modular Neural Network
ADVANTAGES:
● Added security
ANPR largely acts as a deterrent. The knowledge that their number plate is being recorded and
checked is usually enough to stop criminal behavior in advance. ANPR is also useful for the
police, who can browse the data collected and check for suspicious vehicles, or vehicles that
were involved in a crime. Thanks to the need to store the data for a short while, ANPR data can
provide both alibis and incriminating data. ANPR also provides security on a lower level, such as
open workplace parking where it can manage permit parking for staff vehicles, or recognise a
vehicle that has previously been banned from your premises. ANPR offers an extra measure of
security for both public and private use.
● Automated service
ANPR cameras are an efficient and cost-effective way to monitor parking solutions. In car parks,
they negate the need for parking wardens. Thanks to their high-accuracy readings and 24/7
operation, they are more efficient than most individuals and therefore provide a more dependable
service. They also offer a confrontation-free parking solution, which some have found to be
beneficial when delivering fines to drivers. Parking management teams often find that both
traffic personnel and ANPR cameras work well together, especially in traffic and parking
enforcement, where staff can rely on ANPR to provide the necessary information, minimizing
the time they spend on the streets.
● Real-time benefits
ANPR is beneficial to many industries thanks to the real-time imaging it offers. Historically,
number plate recording would take time, and then longer still to send out penalty notices to those
who violate traffic laws. With ANPR however, number plates can be recognised and checked
against the database almost instantaneously. From this, it takes as little as 48 hours to issue a
penalty notice. The fast nature of these cameras allows for an immediate response to criminal
activity, making sure no unwanted behavior goes unchecked.
● Cost-effective
As well as being easier and more efficient, ANPR technology is also one of the most cost-
effective solutions for managing your car park. You will be able to cut costs and reduce the need
for security personnel when you choose this smart solution. Many companies will also issue
fines to anyone picked up by their ANPR system that shouldn’t be on their private property or
anyone that has exceeded the maximum time limit. This can bring in extra money for the
company and may even end up paying for this security solution.
DISADVANTAGES:
● Privacy Concerns
Using ANPR cameras raises privacy concerns for many people, who dislike the idea of their data
being stored for months. There is the concern that storing information could lead to data leaks
and theft, or misuse of their personal information. People also dislike the idea of their
whereabouts being known at all times. However, ANPR is not considered an infringement on an
individual’s privacy, and the data is always stored securely and should only be accessed for good
reason by a senior official.
● Extreme circumstances
While a great addition to a car park, ANPR is not a fool-proof method. ANPR cameras may
struggle to work in adverse weather conditions, such as heavy rain, or snow, where the number
plate is obscured or distorted. These cameras also rely on sensible driving from cars. For
example, if, when leaving the car park, you were too close to the car in front and your number
plate was obscured, the cameras may not recognise that you had left, and could end up
overcharging you. Not only this, some ANPR cameras are not advanced enough to recognise
number plates that vary from the standard, such as vanity or foreign plates. In these situations, it
would be useful to mix both personnel and automated systems.
● Human behavior
A disadvantage of ANPR parking systems is that they rarely take into account human error and
behavior. ANPR systems do not usually consider giving a grace period when you enter a car
park. This means that those drivers who enter the car park and don’t find a space can be charged,
as the camera saw them enter and leave, but can find no matching ticket. Similarly, a mistyped
ticket at a ticket machine, for example, using the letter ‘O’ instead of a zero, can result in a fine,
as the system cannot find a matching ticket to the number plate the cameras read.
5.1 PREPROCESSING:
● The aim of pre-processing is an improvement of the image data that suppresses
unwanted distortions or enhances some image features important for further
processing.
● Pre-processing includes the following stages
Binarization.
Skew removal.
Noise removal.
Processed image.
5.1.1Binarization:
The image obtained after scanning an opened book page usually suffers from various
scanning artefacts. One such major artefact is the Skew defect. This defect reduces the
quality of the scanned images and cause many problems to the process of document image
analysis. It is difficult to understand such documents by the Optical Character Recognizer
(OCR). Some effective methods are present to rectify this error.
Images are often degraded by noises. Noise can occur and obtain during image capture,
transmission, etc. Noise removal is an important task in image processing. In general, the
results of the noise removal have a strong influence on the quality of the image processing
techniques. Several techniques for noise removal are well established in colour image
processing. The nature of the noise removal problem depends on the type of noise corrupting
the image. In the field of image noise reduction, several linear and nonlinear filtering
methods have been proposed. Linear filters are not able to effectively eliminate impulse
noise as they have a tendency to blur the edges of an image. On the other hand, nonlinear
filters are suited for dealing with impulse noise. Several nonlinear filters based on Classical
and fuzzy techniques have emerged in the past few years. For example, most classical filters
that remove simultaneously blur the edges, while fuzzy filters have the ability to combine
edge preservation and smoothing. Compared to other nonlinear techniques, fuzzy filters are
able to represent knowledge in a comprehensible way.
The haar cascade is used for the line or edge detection features and it is fast at computing the
features due to the use of integral images
● Read the .xml file .Convert the image to gray scale and do contouring to detect
number plate in an image by defining width and height
● Read the input image file and mark the number plate
● Convert the image again to gray scale now with the detected number plate
● Perform Morphological transform on the image
● Perform edge detection. Cropping the image so as to just focus on number plate
Step1: Firstly, find the pixel intensities of the image along the width of the image and add
them with the previous value and place them in a list.
Step2: Now take that list and split the list whenever the same value occurs twice (if there is
no character the pixel intensity at that position is “0” so adding the previous value to those
gives the same value).
Step3: Take each value from the list and subtract it from the previous value to have the
character widths.
Step-4: Now we have a list of values which have been segmented perfectly
Our neural network consists of a convolution neural network that is used for feature
extraction. The output of the convolution neural network is then connected to the Resnet
neural network
The Resnet neural network uses different sizes of filters to extract features at different
levels of the image. Our neural network consists of 4 different convolutional filters of
different sizes (1x1 3x3). All the different outputs of the filters are concatenated by using a
concatenation filter and the most important features are extracted by using the max pool
layer.
The creation of the different layers of the neural network can be achieved by using the
below commands in python
initializer=tf.contrib.layers.variance_scaling_initializer())
First, the images are converted into Greyscale which removes unwanted details and then fed
into our neural network. The size of our dataset is more than 2000 samples, Divide the entire
dataset for both testing and training which is 80% of the dataset will go to the training part
and 20% of data for testing We trained with around 2000 images with 20,000 training steps
and with a batch size of 64. Finally achieved an accuracy of more than 90%
CHAPTER 8: RESULTS
The X-axis represents the training steps and the Y-axis represents the accuracy. The graph
first starts at an accuracy of 25% and increases as the number of batches increases and
saturates at around 20,000 samples. Continuous Training after this point leads to overfitting
and decreases the accuracy of the model(test accuracy). We got the highest accuracy.
CHAPTER 9: CONCLUSION
In conclusion to retrieve the vehicle number from a moving vehicle due to the vehicle's
ResNet 50 network model which has higher character level accuracy rates.The time
required for training the model is less for Resnet as compared to other models.The Haar
Cascade algorithm efficiently performs the edge detection of number plate from a input
in printed letters (like ‘B’ and ‘8’) , the character recognition is mismatched. Use
conditional rendering to get the best possible results. Finally we have achieved an
import numpy as np
import pandas as pd
import tensorflow as tf
import cv2
sys.path.append('../src')
%matplotlib inline
IMG = '../data/pages/v2.jpeg'
LANG = 'en_hw'
MODEL_LOC_CHARS = '../models/char-clas/' + LANG + '/CharClassifier'
CHARACTER_MODEL = Model(MODEL_LOC_CHARS)
implt(image)
crop = page.detection(image)
implt(crop)
boxes = words.detection(crop)
lines = words.sort_words(boxes)
def recognise(img):
img = word_normalization(
img,
60,
border=False,
tilt=True,
hyst_norm=True)
# Separate letters
img = cv2.copyMakeBorder(
img,
0, 0, 30, 30,
cv2.BORDER_CONSTANT,
value=[0, 0, 0])
#implt(img)
gaps=segment(img)
#print(gaps)
chars = []
for i in range(len(gaps)-1):
implt(char)
chars.append(char.flatten())
chars = np.array(chars)
word = ''
if len(chars) != 0:
pred = CHARACTER_MODEL.run(chars)
#print(pred)
for c in pred:
word += idx2char(c)
return word
implt(crop)
for line in lines: