0% found this document useful (0 votes)
28 views

Traffic Signs Recognition With Deep Learning

Uploaded by

suyash
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Traffic Signs Recognition With Deep Learning

Uploaded by

suyash
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2018 International Conference on Applied Smart Systems (ICASS'2018)

24-25 November 2018, Médéa, ALGERIA

Traffic signs recognition with deep learning


Djebbara Yasmina Rebai Karima Azouaoui Ouahiba
Department of Electrical Engineering Department of Electrical Engineering Division Productique et Robotique
Ecole Nationale Supérieure de Ecole Nationale Supérieure de Centre de Développement des
Technologie Technologie Technologies Avancées
Algiers, Algeria Algiers, Algeria Algiers, Algeria
[email protected] [email protected] [email protected]

Abstract — In this paper, a deep learning based road traffic more adapted to real images of road signs that do not generally
signs recognition method is developed which is very promising in look like their models.
the development of Advanced Driver Assistance Systems (ADAS)
and autonomous vehicles. The system architecture is designed to Motivated by the success of classification and recognition
extract main features from images of traffic signs to classify them methods, in different domains, based on Deep learning, we are
under different categories. The presented method uses a modified interested in the use of these new advances in Machine learning
LeNet-5 network to extract a deep representation of traffic signs for traffic signs recognition.
to perform the recognition. It is constituted of a Convolutional
The remainder of this paper is organized as follows; the 2nd
Neural Network (CNN) modified by connecting the output of all
convolutional layers to the Multilayer Perceptron (MLP). The section discusses some related works in Traffic Signs
training is conducted using the German Traffic Sign Dataset and Recognition (TSR). In the 3rd section, the datasets used in the
achieves good results on recognizing traffic signs. development of our approach are presented. Section 4 details
the proposed method and section 5 discusses the developed
ideas to improve their performances. Section 6 presents the
implementation results of the network before and after the
Keywords—Classification, Recognition, Artificial Neural application of improvement operations. A summary of the key
Network (ANN), Convolutional Neural Network (CNN), Multilayer points and future works concludes the paper in section 6.
Perceptron (MLP), Deep learning, Artificial Intelligence, Road
signs, Autonomous vehicles.
II. RELATED WORKS
The last decade shows a growth evolution in the
I. INTRODUCTION
development of intelligent transportation systems (ITS) and
Human factor remains the most common cause of road especially ADAS and Self-Driving Cars (SDC). In these
mortality. Indeed, the potentially dangerous choices made by systems, traffic signs detection and recognition is one of the
the driver might be intentional (speed driving, for example) as difficult tasks that confront researchers and developers. This
they might be the result of physical tiredness, drowsiness or a issue is addressed as a problem of detecting, recognizing, and
poor perception and interpretation of seen scenes. The classifying objects (traffic signs) using computer vision and
introduction of autonomous vehicles will certainly reduce these still be a challenge until now.
causes or even make them disappear.
The work presented in this paper focuses on traffic signs
As part of the development of these autonomous vehicles, recognition without the consideration of the detection step. For
particularly driving assistance systems, several manufacturers this purpose, this section discusses only related works from this
and laboratories have oriented their works towards the angle. Traffic signs recognition is divided in two parts: features
exploitation of visual information because of its usefulness for extraction and sings recognition. In the first step, several
the detection of road, vehicles, pedestrians and traffic signs. methods have been proposed, including edge detection [1],
The principle of driving assistance systems aiming at road signs scale invariance feature (SIFT) [2], speeded-up robust feature
recognition is to detect signs, interpret their meaning, then (SURF) [3], Histogram of gradient (HOG) [4] and others. In
transmit the information to the driver (by a projection on a [5], Bag of Words (BOW) exploiting SURF and k-means
windshield, a screen or a smartphone) or even better, transmit classifier was used. Typically, the output of this step is the
the information to the vehicle that carries out the execution input of the classification algorithms for the recognition of the
without needing a human decision. However, given that the road signs. Many algorithms have been used such as K-Nearest
classical approach has been bounded by well-structured models Neighbor (KNN) classifier [3], Support Vector Machine
of traffic signs (undistorted and completely visible models) (SVM) [6] and neural network [5][7] for traffic signs
only, it became necessary to consider real characteristics of the classification. Authors in [5] proposed the evaluation of three
road environment. For this, the current researches are moving methods namely, Artificial Neural Network (ANN), Support
towards the development of recognition systems which are Vector Machine (SVM) and Ensemble subSpace KNN using

978-1-5386-6866-5/18/$31.00 ©2018 IEEE


2018 International Conference on Applied Smart Systems (ICASS'2018)
24-25 November 2018, Médéa, ALGERIA

BoW where every road sign is encoded with 200 features. The traffic signs class by processing them into a 4 layers fully
Multi-layer Perceptron Neural network provides better results. connected network.
Currently, Convolutional networks are gradually replaced
traditional computer vision algorithms for different applications
such as object classification and pattern recognition [7][8]. It is
used for the extraction and the learning of depth description of
the traffic signs. This solution overcomes the step of
descriptors extraction which is very sensitive to different
factors. This network takes 2D image and processes it with
convolution operations. It has the ability to learn a
representative description of image.
Fig. 1. LeNet-5 architecture
III. TRAFFIC SIGNS DATASET
A rich dataset is needed in object recognition based on The training phase of our neural network updates its
neural network in order to train the system and evaluate its parameters Φ (weights and biases) in order to reach an
results. For the purpose of traffic signs classification, we used adequate accuracy value. The update algorithm chosen in our
the German Traffic Sign Benchmark (GTSB) [9] which application is a supervised learning algorithm called gradient
contains 43 classes divided into 3 categories as represented in descent with mini-batches where a multi-dimensional error
table I. function С (depends on all the network parameters, over 70 000
in the case of LeNet-5) is calculated over mini batches of 64
training examples (to avoid calculus over 34799 images at
TABLE I. THE DATASET DISTRIBUTION
every stage). Once the error function is obtained, the algorithm
Category Task Number Shape will search for the function’s decreasing direction by using the
of gradient of each parameter and then update them under the
images
formula (1) [8], where γ is the learning rate:
Training data Used to train the 34799 4 dimensions
network tensor to determine
the image index in  Φt+1 = Φt + γ ∇С(Φ) 
Validation Allows to 4410 the dataset, the
data supervise the pixel’s row-column
and the information The algorithm repeats the described process until it reaches
network
performances it carries (Red Green the desired results. At the end, the parameters of the neural
while training it (a Blue value) network are well trained to know what features the network
reduced version of must extract (convolution phase) and which class it must
testing data) attribute to the input (classification phase).
Testing data Used to evaluate 12630
the final network
V. PERFORMANCE IMPROVEMENT

IV. THE NEURAL NETWORK ARCHITECTURE A. Training data


Using a fully connected neural network to make an image The unbalanced distribution of images in the German
classification requires a large number of layers and neurons in Traffic Sign Benchmark privileges some classes over others
the network, which increases the number of parameters leading during the training phase because they are better represented in
the network to over-fitting (memorizing the training data only). terms of number of images. In order to make sure that the
The input image may also lose its pixels correlation properties learning of the network is well performed, a data augmentation
since all neurons (carrying pixels values) are connected to each of some classes is done by applying some geometric
other [7]. transformations (rotation, translation, and shear mapping) on
Convolutional neural networks have emerged to solve these many of their images as shown in Fig. 2.
problems through their kernel filters to extract main features of
the input image and then inject them into a fully connected
network to define the class [7].
The chosen architecture in our application is LeNet-5
convolutional neural network (Fig. 1) firstly used for
handwritten digits recognition [10]. It contains 9 layers: 5
layers of convolution and simplification functions made by 22
5x5 kernel filters and a max pooling filter of 2x2 to reduce at
last the input image of 32x32 into 16 maps of 5x5. The feature
images carry most important features to define a specified
2018 International Conference on Applied Smart Systems (ICASS'2018)
24-25 November 2018, Médéa, ALGERIA

convolution operation. This layer will then become a 1576


neurons layer instead of 400 neurons layer as shown in Fig. 4.

Fig. 4. Modified LeNet 5 architecture

C. The dropout operation


In neural network training, the dropout method is
established to prevent from over-fitting by shutting down some
neurons during the training phase to give the network a flexible
margin to react to inputs out of the training examples data [11].
In our case, we performed a shutting down of 30% and 50% of
the total neurons of the second and third layers of the fully
connected network respectively. A 90% dropout (shutting
down 10% of neurons) on the layer preceding the fully
connected network is used in addition to the previous
operation.

VI. IMPLEMENTATION RESULTS


Fig. 2. Comparative histograms of data augmentation To built and train the network, the TensorFlow deep
learning library [12] is used. Training and testing were
The algorithm takes only classes with less than 1000 implemented using the dataset described in section III and the
images to pick randomly images and makes one of the developed method succeeds in classifying the 43 traffic signs
transformation operations (Fig. 3). The resulting images are classes.
added to the same class until its elements number reaches the
bias which is 1000 images. The implementation results of the network LeNet-5 and its
improvement operations show the impact of each changed
element. As represented in the curves of Fig. 5, the enrichment
of the first layer of LeNet-5 fully connected network made the
validation accuracy jump from 94,7% (blue) to 95,1% (green)
after 120 iterations of the learning algorithm. The new given
architecture can now combine between many more factors to
classify traffic signs. After applying data augmentation (red),
an accuracy of 95,3% at the 120th iteration is noticed, making
the network performances become even better than the last
ones. It is also due to new balanced property of the training
data in different classes.

Fig. 3. Geometric transformation for data augmentation

B. The network architecture


In its established architecture, LeNet-5 only takes features
resulting from the second convolution operation while the ones
of the first convolution might contain elements as important as
the ones injected into the fully connected network. Considering
this hypothesis, we performed a modification on the first layer
of the fully connected network by adding the results of the first
2018 International Conference on Applied Smart Systems (ICASS'2018)
24-25 November 2018, Médéa, ALGERIA

20 21 22 23

24 25 26 27

Fig. 5. Accuracies evolution after different implementations

In addition to the augmentation of the learning data, a


dropout on 2 layers of the fully connected network is applied 28 29 30 31
and obtained result is a 97,1% validation accuracy (yellow)
which corresponds to a 95,2% test accuracy. However, when Fig. 7. Examples of confusion between traffic signs
the dropout is applied to the fully connected network and one
of convolutional layers the performances decreased from
97,1% to 92,5% validation accuracy (black). This degradation VII. CONCLUSION
is explained by the fact that applying a dropout on a This paper presented a convolutional neural network
convolutional layer means that some neurons which hold implementation used for traffic signs recognition. The basic
convolution results already judged essential to make an object proposed network together with the different improvement
recognition are shut down. operations allowed us to be aware of which parts and phases
that have the control on the system reliability. It is likely that
The obtained results show the effectiveness of the we would obtain better results by reinforcing the convolution
developed method since a validation accuracy of 97,1% is stage of the network with more layers in order to extract more
achieved. However, the built network could not classify some features. It also would be interesting to exclude confusions by
examples correctly as illustrated in Fig. 6. This figure comparing classes with the highest proportions in the confusion
illustrates, in gray levels, the accuracies of correctly recognized matrix and pull out their common factors to reverse them by
traffic signs and the network confused recognition cases. It is image adjustment.
clear that the proposed network achieves 100% recognition rate
for some traffic sign classes. The matrix also shows that the
most confusion situations (in the red circle) concern classes REFERENCES
from 20 to 31 represented in Fig. 7.
[1] P. Dewan, R. Vig, N. Shukla and B. K. Das, “An Overview of Traffic
Signs Recognition Methods,” International Journal of Computer
Applications, Vol. 168 – N..11, June 2017
[2] D. Jianmin and V. Malichenko, “Real time road edges detection and road
signs recognition,” IEEE International Conference on Control,
Automation and Information Sciences (ICCAIS), Changshu, China, 29-
Confusion 31 Oct. 2015
[3] Y. Han, K. Virupakshappa, E. Vitor, S. Pinto and E. Oruklu,
“Hardware/Software Co-Design of a Traffic Sign Recognition System
Using Zynq FPGAs,”, In Electronics journal, 2015, Vol. 4, p. 1062-
1089; doi:10.3390/electronics4041062.
[4] F. Zaklouta and B. Stanciulescu, “Real-Time Traffic-Sign Recognition
Using Tree Classifiers,” IEEE Transactions On Intelligent
Transportation Systems, Vol. 13, N. 4, December 2012, p. 1507-1514.
[5] K. Tohidul Islam, R. Gopal Raj and G. Mujtaba, “Recognition of Traffic
Sign Based on Bag-of-Words and Artificial Neural Network,” Symmetry
journal, 2017, Vol. 9, 138; doi:10.3390/sym9080138.
Fig. 6. Confusion matrix
An important confusion has been observed between classes 21, [6] S. Maldonado-Bascón, S. Lafuente-Arroyo, P. Gil-Jiménez, H. Gómez-
Moreno, and F. López-Ferreras, “Road-Sign Detection and Recognition
24 and 27, which is due to their common category (warning Based on Support Vector Machines,” IEEE Transactions On Intelligent
signs) and their containing of linear symbols. Transportation Systems, Vol. 8, N. 2, June 2007; p. 264-278.
2018 International Conference on Applied Smart Systems (ICASS'2018)
24-25 November 2018, Médéa, ALGERIA

[7] L. Abdi, “Deep learning traffic sign detection, recognition and [11] N. Srivastava, G. Hinton, A. Krizhevsky, I. Stuskever and R.
augmentation,” Proceedings of the Symposium on Applied Computing, Salakhutdinov, “Dropout : A Simple Way to Prevent Neural Networks
Maroc, 2017, p. 131-136. from Overfitting,” Journal of Machine Learning Research, Vol. 15,
[8] Y. Moualek, “Deep learning pour la classification des images,” Master’s 2014, p. 1929-1958.
thesis, Abou Bakr Belkaid University, Tlemcen, 2017. [12] https://www.tensorflow.org
[9] http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
[10] Y. Le Cun, L. Bottou, Y. Bengio and P.Haffner, “Gradient-Based
learning applied to document recognition,” Proceedings of IEEE, Vol.
86, N°11, p. 2278-2324, 1998.

You might also like