0% found this document useful (0 votes)
14 views

Final Report Phase 2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Final Report Phase 2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

SMART BANANA LEAF DISEASE

CLASSIFIER USING CONVOLUTIONAL NEURAL NETWORK

A report submitted in partial fulfillment of the requirements for


the award of the degree of

Bachelor of Technology

in

Department of Electronics and Communication Engineering

by

Chandru S (EC20B1004)
Dheeban Kumar G (EC20B1007)
Mahesh D (EC20B1026)

DEPARTMENT OF
ELECTRONICS AND COMMUNICATION ENGINEERING
NATIONAL INSTITUTE OF TECHNOLOGY PUDUCHERRY
KARAIKAL – 609 609

APRIL 2024
BONAFIDE CERTIFICATE

This is to certify that the project work entitled “Smart banana leaf disease classifier using
Convolutional Neural Network” is a bonafide record of the work done by

Chandru S (EC20B1004)
Dheeban Kumar G (EC20B1007)
Mahesh D (EC20B1026)

in partial fulfillment of the requirements for the award of the degree of Bachelor of
Technology in Electronics and Communication Engineering of the NATIONAL
INSTITUTE OF TECHNOLOGY PUDUCHERRY during the year 2023 - 2024.

Dr. G. Lakshmi Sutha Dr. Malaya Kumar Nath


Associate Professor Assistant Professor
Project Guide Head of the Department

Project viva-voce held on ___________________

Internal Examiner External Examiner


ABSTRACT
Plant diseases are the threat to the food security in terms of loss of yield, low quality of
fruits and food grains and so on. Hence, it is essential to identify and classify the plant diseases
at an early stage to avoid the catastrophic effects. Deep learning models are the recently used
technology for disease classification and identification. This project aims to classify the banana
leaves into 13 different categories (1 healthy class and 4 diseased classes and 8 deficient
classes). Various Convolutional Neural Network (CNN) architectures such as AlexNet, VGG-
16, VGG-19 and ResNet-50, are used for disease classification using the images collected from
the Mendeley dataset and MUSA dataset. The data used for classification is collected from
various sources and contains noisy and inconsistent data. This makes data analysis mandatory.
In order, to rectify the issues, image pre-processing is done using the nine distinct image
enhancement techniques. It is found from the performance analysis of the image enhancement
techniques that Single Image Super Resolution (SISR) performs well as compared to all other
algorithms. The images are preprocessed using the SISR and applied to the CNN models for
banana leaves disease classification.
It is found from the performance comparison of AlexNet and VGG-16 that AlexNet performs
well in terms of training time and number of parameters. Thus, further analysis on AlexNet model was
performed. There are 4 models of AlexNet considered based on changing the learning rates (0.1, 0.01,
0.001, and 0.0001), changing the optimizers (Stochastic Gradient Descent with Momentum (SGDM)
and Adaptive Moment Estimation (ADAM)) and changing the epoch sizes (5, 10 and 15). The CNN
models are tested on Mendeley dataset and MUSA dataset. AlexNet Model with learning rate of 0.0001,
epoch of 10 using ADAM optimizer is identified as the suitable model and it is named as Model 4.
Model 4 is further optimized using transfer learning, structural remodeling and ensemble methods to
improve the accuracy. The ensemble model gives the accuracy of 96.15 %. The DL models are stored
in Dropbox. An access token is generated for Dropbox App access. MATLAB analysis tab in
ThingSpeak is used to download the trained (.mat) files from Dropbox. The input image is
stored in Dropbox and it is downloaded into the ThingSpeak cloud using the MATLAB
analysis tab. The input image will be resized accordingly in ThingSpeak and will be given as
input to the DL model. If the input image is not a leaf image the classifier will request for a
leaf image by printing a message. If leaf image is given as input image, the classifier will
classify the image and displays the class name (one of the 13 classes) as output.

Keywords: Convolutional Neural Networks, Deep learning, Disease classification, Banana


leaf.

i
ACKNOWLEDGEMENTS
We would like to express our deep sense of gratitude to the Lord Almighty for giving
us an opportunity to do this project and showering his blessing in the due course of the project.
We would like to thank the Director In-charge, Dr. Usha Natesan for permitting us to
undertake this project work. We would like to extend our thanks to the Registrar, Dr.
Sundaravardhan S for permitting us to undertake this project work. We extend our sincere
thanks to Dean (Academic), Dr. G Lakshmi Sutha for permitting us to undertake this project
work.
We express our sincerest thanks to our project coordinator Dr. Suresh Balanethiram,
Assistant Professor, Department of ECE for his motivation in various reviews. We would like
to convey our thanks to Dr. Malaya Kumar Nath, Head of the department, and Department
of ECE for his constant support. We want to genuinely convey our thanks to Dr. G Lakshmi
Sutha, Associate Professor and project supervisor, Department of ECE for her valuable inputs,
able guidance, encouragement, whole-hearted cooperation and constructive criticism
throughout the duration of our project.
Last but not the least, we thank our families and friends whose support and suggestions
helped us mold this project.

ii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO NO
ABSTRACT i
ACKNOWLEDGEMENT ii
TABLE OF CONTENTS iii
LIST OF TABLES v
LIST OF FIGURES vi
LIST OF SYMBOLS AND ABBREVIATIONS vii
1 INTRODUCTION
1.1 General Introduction 01
1.2 Motivation 02
1.3 Literature Review 02
1.4 Objectives 06
1.5 Dataset 07
2 IMAGE PRE-PROCESSING TECHNIQUES
2.1 Image Enhancement Techniques 08
2.1.1 Image Enhancement Algorithm 08
2.1.1.1 Histogram Equalization 08
2.1.1.2 Bi-Histogram Equalization 09
2.1.1.3 Dynamic Stretching based Brightness 09
Preservation
2.1.1.4 Contrast Stretching 09
2.1.1.5 Gaussian Filter for Noise Reduction 10
2.1.1.6 Contrast Limited Adaptive Histogram 11
Equalization
2.1.1.7 Dualistic Sub-Image Histogram Equalization 11
2.1.1.8 Single Image Super Resolution 11
2.1.1.9 Adaptive Gamma Correction with Weighting 12
Distribution
2.1.2 Performance Metrices 12
2.1.2.1 Absolute Mean Brightness Error 13
2.1.2.2 Structural Similarity Index 13

iii
2.1.2.3 Peak Signal to Noise Ratio 13
2.1.2.4 Discrete Entropy 13
2.1.2.5 Enhancement Measure 14
2.1.2.6 Lightness Order Error 14
2.1.3 Qualitative and Quantitative Analyses of Image 15
Enhancement Algorithm
2.2 Data Augmentation 18
3 DEEP LEARNING TECHNIQUES
3.1 Methodology 20
3.2 Performance Metrics 20
3.3 Results of Deep learning Techniques 21
4 CLASSIFIER MODEL
4.1 Proposed Model 29
4.2 Dataset 29
4.2.1 Methods to Overcome Overfitting 33
4.3 Discussion on Plant Parts Classifier 33
4.3.1 Dataset 33

5 ACCURACY IMPROVEMENT
5.1 Optimization Techniques 35
5.2 Transfer Learning 35
5.3 Structural Modification 35
5.4 Voting Technique 35
6 CLOUD OPERATION
6.1 Cloud Configuration 38
6.1.1 ThingSpeak 38
6.1.2 Dropbox 38
6.1.3 ThingSpeak and Dropbox configuration 38
7 CONCLUSION
7.1 Conclusion 40
7.2 Future Scope 41
PUBLICATION 42
REFERENCES 43

iv
LIST OF TABLES
Table No Title Page No
1 Banana Leaf Disease Classification Models using Neural 02
Networks
2 Literature review on Image Processing Techniques 05
3 Qualitative Analysis of Image Processing Techniques 16
4 Quantitative Analysis of Image Processing Techniques 17
5 Augmented images 18
6 Augmented Datasets 19
7 Mendeley dataset analysis 28
8 MUSA dataset analysis 28
9 Combined dataset analysis 28
10 Disease-deficiency Dataset Description 30
11 Performance analysis on disease-deficiency dataset 30
12 Dataset Description 33
13 Analysis on Dataset (LeNet) 34
14 Improvement in accuracy after Ensemble Techniques. 37

v
LIST OF FIGURES

Figure Title Page No


No.
1 Analysis on AlexNet Architecture. 22

2 Analysis on VGG- 16 Architecture. 23

3 Analysis on AlexNet Architecture with SGDM. 24

4 Analysis on AlexNet Architecture with ADAM. 26

5 Analysis on Mendeley dataset. 26

6 Analysis on MUSA dataset. 27

7 Analysis on Combined dataset. 27


8 Proposed Model 29
9 Analysis on Disease-deficiency Dataset 31
10 Confusion matrix of Model 1 31
11 Confusion matrix of Model 3 32
12 Confusion matrix of Model 4 32
13 Confusion matrix of LeNet 34
14 Architecture of modified AlexNet 36
15 ThinkSpeak Platform 38
16 MATLAB Analysis Window 39
17 Classification Output of DL model 39

vi
LIST OF SYMBOLS
σ – Standard deviation

π – pi

∑ - Summation

LIST OF ABBREVIATIONS

VGG16 - Visual Geometry Group – 16

ResNet - Residual Network

GLCM – Gray Level Co-occurrence Matrix

DCNN – Deep Convoluted Neural Network

FSVM – Fuzzy Support Vector Machine

TGVFCMS - Total Generalized Variation Fuzzy C Means

ANN – Artificial Neural Network

NGTDM - Neighborhood Gray-Tone Difference Matrix

SGDM - Stochastic Gradient Descent with momentum

ADAM- Adaptive Moment Estimation

CNN - Convolutional Neural Network

SVM - Support Vector Machine

ANN - Artificial Neural Network

KNN - K-Nearest Neighbors

BLR - Binary logistic regression

HE - Histogram equalization

BBHE - Brightness-preserving Bi-Histogram Equalization

vii
DSBP - Dynamic Stretching-Based Brightness Preservation

CLAHE - Contrast Limited Adaptive Histogram Equalization

DSIHE - Dualistic Sub-Image Histogram Equalization

SISR - Single Image Super Resolution

AGCWD - Adaptive Gamma Correction with Weighting Distribution

AMBE - Absolute mean brightness error

SSIM - Structural Similarity Index

PSNR - Peak Signal to Noise Ratio

DE - Discrete Entropy

EME - Enhancement Measure

LOE - Lightness Order Error

MATLAB – Matrix Laboratory

DBX – Dropbox

viii
CHAPTER 1
INTRODUCTION
1.1 GENRAL INTRODUCTION
Agriculture is still one of the largest industries in the world and employs around
27 % of the world population. The yield of a farm is the main source of income for such
a large population. Bananas are grown in 130 countries, primarily in tropical and
subtropical regions. Their origin can be traced back to South-East Asia. Bananas are a
highly sought-after staple food, accounting for nearly 16% of global fruit production
and ranking as the second largest fruit behind citrus. In terms of world trade, bananas
are the fifth most important food crop. Bananas can be consumed raw or processed and
contain various bio-active molecules such as phenolics, carotenoids, biogenic amines,
and phytosterols, which are beneficial for human health and provide great source of
energy. They also have high levels of antioxidants. Historically, bananas have been
used to treat various chronic degenerative disorders. It is a major crop in most of the
developing nations in the world [2]. India is the largest producer of bananas, accounting
for 27% of global production. Global banana production totals 128,778,738 tons from
an area of 5,517,027 hectares. Bangladesh produces 833,309 tons of bananas from
48,850 hectares of land.

Recent problems such as climate change and attacks from pests and pathogens
affects the plant growth to a very large extent and results in lower yields. This plant is
susceptible to various diseases that require early detection for curing them. Many
diseases frequently infect banana crops. Some commonly reported leaf spot diseases
include sigatoka diseases such as exserohilum leaf spot (Exserohilum rostratum),
cordana leaf spot (Cordana musae), plantain zonate leaf spot (Pestalotiopsis
menezesiana), banana freckle disease (Phyllosticta musarum) [1], black sigatoka
(Mycosphaerella fijiensis) [3], eumusae leaf spot (Mycosphaerella eumusae) [4],
yellow sigatoka (Pseudocercospora musicola) [7].

Plant diseases largely limit banana production. In order to curb the disease’s
progression, it is crucial to assess the severity of the disease. Traditionally, plant
pathologists estimate plant disease severity by visually inspecting the disease
symptoms. Unfortunately, this technique is ineffective and very expensive if the area

1
of cultivation is large. Agriculturists are increasingly using automated disease diagnosis
models because of the advent of digital cameras and computer technology. In recent
times, the diagnosis of plant disease severity has been undertaken by deep learning
image based automatic analysis. Recently, Artificial Intelligence (AI) and remote
sensing technology have been used to detect various crop diseases. Some countries have
implemented, deep learning techniques such as LeNet, VGG16, ResNet18, ResNet50,
ResNet152, and InceptionV3 to classify banana leaf diseases [1]. It is essential that the
images used for classification are of same quality, hence image preprocessing is one of
the basic steps in image classification.

1.2 MOTIVATION
The motivation behind the project is to develop an advanced and efficient
system that uses deep learning and computer vision technologies to identify diseases in
banana leaves. By automating disease detection, farmers can quickly and accurately
assess the health of their crops, enabling timely interventions to prevent further spread
and minimize crop losses. This project aims to empower farmers, increase crop yields,
and contribute to a greener and healthier farming ecosystem.

1.3 LITERATURE REVIEW / STATE OF THE ART


Table 1 summarizes the literature review on the banana leaf disease
classification models using various deep learning algorithms.

Table 1. Banana Leaf Disease Classification Models using Neural Networks

Ref. Publisher Dataset Performance


Paper Title Techniques
No. and Year Used Metrics
Banana leaf disease
detection using GLCM
based Feature
Extraction and DCNN along Accuracy=90%;
classification using Real- with feature Precision=86%,
2022
[3] Deep Convoluted time extraction Recall=81%;
Neural Networks using GLCM F1 score=70%
(DCNN), Journal of
Positive School
Psychology

2
CRUN-based leaf
disease segmentation
and morphological- Real-
Image Accuracy=99%;
based stage time and
Hindawi, processing and Sensitivity=99.2%;
[9] identification, public
2022 CRUN Specificity=99.3%
Mathematical datasets
problems in
Engineering
Feature
Banana plant disease extraction
classification using using CNN
Real Accuracy=97.70%;
Hybrid Convolutional and
Hindawi, time Precision=0.97;
Neural Network, classification recall =0.94;
[10] 2022 (3500 using FSVM
Computational
images) (binary SVM+ F1 score=0.95
Intelligence and
Neuroscience multiclass
SVM)
IoT based banana leaf
disease identification Using
system temperature,
humidity, and
International Real- color sensors Accuracy =
Research, Journal of IRJMETS,
[4] time to collect 92.66%
Modernization in 2022 various data
Engineering with arduino-
Technology and UNO
Science
An automated Segmentation
segmentation and using Accuracy=93.45%;
classification model CrossMar CIAT TGVFCMS Sensitivity=89.04
[11] for banana leaf disease k, image and %;
detection, Journal of 2022 library classification Specificity=96.38
Applied Biology & using the CNN %.
Biotechnology technique
Bangaba
BananaSqueezeNet: A ndhu
very fast, lightweight Sheikh
A lightweight
convolutional neural Mujibur
CNN
network for the Rahman
Elsevier, architecture Performance
[1] diagnosis of three Agricult
2022 named Metrix
prominent banana leaf ural
BananaSqueez
diseases, Smart Universit
eNet.
Agricultural y
Technology (BSMR
AU)

3
Detection of banana
leaf and fruit diseases
using Neural
Networks, Second ANN and Accuracy=96.25%;
International IEEE, Real- Segmentation Precision=96.54%;
[12] Recall=96.25%;
Conference on 2020 time using Fuzzy c-
Inventive Research in means F1 score=96.17%
Computing
Applications
(ICIRCA).
support vector
Development of a
machines Mean=0.324;
digital image Standard
(SVM), Bayes
classification system SciELO, Deviation=0.026;
classifiers,
to support technical Brasil Public-
[5]
assistance for Black dataset
decision trees, Variance =0.00005;
2020 Kmeans, K- Skewness =-
Sigatoka detection,
nearest 0.6659;
Revista Brasileira de
neighbors Kurtosis=4.948
Fruticultura.
(KNN)
Dataset of banana
leaves and stem
images for object
detection, Elsevier, Harvard
[8] classification and Datavers NIL
2023 e
segmentation: A case Perception=80%
of Tanzania
Data in Brief
Recognition of banana
Binary logistic
fusarium wilt based on MDPI, Real-
[13] regression
UAV remote sensing, 2020 time
(BLR) NIL
Remote sensing
Disease detection in
banana leaf plants
using DenseNet and oversampling
Real-
[14] Inception Method, IAII, 2022 and under-
time
Jurnal RESTI sampling OA=91.7%;
(Rekayasa Sistem dan Kappa=0.83
Teknologi Informasi)
Banana leaf disease
detection with Multi National Real- Feature Accuracy=84.73%;
Feature Extraction College of time Extraction, Recall=84.73%;
[15] Techniques using Ireland (1289 GLCM and Precision=84.80%;
SVM. 2022 images) NGTDM. F1 score=84.62%
School of Computing

4
Table 2 summarizes the literature review on the various image processing
techniques which are used to enhance the images.

Table 2. Literature review on Image Processing Techniques

Ref. Journal Publisher Dataset Performance


Paper Title
No. Name and Year Used Metrics
A comprehensive
survey on Image AMBE, CII, SD,
contrast Contrast, SSIM,
Sensing and
[16] enhancement 2020 USC-SIPI Entropy, DEN,
Imaging
PSNR and
techniques in GMSD
Spatial Domain

Soft-Edge DIV2K,
IEEE
assisted network Set5, Set14,
Transactions
[22] for Single Image BSDS100, PSNR, SSIM
on Image
2020 Urban100,
Super-Resolution Processing
&Manga109
Improving retinal
image quality
International
using the contrast
Journal of
stretching,
Image,
[19] histogram MSE, PSNR,
Graphics and 2020 STARE
equalization, and and SSIM
Signal
CLAHE methods
Processing
with median
filters
Image contrast
enhancement for International
Brightness Journal of
Real Time Mean and
[18] Preservation Image
Image Median
Based on Processing 2015
Dynamic (IJIP)
Stretching
PSNR (Peak
Contrast
Signal to Noise
enhancement
Ratio), MSE
using improved International
(Mean Square
adaptive gamma Journal of Real Time
[23] Error) and
correction with Computer 2014 (10 Images)
AMBE
weighting Applications
(Absolute Mean
distribution
Brightness
technique
Error).
Realization of the Journal of Real Time
[20] -
Contrast Limited VLSI signal 2004 Image

5
Adaptive processing
Histogram systems for
Equalization signal, image
(CLAHE) for and video
Real-Time Image technology.
Enhancement
Image
enhancement
based on equal IEEE
area Dualistic transactions Real Time
[21] Entropy
Sub-image on Consumer 1999 Image
Histogram Electronics
Equalization
Method
Contrast
enhancement IEEE
using Brightness transactions Real Time
[17] mean
Preserving Bi- on Consumer 1997 Image
Histogram Electronics
Equalization

In this report, the suitable image enhancement techniques are identified from
Table 2 and they are used to enhance the quality of the banana leaf images to improve
the accuracy of the deep learning models. A comprehensive review on ensemble
techniques for DL models was done in [27]. In this report, parallel and sequential
ensemble technique for DL is used.

1.4 OBJECTIVES
The main objective of this work is to perform banana leaf disease classification
for 13 different classes. In order to fulfil this objective, the work is split as follows:
1. To perform Image enhancement to avoid noise and to increase the quality of the
image.
2. To augment the data set to avoid class imbalance problem.
3. To identify suitable DL model that provides better accuracy by using various
techniques like hyperparameter tuning, structural remodeling, transfer learning
and ensemble methods.
4. To send the input image, classifier models to the cloud (ThingSpeak platform)
and implement the entire classification process at the cloud and display the
output.

6
1.5 DATASET
The efficiency of the Deep learning model is based on training and testing of
the model. To perform the training and testing of the deep learning model effectively,
suitable datasets have to be chosen. Thus, two different datasets are identified. They are
Mendeley dataset (Healthy, Xanthomonas, Yellow Sigatoka) [6] and MUSA dataset
(Healthy, Potassium Deficiency, Yellow Sigatoka, Black Sigatoka, Banana Aphids) [8].
The combined dataset is obtained by combining these two datasets. As a result, the
combined dataset has images of 6 classes such as Healthy, Xanthomonas, Potassium
deficiency, Banana aphids, Black Sigatoka and Yellow Sigatoka. Further for early
identification of diseases an enhanced dataset namely Disease-deficiency dataset has
been formed by combining 7 mineral deficiency classes to the existing combined
dataset (1 healthy, 4 disease classes, 8 deficiency classes).

7
CHAPTER 2
IMAGE PRE-PROCESSING TECHNIQUES

2.1 IMAGE ENHANCEMENT TECHNIQUES


A total of nine image enhancement techniques has been compared to find a
better technique, the techniques used are,
 Histogram Equalization (HE)
 Brightness-preserving Bi-Histogram Equalization (BBHE)
 Dynamic Stretching-Based Brightness Preservation (DSBP)
 Contrast Stretching (CS)
 Gaussian Filter for Noise Reduction
 Contrast Limited Adaptive Histogram Equalization (CLAHE)
 Dualistic Sub-Image Histogram Equalization (DSIHE)
 Single Image Super Resolution (SISR)
 Adaptive Gamma Correction with Weighting Distribution (AGCWD)

2.1.1 IMAGE ENHANCEMENT ALGORITHMS


2.1.1.1 Histogram Equalization
Histogram of an image represents the distribution of the intensity values of the
pixels. It gives an insight to, alteration of the pixels of an image in order to visualize
the quality to natural looking Histogram equalization (HE) is one of the popular time
domain techniques due to its easy implementation and performance. This makes HE
more suitable for real world applications [16]. The output image of the histogram
equalization, Y = {Y(i,j)} can be expressed as [17]

y = f(X)
= {f (X (i, j)) |∀X (i, j) ∈ X}

where X = {X (i, j)} denote a given image composed of L discrete gray levels denoted
as {X0, X1, . . ., X L-1}, where X (i, j) represents an intensity of the image at the spatial
location (i, j) and X (i, j) ∈ {X0, X1…, X L-1}

8
2.1.1.2 Brightness-preserving Bi-Histogram Equalization
BBHE divides the input histogram into two sub-histograms by the mean
intensity of the image and independently equalizing the sub-histograms to enhance the
image. It retains the mean brightness while reducing the saturation effect, avoiding
abnormal enhancement and undesirable artifacts.
The generalized algorithm for the bi-histogram based methods is given below.
Xm represents the intensity value which separates the histogram into two parts.

X =XL ∪ XU
XL={X(i,j)|X(i,j)≤Xm,∀X(i,j)∈X} (1)
and
XU={X(i,j)|X(i,j)>Xm, ∀X(i, j) ∈ X} (2)

The probability density functions of XU and XL are defined as


PL(Xk) = nkL / nL , where k = 0, 1, ..m, and P U(X1) = nl U/ nU , where l = m + 1, m
+2,…L – 1, nL and nU represent the total number of pixels in the lower ( X L ) and the
upper ( XU ) sub histograms and denote the number of pixels having the intensity value
Xk and Xl in the XL and XU respectively. The respective cumulative distribution
functions are defined in [17] as

CL (Xk) = ∑𝑘𝑗=0 PL(Xj) (3)


CU(Xl)=∑𝑙𝑗=0 PU(Xj) (4)

2.1.1.3 Dynamic Stretching-Based Brightness Preservation


DSBP (Dynamic Stretching-Based Brightness Preservation) is an image
enhancement technique that improves image contrast while maintaining the original
image's mean brightness. It achieves this by separating the image into high and low-
intensity regions, stretching their intensity ranges, and iteratively adjusting the
separation threshold to minimize brightness changes, resulting in visually enhanced
images [18].

2.1.1.4 Contrast Stretching


It is a method of stretching the contrast, with the intensity value contained in

9
the image expanded with a dynamic range. The pixel value of the image can be applied
with linear scaling. To normalize the image or contrast stretching the image it is
necessary to determine the minimum and maximum values of the image. These
minimum and maximum values will determine the boundary of the image. In this
proposed method an image with an 8-bit gray level is the lower limit and values 0 to
255 as the upper limit. Digital images are taken using a fundus camera. When viewed
from the outline angle, digital image processing techniques are divided into 3 based on
processing levels, namely:
1. Low-Level Process or low level, this processing is a basic operation in image
processing examples such as noise reduction, image improvement, and image
restoration.
2. Mid-Level Process or intermediate level, this processing includes object description,
image segmentation, and object classification separately.
3. High-Level Process or high level includes the analysis of an image.

𝑓(𝑥,𝑦)−𝑚𝑖𝑛
𝑔(𝑥, 𝑦) = × 255 (5)
𝑚𝑎𝑥−𝑚𝑖𝑛

where,
g (x, y) = matrix of the resulting image
f (x, y) = original image matrix value

where g (x, y) represents the output and f (x, y) represents the input. Image intensity
values with 0 as the lowest value and 255 as the highest value [17]. By using stretchlim
as a determinant of the minimum and maximum values. The value of g (x, y) as a new
image obtained from the image value (x, y) will be subtracted by the maximum value
and divided by the results of the minimum and maximum reduction. The results will be
multiplied by 255 as the pixel value [19].

2.1.1.5 Gaussian Filter for Noise Reduction


It is a type of linear filter that works by applying a weighted average to the
pixels in the image, with the weights determined by a Gaussian distribution. This filter
is effective at reducing high-frequency noise while preserving the edges and details in
the image.
Gaussian function hg (x, y) is given as

10
(𝑥2 +𝑦2 )
1 −
ℎ𝑔 (𝑥, 𝑦) = 2.𝑒
2𝜎2 (6)
2𝜋𝜎

𝑈(𝑥, 𝑦) = (ℎ𝑔(𝑥,𝑦) )/(∑𝑥 ∑𝑦 ℎ𝑔 ) (7)

(x, y) represents the unsharp mask obtained from the Gaussian kernel. The sharpened
image
F (x, y) is obtained by

𝑐 1−𝑐
𝐹(𝑥, 𝑦) = (2𝑐−1 ) . 𝐼 (𝑥, 𝑦) − 2𝑐−1 . 𝑈 (𝑥, 𝑦) (8)

where c lies between 0.5 to 1. σ is the standard deviation calculated from the input
image [16].

2.1.1.6 Contrast Limited Adaptive Histogram Equalization


Contrast Limited Adaptive Histogram Equalization (CLAHE) is a modified part
of adaptive histogram equalization. In this method, enhancement function is applied
over all neighborhood pixels and transformation function is derived [20].

2.1.1.7 Dualistic Sub-Image Histogram Equalization


Equal Area Dualistic Sub-Image Histogram Equalization (DSIHE) follows the
same idea as in the BBHE, which decomposes the original image histogram into two
sub histograms based on the mean value. DSIHE method decomposes the image based
on the gray level with a cumulative [21].

2.1.1.8 Single Image Super Resolution


Single image super-resolution (SISR) is an extremely hot topic in the field of
computer vision, which aims to reconstruct a super-resolution (SR) image from a single
low-resolution (LR) one. It has been widely used in computer vision tasks such as
medical image enhancement, video super-resolution, and facial illusion. Meanwhile,
the quality of reconstructed images significantly affects the accuracy of high-level
tasks, such as image classification, objective detection, and image segmentation.
Although SISR has a wide range of applications, it is still considered as a highly ill-

11
posed problem due to information loss [22].

2.1.1.9 Adaptive Gamma Correction with Weighting Distribution


Contrast enhancement is a method that is used to enhance images for viewing
process or for further analysis of images. Main idea behind contrast enhancement
techniques is to increase contrast and to preserve original brightness of images. In this
paper a contrast enhancement technique is proposed that first segments histogram of
image recursively and then applies Adaptive Gamma Correction with Weighting
Distribution (AGCWD) Technique. The proposed technique is basically an
improvement over AGCWD technique and aims to get better contrast enhancement and
brightness preservation than AGCWD technique [23]. According to AGCWD method,
Adaptive gamma correction is formulated in as:

𝒍 𝛄 𝒍 𝟏−𝐂𝐃𝐅(𝐥)
𝑻(𝒍) = 𝒍𝐦𝐚𝐱 (𝒍𝒎𝒂𝒙) = 𝒍𝐦𝐚𝐱 (𝒍 ) (9)
𝒎𝒂𝒙

Weighting distribution function is applied as:

𝑷𝑫𝑭 𝒂
𝑷𝑫𝑭𝒘(𝒍) = ∑𝒍=𝟎 𝑷𝑫𝑭𝒎𝒂𝒙 (𝑷𝑫𝑭(𝒍) − 𝑷𝑫𝑭 𝒎𝒊𝒏 − 𝑷𝑫𝑭𝒎𝒊𝒏 ) (10)
𝒎𝒂𝒙

where α is the adjusted parameter, PDFmax is the maximum PDF of statistical histogram,
and PDFmin is minimum PDF. Then modified CDF is as:

𝒍
∑𝒑𝒅𝒇𝒘 = ∑𝒍=𝟎
𝒎𝒂𝒙
𝒑𝒅𝒇𝒘(𝒍) (11)
And, gamma is calculated as 𝜸 = 𝒍 − 𝒄𝒅𝒇𝒘 (𝒍)

2.1.2 PERFORMANCE METRICES


A total of 6 performance metrics is used to measure the performance of the
image enhancement techniques,
 Absolute mean brightness error (AMBE)
 Structural Similarity Index (SSIM)
 Peak Signal to Noise Ratio (PSNR)
 Discrete Entropy (DE)
 Enhancement Measure (EME)

12
 Lightness Order Error (LOE)

2.1.2.1 Absolute mean brightness error (AMBE)


It is a measure of preservation of the original image brightness. It is defined as

AMBE=|M(I)-M(J)| (12)

where M(I) and M(J) represent the mean values of the low contrast (I) and enhanced
(J) images respectively. The preservation of the original image is linked to the lower
value of AMBE [24].

2.1.2.2 Structural Similarity Index (SSIM)


SSIM is a perceptual quality measurement of a processed image with respect to
reference image. It varies from 0 to 1, where ‘1’ indicates the structural information of
the image is prevented and ‘0’ indicates structural information is lost during
enhancement. It is calculated from the statistical parameters of input and enhanced
images. It is defined as [8]

(2μI μJ +c1)(2σI ,J +c2)


SSIM(I, J) = (13)
(μ2I +μ2J +c1 )(σ2I +σ2J +c2)

2.1.2.3 Peak Signal to Noise Ratio (PSNR)


PSNR is used for evaluating image quality. It indicates level of degradation of
enhanced image when compared to the input image

1
PSNR = 10 log10 [(L − 1)2 /((n) ∑x ∑y|I(x, y) − J(x, y)|2 )] (14)

where (L − 1) indicates the maximum intensity of the image. Total number of pixels
in the image is represented by ‘n’ [16].

2.1.2.4 Discrete Entropy (DE)


Entropy is a measurement of uncertainty of a random variable. The more the
variable is random, the more entropy an image. In image processing, low entropy means

13
image has low contrast. the discrete entropy for each algorithm and it proves that most
of the cases Exact Histogram Specification (EHS) holds large entropy. So, in this case
EHS also performs better than other if we give importance on the contrast of enhanced
image.[18]

H(X) = - Σ [P(x) * log2(P(x))] (15)

H(X) is the discrete entropy of the random variable X.


Σ represents the summation over all possible values of x that X can take.
P(x) is the probability of X taking the value x.
log2(P(x)) is the base-2 logarithm of the probability P(x).

2.1.2.5 Enhancement Measure (EME)


The Measure of enhancement (EME) approximates an average contrast in the
image by dividing the image into non overlapping blocks and finding a measure based
on the minimum and maximum intensity values. The EME is calculated as [25]

1 𝐼
𝐸𝑀𝐸 = 𝑛 ∑20𝑙𝑜𝑔( 𝐼𝑚𝑎𝑥 ) (16)
𝑚𝑖𝑛

2.1.2.6 Lightness Order Error (LOE)


The Lightness-order-error (LOE) measure for the naturalness preservation is
proposed to assess enhanced images. Secondly, we decompose the image through the
proposed bright-pass filter, which ensures the reflectance is restricted in the range [0,
1]. Thirdly, the bi-log transformation is proposed to process the illumination, so that
the illumination will not flood details due to spatial variation while the lightness order
is preserved.

The LOE measure is defined as:


1
𝐿𝑂𝐸 = 𝑚∗𝑛 ∑𝑚 𝑛
𝑖=1 ∑𝑗=1 𝑅𝐷 (𝑖, 𝑗 ) (17)

from the definition of LOE, we can see that the smaller the LOE value is, the better the
lightness order is preserved. In order to reduce the computational complexity, we take
the down-sampled versions DL and DLe of size dm × dn instead of L and Le. The ratio

14
r between the size of the down sampled image and that of the original images is set as
r = 50/ min (m, n). As a result, the size dm × dn of the down sampled image is [m · r]
× [n ·r] [26].

2.1.3 Qualitative and Quantitative Analyses of Image Enhancement Algorithms


In Histogram Equalization, higher values of intensity can be used to increase
contrast, but avoid extremely high intensity values to prevent over-amplification of
noise. In Bi-Histogram Equalization (BBHE) a moderate value of pixel intensity is
often suitable to maintain color balance. Dynamic Stretching-Based Brightness
Preservation adjusts the intensity values moderately to balance brightness preservation
and detail enhancement. In Contrast Stretching, moderate value of intensity is used to
avoid excessive clipping or spreading of pixel values. In Gaussian Filter, for Noise
Reduction, higher values of intensities are applied for stronger noise reduction, but
ensure not to blur the image excessively.

In Adaptive Histogram Equalization (CLAHE), a balanced pixel intensity value


is typically preferable to prevent over-enhancement in local areas. In DSIHE (Dualistic
Sub-Image Histogram Equalization) the number and size of sub-images will influence
this and too many sub-images may lead to fragmented results. In Super Resolution, the
use of higher interpolation factor increases image resolution, but the use of excessively
higher values may generate unrealistic details. AGCWD (Adaptive Gamma Correction
with Weighting Distribution) adjusts the weighting factor moderately to correct
brightness variations without amplifying noise. Table 3 shows the qualitative analysis
of image enhancement techniques. The qualitative analysis is performed by visually
analyzing the enhanced images.

15
Table 3. Qualitative Analysis of Image Processing Techniques

Original Image
Enhanced images – Qualitative Analysis

Dynamic Stretching-
Bi-Histogram
Histogram Equalization Based Brightness
Equalization (BBHE):
Preservation

Gaussian Filter for Adaptive Histogram


Contrast Stretching
Noise Reduction Equalization (CLAHE)

DSIHE (Dualistic Sub- Single image Super


Image Histogram Resolution. AGCWD
Equalization):

Table 3 shows the enhanced images using different image enhancement


techniques. It is evident from Table 3 that SISR enhances the brightness of the original
image optimally, and also does not add noise or disturbance to the image.

16
Table 4. Quantitative Analysis of Image Processing Techniques
(Original DE=5.8517)
peak Enhanc
Absolute Structura
signal
Mean l discrete ement lightness
S. to noise
Techniques Brightne Similarit entropy Measur order error
No ratio
ss Error y Index (DE) e (LOE)
(PSNR)
(AMBE) (SSIM) (EME)
in dB
Histogram
1 0.2353 0.5126 35.84 3.8794 3.8825 7361.6994
Equalization
Bi-Histogram
2 Equalization 0.2140 0.5398 35.94 4.2585 4.3564 6567.2578
(BBHE):

Dynamic
Stretching-Based
3 0.2466 0.5114 36.52 3.3149 3.7265 7993.2008
Brightness
Preservation

Contrast
4 0.2584 0.4993 36.94 3.1651 3.5541 8341.4552
Stretching
Gaussian Filter
5 for Noise 0.2586 0.5089 37.77 3.1651 3.6583 8165.0184
Reduction
Adaptive
Histogram
6 0.2570 0.4739 35.48 3.3932 3.3433 8522.1691
Equalization
(CLAHE)
DSIHE (Dualistic
Sub-Image
7 0.2372 0.5111 36.09 3.8694 3.8678 7369.6411
Histogram
Equalization):
Single image
8 0.1658 0.6138 34.95 4.2901 5.4656 5518.2517
Super Resolution.
9 AGCWD 0.2281 0.5021 33.97 4.3349 3.6169 8452.1307

The quantitative analysis of image enhancement techniques is performed by


calculating the performance metrics such as AMBE, SSIM, PSNR, DE, EME and LOE.
The values of performance metrics are tabulated in Table 4. Lower values of AMBE
and LOE are required for better image enhancement. SISR has produced lowest AMBE
of 0.1658 and LOE of 5518.2517 among all the enhancement techniques. SSIM is an
index, which is used to measure the naturalness of the image, hence higher values are

17
preferred. SISR has high SSIM of 0.6138. Higher the PSNR better is the noise rejection.
GF provides a higher PSNR of 37.77 db. EME is an enhancement parameter so a higher
value is preferred, EME of SISR is 5.4656. Based on the performance analysis, the
SISR enhancement algorithm performed well as compared to the other algorithms,
except for PSNR. Therefore, in this project, SISR is selected for enhancing the banana
leaf images.

2.2 DATA AUGMENTATION


Mendeley dataset, MUSA dataset and a combination of Mendeley dataset and
MUSA dataset are used in this report for banana leaf disease classification. The number
of images in each class is different in the datasets considered. This leads to class
imbalance problem. Hence image augmentation techniques such as rotation, reflection
and scaling are used to increase the no. of images in each class. The augmented dataset
is limited to 1500 images in each class to eliminate the class imbalance problem. Table
5 shows the augmented images.
Table 5. Augmented images

(a)Original Image (b)Rotated Image

(c)Reflected Image (d)Scaled Image

Class imbalance is the condition where the images in the different classes are
not equal, as a result the class with larger image are trained frequently. Hence, the model
can classify the class with more images correctly compared to the class with fewer

18
images. In order to make the deep learning model more robust, the dataset augmented
and a limit is set to use the data for classification. Some of the data augmentation
techniques such as rotation, scaling and refection are used to increase the number of
images. The augmented dataset is limited to 1500 images in each class to eliminate the
class imbalance problem. The combined augmented dataset is shown in Table 6.

Table 6. Augmented Datasets


Classes No. of Images
Healthy 1500
Banana Aphids 1500
Black Sigatoka 1500
Yellow Sigatoka 1500
Potassium Deficiency 1500
Xanthomonas 1500
Total no of images 9000

19
CHAPTER 3
DEEP LEARNING TECHNIQUES
3.1 METHODOLOGY
After pre-processing the original image, the image classification is
performed by using the state- of-the-art architectures, AlexNet [21] and VGG-16 [22].
AlexNet has 8 layers with learnable parameters. It has 5 convolution layers, 3 max-
pooling layers and 3 fully connected layers. The total number of parameters present in
AlexNet model is 62.3 million. VGG-16 has 16 layers. This network is a large network
and it has about 138 million parameters approximately. The activation function used in
all the layers for AlexNet, VGG-16 and ResNet-50 is Relu except for the output layers.
The activation function used in the output layers is Softmax. The inputs to the CNN
Architectures are the RGB images and their size vary based on the size of the input
layer present in the architecture.

3.2 PERFORMANCE METRICS


The metrics which are used to investigate the model are accuracy, precision,
recall and F1-Score. They are calculated using the following equations, Eq. 18 to Eq.
22.
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑖𝑚𝑎𝑔𝑒𝑠
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑚𝑎𝑔𝑒𝑠
(18)

𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃+𝐹𝑃 (19)

𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃+𝐹𝑁 (20)

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗𝑅𝑒𝑐𝑎𝑙𝑙
𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 (21)

𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 𝑇𝑁+𝐹𝑃 (22)

Where,
TN = True Negative,
TP = True Positive,
FN = False Negative and FP = False Positive.

20
True positive is an outcome if the model correctly predicts the healthy class,
true negative is an outcome if the model correctly predicts the diseased class, and false
positive is produced as an outcome when the model incorrectly predicts the healthy
class. And a false negative is an outcome if the model incorrectly predicts the diseased
class. Accuracy is a ratio of correctly predicted observation to the total observations
Precision is the ratio of correctly predicted healthy observations to the totally predicted
healthy observations. Recall is the ratio of correctly predicted images to all the images
in that particular class. F1 -Score is the weighted average of precision and recall.

3.3 RESULTS OF DEEP LEARNING TECHNIQUES


All the models are programmed in MATLAB R-2023b and implemented on a
Desktop PC. The metrics which are used to evaluate the model are accuracy, precision,
recall and F1-Score. The performance metrics are calculated from the confusion matrix,
which is formed by taking the predicted labels and actual labels as input. AlexNet and
VGG-16 models are identified as the basic CNN models. They are the most commonly
used CNN models for plant disease classification. The training and testing of those
models are carried using different training options. The number of epochs is fixed to 6
and learning rate is fixed to 0.0001 for effective comparison. The optimizers used are
SGDM and ADAM. The model was trained for different batch sizes of 32 and 64. The
models were trained using transfer learning and without using transfer learning. Fig. 1
shows the AlexNet implementation and Fig. 2 shows the VGG-16 implementation.
From Fig. 1 and Fig 2, the AlexNet model trained with weight transfer performs better
for both the batch sizes. For batch size of 32, training accuracy is 91.82% and the testing
accuracy is 91.64%. For batch size of 64, training accuracy is 92.65% and testing
accuracy is 90.25%. Without weight transfer the training accuracy of the model is 77.67
% and the testing accuracy is 66.3 % for batch size of 32. The training and testing
accuracy of the model trained without transfer of weights is 66.3% and 76.04%
respectively for batch size of 64. These models were trained and tested on the combined
dataset without data augmentation.

21
Fig. 1. Analysis on AlexNet Architecture.

For VGG-16 model for batch size of 32, the TL model gives higher
accuracy compared to the model without TL. But for batch size of 64, the model
without TL gives higher accuracy compared to the TL model. On comparing
AlexNet and VGG-16 model, AlexNet is composed of lesser number of layers and
thus the size of the model is smaller, therefore the memory requirement to store the
model on a cloud platform is less. Alexnet model and VGG-16 model has 62.30
and 138 million parameters respectively. Thus, the time taken for training increases
significantly. AlexNet model takes 4.91 minutes to train a model with batch size
of 32, but VGG-16 model takes 11.18 hours. Thus, AlexNet model is chosen for
banana leaf disease classification task.
On taking VGG-16 architecture the TL models produce high accuracy for
batch size of 32 but for batch size of 64 the accuracy is very less. The models trained
with transfer learning can learn faster. Also, there is a significant decrease in the
training for VGG-16 model but for AlexNet model, there is only 1 minute decrease
in the training time for AlexNet model. In order to avoid unambiguity in analysis
the model trained without transfer of weights is take into consideration. The batch
size used for training is also fixed to 32, because there is a huge change in the
training time for VGG-16 model, but no significant change in the accuracy is
observed. Hence, AlexNet model without transfer of weight for a batch size of 32

22
is chosen for further analysis.

Fig. 2. Analysis on VGG- 16 Architecture.

Thus, the AlexNet model, without pre-trained weights are trained for
various training options. They are batch size is fixed to 32, epochs vary as 5,10,15.
The learning rates are chosen as 0.1, 0.01, 0.001 and 0.0001. For higher
learning rates, the models do not produce the expected result. Hence, it is fixed to
0.001, 0.0001. Fig. 3 shows the results of AlexNet model which is trained using
SGDM optimizer for a batch size of 32. It can be inferred from Fig. 3 that the
training and testing accuracies for 15 epochs and LR of 0.0001 are 85.16 % and
82.73% respectively. Similarly, the training and testing accuracies for 15 epochs
and LR of 0.001 is 87.33 % and 86.33% respectively. Thus, by using SGDM as
the optimizer, the model performs well for LR of 0.0001 and 0.001 for 15 epochs.
Fig. 4 shows the results of training AlexNet model using ADAM optimizer for a
batch size of 32. It can be inferred from Fig. 4 that the training and testing
accuracies for 5 epochs and LR of 0.0001 is 84.74 % and 86.07% respectively.
Similarly, the training and testing accuracies for 10 epochs and LR of 0.0001
is 88.9 % and 91.36% respectively. Thus, by using ADAM as the optimizer, the

23
model performs well for LR of 0.0001 for 5 and 10 epochs. Thus, four models of
higher accuracy using SGDM and ADAM optimizer are identified. These models
are further trained on different datasets to obtain a suitable model for Banana leaf
disease classification.

Fig. 3. Analysis on AlexNet Architecture with SGDM.

While analyzing the training and testing accuracies of AlexNet model by


changing various hyper parameters four models were identified to give good
performance. They are
Model1 (EPOCH=15; LR=0.001; SGDM),
Model2 (EPOCH=15; LR=0.0001; SGDM),
Model3 (EPOCH=5; LR=0.0001; ADAM) and
Model4 (EPOCH=10; LR=0.0001; ADAM)
These AlexNet models are trained on three different augmented datasets
to find a suitable robust model. Fig. 5 shows the analysis of the AlexNet models
on Mendeley dataset. It can be identified from Fig. 5 that Model1 gives the
highest training and testing accuracies of 83.33 % and 82.89% respectively.
Model4 also provides almost the same training and testing accuracies of 82%
and 82.89% respectively. Precision, Recall, Specificity and F1 score of Model1
and Model4 are identified to be the best. But the training time is 12.56 mins for
Model1 and 8.28 mins Model4. Model4 achieves better accuracy in lesser number

24
of epochs as compared to Model1.
Fig. 6 shows the analysis of the AlexNet models on MUSA dataset. From
Fig. 6, it can be identified that Model1 gives the highest training and testing
accuracies of 91.2 % and 89.6% respectively. Model3 also provides almost the
same training and testing accuracies of 90 % and 87.73% respectively. Model4
also provides almost the same training and testing accuracies of 92 % and
90.67% respectively. Precision, Recall, Specificity and F1 score of Model1 and
Model4 are identified to be the best. The training time is 25.25 mins for
Model1 and 24.56 mins Model4, hence there is no significant improvement in
timing between the models. But Model4 achieves better accuracy in lesser
number of epochs while compared to Model1 for MUSA dataset also.
Fig. 7 shows the analysis of the AlexNet models on the Combined dataset.
From Fig. 7, it can be identified that Model1 gives the highest training and
testing accuracies of 89.67 % and 87.42% respectively. Model2 also provides
almost the same training and testing accuracies of 83.92% and 83.42%
respectively. Model3 also provides almost the same training and testing accuracies
of 85.25% and 78.55% respectively. Precision, Recall, Specificity and F1 score
of Model1 are identified to be the best. The training time is 58.58 mins for
Model1 and 25.32 mins Model4. But Model3 achieves better accuracy in lesser
number of epochs while the improvement of number of epochs does not show any
significant improvement in accuracy of the model.
From the above analysis, Model1 and Model4 performed better for smaller
datasets. While Model4 did not produce the expected performance for Combined
dataset. But Model1 consistently produces high accuracy, precision, recall,
specificity and F1- Score for all the cases. Hence, AlexNet model with training
options as learning rate of 0.001, optimizer as SGDM and epoch of 15, can be
chosen to perform Banana leaf disease classification.

25
Fig. 4. Analysis on AlexNet Architecture with ADAM.

Fig. 5. Analysis on Mendeley dataset.

26
Fig. 6. Analysis on MUSA dataset.

Fig. 7. Analysis on Combined dataset.

27
Table 7. Mendeley dataset analysis

Table 8. MUSA dataset analysis.

Table 9. Combined dataset analysis

Tables 7, 8, and 9 show the performance metrics for the datasets Mendeley,
MUSA and combined dataset respectively which is also shown in Fig. 5, Fig. 6 and Fig.
7 respectively.

28
CHAPTER 4
CLASSIFIER MODEL
4.1 PROPOSED MODEL

Fig 8. Smart Banana Leaf Disease Classifier using CNN

The block diagram for the Smart banana leaf disease classifier using CNN is
shown in Fig. 8. The proposed DL model is trained using the dataset in [6] and [8]. The,
dataset is divided into 70:30 ratio and used for training and testing respectively. The
real time image collected from the banana tree farm is sent to the Dropbox. MATLAB
analysis tab in ThingSpeak is used to download the trained (.mat) file from Dropbox.
The ‘.mat’ file includes i) image enhancement using the SISR techniques ii)
identification of plant part using the pre trained LeNet classifier and iii) classification
of image using the pre trained deep learning model. Once the image is received, it is
enhanced, LeNet classifier identifies the plant part and the pre trained deep learning
model is used in the ThingSpeak cloud for classification of image. The output of the
class is shown in MATLAB analysis tab.

4.2 DATASET
A new dataset has been formed by adding 6 mineral deficiency classes (Taken
from Mendeley dataset) to the combined dataset (7 classes) and it is named as disease-
deficiency dataset. As deficiency of these minerals makes the plant more prone to
disease, hence early identification of mineral deficiency is required. Table 10 shows the

29
disease-deficiency dataset description.
Table 10. Disease-deficiency Dataset Description
Classes No. of images
Healthy 1500
Banana aphids 1500
Black sigatoka 1500
Boron deficiency 1500
Calcium deficiency 1500
Iron deficiency 1500
Magnesium deficiency 1500
Manganese deficiency 1500
Potassium deficiency 1500
Sulphur deficiency 1500
Xanthomonas 1500
Yellow sigatoka 1500
Zinc deficiency 1500
Total No. images 19500

It is found from the performance comparison of the 4 models in terms of


accuracy, precision, recall, F1-Score and specificity when trained on MUSA, Mendeley
and combined dataset, Model 2 did not perform well.

Table 11. Performance analysis on disease-deficiency dataset


Parameters Model 1 Model 3 Model 4
Training Accuracy 71.08 67.94 74.67
Training time (min) 353.4 148.52 251.21
Precision (in %) 51.41 47.85 57.33
Recall (in %) 62.19 58.18 68.41
Accuracy (in %) 67.24 62.57 75.49
Specificity (in %) 97.97 97.66 98.48
F1-Score (in %) 54.58 51.96 61.06

30
97.66
97.97

98.48
75.49
74.67
71.08

68.41
67.94
67.24

62.57
62.19

61.06
58.18

57.33
54.58

51.96
51.41

47.85
5.89

4.18
2.47
model-1 model-3 model-4
Training Accuracy 71.08 67.94 74.67
Training time 5.89 2.47 4.18
Precision(in %) 51.41 47.85 57.33
Accuracy(in %) 67.24 62.57 75.49
Specificity(in %) 97.97 97.66 98.48
F1-Score(in %) 54.58 51.96 61.06
Recall (in %) 62.19 58.18 68.41
Training Accuracy Training time Precision(in %) Accuracy(in %)
Specificity(in %) F1-Score(in %) Recall (in %)

Fig 9. Analysis on Disease-deficiency dataset


Hence, Model 1, Model 3 and Model 4 were trained on this new dataset and the
parameters like accuracy, precision, recall, F1-Score and specificity are calculated for
each of the model. The values are shown in Table.11.

Fig 10. Confusion matrix of Model 1


31
Fig 11. Confusion matrix of Model 3

Fig 12. Confusion matrix of Model 4

32
From the analysis done on disease-deficiency dataset, it is found that the highest
achievable training accuracy is given by Model 4 which is 74.67%. The models that
performed better in combined dataset analysis doesn’t perform well in disease-
deficiency dataset analysis due to overfitting. Fig 10, 11, 12 show the confusion
matrices of Model 1, Model 3, Model 4 respectively.

4.2.1 METHODS TO OVERCOME OVERFITTING


 Data Augmentation
 Dropout regularization
 L2 regularization
 Early stopping

4.3 DISCUSSION ON PLANT PARTS CLASSIFIER


In order to get more accuracy in disease classification, only leaf images must be
given as input to disease classifier (AlexNet). To classify between various plant parts
such as leaf, stem and fruit, LeNet model is used.

4.3.1 DATASET
For training the LeNet, a dataset with three classes has been used and it is
described in Table 12. Table 13 shows the performance parameters of the trained LeNet
model.

Table 12. Dataset Description

Classes No. of images

Leaf 500

Stem 500

Fruit 500

Total 1500

33
Table 13. Analysis on Dataset (LeNet)

Parameters Value
Training Accuracy (in %) 84.33
Training time (in sec) 49
Precision (in %) 91.13
Recall (in %) 91.28
Accuracy (in %) 91.13
Specificity (in %) 95.62
F1-Score (in %) 91.11

Fig 13. Confusion matrix of LeNet

34
CHAPTER 5
ACCURACY IMPROVEMENT

5.1 OPTIMIZATION TECHNIQUES


Model4 has given the accuracy of 74.67 %. The following optimization
techniques can be used to improve the accuracy of a deep learning model. They include
 The use of pre-trained model to obtain a faster learning curve/to increase
accuracy. It is named as the transfer learning model in literature [27].
 Structural modification in the number of layers of the DL model. For example,
the no. of layers can be increased to increase the accuracy.
 The use of ensemble techniques is also preferred to increase the accuracy.

5.2 TRANSFER LEARNING


AlexNet model was trained with ImageNet dataset, which was used to classify
the banana leaves. This model has provided the accuracy of 94%

5.3 STRUCTURAL MODIFICATION


The structure of the existing model was analyzed. The structural modification
in the AlexNet was done by changing the size of stride from 4 to 2 in the first
convolutional layer of the architecture. Fig. 14 shows the architecture of the modified
AlexNet. This architecture gives the accuracy of 89.63 %.

5.4 VOTING TECHNIQUE:


There are a large variety of ensemble techniques used to improve the accuracy
of the CNN model. The most common and successful approach is max voting technique
and it is explained in Eq. (23) and Eq. (24).

𝑦 ∗ = 𝑚𝑜𝑑[𝐶1 (𝑥 ), 𝐶2 (𝑥), … , 𝐶𝑛 (𝑥)] (23)

where y * predict the class label via majority (plurality) voting of each classifier Cn.

𝑛
𝑦 ∗ = 𝑎𝑟𝑔𝑚𝑎𝑥 ∑𝑗=1 Wi P ij (24)

35
where Wj is the weight that can be assigned to the jth classifier [27]. Table 14 shows the
accuracies of the various optimization techniques used. It is found that the ensemble
model gives the higher accuracy of 96.15% as compared to Model 4, structural
modification model and transfer learning model. Therefore, ensemble model is
preferred to do the classification in cloud.
Table 14 shows the accuracies of various optimization techniques.

Fig 14. Architecture of modified AlexNet

36
Table 14. Improvement in accuracy after Ensemble Techniques.
S. No Model Accuracy
1 Transfer learning model 94%
3 Structural modification model 89.63%
4 Ensemble Model 96.15%

37
CHAPTER 6
CLOUD OPERATIONS
6.1 CLOUD CONFIGURATION
6.1.1 ThingSpeak
ThingSpeak is an open-source software written in Ruby which allows
users to communicate with internet enabled devices. It facilitates data access, retrieval
and logging of data by providing an API to both the devices and social network
websites. Fig. 15 shows the working of ThingSpeak platform.

Fig 15. ThingSpeak Platform

6.1.2 Dropbox
Dropbox is a cloud storage solution, equipped with features to help you save
time, improve your productivity, and collaborate with others. Just some of the many
things you can do with Dropbox include: Store your files, documents, and photos online
and access them from any device. Steps to do in Dropbox, to load the trained DL models
to ThingSpeak, an App is created in Dropbox. An access token to access (read and
write) the app is generated.

6.1.3 ThingSpeak and Dropbox configuration


For classification of the image in ThingSpeak, a trained DL model (.mat) file
needs to be loaded to ThingSpeak. But Thingspeak does not support the loading of
(.mat) file. So, for loading the (.mat) file, it needs to be uploaded on Dropbox. Image
to be classified also needed to be stored in Dropbox. After uploading the model to

38
Dropbox, MATLAB analysis window in Thingspeak is used to download the model
from Dropbox using the function “downloadFromDropbox”. Image uploaded to
Dropbox will be the input image will be resized accordingly in ThingSpeak and given
as input to the trained DL model (.mat file). If the input image is not a leaf image the
classifier will request for a leaf image by printing a message. If leaf image is given as
input image the classifier will classify the image and displays class name as output.
Fig. 16 shows the MATLAB analysis window. Fig. 17 shows classification
output of DL model.

Fig 16. MATLAB analysis window

Fig 17. Classification output of DL model

39
CHAPTER 7
CONCLUSION
7.1 CONCLUSION
Nine image enhancement techniques are used to enhance the images. Based on
metrices like Mean Brightness Error (AMBE), Discrete Entropy (DE), Structural
Similarity Index (SSIM), Enhancement Measure (EME), Lightness Order Error (LOE),
it is found that Single Image Super Resolution (SISR) turns out to be the more efficient
enhancement technique among the nine techniques.
Optimizing the performance of the AlexNet and VGG-16 models through
extensive experimentation and fine-tuning is done. Through data augmentation and
systematic variations in training parameters, including the number of epochs and
learning rate, it is aimed to identify the most accurate and efficient model configuration.
Model1 (EPOCH=15; LR=0.001; SGDM), Model2 (EPOCH=15; LR=0.0001;
SGDM), Model3 (EPOCH=5; LR=0.0001; ADAM), Model4 (EPOCH=10;
LR=0.0001; ADAM) are trained on three different datasets namely Mendeley dataset,
MUSA dataset and Combined Dataset. Out of these four models, Model1 provided
higher training and testing accuracy when Mendeley dataset is used, Model4 provided
higher accuracy, precision, recall and specificity for the same. In case of MUSA dataset
Model4 performs good both in terms of training and testing accuracy as well as
accuracy, precision, recall and specificity. Mode11 provided highest accuracy,
precision, recall, F1-Score and specificity followed Model4 if combined dataset is used.
Model3 takes less training time as compared to the other models considered.
From the analysis done on disease-deficiency dataset, it is found that the highest
achievable training accuracy is given by Model4 which is 74.67%. The models that
performed better in combined dataset analysis doesn’t perform well in disease-
deficiency dataset analysis due to overfitting. Various optimization techniques such as
transfer learning, structural modification and ensemble techniques were used to
improve the accuracy of a deep learning model. The ensemble model gives the highest
accuracy of 96.15% as compared to the other models used.
The DL models are stored in Dropbox. An access token is generated for
Dropbox App access. MATLAB analysis tab in ThingSpeak is used to download the
trained (.mat) file from Dropbox. The input image is stored in Dropbox and it is
downloaded into the ThingSpeak cloud using MATLAB analysis. The input image will

40
be resized accordingly in ThingSpeak and is given as input to the DL model. If the input
image is not a leaf image the classifier will request for a leaf image by printing a
message. If leaf image is given as input image the classifier will classify the image and
displays class name as output.

7.2 FUTURE SCOPE


1. Different type of datasets such as Banana Leaf Spot Diseases (BananaLSD)
Dataset will be used for banana leaf disease classification.

2. FPGA implementation of banana leaf disease classification will be done using


the Zedboard for real time classification applications.

3. Specific CNN models will be developed for banana leaf disease classification

41
PUBLICATION

Jayanthi B, Chandru S, Mahesh D, Dheeban Kumar G and Lakshmi Sutha Kumar,


“Optimization of AlexNet architecture for Banana Leaf Disease classification”
submitted to SN Computer Science on 13th December 2023 and the paper is now ‘under
review’.

42
REFERENCES

[1]. Bhuiyan, Md Abdullahil Baki, et al. "BananaSqueezeNet: A very fast, lightweight


convolutional neural network for the diagnosis of three prominent banana leaf
diseases." Smart Agricultural Technology 4 (2023): 100214.
[2]. Singh, Balwinder, et al. "Bioactive compounds in banana and their associated health
benefits–A review." Food chemistry 206 (2016): 1-11.
[3]. Mahendran, T., and K. Seetharaman. "Banana Leaf Disease Detection Using GLCM
Based Feature Extraction and Classification Using Deep Convoluted Neural Networks
(Dcnn)." Journal of Positive School Psychology 6.10 (2022): 2553-2562.
[4]. Duraianand, T., and R. Sivasangari. "IoT based banana leaf disease identification
system." International Research Journal of Modernization in Engineering Technology
and Science 2022.
[5]. Escudero, Cristian Andrés, et al. "Development of a digital image classification
system to support technical assistance for Black Sigatoka detection." Revista Brasileira
de Fruticultura 43 (2021).
[6]. Medhi, Epsita, and Nabamita Deb. "PSFD-Musa: A dataset of banana plant, stem,
fruit, leaf, and disease." Data in Brief 43 (2022): 108427.
[7]. Shanthiyaa, V., et al. "Prevalence of banana yellow Sigatoka disease caused by in
Tamil Nadu." J. Mycol. Plant Pathol 43.4 (2013): 414.
[8]. Mduma, Neema, and Judith Leo. "Dataset of Banana Leave and Stem Images for
Object Detection, Classification and Segmentation: A Case of Tanzania." Data in
Brief (2023): 109322.
[9]. Sujithra, J., and M. Ferni Ukrit. "CRUN-Based Leaf Disease Segmentation and
Morphological-Based Stage Identification." Mathematical Problems in Engineering
2022 (2022).
[10]. Narayanan, K. Lakshmi, et al. "Banana plant disease classification using hybrid
convolutional neural network." Computational Intelligence and Neuroscience 2022
(2022).
[11]. Krishnan, V. Gokula, et al. "An automated segmentation and classification model for
banana leaf disease detection." Journal of Applied Biology and Biotechnology 10.1
(2022): 213-220
[12]. Saranya, N., et al. "Detection of banana leaf and fruit diseases using neural networks."

43
2020 Second International Conference on Inventive Research in Computing
Applications (ICIRCA). IEEE, 2020.
[13]. Ye, Huichun, et al. "Recognition of banana fusarium wilt based on UAV remote
sensing." Remote Sensing 12.6 (2020): 938.
[14]. Ridhovan, Andreanov, Aries Suharso, and Chaerur Rozikin. "Disease Detection in
Banana Leaf Plants using DenseNet and Inception Method." Jurnal RESTI (Rekayasa
Sistem dan Teknologi Informasi) 6.5 (2022): 710-718
[15]. Evuri, Sai Rajasekhar Reddy. “Banana Leaf Disease Detection With Multi Feature
Extraction Techniques Using SVM”. Diss. Dublin, National College of Ireland, 2022.
[16]. Vijayalakshmi, D., Malaya Kumar Nath, and Om Prakash Acharya. "A
comprehensive survey on image contrast enhancement techniques in spatial
domain." Sensing and Imaging 21.1 (2020): 40.
[17]. Kim, Yeong-Taeg. "Contrast enhancement using brightness preserving bi-histogram
equalization." IEEE transactions on Consumer Electronics 43.1 (1997): 1-8.
[18]. Rahman, M. A., et al. "Image contrast enhancement for brightness preservation based
on dynamic stretching." International Journal of Image Processing (IJIP) 9.4 (2015):
241.
[19]. Ningsih, Dwi Ratna. "Improving retinal image quality using the contrast stretching,
histogram equalization, and CLAHE methods with median filters." International
Journal of Image, Graphics and Signal Processing 10.2 (2020): 30.
[20]. Reza, Ali M. "Realization of the contrast limited adaptive histogram equalization
(CLAHE) for real-time image enhancement." Journal of VLSI signal processing
systems for signal, image and video technology 38 (2004): 35-44.
[21]. Wang, Yu, Qian Chen, and Baeomin Zhang. "Image enhancement based on equal
area dualistic sub-image histogram equalization method." IEEE transactions on
Consumer Electronics 45.1 (1999): 68-75.
[22]. Fang, Faming, Juncheng Li, and Tieyong Zeng. "Soft-edge assisted network for
single image super-resolution." IEEE Transactions on Image Processing 29 (2020):
4656-4668.
[23]. Rani, Seema, and Manoj Kumar. "Contrast enhancement using improved adaptive
gamma correction with weighting distribution technique." International Journal of
Computer Applications 101.11 (2014).
[24]. Rahman, Shanto, et al. "Image enhancement in spatial domain: A comprehensive

44
study." 2014 17th international conference on computer and information technology
(ICCIT). IEEE, 2014.
[25]. Subramani, Bharath, and Magudeeswaran Veluchamy. "Fuzzy contextual
inference system for medical image enhancement." Measurement 148 (2019): 106967.
[26]. Wang, Shuhang, et al. "Naturalness preserved enhancement algorithm for non-
uniform illumination images." IEEE transactions on image processing 22.9 (2013):
3538-3548.
[27]. Mohammed, Ammar, and Rania Kora. "A comprehensive review on ensemble deep
learning: Opportunities and challenges." Journal of King Saud University-Computer
and Information Sciences (2023).

45

You might also like