Final Report Phase 2
Final Report Phase 2
Bachelor of Technology
in
by
Chandru S (EC20B1004)
Dheeban Kumar G (EC20B1007)
Mahesh D (EC20B1026)
DEPARTMENT OF
ELECTRONICS AND COMMUNICATION ENGINEERING
NATIONAL INSTITUTE OF TECHNOLOGY PUDUCHERRY
KARAIKAL – 609 609
APRIL 2024
BONAFIDE CERTIFICATE
This is to certify that the project work entitled “Smart banana leaf disease classifier using
Convolutional Neural Network” is a bonafide record of the work done by
Chandru S (EC20B1004)
Dheeban Kumar G (EC20B1007)
Mahesh D (EC20B1026)
in partial fulfillment of the requirements for the award of the degree of Bachelor of
Technology in Electronics and Communication Engineering of the NATIONAL
INSTITUTE OF TECHNOLOGY PUDUCHERRY during the year 2023 - 2024.
i
ACKNOWLEDGEMENTS
We would like to express our deep sense of gratitude to the Lord Almighty for giving
us an opportunity to do this project and showering his blessing in the due course of the project.
We would like to thank the Director In-charge, Dr. Usha Natesan for permitting us to
undertake this project work. We would like to extend our thanks to the Registrar, Dr.
Sundaravardhan S for permitting us to undertake this project work. We extend our sincere
thanks to Dean (Academic), Dr. G Lakshmi Sutha for permitting us to undertake this project
work.
We express our sincerest thanks to our project coordinator Dr. Suresh Balanethiram,
Assistant Professor, Department of ECE for his motivation in various reviews. We would like
to convey our thanks to Dr. Malaya Kumar Nath, Head of the department, and Department
of ECE for his constant support. We want to genuinely convey our thanks to Dr. G Lakshmi
Sutha, Associate Professor and project supervisor, Department of ECE for her valuable inputs,
able guidance, encouragement, whole-hearted cooperation and constructive criticism
throughout the duration of our project.
Last but not the least, we thank our families and friends whose support and suggestions
helped us mold this project.
ii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO NO
ABSTRACT i
ACKNOWLEDGEMENT ii
TABLE OF CONTENTS iii
LIST OF TABLES v
LIST OF FIGURES vi
LIST OF SYMBOLS AND ABBREVIATIONS vii
1 INTRODUCTION
1.1 General Introduction 01
1.2 Motivation 02
1.3 Literature Review 02
1.4 Objectives 06
1.5 Dataset 07
2 IMAGE PRE-PROCESSING TECHNIQUES
2.1 Image Enhancement Techniques 08
2.1.1 Image Enhancement Algorithm 08
2.1.1.1 Histogram Equalization 08
2.1.1.2 Bi-Histogram Equalization 09
2.1.1.3 Dynamic Stretching based Brightness 09
Preservation
2.1.1.4 Contrast Stretching 09
2.1.1.5 Gaussian Filter for Noise Reduction 10
2.1.1.6 Contrast Limited Adaptive Histogram 11
Equalization
2.1.1.7 Dualistic Sub-Image Histogram Equalization 11
2.1.1.8 Single Image Super Resolution 11
2.1.1.9 Adaptive Gamma Correction with Weighting 12
Distribution
2.1.2 Performance Metrices 12
2.1.2.1 Absolute Mean Brightness Error 13
2.1.2.2 Structural Similarity Index 13
iii
2.1.2.3 Peak Signal to Noise Ratio 13
2.1.2.4 Discrete Entropy 13
2.1.2.5 Enhancement Measure 14
2.1.2.6 Lightness Order Error 14
2.1.3 Qualitative and Quantitative Analyses of Image 15
Enhancement Algorithm
2.2 Data Augmentation 18
3 DEEP LEARNING TECHNIQUES
3.1 Methodology 20
3.2 Performance Metrics 20
3.3 Results of Deep learning Techniques 21
4 CLASSIFIER MODEL
4.1 Proposed Model 29
4.2 Dataset 29
4.2.1 Methods to Overcome Overfitting 33
4.3 Discussion on Plant Parts Classifier 33
4.3.1 Dataset 33
5 ACCURACY IMPROVEMENT
5.1 Optimization Techniques 35
5.2 Transfer Learning 35
5.3 Structural Modification 35
5.4 Voting Technique 35
6 CLOUD OPERATION
6.1 Cloud Configuration 38
6.1.1 ThingSpeak 38
6.1.2 Dropbox 38
6.1.3 ThingSpeak and Dropbox configuration 38
7 CONCLUSION
7.1 Conclusion 40
7.2 Future Scope 41
PUBLICATION 42
REFERENCES 43
iv
LIST OF TABLES
Table No Title Page No
1 Banana Leaf Disease Classification Models using Neural 02
Networks
2 Literature review on Image Processing Techniques 05
3 Qualitative Analysis of Image Processing Techniques 16
4 Quantitative Analysis of Image Processing Techniques 17
5 Augmented images 18
6 Augmented Datasets 19
7 Mendeley dataset analysis 28
8 MUSA dataset analysis 28
9 Combined dataset analysis 28
10 Disease-deficiency Dataset Description 30
11 Performance analysis on disease-deficiency dataset 30
12 Dataset Description 33
13 Analysis on Dataset (LeNet) 34
14 Improvement in accuracy after Ensemble Techniques. 37
v
LIST OF FIGURES
vi
LIST OF SYMBOLS
σ – Standard deviation
π – pi
∑ - Summation
LIST OF ABBREVIATIONS
HE - Histogram equalization
vii
DSBP - Dynamic Stretching-Based Brightness Preservation
DE - Discrete Entropy
DBX – Dropbox
viii
CHAPTER 1
INTRODUCTION
1.1 GENRAL INTRODUCTION
Agriculture is still one of the largest industries in the world and employs around
27 % of the world population. The yield of a farm is the main source of income for such
a large population. Bananas are grown in 130 countries, primarily in tropical and
subtropical regions. Their origin can be traced back to South-East Asia. Bananas are a
highly sought-after staple food, accounting for nearly 16% of global fruit production
and ranking as the second largest fruit behind citrus. In terms of world trade, bananas
are the fifth most important food crop. Bananas can be consumed raw or processed and
contain various bio-active molecules such as phenolics, carotenoids, biogenic amines,
and phytosterols, which are beneficial for human health and provide great source of
energy. They also have high levels of antioxidants. Historically, bananas have been
used to treat various chronic degenerative disorders. It is a major crop in most of the
developing nations in the world [2]. India is the largest producer of bananas, accounting
for 27% of global production. Global banana production totals 128,778,738 tons from
an area of 5,517,027 hectares. Bangladesh produces 833,309 tons of bananas from
48,850 hectares of land.
Recent problems such as climate change and attacks from pests and pathogens
affects the plant growth to a very large extent and results in lower yields. This plant is
susceptible to various diseases that require early detection for curing them. Many
diseases frequently infect banana crops. Some commonly reported leaf spot diseases
include sigatoka diseases such as exserohilum leaf spot (Exserohilum rostratum),
cordana leaf spot (Cordana musae), plantain zonate leaf spot (Pestalotiopsis
menezesiana), banana freckle disease (Phyllosticta musarum) [1], black sigatoka
(Mycosphaerella fijiensis) [3], eumusae leaf spot (Mycosphaerella eumusae) [4],
yellow sigatoka (Pseudocercospora musicola) [7].
Plant diseases largely limit banana production. In order to curb the disease’s
progression, it is crucial to assess the severity of the disease. Traditionally, plant
pathologists estimate plant disease severity by visually inspecting the disease
symptoms. Unfortunately, this technique is ineffective and very expensive if the area
1
of cultivation is large. Agriculturists are increasingly using automated disease diagnosis
models because of the advent of digital cameras and computer technology. In recent
times, the diagnosis of plant disease severity has been undertaken by deep learning
image based automatic analysis. Recently, Artificial Intelligence (AI) and remote
sensing technology have been used to detect various crop diseases. Some countries have
implemented, deep learning techniques such as LeNet, VGG16, ResNet18, ResNet50,
ResNet152, and InceptionV3 to classify banana leaf diseases [1]. It is essential that the
images used for classification are of same quality, hence image preprocessing is one of
the basic steps in image classification.
1.2 MOTIVATION
The motivation behind the project is to develop an advanced and efficient
system that uses deep learning and computer vision technologies to identify diseases in
banana leaves. By automating disease detection, farmers can quickly and accurately
assess the health of their crops, enabling timely interventions to prevent further spread
and minimize crop losses. This project aims to empower farmers, increase crop yields,
and contribute to a greener and healthier farming ecosystem.
2
CRUN-based leaf
disease segmentation
and morphological- Real-
Image Accuracy=99%;
based stage time and
Hindawi, processing and Sensitivity=99.2%;
[9] identification, public
2022 CRUN Specificity=99.3%
Mathematical datasets
problems in
Engineering
Feature
Banana plant disease extraction
classification using using CNN
Real Accuracy=97.70%;
Hybrid Convolutional and
Hindawi, time Precision=0.97;
Neural Network, classification recall =0.94;
[10] 2022 (3500 using FSVM
Computational
images) (binary SVM+ F1 score=0.95
Intelligence and
Neuroscience multiclass
SVM)
IoT based banana leaf
disease identification Using
system temperature,
humidity, and
International Real- color sensors Accuracy =
Research, Journal of IRJMETS,
[4] time to collect 92.66%
Modernization in 2022 various data
Engineering with arduino-
Technology and UNO
Science
An automated Segmentation
segmentation and using Accuracy=93.45%;
classification model CrossMar CIAT TGVFCMS Sensitivity=89.04
[11] for banana leaf disease k, image and %;
detection, Journal of 2022 library classification Specificity=96.38
Applied Biology & using the CNN %.
Biotechnology technique
Bangaba
BananaSqueezeNet: A ndhu
very fast, lightweight Sheikh
A lightweight
convolutional neural Mujibur
CNN
network for the Rahman
Elsevier, architecture Performance
[1] diagnosis of three Agricult
2022 named Metrix
prominent banana leaf ural
BananaSqueez
diseases, Smart Universit
eNet.
Agricultural y
Technology (BSMR
AU)
3
Detection of banana
leaf and fruit diseases
using Neural
Networks, Second ANN and Accuracy=96.25%;
International IEEE, Real- Segmentation Precision=96.54%;
[12] Recall=96.25%;
Conference on 2020 time using Fuzzy c-
Inventive Research in means F1 score=96.17%
Computing
Applications
(ICIRCA).
support vector
Development of a
machines Mean=0.324;
digital image Standard
(SVM), Bayes
classification system SciELO, Deviation=0.026;
classifiers,
to support technical Brasil Public-
[5]
assistance for Black dataset
decision trees, Variance =0.00005;
2020 Kmeans, K- Skewness =-
Sigatoka detection,
nearest 0.6659;
Revista Brasileira de
neighbors Kurtosis=4.948
Fruticultura.
(KNN)
Dataset of banana
leaves and stem
images for object
detection, Elsevier, Harvard
[8] classification and Datavers NIL
2023 e
segmentation: A case Perception=80%
of Tanzania
Data in Brief
Recognition of banana
Binary logistic
fusarium wilt based on MDPI, Real-
[13] regression
UAV remote sensing, 2020 time
(BLR) NIL
Remote sensing
Disease detection in
banana leaf plants
using DenseNet and oversampling
Real-
[14] Inception Method, IAII, 2022 and under-
time
Jurnal RESTI sampling OA=91.7%;
(Rekayasa Sistem dan Kappa=0.83
Teknologi Informasi)
Banana leaf disease
detection with Multi National Real- Feature Accuracy=84.73%;
Feature Extraction College of time Extraction, Recall=84.73%;
[15] Techniques using Ireland (1289 GLCM and Precision=84.80%;
SVM. 2022 images) NGTDM. F1 score=84.62%
School of Computing
4
Table 2 summarizes the literature review on the various image processing
techniques which are used to enhance the images.
Soft-Edge DIV2K,
IEEE
assisted network Set5, Set14,
Transactions
[22] for Single Image BSDS100, PSNR, SSIM
on Image
2020 Urban100,
Super-Resolution Processing
&Manga109
Improving retinal
image quality
International
using the contrast
Journal of
stretching,
Image,
[19] histogram MSE, PSNR,
Graphics and 2020 STARE
equalization, and and SSIM
Signal
CLAHE methods
Processing
with median
filters
Image contrast
enhancement for International
Brightness Journal of
Real Time Mean and
[18] Preservation Image
Image Median
Based on Processing 2015
Dynamic (IJIP)
Stretching
PSNR (Peak
Contrast
Signal to Noise
enhancement
Ratio), MSE
using improved International
(Mean Square
adaptive gamma Journal of Real Time
[23] Error) and
correction with Computer 2014 (10 Images)
AMBE
weighting Applications
(Absolute Mean
distribution
Brightness
technique
Error).
Realization of the Journal of Real Time
[20] -
Contrast Limited VLSI signal 2004 Image
5
Adaptive processing
Histogram systems for
Equalization signal, image
(CLAHE) for and video
Real-Time Image technology.
Enhancement
Image
enhancement
based on equal IEEE
area Dualistic transactions Real Time
[21] Entropy
Sub-image on Consumer 1999 Image
Histogram Electronics
Equalization
Method
Contrast
enhancement IEEE
using Brightness transactions Real Time
[17] mean
Preserving Bi- on Consumer 1997 Image
Histogram Electronics
Equalization
In this report, the suitable image enhancement techniques are identified from
Table 2 and they are used to enhance the quality of the banana leaf images to improve
the accuracy of the deep learning models. A comprehensive review on ensemble
techniques for DL models was done in [27]. In this report, parallel and sequential
ensemble technique for DL is used.
1.4 OBJECTIVES
The main objective of this work is to perform banana leaf disease classification
for 13 different classes. In order to fulfil this objective, the work is split as follows:
1. To perform Image enhancement to avoid noise and to increase the quality of the
image.
2. To augment the data set to avoid class imbalance problem.
3. To identify suitable DL model that provides better accuracy by using various
techniques like hyperparameter tuning, structural remodeling, transfer learning
and ensemble methods.
4. To send the input image, classifier models to the cloud (ThingSpeak platform)
and implement the entire classification process at the cloud and display the
output.
6
1.5 DATASET
The efficiency of the Deep learning model is based on training and testing of
the model. To perform the training and testing of the deep learning model effectively,
suitable datasets have to be chosen. Thus, two different datasets are identified. They are
Mendeley dataset (Healthy, Xanthomonas, Yellow Sigatoka) [6] and MUSA dataset
(Healthy, Potassium Deficiency, Yellow Sigatoka, Black Sigatoka, Banana Aphids) [8].
The combined dataset is obtained by combining these two datasets. As a result, the
combined dataset has images of 6 classes such as Healthy, Xanthomonas, Potassium
deficiency, Banana aphids, Black Sigatoka and Yellow Sigatoka. Further for early
identification of diseases an enhanced dataset namely Disease-deficiency dataset has
been formed by combining 7 mineral deficiency classes to the existing combined
dataset (1 healthy, 4 disease classes, 8 deficiency classes).
7
CHAPTER 2
IMAGE PRE-PROCESSING TECHNIQUES
y = f(X)
= {f (X (i, j)) |∀X (i, j) ∈ X}
where X = {X (i, j)} denote a given image composed of L discrete gray levels denoted
as {X0, X1, . . ., X L-1}, where X (i, j) represents an intensity of the image at the spatial
location (i, j) and X (i, j) ∈ {X0, X1…, X L-1}
8
2.1.1.2 Brightness-preserving Bi-Histogram Equalization
BBHE divides the input histogram into two sub-histograms by the mean
intensity of the image and independently equalizing the sub-histograms to enhance the
image. It retains the mean brightness while reducing the saturation effect, avoiding
abnormal enhancement and undesirable artifacts.
The generalized algorithm for the bi-histogram based methods is given below.
Xm represents the intensity value which separates the histogram into two parts.
X =XL ∪ XU
XL={X(i,j)|X(i,j)≤Xm,∀X(i,j)∈X} (1)
and
XU={X(i,j)|X(i,j)>Xm, ∀X(i, j) ∈ X} (2)
9
the image expanded with a dynamic range. The pixel value of the image can be applied
with linear scaling. To normalize the image or contrast stretching the image it is
necessary to determine the minimum and maximum values of the image. These
minimum and maximum values will determine the boundary of the image. In this
proposed method an image with an 8-bit gray level is the lower limit and values 0 to
255 as the upper limit. Digital images are taken using a fundus camera. When viewed
from the outline angle, digital image processing techniques are divided into 3 based on
processing levels, namely:
1. Low-Level Process or low level, this processing is a basic operation in image
processing examples such as noise reduction, image improvement, and image
restoration.
2. Mid-Level Process or intermediate level, this processing includes object description,
image segmentation, and object classification separately.
3. High-Level Process or high level includes the analysis of an image.
𝑓(𝑥,𝑦)−𝑚𝑖𝑛
𝑔(𝑥, 𝑦) = × 255 (5)
𝑚𝑎𝑥−𝑚𝑖𝑛
where,
g (x, y) = matrix of the resulting image
f (x, y) = original image matrix value
where g (x, y) represents the output and f (x, y) represents the input. Image intensity
values with 0 as the lowest value and 255 as the highest value [17]. By using stretchlim
as a determinant of the minimum and maximum values. The value of g (x, y) as a new
image obtained from the image value (x, y) will be subtracted by the maximum value
and divided by the results of the minimum and maximum reduction. The results will be
multiplied by 255 as the pixel value [19].
10
(𝑥2 +𝑦2 )
1 −
ℎ𝑔 (𝑥, 𝑦) = 2.𝑒
2𝜎2 (6)
2𝜋𝜎
(x, y) represents the unsharp mask obtained from the Gaussian kernel. The sharpened
image
F (x, y) is obtained by
𝑐 1−𝑐
𝐹(𝑥, 𝑦) = (2𝑐−1 ) . 𝐼 (𝑥, 𝑦) − 2𝑐−1 . 𝑈 (𝑥, 𝑦) (8)
where c lies between 0.5 to 1. σ is the standard deviation calculated from the input
image [16].
11
posed problem due to information loss [22].
𝒍 𝛄 𝒍 𝟏−𝐂𝐃𝐅(𝐥)
𝑻(𝒍) = 𝒍𝐦𝐚𝐱 (𝒍𝒎𝒂𝒙) = 𝒍𝐦𝐚𝐱 (𝒍 ) (9)
𝒎𝒂𝒙
𝑷𝑫𝑭 𝒂
𝑷𝑫𝑭𝒘(𝒍) = ∑𝒍=𝟎 𝑷𝑫𝑭𝒎𝒂𝒙 (𝑷𝑫𝑭(𝒍) − 𝑷𝑫𝑭 𝒎𝒊𝒏 − 𝑷𝑫𝑭𝒎𝒊𝒏 ) (10)
𝒎𝒂𝒙
where α is the adjusted parameter, PDFmax is the maximum PDF of statistical histogram,
and PDFmin is minimum PDF. Then modified CDF is as:
𝒍
∑𝒑𝒅𝒇𝒘 = ∑𝒍=𝟎
𝒎𝒂𝒙
𝒑𝒅𝒇𝒘(𝒍) (11)
And, gamma is calculated as 𝜸 = 𝒍 − 𝒄𝒅𝒇𝒘 (𝒍)
12
Lightness Order Error (LOE)
AMBE=|M(I)-M(J)| (12)
where M(I) and M(J) represent the mean values of the low contrast (I) and enhanced
(J) images respectively. The preservation of the original image is linked to the lower
value of AMBE [24].
1
PSNR = 10 log10 [(L − 1)2 /((n) ∑x ∑y|I(x, y) − J(x, y)|2 )] (14)
where (L − 1) indicates the maximum intensity of the image. Total number of pixels
in the image is represented by ‘n’ [16].
13
image has low contrast. the discrete entropy for each algorithm and it proves that most
of the cases Exact Histogram Specification (EHS) holds large entropy. So, in this case
EHS also performs better than other if we give importance on the contrast of enhanced
image.[18]
1 𝐼
𝐸𝑀𝐸 = 𝑛 ∑20𝑙𝑜𝑔( 𝐼𝑚𝑎𝑥 ) (16)
𝑚𝑖𝑛
from the definition of LOE, we can see that the smaller the LOE value is, the better the
lightness order is preserved. In order to reduce the computational complexity, we take
the down-sampled versions DL and DLe of size dm × dn instead of L and Le. The ratio
14
r between the size of the down sampled image and that of the original images is set as
r = 50/ min (m, n). As a result, the size dm × dn of the down sampled image is [m · r]
× [n ·r] [26].
15
Table 3. Qualitative Analysis of Image Processing Techniques
Original Image
Enhanced images – Qualitative Analysis
Dynamic Stretching-
Bi-Histogram
Histogram Equalization Based Brightness
Equalization (BBHE):
Preservation
16
Table 4. Quantitative Analysis of Image Processing Techniques
(Original DE=5.8517)
peak Enhanc
Absolute Structura
signal
Mean l discrete ement lightness
S. to noise
Techniques Brightne Similarit entropy Measur order error
No ratio
ss Error y Index (DE) e (LOE)
(PSNR)
(AMBE) (SSIM) (EME)
in dB
Histogram
1 0.2353 0.5126 35.84 3.8794 3.8825 7361.6994
Equalization
Bi-Histogram
2 Equalization 0.2140 0.5398 35.94 4.2585 4.3564 6567.2578
(BBHE):
Dynamic
Stretching-Based
3 0.2466 0.5114 36.52 3.3149 3.7265 7993.2008
Brightness
Preservation
Contrast
4 0.2584 0.4993 36.94 3.1651 3.5541 8341.4552
Stretching
Gaussian Filter
5 for Noise 0.2586 0.5089 37.77 3.1651 3.6583 8165.0184
Reduction
Adaptive
Histogram
6 0.2570 0.4739 35.48 3.3932 3.3433 8522.1691
Equalization
(CLAHE)
DSIHE (Dualistic
Sub-Image
7 0.2372 0.5111 36.09 3.8694 3.8678 7369.6411
Histogram
Equalization):
Single image
8 0.1658 0.6138 34.95 4.2901 5.4656 5518.2517
Super Resolution.
9 AGCWD 0.2281 0.5021 33.97 4.3349 3.6169 8452.1307
17
preferred. SISR has high SSIM of 0.6138. Higher the PSNR better is the noise rejection.
GF provides a higher PSNR of 37.77 db. EME is an enhancement parameter so a higher
value is preferred, EME of SISR is 5.4656. Based on the performance analysis, the
SISR enhancement algorithm performed well as compared to the other algorithms,
except for PSNR. Therefore, in this project, SISR is selected for enhancing the banana
leaf images.
Class imbalance is the condition where the images in the different classes are
not equal, as a result the class with larger image are trained frequently. Hence, the model
can classify the class with more images correctly compared to the class with fewer
18
images. In order to make the deep learning model more robust, the dataset augmented
and a limit is set to use the data for classification. Some of the data augmentation
techniques such as rotation, scaling and refection are used to increase the number of
images. The augmented dataset is limited to 1500 images in each class to eliminate the
class imbalance problem. The combined augmented dataset is shown in Table 6.
19
CHAPTER 3
DEEP LEARNING TECHNIQUES
3.1 METHODOLOGY
After pre-processing the original image, the image classification is
performed by using the state- of-the-art architectures, AlexNet [21] and VGG-16 [22].
AlexNet has 8 layers with learnable parameters. It has 5 convolution layers, 3 max-
pooling layers and 3 fully connected layers. The total number of parameters present in
AlexNet model is 62.3 million. VGG-16 has 16 layers. This network is a large network
and it has about 138 million parameters approximately. The activation function used in
all the layers for AlexNet, VGG-16 and ResNet-50 is Relu except for the output layers.
The activation function used in the output layers is Softmax. The inputs to the CNN
Architectures are the RGB images and their size vary based on the size of the input
layer present in the architecture.
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃+𝐹𝑃 (19)
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃+𝐹𝑁 (20)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗𝑅𝑒𝑐𝑎𝑙𝑙
𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙 (21)
𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = 𝑇𝑁+𝐹𝑃 (22)
Where,
TN = True Negative,
TP = True Positive,
FN = False Negative and FP = False Positive.
20
True positive is an outcome if the model correctly predicts the healthy class,
true negative is an outcome if the model correctly predicts the diseased class, and false
positive is produced as an outcome when the model incorrectly predicts the healthy
class. And a false negative is an outcome if the model incorrectly predicts the diseased
class. Accuracy is a ratio of correctly predicted observation to the total observations
Precision is the ratio of correctly predicted healthy observations to the totally predicted
healthy observations. Recall is the ratio of correctly predicted images to all the images
in that particular class. F1 -Score is the weighted average of precision and recall.
21
Fig. 1. Analysis on AlexNet Architecture.
For VGG-16 model for batch size of 32, the TL model gives higher
accuracy compared to the model without TL. But for batch size of 64, the model
without TL gives higher accuracy compared to the TL model. On comparing
AlexNet and VGG-16 model, AlexNet is composed of lesser number of layers and
thus the size of the model is smaller, therefore the memory requirement to store the
model on a cloud platform is less. Alexnet model and VGG-16 model has 62.30
and 138 million parameters respectively. Thus, the time taken for training increases
significantly. AlexNet model takes 4.91 minutes to train a model with batch size
of 32, but VGG-16 model takes 11.18 hours. Thus, AlexNet model is chosen for
banana leaf disease classification task.
On taking VGG-16 architecture the TL models produce high accuracy for
batch size of 32 but for batch size of 64 the accuracy is very less. The models trained
with transfer learning can learn faster. Also, there is a significant decrease in the
training for VGG-16 model but for AlexNet model, there is only 1 minute decrease
in the training time for AlexNet model. In order to avoid unambiguity in analysis
the model trained without transfer of weights is take into consideration. The batch
size used for training is also fixed to 32, because there is a huge change in the
training time for VGG-16 model, but no significant change in the accuracy is
observed. Hence, AlexNet model without transfer of weight for a batch size of 32
22
is chosen for further analysis.
Thus, the AlexNet model, without pre-trained weights are trained for
various training options. They are batch size is fixed to 32, epochs vary as 5,10,15.
The learning rates are chosen as 0.1, 0.01, 0.001 and 0.0001. For higher
learning rates, the models do not produce the expected result. Hence, it is fixed to
0.001, 0.0001. Fig. 3 shows the results of AlexNet model which is trained using
SGDM optimizer for a batch size of 32. It can be inferred from Fig. 3 that the
training and testing accuracies for 15 epochs and LR of 0.0001 are 85.16 % and
82.73% respectively. Similarly, the training and testing accuracies for 15 epochs
and LR of 0.001 is 87.33 % and 86.33% respectively. Thus, by using SGDM as
the optimizer, the model performs well for LR of 0.0001 and 0.001 for 15 epochs.
Fig. 4 shows the results of training AlexNet model using ADAM optimizer for a
batch size of 32. It can be inferred from Fig. 4 that the training and testing
accuracies for 5 epochs and LR of 0.0001 is 84.74 % and 86.07% respectively.
Similarly, the training and testing accuracies for 10 epochs and LR of 0.0001
is 88.9 % and 91.36% respectively. Thus, by using ADAM as the optimizer, the
23
model performs well for LR of 0.0001 for 5 and 10 epochs. Thus, four models of
higher accuracy using SGDM and ADAM optimizer are identified. These models
are further trained on different datasets to obtain a suitable model for Banana leaf
disease classification.
24
of epochs as compared to Model1.
Fig. 6 shows the analysis of the AlexNet models on MUSA dataset. From
Fig. 6, it can be identified that Model1 gives the highest training and testing
accuracies of 91.2 % and 89.6% respectively. Model3 also provides almost the
same training and testing accuracies of 90 % and 87.73% respectively. Model4
also provides almost the same training and testing accuracies of 92 % and
90.67% respectively. Precision, Recall, Specificity and F1 score of Model1 and
Model4 are identified to be the best. The training time is 25.25 mins for
Model1 and 24.56 mins Model4, hence there is no significant improvement in
timing between the models. But Model4 achieves better accuracy in lesser
number of epochs while compared to Model1 for MUSA dataset also.
Fig. 7 shows the analysis of the AlexNet models on the Combined dataset.
From Fig. 7, it can be identified that Model1 gives the highest training and
testing accuracies of 89.67 % and 87.42% respectively. Model2 also provides
almost the same training and testing accuracies of 83.92% and 83.42%
respectively. Model3 also provides almost the same training and testing accuracies
of 85.25% and 78.55% respectively. Precision, Recall, Specificity and F1 score
of Model1 are identified to be the best. The training time is 58.58 mins for
Model1 and 25.32 mins Model4. But Model3 achieves better accuracy in lesser
number of epochs while the improvement of number of epochs does not show any
significant improvement in accuracy of the model.
From the above analysis, Model1 and Model4 performed better for smaller
datasets. While Model4 did not produce the expected performance for Combined
dataset. But Model1 consistently produces high accuracy, precision, recall,
specificity and F1- Score for all the cases. Hence, AlexNet model with training
options as learning rate of 0.001, optimizer as SGDM and epoch of 15, can be
chosen to perform Banana leaf disease classification.
25
Fig. 4. Analysis on AlexNet Architecture with ADAM.
26
Fig. 6. Analysis on MUSA dataset.
27
Table 7. Mendeley dataset analysis
Tables 7, 8, and 9 show the performance metrics for the datasets Mendeley,
MUSA and combined dataset respectively which is also shown in Fig. 5, Fig. 6 and Fig.
7 respectively.
28
CHAPTER 4
CLASSIFIER MODEL
4.1 PROPOSED MODEL
The block diagram for the Smart banana leaf disease classifier using CNN is
shown in Fig. 8. The proposed DL model is trained using the dataset in [6] and [8]. The,
dataset is divided into 70:30 ratio and used for training and testing respectively. The
real time image collected from the banana tree farm is sent to the Dropbox. MATLAB
analysis tab in ThingSpeak is used to download the trained (.mat) file from Dropbox.
The ‘.mat’ file includes i) image enhancement using the SISR techniques ii)
identification of plant part using the pre trained LeNet classifier and iii) classification
of image using the pre trained deep learning model. Once the image is received, it is
enhanced, LeNet classifier identifies the plant part and the pre trained deep learning
model is used in the ThingSpeak cloud for classification of image. The output of the
class is shown in MATLAB analysis tab.
4.2 DATASET
A new dataset has been formed by adding 6 mineral deficiency classes (Taken
from Mendeley dataset) to the combined dataset (7 classes) and it is named as disease-
deficiency dataset. As deficiency of these minerals makes the plant more prone to
disease, hence early identification of mineral deficiency is required. Table 10 shows the
29
disease-deficiency dataset description.
Table 10. Disease-deficiency Dataset Description
Classes No. of images
Healthy 1500
Banana aphids 1500
Black sigatoka 1500
Boron deficiency 1500
Calcium deficiency 1500
Iron deficiency 1500
Magnesium deficiency 1500
Manganese deficiency 1500
Potassium deficiency 1500
Sulphur deficiency 1500
Xanthomonas 1500
Yellow sigatoka 1500
Zinc deficiency 1500
Total No. images 19500
30
97.66
97.97
98.48
75.49
74.67
71.08
68.41
67.94
67.24
62.57
62.19
61.06
58.18
57.33
54.58
51.96
51.41
47.85
5.89
4.18
2.47
model-1 model-3 model-4
Training Accuracy 71.08 67.94 74.67
Training time 5.89 2.47 4.18
Precision(in %) 51.41 47.85 57.33
Accuracy(in %) 67.24 62.57 75.49
Specificity(in %) 97.97 97.66 98.48
F1-Score(in %) 54.58 51.96 61.06
Recall (in %) 62.19 58.18 68.41
Training Accuracy Training time Precision(in %) Accuracy(in %)
Specificity(in %) F1-Score(in %) Recall (in %)
32
From the analysis done on disease-deficiency dataset, it is found that the highest
achievable training accuracy is given by Model 4 which is 74.67%. The models that
performed better in combined dataset analysis doesn’t perform well in disease-
deficiency dataset analysis due to overfitting. Fig 10, 11, 12 show the confusion
matrices of Model 1, Model 3, Model 4 respectively.
4.3.1 DATASET
For training the LeNet, a dataset with three classes has been used and it is
described in Table 12. Table 13 shows the performance parameters of the trained LeNet
model.
Leaf 500
Stem 500
Fruit 500
Total 1500
33
Table 13. Analysis on Dataset (LeNet)
Parameters Value
Training Accuracy (in %) 84.33
Training time (in sec) 49
Precision (in %) 91.13
Recall (in %) 91.28
Accuracy (in %) 91.13
Specificity (in %) 95.62
F1-Score (in %) 91.11
34
CHAPTER 5
ACCURACY IMPROVEMENT
where y * predict the class label via majority (plurality) voting of each classifier Cn.
𝑛
𝑦 ∗ = 𝑎𝑟𝑔𝑚𝑎𝑥 ∑𝑗=1 Wi P ij (24)
35
where Wj is the weight that can be assigned to the jth classifier [27]. Table 14 shows the
accuracies of the various optimization techniques used. It is found that the ensemble
model gives the higher accuracy of 96.15% as compared to Model 4, structural
modification model and transfer learning model. Therefore, ensemble model is
preferred to do the classification in cloud.
Table 14 shows the accuracies of various optimization techniques.
36
Table 14. Improvement in accuracy after Ensemble Techniques.
S. No Model Accuracy
1 Transfer learning model 94%
3 Structural modification model 89.63%
4 Ensemble Model 96.15%
37
CHAPTER 6
CLOUD OPERATIONS
6.1 CLOUD CONFIGURATION
6.1.1 ThingSpeak
ThingSpeak is an open-source software written in Ruby which allows
users to communicate with internet enabled devices. It facilitates data access, retrieval
and logging of data by providing an API to both the devices and social network
websites. Fig. 15 shows the working of ThingSpeak platform.
6.1.2 Dropbox
Dropbox is a cloud storage solution, equipped with features to help you save
time, improve your productivity, and collaborate with others. Just some of the many
things you can do with Dropbox include: Store your files, documents, and photos online
and access them from any device. Steps to do in Dropbox, to load the trained DL models
to ThingSpeak, an App is created in Dropbox. An access token to access (read and
write) the app is generated.
38
Dropbox, MATLAB analysis window in Thingspeak is used to download the model
from Dropbox using the function “downloadFromDropbox”. Image uploaded to
Dropbox will be the input image will be resized accordingly in ThingSpeak and given
as input to the trained DL model (.mat file). If the input image is not a leaf image the
classifier will request for a leaf image by printing a message. If leaf image is given as
input image the classifier will classify the image and displays class name as output.
Fig. 16 shows the MATLAB analysis window. Fig. 17 shows classification
output of DL model.
39
CHAPTER 7
CONCLUSION
7.1 CONCLUSION
Nine image enhancement techniques are used to enhance the images. Based on
metrices like Mean Brightness Error (AMBE), Discrete Entropy (DE), Structural
Similarity Index (SSIM), Enhancement Measure (EME), Lightness Order Error (LOE),
it is found that Single Image Super Resolution (SISR) turns out to be the more efficient
enhancement technique among the nine techniques.
Optimizing the performance of the AlexNet and VGG-16 models through
extensive experimentation and fine-tuning is done. Through data augmentation and
systematic variations in training parameters, including the number of epochs and
learning rate, it is aimed to identify the most accurate and efficient model configuration.
Model1 (EPOCH=15; LR=0.001; SGDM), Model2 (EPOCH=15; LR=0.0001;
SGDM), Model3 (EPOCH=5; LR=0.0001; ADAM), Model4 (EPOCH=10;
LR=0.0001; ADAM) are trained on three different datasets namely Mendeley dataset,
MUSA dataset and Combined Dataset. Out of these four models, Model1 provided
higher training and testing accuracy when Mendeley dataset is used, Model4 provided
higher accuracy, precision, recall and specificity for the same. In case of MUSA dataset
Model4 performs good both in terms of training and testing accuracy as well as
accuracy, precision, recall and specificity. Mode11 provided highest accuracy,
precision, recall, F1-Score and specificity followed Model4 if combined dataset is used.
Model3 takes less training time as compared to the other models considered.
From the analysis done on disease-deficiency dataset, it is found that the highest
achievable training accuracy is given by Model4 which is 74.67%. The models that
performed better in combined dataset analysis doesn’t perform well in disease-
deficiency dataset analysis due to overfitting. Various optimization techniques such as
transfer learning, structural modification and ensemble techniques were used to
improve the accuracy of a deep learning model. The ensemble model gives the highest
accuracy of 96.15% as compared to the other models used.
The DL models are stored in Dropbox. An access token is generated for
Dropbox App access. MATLAB analysis tab in ThingSpeak is used to download the
trained (.mat) file from Dropbox. The input image is stored in Dropbox and it is
downloaded into the ThingSpeak cloud using MATLAB analysis. The input image will
40
be resized accordingly in ThingSpeak and is given as input to the DL model. If the input
image is not a leaf image the classifier will request for a leaf image by printing a
message. If leaf image is given as input image the classifier will classify the image and
displays class name as output.
3. Specific CNN models will be developed for banana leaf disease classification
41
PUBLICATION
42
REFERENCES
43
2020 Second International Conference on Inventive Research in Computing
Applications (ICIRCA). IEEE, 2020.
[13]. Ye, Huichun, et al. "Recognition of banana fusarium wilt based on UAV remote
sensing." Remote Sensing 12.6 (2020): 938.
[14]. Ridhovan, Andreanov, Aries Suharso, and Chaerur Rozikin. "Disease Detection in
Banana Leaf Plants using DenseNet and Inception Method." Jurnal RESTI (Rekayasa
Sistem dan Teknologi Informasi) 6.5 (2022): 710-718
[15]. Evuri, Sai Rajasekhar Reddy. “Banana Leaf Disease Detection With Multi Feature
Extraction Techniques Using SVM”. Diss. Dublin, National College of Ireland, 2022.
[16]. Vijayalakshmi, D., Malaya Kumar Nath, and Om Prakash Acharya. "A
comprehensive survey on image contrast enhancement techniques in spatial
domain." Sensing and Imaging 21.1 (2020): 40.
[17]. Kim, Yeong-Taeg. "Contrast enhancement using brightness preserving bi-histogram
equalization." IEEE transactions on Consumer Electronics 43.1 (1997): 1-8.
[18]. Rahman, M. A., et al. "Image contrast enhancement for brightness preservation based
on dynamic stretching." International Journal of Image Processing (IJIP) 9.4 (2015):
241.
[19]. Ningsih, Dwi Ratna. "Improving retinal image quality using the contrast stretching,
histogram equalization, and CLAHE methods with median filters." International
Journal of Image, Graphics and Signal Processing 10.2 (2020): 30.
[20]. Reza, Ali M. "Realization of the contrast limited adaptive histogram equalization
(CLAHE) for real-time image enhancement." Journal of VLSI signal processing
systems for signal, image and video technology 38 (2004): 35-44.
[21]. Wang, Yu, Qian Chen, and Baeomin Zhang. "Image enhancement based on equal
area dualistic sub-image histogram equalization method." IEEE transactions on
Consumer Electronics 45.1 (1999): 68-75.
[22]. Fang, Faming, Juncheng Li, and Tieyong Zeng. "Soft-edge assisted network for
single image super-resolution." IEEE Transactions on Image Processing 29 (2020):
4656-4668.
[23]. Rani, Seema, and Manoj Kumar. "Contrast enhancement using improved adaptive
gamma correction with weighting distribution technique." International Journal of
Computer Applications 101.11 (2014).
[24]. Rahman, Shanto, et al. "Image enhancement in spatial domain: A comprehensive
44
study." 2014 17th international conference on computer and information technology
(ICCIT). IEEE, 2014.
[25]. Subramani, Bharath, and Magudeeswaran Veluchamy. "Fuzzy contextual
inference system for medical image enhancement." Measurement 148 (2019): 106967.
[26]. Wang, Shuhang, et al. "Naturalness preserved enhancement algorithm for non-
uniform illumination images." IEEE transactions on image processing 22.9 (2013):
3538-3548.
[27]. Mohammed, Ammar, and Rania Kora. "A comprehensive review on ensemble deep
learning: Opportunities and challenges." Journal of King Saud University-Computer
and Information Sciences (2023).
45