A Lightweight Meta-Ensemble Approach For Plant Disease Detection Suitable For IoT-Based Environments
A Lightweight Meta-Ensemble Approach For Plant Disease Detection Suitable For IoT-Based Environments
INDEX TERMS Convolution neural network, ensemble, artificial intelligence, deep learning, Internet of
Things, disease detection.
2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
28096 For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 12, 2024
R. Maurya et al.: Lightweight Meta-Ensemble Approach for Plant Disease Detection
[8] and attention-based techniques [9], [10], [11] have also CNN-based method for the categorisation of Bacterial Spot
been deployed for plant disease classification. However, the disease of the peach plants with 98.38% categorisation
deployment of such models on low-powered Internet of accuracy [14]. Ferentinos has tested different CNNs such
Things (IoT) devices with limited computational resources as AlexNet, GoogleNet, and VGGNet for the categorisation
remains a challenge. of 17,548 images of 58 different classes of plant disease
Recently, the lightweight MLP-Mixer architecture has and obtained a categorisation accuracy of 99.53% [15].
gained attention due to its lesser architectural complexity The MobileNet model developed by Kamal et al. with
and competitive performance on ImageNet dataset [12]. deep separable convolution achieved 97.65% categorization
This architecture, which relies solely on multi-layer percep- accuracy on the PlantVillage dataset [16]. A customised
trons(MLPs) without convolution and attention mechanism, CNN model has been suggested by Chohan et al. for the
present a promising solution for resource-constrained IoT classification of illnesses in 15 distinct plants [17]. The
environments. InceptionResNet model was suggested by Hassan and Maji
for the categorisation of 15 different plant disease types [18].
A. MOTIVATION AND CONTRIBUTION Atila et al. have proposed the EfficientNet model for the
Despite existing literature on machine learning and deep categorisation of 39 different diseases present in the PV
learning algorithms for plant disease diagnostics, there is dataset [19]. Amin et al. have proposed a method for corn
still a need for the development of lightweight solutions leaf disease classification by combining the features extracted
that can be easily implemented in resource-constrained from the EfficientNetB0, and DenseNet121 deep CNN
environments with limited memory and computation power. models and achieved 98.56% classification accuracy [20].
This research presents a unique two-tier meta-ensemble Maurya et al. have proposed a method for classification of
approach to address the need for lightweight models that can diseases present in the PlantVillage dataset using pre-trained
be deployed in resource-constrained IoT-based situations for Vision Transformer network and interpreted the performance
automated plant disease diagnosis. The proposed approach of the model using GradCAM algorithm [21].
harnesses the benefits of the MLP-Mixer and Long Short Some of the works under miscellaneous category, proposed
Term Memory (LSTM) models to improve classification by different researchers for the plant disease categorisation
performance while staying appropriate for usage in resource- have been summarised as follows: Abbas et al. have utilised
constrained contexts. generative adversarial networks to produce synthetic images
The adoption of the suggested meta-ensemble technique of the diseased leaves of the tomato plant [22]. Five different
is supported by its lightweight nature, which makes it types of potato plant diseases have been classified with the
suited for deployment in resource-constrained situations DenseNet121 model with 97.11% categorisation accuracy.
such as IoT devices. Integrating MLP-Mixer and LSTM Thakur et al. have utilised the ViT architecture for the
models into the proposed meta-ensemble allows for the use categorisation of the images of plant diseases and achieved
of their complimentary capabilities thereby enhancing the an average accuracy of more than 93% in the case of
classification performance. Apple, Maize, and Rice datasets [23]. For tomato leaf disease
The rest of the article has been split up into the following classification, Karthik et al. [24] proposed a strategy based
sections: Section II details the related works. Section III on the use of the attention mechanism in a deep CNN. Their
provides details about the methods deployed in the proposed suggested model performed 98% categorization correctly
work. Section IV displays experimental results and provides when evaluated with 24001 photos [24]. Shah et al. suggested
detailed discussions of them. Section V provides a concrete a teacher/student architecture for identifying 14 different
outline of the proposed work. plant diseases [25].
Most of the works discussed above either used convolution
II. RELATED WORKS or attention mechanisms embedded with the CNN archi-
Some of the prior works related to the plant disease categori- tecture. These models cannot be adapted to an IoT-based
sation task have been discussed in this section. This paragraph environment where there is a constraint of limited memory
describes some of the convolution neural network based and computational power. Internet of things faces several
models proposed by the different researchers. Zhao et al. [9] challenges such as limited resources in terms of computing,
have proposed a method consisting of an inception module power and memory capacity [26]. Therefore, in the proposed
and residual connection for the identification of diseases work, a lightweight approach has been presented which does
related to the corn, potato and tomato plants. They also not rely on convolution or attention mechanism, thereby,
suggested the use of a web-based system for the real-time it is well suited for IoT-based deployment. The proposed
identification of plant diseases [9]. Pandey and Jain have model also utilises the multi-tier meta ensemble approach in
proposed an attention-based dense CNN model for the which the prediction probabilities obtained from the trained
detection of 44 diverse types of plant diseases using a models at the first level are used as a feature set to train the
dataset constructed from the 10,851 images captured from model at the second level. The meta-ensemble approach helps
the field and achieved 97.33% categorisation accuracy [13]. in further improving the categorisation performance of the
Bedi and Gole proposed a convolutional autoencoder and proposed method.
VOLUME 12, 2024 28097
R. Maurya et al.: Lightweight Meta-Ensemble Approach for Plant Disease Detection
FIGURE 1. Images of the samples taken from each dataset (a) Cotton
Dataset (b) Tomato/Potato/Pepper Dataset (c) Maize Dataset.
TABLE 2. Number of sample images present in each class. A. SPLIT THE DATASET
Training and test sets have been created from the entire
dataset. While the test set photos were used to gauge how
well the proposed meta ensemble framework performed at
categorising images, the training set images were utilised to
train the models. The experimental findings section contains a
description of the number of sample photos that were utilised
for training and testing.
divided into several patches and each of these patches has where T and J denote the output of the first and the second
been further projected into D dimensional space (here, D = FCN layers. LN denotes the layer normalisation operation.
128) of fixed size, the projected embeddings are termed as W 1, W 2, W 3 and W 4, denote the weight matrices. X denotes
‘tokens’. The core functionality of the Mixer architecture the input to the first FCN layer. C and S denote the number of
lies in its mixer layers. Mixer layers composed of MLPs channels and tokens respectively. The other hyperparameters
which perform two different operations, i.e., mixing of the related to the proposed MLP mixer have also been presented
tokens and the channels. The token mixing allows the MLP in Section V.
Mixer architecture to learn the spatial relationship between
the tokens (patch embeddings); whereas, the channel mixing b: LONG SHORT TERM MEMORY (LSTM)
MLP allows the model to learn the inter-relationship between Long Short-Term Memory (LSTM) is a recurrent neural
the channels present in the single token itself. network (RNN) architecture that addresses the vanishing
Thus, in any mixer layer an input matrix of shape gradient problem in regular RNNs. LSTMs regulate infor-
(NXD, N = 9, D = 128), where N is the number of mation flow by using a memory cell and three gates (input,
patches and D is the embedding dimension, passes through forget, and output). The cell stores information across the
the token mixing and channel mixing MLP by transposing long sequences, allowing the network to capture and learn
the input matrix accordingly. An MLP in the mixer layer data dependencies more efficiently. Their capacity to handle
consists of two fully-connected (FCN) layers with Gaussian long-range dependencies makes them ideal for a variety
Error Linear Unit (GELU) non-linearity. Thus, in any mixer of sequential data applications. The reason for choosing
layer: layer normalisation, GELU non-linearity and skip LSTM architecture for the proposed meta-ensemble is that
connections between the two MLPs are used for the smoother the LSTM model applies operations directly to the data
flow of the gradient among the layers. A series of mixer and does not use the convolution concept as well as the
layers with the same form make up the MLP Mixer. Because attention mechanism. Therefore, LSTM is also lightweight
increasing the number of mixer layers also makes the MLP in comparison to CNN and ViT architectures. Thus, LSTM
Mixer more difficult, an ideal number of mixer layers (seven has also been deployed for the development of the proposed
in this case) has been chosen for the current plant disease meta-ensemble. Though it is not reasonable to train the LSTM
classification assignment. The output of the final mixer layer directly on input images, considering the raw information
is passed through the normalisation layer first, the dropout present in an input image; therefore, the features drawn
layer (rate = 0.25), the global average pooling (GAP) layer, out from the last convolution layer of ImageNet-trained
and then the categorisation layer with the activation function CNNs were used to train the proposed LSTM. Input images
‘‘softmax’’ for each row of an input matrix. Using Equations 1 were resized to 64X64X3 before passing them as input to
and 2, the mixer layer in the MLP Mixer model can be these ImageNet-trained CNNs for the extraction of features.
represented. These ImageNet-trained CNNs had their top-most FC layers
removed, and the activations from that layer were then
T∗,i = X∗,i + W2 σ (W1 .LN (X )∗,i ), for i = 1, . . . , C, (1)
sent to the layer known as ‘‘global average pooling’’ to be
Jj,∗ = Tj,∗ + W4 σ (W3 .LN (T )j,∗ ), for j = 1, . . . , S, (2) used in feature extraction. The weights of the convolution
base in ImageNet-trained CNNs were kept frozen. The how the suggested technique has been divided into two levels:
features drawn from four different pre-trained CNNs such as level 1 of the proposed meta ensemble contains the MLP
MobileNet, DenseNet121, DenseNet169 and DenseNet201 Mixer and LSTM models, while level 2 of the ensemble
were concatenated to form features set having dimensionality, contains the SVM classifier.
D = 5632. The dimensionality of the individual feature set The overall objective of this study is to build a lightweight
drawn out from the MobileNet, DenseNet121, DenseNet201 framework that can be deployed on IoT-based devices, there-
and DenseNet169 was 1024, 1024, 1536, and 2048 consec- fore, MLP-mixer and LSTM were found suitable to create
utively. The concatenated features of shape (1, 5632) were meta ensemble since they are lightweight, more accurate
given as input to the Long Short Term architecture. The value and accelerates the prediction time. Pre-processed augmented
of the time-step chosen for the LSTM architecture was 1. The training set images were used to train the MLP Mixer
total number of cells chosen for the present categorisation model present at level 2 while the LSTM model has been
task was 30. The number of trainable parameters present trained on the concatenated features, drawn out from the four
in the proposed LSTM was lesser than 0.3 million. The different ImageNet-trained CNNs. All images were resized to
other hyperparameters used in the proposed LSTM have been 64 × 64 before passing them to ImageNet-trained CNNs for
described in Section V. drawing out the features from them. The features drawn out
The combination of LSTM and MLP-mixer used at the from the pre-trained MobileNet, DenseNet121, DenseNet169
level 1 of the proposed method helps in learning the and DenseNe201 architectures were concatenated to form
patch-level, channel-level and feature-level dependencies the combined feature vector of shape (1,5632) which is used
present in an input image. LSTM model has been used to train LSTM. After training both the models present at
to learn the distinguishing characteristics present in the level 1, the prediction probabilities of these trained models
one-dimensional combined feature vector obtained after were recorded by providing training set images as input
combining the feature set obtained from the MobileNet, to them. The shape of the prediction probability matrix
DenseNet121, DenseNet201 and DenseNet169 models. Both obtained from these models was No._of_training_images X
models when used together in the meta-ensemble at level 1, num_classes. After concatenating the prediction probability
gives better classification performance with optimised run vector obtained from both the trained models present at
time, in comparison to the other combinations of models such level 1, the combined feature representation matrix of shape
as other variants of the vision transformer model. (No. of training images X (2Xnum_classes)) was used to train
the SVM classifier present at level 2.
2) DESCRIPTION OF THE MODEL PRESENT AT LEVEL 2
The following procedures have been used to test the pro-
posed method: first, unseen test photos were pre-processed
As shown in Fig. 2 the proposed meta ensemble is composed
in the same way that the training set images were. Then
of models present at two different levels. The predictions
the models (LSTM and MLP-Mixer) trained during the
made by the models (MLP Mixer and LSTM) present at level
training phase were used to obtain the prediction probabilities
1 are used as a feature set to train the ML model (support
by giving test set images as an input to them. The
vector machine) present at level 2.
prediction probabilities obtained from these models (LSTM
Support Vector Machine (SVM): SVM classifier is based
and MLP-Mixer), were concatenated and then passed as input
on the theory of maximising the margin between separating
to the trained SVM classifier to make the final decision about
hyperplanes [30]. SVM is well known for its better perfor-
the class of an input test set images. The testing phase of the
mance with a limited amount of training data [31]. Therefore,
proposed meta ensemble can be represented mathematically
it has been chosen as the final classifier in the proposed
using Eq.3- 7.
two-level meta ensemble approach. SVM takes its input from
the predictions made by the models present at level 1. SVM Xtest-LSTM = [FMobileNet , FDenseNet121 ,
classifier has been chosen due to its better performance in
contrast to the other ML classifiers such as Naïve Bayes, FDenseNet169 , FDenseNet201 ] (3)
Random Forest and Nearest-Neighbor. The SVM classifier PLSTM = LSTM(XLSTM , 2LSTM ) (4)
has also been proven to be superior to other classifiers in PMixer = MLP_Mixer(XMixer , 2Mixer ) (5)
the context of the current categorization task, and the related PConcat = [PMixer , PLSTM ] (6)
experimental findings are presented in the results part of the
current publication. YFinal = SVM_Predict(PConcat , 2SVM ) (7)
predicted probabilities of these models. After concatenating The MLP-Mixer architecture consists of 7 mixer layers. The
these probabilities into a vector, the final matrix denoted by detailed overview of the single mixer layer including the
PConcat in Eq. 6 is used to test the SVM model trained during name, output size and number of parameters have been
the training phase, as shown in Eq. 7. provided in Table 7.
TABLE 7. Detailed overview of single Mixer layer of MLP-Mixer model.
V. EXPERIMENTAL RESULTS AND DISCUSSIONS
Python 3.6 was used to implement each experiment, and
an Nvidia K80 GPU with 16GB of RAM was used. The
effectiveness of the suggested meta-ensemble has been eval-
uated using a variety of assessment measures, including as
precision, recall, F1 score, and accuracy. An ROC (Receiver
operating characteristic) curve has also been plotted for each
class represented in each dataset. The dataset has been split
into the ratio of 0.8:0.2, 80% of the data was used for training
and remaining 20% were used for the training. The division
of the entire dataset into a training set and a test set is shown
in Table 4.
TABLE 4. The number of training and test set images present in each
dataset.
Techniques such as dropout, layer normalization and
advanced activation functions such as Gaussian error linear
unit(GeLU) has been used to avoid local minima. Moreover,
the performance of the proposed method has been analyzed
on the validation set and the unseen test set to measure the
correct generalizability of the proposed model. To test the
The hyperparameters of the different architectures used in
generalisation of the MLP Mixer and LSTM models used in
the proposed meta ensemble have been shown in Table 5 and
the proposed meta ensemble, training and validation accuracy
Table 6. Table 5 shows the hyperparameters for the proposed
and loss curves have also been plotted for each dataset as
MLP Mixer model and Table 6 shows the hyperparameters
shown in Fig. 4 and Fig. 5 respectively. It can be analysed
used in the proposed LSTM architecture used in designing
from the training and validation accuracy curves that trained
the proposed meta-ensemble.
models have neither the high bias nor the high variance and
TABLE 5. Hyperparameters of the proposed MLP Mixer Architecture. both the models (MLP Mixer and LSTM) have achieved
convergence. It can also be observed from Fig. 4 and Fig. 5
that the convergence in the case of LSTM architecture is faster
than the convergence of MLP Mixer architecture.
The confusion matrices obtained for the final SVM
classifier for each dataset have been shown in Fig. 6(a), 6(b),
and 6(c) for TPP, Maize and Cotton datasets respectively.
The performance metrics calculated from these confusion
matrices have also been presented in Tables 8(a), 8(b) and 8(c)
for each dataset. As shown in Table 8(a), in the case of the
Maize dataset worst f1 score of 0.83 has been obtained for
the ‘healthy’ class whereas the best f1 score of value 1 has
TABLE 6. Hyperparameters of the proposed LSTM architecture.
been obtained for the ‘blight’ and ‘grey leaf spot’ disease
class. The average categorisation accuracy of 94.27% has
been obtained in the case of the Maize dataset. As shown in
Table 8(b), for the ‘Corn’ dataset, the best f1 score of 0.99 has
been obtained for the ‘curl_virus’ and ‘healthy’ classes. The
average categorisation accuracy of 98.43% has been obtained
in the case of the Cotton dataset. It can be analysed from
Table 8(c) that for the ‘TPP’ Dataset, the lowest f1 score of
The SVM classifier used at the second level of the 0.89 has been obtained for the ‘early blight’ disease class
proposed meta-ensemble has been fine-tuned using a grid- and the lowest precision and recall have been achieved for
search strategy. The grid-search has been performed using ‘healthy’ and ‘spider mite’ diseased class of tomato plant.
the following values: ‘C’: [0.1, 1, 10, 100, 1000], ‘gamma’: The highest f1 score of 0.99 has been obtained in the case
[1, 0.1, 0.01, 0.001, 0.0001] and ’kernel’: [‘linear’, ‘RBF’]. of two different classes of tomato plant named ‘late blight’
FIGURE 4. The training and validation accuracy curves of the MLP mixer and LSTM models used in
the proposed meta ensemble.
FIGURE 5. The training and validation loss curves of the MLP mixer and LSTM models used in the
proposed meta ensemble.
and ‘mosaic virus’ whereas, in the case of the potato plant, As presented in Table 9, the number of parameters in
the ‘early blight’ class has obtained the highest f1 score. The the proposed meta ensemble is near about a million. The
average categorisation accuracy of 97.45% has been obtained time taken by the proposed meta ensemble, i.e., the time
in the case of the ‘TPP’ dataset. required in getting output from level 1 models (MLP Mixer
TABLE 8. Performance metrics for (a) Maize dataset (b) Cotton dataset
(c) TPP Dataset.
FIGURE 6. Confusion matrix for (a) TPP dataset, (b) Maize dataset, and
(c) Cotton Dataset.
TABLE 9. % Accuracy, prediction time, and total count of trainable parameters present in the models used in the proposed meta ensemble. The
prediction time of the proposed meta ensemble includes the time required to extract the features from LSTM and Mixer models present at level 1 and the
prediction time of the SVM classifier present at level 2. The number of trainable parameters includes the number of trainable parameters of LSTM and
Mixer models; SVM cannot be compared with other neural network-based models in terms of the number of parameters; therefore, no parameter has
been shown in the case of the SVM classifier.
TABLE 10. Comparison of the proposed method with other state-of-the-art vision transformer and convolutional neural networks.
TABLE 11. Comparison of the SVM classifier used in the proposed meta ensemble with other classifiers.
TABLE 12. Performance comparison of different methods on Cotton, Maize, and TPP datasets.
Other related methods, using different sizes of a dataset, efficiency for its deployment in resource constrained IoT
and different numbers of classes for similar plant disease environments.
categorisation tasks have been compared with the proposed In contrast to Rai et al. [33] who utilised customised CNN
meta ensemble as presented in Table 12. on the Cotton dataset, the proposed meta-ensemble surpasses
As shown in Table 12, the proposed meta-ensemble their categorisation accuracy of 97.98% by achieving 98.45%
consisting of two-level ensemble composed of MLP-Mixer, accuracy even with the smaller size of dataset. Similarity,
LSTM and SVM, has demonstrated superior categorisa- on Maize dataset, the proposed work has outperformed
tion accuracy across all the datasets used for the com- Mishra et al. [37], Waheed et al. [38], Arvind et al. [39] even
parison purpose with its lightweight architecture. It has with smaller size dataset. On the ‘TPO’ dataset, the proposed
achiever, 98.45%, 94.26% and 97.45% categorisation accu- meta-ensemble achieves the highest accuracy surpassing
racy with Cotton, Maize and ‘TPO’ datasets respectively. Abbas et al. [22] utilizing conditional generative adversarial
The proposed framework is able to achieve this perfor- networks for data augmentation and DenseNet121 for
mance with a smaller size dataset and fewer trainable classification purpose. It has also surpassed the performance
parameters, highlighting its efficiency and suitability and of the other methods discussed in Table 12.
The best performance of the proposed method among the DATA AVAILABILITY
performance of all the other methods with limited size of The authors have also stated that data will be accessible upon
dataset and model parameters, makes the proposed method request.
useful for its deployment with the resource constrained IoT
devices. REFERENCES
[1] S. Deepa and R. Umarani, ‘‘Steganalysis on images using SVM with
VI. CONCLUSION
selected hybrid features of Gini index feature selection algorithm,’’ Int.
This paper aims at building a lightweight framework for J. Adv. Res. Comput. Sci., vol. 8, no. 5, p. 1503, 2017.
plant disease categorisation that can easily be deployed [2] S. Zhang and Z. Wang, ‘‘Cucumber disease recognition based on
in a resource-constrained IoT-based environment. To meet global–local singular value decomposition,’’ Neurocomputing, vol. 205,
pp. 341–348, Sep. 2016.
this goal, a meta-ensemble approach has been proposed in [3] S. Zhang, X. Wu, Z. You, and L. Zhang, ‘‘Leaf image based cucumber
this work, which is composed of lightweight state-of-the-art disease recognition using sparse representation classification,’’ Comput.
architectures such as MLP Mixer and LSTM. The modular Electron. Agricult., vol. 134, pp. 135–141, Mar. 2017.
[4] A. Loddo, M. Loddo, and C. Di Ruberto, ‘‘A novel deep learning
and lightweight nature of proposed model ensures scalability based approach for seed image classification and retrieval,’’ Com-
of use by integrating it with the resource constrained IoT put. Electron. Agricult., vol. 187, Aug. 2021, Art. no. 106269, doi:
devices. The proposed model has been trained on three 10.1016/j.compag.2021.106269.
[5] C. Qian, M. Tong, X. Yu, and S. Zhuang, ‘‘CNN-based visual processing
diverse datasets; therefore, it tends to learn common features approach for biological sample microinjection systems,’’ Neurocomputing,
across diverse types of plant diseases. In addition to that vol. 459, pp. 70–80, Oct. 2021, doi: 10.1016/j.neucom.2021.06.085.
the use of features extracted from the multiple CNN models [6] R. Maurya, V. K. Pathak, and M. K. Dutta, ‘‘Deep learning based
further ensures the generalisability of the proposed solution. microscopic cell images classification framework using multi-level
ensemble,’’ Comput. Methods Programs Biomed., vol. 211, Nov. 2021,
The proposed meta ensemble approach has achieved the Art. no. 106445, doi: 10.1016/j.cmpb.2021.106445.
categorisation accuracy of 98.43%, 94.27% and 97.45% in [7] S. Huang, G. Zhou, M. He, A. Chen, W. Zhang, and Y. Hu, ‘‘Detection
the case of the Corn, Maize and TPP dataset respectively. of peach disease image based on asymptotic non-local means and PCNN-
IPELM,’’ IEEE Access, vol. 8, pp. 136421–136433, 2020.
Due to the lightweight nature of the proposed meta ensemble, [8] S. Yadav, N. Sengar, A. Singh, A. Singh, and M. K. Dutta, ‘‘Identification
the proposed meta ensemble also accelerates the prediction of disease using deep learning and evaluation of bacteriosis in peach leaf,’’
time. Thus, considering the overall benefits of the suggested Ecolog. Informat., vol. 61, Mar. 2021, Art. no. 101247.
[9] Y. Zhao, C. Sun, X. Xu, and J. Chen, ‘‘RIC-Net: A plant disease classifi-
method in terms of its accuracy, lesser number of trainable cation model based on the fusion of inception and residual structure and
parameters and fast processing capability made the proposed embedded attention mechanism,’’ Comput. Electron. Agricult., vol. 193,
meta ensemble the obvious choice for its deployment on Feb. 2022, Art. no. 106644, doi: 10.1016/j.compag.2021.106644.
Internet of Things-based platforms. [10] R. Maurya, R. Burget, R. Shaurya, M. Kiac, and M. K. Dutta,
‘‘Multi-head attention-based transfer learning approach for porato dis-
In future, the proposed method can be advanced so that ease detection,’’ in Proc. 15th Int. Congr. Ultra Modern Telecom-
it can be utilised in precision agriculture, by enhancing the mun. Control Syst. Workshops (ICUMT), Oct. 2023, pp. 165–169, doi:
capabilities of the model by training it using multimodal 10.1109/ICUMT61075.2023.10333272.
[11] X. Chen, G. Zhou, A. Chen, J. Yi, W. Zhang, and Y. Hu, ‘‘Identifi-
data, including soil information, real-time weather conditions cation of tomato leaf diseases based on combination of ABCK-BWTR
that affect the plant health. This will help in improving the and B-ARNet,’’ Comput. Electron. Agricult., vol. 178, Nov. 2020,
adaptability of the model and in automatically adjusting its Art. no. 105730.
[12] I. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. Zhai, T.
predictions based on the real-time changes in environmental Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, and
conditions, promoting, accurate and real-time response. A. Dosovitskiy, ‘‘MLP-mixer: An all-MLP architecture for vision,’’ 2021,
We see the necessity for a more thorough investigation of arXiv:2105.01601.
[13] A. Pandey and K. Jain, ‘‘A robust deep attention dense convolutional neural
the policy implications to further enhance the conversation. network for plant leaf disease identification and classification from smart
Through expanding the conclusion to discuss prospective phone captured real world images,’’ Ecolog. Informat., vol. 70, Sep. 2022,
investment possibilities and strategic considerations for Art. no. 101725, doi: 10.1016/j.ecoinf.2022.101725.
[14] P. Bedi and P. Gole, ‘‘Plant disease detection using hybrid model based on
policymakers, we want to offer insightful information that
convolutional autoencoder and convolutional neural network,’’ Artif. Intell.
closes the knowledge gap between cutting edge research Agricult., vol. 5, pp. 90–101, Jan. 2021.
and real-world applications. This improvement will serve the [15] K. P. Ferentinos, ‘‘Deep learning models for plant disease detection and
needs of investor and legislators, presenting our lightweight diagnosis,’’ Comput. Electron. Agricult., vol. 145, pp. 311–318, Feb. 2018,
doi: 10.1016/j.compag.2018.01.009.
framework as a valuable resource in the field of agricultural [16] K. Kc, Z. Yin, M. Wu, and Z. Wu, ‘‘Depthwise separable convolution
technology. architectures for plant disease classification,’’ Comput. Electron. Agricult.,
vol. 165, Oct. 2019, Art. no. 104948, doi: 10.1016/j.compag.2019.104948.
DECLARATION OF COMPETING INTEREST [17] M. Chohan, A. Khan, R. Chohan, S. H. Katpar, and M. S. Mahar,
The authors affirm that they do not have any competing ‘‘Plant disease detection using deep learning,’’ Int. J. Recent Tech-
nol. Eng. (IJRTE), vol. 9, no. 1, pp. 909–914, May 2020, doi:
financial interests or personal relationships that could have 10.35940/ijrte.a2139.059120.
influenced the reported work in this paper. [18] S. M. Hassan and A. K. Maji, ‘‘Plant disease identification using
a novel convolutional neural network,’’ IEEE Access, vol. 10,
FUNDING pp. 5390–5401, 2022.
[19] Ü. Atila, M. Uçar, K. Akyol, and E. Uçar, ‘‘Plant leaf disease classification
This research did not receive any particular funding from using EfficientNet deep learning model,’’ Ecolog. Informat., vol. 61,
public, commercial, or not-for-profit organizations. Mar. 2021, Art. no. 101182, doi: 10.1016/j.ecoinf.2020.101182.
[20] H. Amin, A. Darwish, A. E. Hassanien, and M. Soliman, ‘‘End-to-end deep [40] C. Yin, T. Zeng, H. Zhang, W. Fu, L. Wang, and S. Yao, ‘‘Maize small leaf
learning model for corn leaf disease classification,’’ IEEE Access, vol. 10, spot classification based on improved deep convolutional neural networks
pp. 31103–31115, 2022, doi: 10.1109/ACCESS.2022.3159678. with a multi-scale attention mechanism,’’ Agronomy, vol. 12, no. 4, p. 906,
[21] R. Maurya, N. N. Pandey, V. P. Singh, and T. Gopalakrishnan, ‘‘Plant Apr. 2022, doi: 10.3390/agronomy12040906.
disease classification using interpretable vision transformer network,’’ [41] M. Agarwal, A. Singh, S. Arjaria, A. Sinha, and S. Gupta, ‘‘ToLeD: Tomato
in Proc. Int. Conf. Recent Adv. Electr., Electron. Digit. Healthcare leaf disease detection using convolution neural network,’’ Proc. Comput.
Technol. (REEDCON), May 2023, pp. 688–692, doi: 10.1109/REED- Sci., vol. 167, pp. 293–301, Jan. 2020, doi: 10.1016/j.procs.2020.03.225.
CON57544.2023.10151342. [42] A. Elhassouny and F. Smarandache, ‘‘Smart mobile application to
[22] A. Abbas, S. Jain, M. Gour, and S. Vankudothu, ‘‘Tomato plant recognize tomato leaf diseases using convolutional neural networks,’’
disease detection using transfer learning with C-GAN synthetic images,’’ in Proc. Int. Conf. Comput. Sci. Renew. Energies (ICCSRE), Jul. 2019,
Comput. Electron. Agricult., vol. 187, Aug. 2021, Art. no. 106279, doi: pp. 1–4, doi: 10.1109/ICCSRE.2019.8807737.
10.1016/j.compag.2021.106279. [43] S. Widiyanto, R. Fitrianto, and D. T. Wardani, ‘‘Implementation of
[23] P. S. Thakur, P. Khanna, T. Sheorey, and A. Ojha, ‘‘Explainable vision convolutional neural network method for classification of diseases in
transformer enabled convolutional neural network for plant disease tomato leaves,’’ in Proc. 4th Int. Conf. Informat. Comput. (ICIC),
identification: PlantXViT,’’ 2022, arXiv:2207.07919. Oct. 2019, pp. 1–5, doi: 10.1109/icic47613.2019.8985909.
[24] R. Karthik, M. Hariharan, S. Anand, P. Mathikshara, A. Johnson, and [44] D. Oppenheim and G. Shani, ‘‘Potato disease classification using
R. Menaka, ‘‘Attention embedded residual CNN for disease detection in convolution neural networks,’’ Adv. Animal Biosci., vol. 8, no. 2,
tomato leaves,’’ Appl. Soft Comput., vol. 86, Jan. 2020, Art. no. 105933, pp. 244–249, 2017, doi: 10.1017/s2040470017001376.
doi: 10.1016/j.asoc.2019.105933.
[25] D. Shah, V. Trivedi, V. Sheth, A. Shah, and U. Chauhan, ‘‘ResTS: Residual
deep interpretable architecture for plant disease detection,’’ Inf. Process.
Agricult., vol. 9, no. 2, pp. 212–223, Jun. 2022. RITESH MAURYA received the B.Tech. degree
[26] B. B. Gupta and M. Quamara, ‘‘An overview of Internet of Things
in computer science and engineering, the M.Tech.
(IoT): Architectural aspects, challenges, and protocols,’’ Concurrency
degree in computer science and engineering from
Comput., Pract. Exper., vol. 32, no. 21, Nov. 2020, Art. no. e4946, doi:
10.1002/cpe.4946. the ABV-Indian Institute of Information Tech-
[27] S. K. Noon, M. Amjad, M. A. Qureshi, and A. Mannan, ‘‘Computationally nology and Management, and the Ph.D. degree
light deep learning framework to recognize cotton leaf diseases,’’ J. Intell. in computer science and engineering from the
Fuzzy Syst., vol. 40, no. 6, pp. 12383–12398, Jun. 2021. Centre for Advanced Studies, Dr. A.P.J. Abdul
[28] D. Singh, N. Jain, P. Jain, P. Kayal, S. Kumawat, and N. Batra, ‘‘PlantDoc: Kalam Technical University, with a focus on
A dataset for visual plant disease detection,’’ in Proc. 7th ACM IKDD machine learning and deep learning applications in
CoDS 25th (COMAD), Jan. 2020, pp. 249–253. biology and medicine. He is currently an Associate
[29] G. Geetharamani and J. A. Pandian, ‘‘Identification of plant leaf Professor with the Amity Centre for Artificial Intelligence, Amity University,
diseases using a nine-layer deep convolutional neural network,’’ Comput. Noida, India. He has published research in esteemed journals, including
Elect. Eng. J., vol. 76, pp. 323–338, Jun. 2019. [Online]. Available: IEEE, Wiley, Springer, and Elsevier. His research interests include machine
https://doi.org/10.1016/j.compeleceng.2019.08.010 learning, and deep learning and its applications in diverse domains.
[30] V. N. Vapnik, The Nature of Statistical Learning Theory. New York, NY,
USA: Springer, 2000, doi: 10.1007/978-1-4757-3264-1.
[31] A. A. Nurhanna and M. F. Othman, ‘‘Multi-class support vector machine
application in the field of agriculture and poultry: A review,’’ Malaysian J.
Math. Sci., vol. 11, pp. 35–52, Feb. 2017.
[32] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, SATYAJIT MAHAPATRA received the B.Tech.
T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, degree in electronics and telecommunication engi-
J. Uszkoreit, and N. Houlsby, ‘‘An image is worth 16×16 words: neering from the Biju Patnaik University of
Transformers for image recognition at scale,’’ 2021, arXiv:2010.11929. Technology, the M.Tech. degree in electronics
[33] C. K. Rai, ‘‘Automatic categorisation of real-time diseased cotton leaves and communication engineering from Siksha ‘O’
and plants using a deep convolutional neural network,’’ Res. Square, Anusandhan University, and the Ph.D. degree in
Durham, NC, USA, 2022, doi: 10.21203/rs.3.rs-1440994/v1. electronics and communication engineering from
[34] A. M., M. Zekiwos, and A. Bruck, ‘‘Deep learning-based image processing the Birla Institute of Technology, Mesra, with a
for cotton leaf disease and pest diagnosis,’’ J. Electr. Comput. Eng.,
focus on machine learning and signal processing
vol. 2021, pp. 1–10, Jun. 2021, doi: 10.1155/2021/9981437.
for genomic data analysis. He is currently an
[35] X. Liang, ‘‘Few-shot cotton leaf spots disease classification based on
metric learning,’’ Plant Methods, vol. 17, no. 1, p. 114, Dec. 2021, doi:
Assistant Professor with the Department of Information and Communication
10.1186/s13007-021-00813-7. Technology, Manipal Institute of Technology, MAHE. He has published
[36] Y. Dong, Z. Fu, S. Stankovski, Y. Peng, and X. Li, ‘‘A cotton disease research in esteemed journals, including IEEE, Oxford University Press,
diagnosis method using a combined algorithm of case-based reasoning and Wiley. His research interests include applied machine learning, image
and fuzzy logic,’’ Comput. J., vol. 64, no. 1, pp. 155–168, Nov. 2019, doi: processing, and genomic signal processing.
10.1093/comjnl/bxaa098.
[37] S. Mishra, R. Sachan, and D. Rajpal, ‘‘Deep convolutional neural
network based detection system for real-time corn plant disease recog-
nition,’’ Proc. Comput. Sci., vol. 167, pp. 2003–2010, Jan. 2020, doi:
10.1016/j.procs.2020.03.236.
[38] A. Waheed, M. Goyal, D. Gupta, A. Khanna, A. E. Hassanien, and LUCKY RAJPUT received the B.Sc. degree in
H. M. Pandey, ‘‘An optimized dense convolutional neural network physics and mathematics, in 2018, and the master’s
model for disease recognition and classification in corn leaf,’’ Com- degree in physics, in 2021. She is currently
put. Electron. Agricult., vol. 175, Aug. 2020, Art. no. 105456, doi: pursuing the M.Tech. degree in data science with
10.1016/j.compag.2020.105456. Amity University, Noida. Her research interests
[39] K. R. Aravind, P. Raja, K. V. Mukesh, R. Aniirudh, R. Ashiwin, include machine learning and deep learning.
and C. Szczepanski, ‘‘Disease classification in maize crop using bag
of features and multiclass support vector machine,’’ in Proc. 2nd Int.
Conf. Inventive Syst. Control (ICISC), Jan. 2018, pp. 1191–1196, doi:
10.1109/icisc.2018.8398993.