0% found this document useful (0 votes)
44 views13 pages

Fruitnet Compressed

Uploaded by

priyan05072003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views13 pages

Fruitnet Compressed

Uploaded by

priyan05072003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Journal of Agriculture and Food Research 18 (2024) 101474

Contents lists available at ScienceDirect

Journal of Agriculture and Food Research


journal homepage: www.sciencedirect.com/journal/journal-of-agriculture-and-food-research

XAI-FruitNet: An explainable deep model for accurate fruit classification


Shirin Sultana a , Md All Moon Tasir a, S.M. Nuruzzaman Nobel a , Md Mohsin Kabir b,*,
M.F. Mridha c,**
a
Bangladesh University of Business and Technology, Dhaka, Bangladesh
b
Eötvös Loránd University, Budapest, 1117, Hungary
c
American International University - Bangladesh, Dhaka, 1229, Bangladesh

A R T I C L E I N F O A B S T R A C T

Keywords: In agricultural technology, precise fruit classification is essential yet challenging due to inherent interclass
CNN similarities and intra-class variabilities among fruit species. Despite their impressive performance, traditional
Fruit classification XAI model deep learning models suffer from a lack of interpretability, which hampers their transparency and trustworthi­
Interpretability
ness in practical applications. To address these issues, we present XAI-FruitNet, a novel hybrid deep learning
Hybrid pooling
architecture designed to enhance feature discrimination by integrating average and max pooling techniques. XAI-
Deep learning
FruitNet, an optimized architecture for efficiency evaluated across the Fruits-360, Fruit Recognition, Fruit and
Vegetables Image Recognition, and Dry Fruit datasets, consistently achieves over 97 % accuracy, surpassing
existing state-of-the-art models and underscoring its remarkable generalization capability. A significant
advancement of XAI-FruitNet is its built-in interpretability, which enhances the model’s transparency and fosters
trust among endusers. Through rigorous experimentation, we demonstrate that XAI-FruitNet advances state-of-
the-art fruit classification accuracy and sets a new standard for explainable artificial intelligence (XAI) in agri­
cultural applications. This hybrid approach ensures that stakeholders can rely on the classification outcomes’
high performance and comprehensible nature, thereby offering a robust and trustworthy solution for modern
agricultural needs.

1. Introduction of individual fruits.


Current methodologies for fruit categorization, tailored to specific
The botanical world boasts of a diverse range of fruits, each with its fruit scenarios, face challenges in effectively addressing these com­
unique shape, color, and texture. Botanists classify fruits to categorize plexities and may not yield favourable outcomes when applied to mul­
plants into different families, genera, and species. In recent years, ad­ tiple fruits simultaneously. Deep learning has demon strated superior
vancements in deep learning (DL) and artificial intelligence (AI) have performance in fruit classification compared to traditional machine
paved the way for innovative approaches to fruit classification. learning (ML). Still, both ML and DL models are frequently regarded as
Numerous challenges in agriculture are presently being addressed "Black Boxes" due to their lack of interpretability, posing challenges for
through computer vision based on deep learning, encompassing many end users, such as agricultural experts or fruit vendors, requiring in­
applications, including weed detection [1], vegetable classification [2], sights into the classification process for validation. The provision of
and fruit detection [3]. While extensive research has been conducted on transparent explanations and interpretations of model outputs in fruit
classifying individual fruits such as grapes [4], mangos [5], apples classification holds the potential to enhance the adoption of artificial
[6–10,13], kiwis [14], strawberries [15], there is a recognized need for intelligence in this domain. Consequently, this enhancement would
more comprehensive investigations and practical techniques for generic assist stakeholders in making evidence-based decisions, ultimately
fruit classification. The distinctive attributes of fruit, encompassing leading to broader acceptance and utilization of these technologies, and
shape, size, colour, and arrangement, present substantial obstacles that could pave the way for a promising future where AI significantly aug­
render generic fruit classification more intricate than the classification ments efficiency and innovation in the fruit industry. This research

* Corresponding author.
** Corresponding author.
E-mail addresses: [email protected] (M.M. Kabir), [email protected] (M.F. Mridha).

https://doi.org/10.1016/j.jafr.2024.101474
Received 2 December 2023; Received in revised form 13 October 2024; Accepted 21 October 2024
Available online 31 October 2024
2666-1543/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-
nc/4.0/).
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

addresses this gap by introducing XAI-FruitNet, a custom Convolutional distinguish the type of date fruit based on its type and MobileNet V2
Neural Network (CNN) architecture that demonstrates exceptional ac­ architecture for the CNN model in the training process and obtained an
curacy and provides explainability. This innovative approach signifi­ accuracy of 96 %. A deep learning-based apple classification system was
cantly enhances the model’s generalization capabilities, rendering it built by Yu et al. [10] to investigate the impact of different CNN
resilient across many datasets. Notably, XAI-FruitNet distinguishes itself frameworks, network depths and dataset settings on classification re­
by augmenting feature extraction by combining max and average sults [11,12]. The precision ranged from 96.1 % on one dataset to 94.4
pooling, a departure from prior methodologies that treated these tech­ % on the other. Some factors, such as lighting and picture quality,
niques independently. Moreover, prevailing fruit classification models severely must be revised to ensure practical applicability. With 93.8 %
often lack transparency and interpretability [57–60], impeding wide­ accuracy, Zeeshan et al. [26] created a CNN model that used deep
spread adoption. XAI-FruitNet endeavors to bridge this gap by inte­ learning methods to recognize oranges in a dynamic setting [23–25].
grating Explainable Artificial Intelligence (XAI), which furnishes Obstacles such as occlusion and lighting changes may affect the per­
transparent explanations and interpretations of its classification de­ formance of the model while detecting fruits. These studies are limited
cisions. This distinctive feature empowers stakeholders in the agricul­ to single fruits, suggesting that extending the research to other fruit
tural sector with invaluable insights into the classification process, varieties or agricultural product types could broaden the applicability of
thereby revolutionizing industry practices. The proposed XAI-FruitNet. the proposed approach. For cross-domain fruit categorization, Wang
et al. [29] proposed an unsupervised domain adaptation technique using
• Demonstrates robustness by achieving high accuracy across diverse HAM-MobileNet, which integrated a hybrid attention module into
fruit image datasets. MobileNet V3 to mitigate complex backdrops and extract discriminative
• It leverages a hybrid pooling technique within the architecture that features. Training models using a hybrid loss function that consider both
aims to mitigate the weaknesses of individual pooling techniques and subdomain alignment and implicit distribution metrics leads to better
enhance the model’s ability to extract discriminative features from classification accuracy. For the two datasets, the proposed technique
fruit images. achieved accuracies of 95.0 % and 93.2 %, respectively. To grade the
• It provides interpretations that enable stakeholders, such as agri­ quality of apples with two colors, Unay [28,30] developed a CNN-based
cultural experts and fruit vendors, to understand the reasoning model that uses multispectral images, whereas Lu et al. [31] used a
behind the model’s predictions, which fosters trust in adopting AI CNN-based model to determine whether apples in an orchard were
technologies in the fruit industry. immature or mature. Gill et al. [32] proposed a fruit classification
scheme using CNN, RNN, and LSTM deep learning models, with 96.08 %
The remainder of this paper is organized as follows: Section 2 pro­ accuracy. They also presented a hierarchical fruit image classification
vides an overview of related works on fruit classification and the system [33] combining CNN, RNN, and LSTM, achieving an impressive
different pooling techniques employed in CNN architectures. Section 3 accuracy rate of 97.4 %. By leveraging LSTM, RNN architecture, and
describes the formulation of the proposed architecture. Section 4 pre­ CNN characteristics, a novel approach was proposed for fruit categori­
sents and discusses the results obtained from our experiments and zation [34]. In addition, type-II fuzzy advance preprocessing was
compares the performance of our architecture with that of conventional applied to the images and the hyperparameters of the proposed tech­
techniques. Finally, Section 6 provides a summary of our conclusions nique were tuned using TLBO-MCET. Pajaziti et al. [35] aimed to
and potential future research directions for generic fruit classification improve the process of classifying various fruits and vegetables within
using CNNs. industries using artificial intelligence and robotic arms. Researchers
have used image processing techniques with the OpenCV library and
2. Related work TensorFlow platform to train machine learning models. They used a
dataset of 350 images that captured apples, pears, mandarins, lemons,
Vasumathi et al. [16] used deep learning techniques to classify and strawberries from various angles. The results showed that the ro­
pomegranate fruits as normal (healthy) or abnormal (diseased) based on botic system could accurately identify and classify fruits with precision
features such as fruit colour, number of spots, and shape. The main exceeding 90 %. However, the system cannot account for variations in
contribution pf this study was to combine the CNN-LSTM deep learning size, weight, or defects within the same type of fruit. The study in
model for classification with CNN for deep feature extraction and LSTM Ref. [36] introduced CAE-ADN, a hybrid deep learning framework that
for class detection based on these features. CNNLSTM model achieved combines CAE pretraining with ADN. It was enhanced with a CBAM
impressive performance, with 98.17 % accuracy, 98.65 % specificity, attention module within DenseNet for precise fruit image classification
97.77 % sensitivity, and 98.39 % F1-score. This study has several across diverse varieties and types. The results show that the CAE-ADN
research gaps, including extending the model to multi-class classifica­ model achieved then highest accuracy of 95.86 % and 93.78 %, sur­
tion for specific pomegranate diseases. Gulzar et al. [17] presented passing benchmarks such as ResNet50, DenseNet-169, and traditional
TLMobileNetV2, a variant of the MobileNetV2 model which takes machine learning models. The CAE-ADN model significantly advanced
advantage of transfer learning, to create an automated system for fruit state-of-threat, accurate multi-class fruit classification by incorporating
classification. By using dropout and data augmentation to prevent attention mechanisms, dense connections, and auto encoder
overfitting, the suggested model achieves a remarkable accuracy of 99 pre-training, However, several limitations underscore the need for
%. Duong et al. [18] demonstrated the high accuracy of MixNet and future research. Manual feature engineering has proved insufficient for
EfficientNet lightweight models in fruit classification but did not assess fine-grained fruit classification with numerous subspecies. In a addition,
fine-grained properties for targeted enhancements. Mahmood et al. [19] the complexity of the model may pose challenges for deployment in
compared two popular architectures (AlexNet and VGG16) to determine embedded systems with limited computational resources. This study
whether jujube fruit was ripe. They stated that VGG16’s 98 % accuracy utilized three deep learning models: a (CNN) for feature extraction, a
was better the 95 % accuracy of AlexNet. Manliguez et al. [20] devised a Recurrent Neural Network (RNN) for feature selection and labeling, and
model using hyperspectral and visible-light photos to predict ripened a Long Short-Term Memory (LSTM) network for final classification to
papaya, achieving 97 % accuracy in maturity prediction. Oil palm fruits develop an effective method for categorizing various types of fruits
were used for maturity testing by Herman et al. [21]. They utilized a based on their image characteristics. Additionally, the study introduces
dataset containing oil palm fruits at seven maturity stages to train a feature fusion strategy that combines features extracted from CNN,
AlexNet and DenseNet architectures, with DenseNet achieving an 8 % labeled features from RNN, and classification results from LSTM to
higher accuracy than AlexNet. enhance the overall classification performance. The proposed method­
Nadhif et al. [22], and Nobel et al. [27] used the CNN algorithm to ology may encounter difficulties for certain fruit types or imaging

2
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

conditions. Using a unique modified cascaded-ANFIS algorithm, Rath­ used an autonomous image-gathering masked R-CNN model to detect
nayake et al. [37] proposed an effective fruit identification and recog­ and segment ripe green tomatoes. The mask area and bounding box were
nition system with an astounding accuracy of 98.36 %. To extract useful proven to have F1 scores of 92.0 %. To improve production efficiency,
information from the fruit photos, a feature extraction step was imple­ classification accuracy, control product quality and perform data anal­
mented. In the classification phase, images of fruits were fed into a ysis within the industry with an accuracy of 97 %, for effective picture
cascaded-ANFIS model to determine which of the 131 distinct categories feature extraction and feature selection in fruit classification, Gill et al.
they most closely resemble. Mamat et al. [38] provided an automated [42] used a convolutional neural network (CNN), recurrent neural
image annotation system that can distinguish between various types of network (RNN) and long-short term memory (LSTM) deep learning
fruit with a 99.5 % success rate. However, the small size of the dataset techniques. Min et al. [43] developed a Multiscale Attention Network
may limit the applicability of the approach. The AFC-HPODTL model called MSANet, which combines visual features from different levels to
developed by Shankar et al. [39] as an autonomous fruit categorization comprehensively represent each image. By aggregating multi-scale vi­
system using hyperparameter-optimized deep transfer learning shows sual features that focus on the essential aspects of images, attention
promising performance. Wang et al. [40] achieved 98.46 % maximum based MSANet aims to improve fruit recognition performance. Our
mean average (MAP) precision in young tomato fruit identification by a model addresses several key issues highlighted in literature review
utilizing RCNN accelerated by the ResNet50 architecture. Zu et al. [41] through an innovative methodology. Our model prioritizes

Fig. 1. Workflow of the proposed CNNs architecture: XAI-FruitNet with multiple convolutional layers, hybrid pooling layers, activation functions and dropout layers.
The step-by-step workflow of our process starts with data augmentation, followed by training the CNN architecture. The performance evaluation phase involves
assessing the proposed method’s efficiency.

3
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

interpretability by incorporating GradCAM (Gradient-weighted Class approach aims to mitigate the weaknesses of individual pooling tech­
Activation Mapping) along with the traditional CNN architecture. This niques, leading to a more robust representation of the fruit images. The
addition allows for visualizing which parts of the input image contribute convolutional layers utilize filters/kernels (3x3) to extract significant
the most to the model’s classification decision, aiding in understanding features from the input data. Convolution operations involve applying
and explaining its predictions. Subsequently, we address potential biases filters to the input data to capture patterns systematically. The ReLU
and overfitting concerns by employing robust data augmentation tech­ activation function introduces non-linearity to the CNN, thereby
niques during dataset collection and preprocessing. By introducing addressing the vanishing gradient issue observed in alternative func­
variations such as rotation, zooming, and cropping, we ensured. tions. The hybrid pooling layer is integrated into the first convolution
that the model learned to recognize fruits under diverse conditions, layer to balance the preservation of spatial information and capturing
thereby enhancing its generalization capabilities. Furthermore, our patterns. Hybrid pooling optimization enhances the generalization ca­
customized CNN architecture integrated a hybrid pooling method that pabilities of CNN’s for accurate and efficient fruit classification.
combined average and max pooling. This hybrid approach addresses the
limit of individual pooling techniques, resulting in a more comprehen­
3.1. Dataset description
sive representation of the fruit images. Overall, our methodology not
only enhances the performance of fruit classification but also prioritizes
The robustness of XAI-FruitNet was assessed by utilizing four data­
interpretability and addresses various challenges identified in the
sets, which included multi-class datasets to provide a thorough
existing literature.
evaluation.
3. Methods and materials
3.1.1. Fruits-360 dataset
Fruit-360 [44] contains 90,483 fruit images. Each fruit has a unique
The workflow involves dataset collection, CNN architecture design,
100x100 pixel image, with approximately 70 % used for training, 10 %
model training, and rigorous testing and validation, as Shown in Fig. 1.
for validation and 20 % for testing. The fruit images were captured by
The images were resized to the standard dimensions of (224, 224) to
filming each fruit as it rotated on a motor for 20 s and then grabbing
maintain consistent proportions. Data augmentation techniques were
stills, which were then processed using a specialized algorithm to
employed to address potential biases and overfitting. Spatial-level
remove the background. Fig. 2 illustrates some images of our dataset
modifications, especially rotation, zooming, and cropping, effectively
(see Table 1).
improve object recognition and detection. A customized CNN architec­
ture was developed using a hybrid pooling method that combined
3.1.2. Fruit and vegetable image recognition dataset
average and max pooling. Average pooling retains spatial information,
We employed another publicly available dataset [45] comprising
whereas max pooling focuses on capturing relevant features. The hybrid
diverse images depicting fruits, including bananas, apples, pears, grapes,

Fig. 2. Visualizing the images of our dataset.

4
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

oranges, kiwis, watermelons, pomegranates, pineapples and mangos. Table 1


The dataset also encompasses vegetables such as beetroot, bell pepper, Comparing the performance and limitations of existing approaches.
cabbage, capsicum, carrot, cauliflower, chili pepper, corn, cucumber, Reference Focused Contribution Limitations
eggplant, garlic, ginger, jalapeno, lemon, lettuce, onion, paprika, peas, Methods
potato, radish, spinach, soybean, sweet potato, sweetcorn, tomato and Ezzat et al. Deep learnings Proposed a method Conducted in a
turnip. The dataset producers have already predefined the [34] for estimating fruit controlled laboratory
training-testing-validation split, which we directly adopted for our maturity using environment may not
evaluation. Refer to Table 2 for the original image count per fruit multimodal directly apply to real-
classification and world field conditions.
category. made modifications
to deep learning
3.1.3. Fruit recognition architectures for
An additional dataset [46] was also utilized, comprising 44,406 fruit sensitivity analysis.
Gill et al. CNN, RNN and Introduces a new More testing is
images meticulously gathered over 6 months. These images were
[32] LSTM method for necessary to determine
captured against a clear background with a 320 x 258-pixel resolution. categorizing fruit the effectiveness of this
During the collection process, a multitude of challenges were introduced images using deep technique on a more
that were commonly encountered in real-world recognition scenarios learning and extensive and diverse
within supermarkets and fruit shops. These challenges encompassed combined features. dataset. Its computing
Optimal image cost, interpretability
diverse factors such as varying lighting conditions, shadows, sunlight features are extracted and resilience should
and pose variations. To ensure the robustness of our proposed model, it using CNNs, RNNs also be evaluated, as
becomes imperative to effectively handle illumination variations, arti­ and LSTMs these could limit its
facts arising from camera capture, specular reflection shading and usefulness in fields that
require
shadows. Through deliberate diversification, they were acquired at
interpretability.
various times and days, encompassing identical fruit categories, effec­ Nadhif et al. Mobile Net V2 A CNN-based date Only looked at 9 types
tively enriching the dataset with greater realism. Furthermore, the im­ [22] classification system of date fruits from one
ages exhibited a wide range of quality and lighting variations. Notably, has been developed shop, so the results may
illumination emerged as one of the crucial sources of variation in the using MobileNet V2 only apply to some date
architecture and fruits from all regions.
imagery. In fact, the illumination level could render two images of the
feature analysis.
same fruit but less similar than two images depicting different types of Promising results
fruits. Consequently, the dataset comprises images captured with room suggest the potential
lights on and off, encompassing open and closed windows, along with for advanced
applications in
open and closed window curtains.
agriculture
Xue et al. CAE-ADN Introduced CAE-ADN, The complexity of the
3.1.4. Dry Fruit Image Dataset [36] a hybrid deep model may pose
There are 12 different types of dry fruits included in the "Dry Fruit learning framework challenges for
Image Dataset" [47], which is a collection of 11500+ processed that combines CAE deployment in
pre- training with embedded systems
high-quality photographs. Almonds, cashew nuts, raisins and dried figs
ADN. It is enhanced with limited
(Anjeer) make up the four main varieties of dry fruit, with three with a CBAM computational
sub-categories each. Mobile phones equipped with high-definition attention module resources.
cameras were used to capture these images, which were taken in a va­ within DenseNet for
precise fruit image
riety of environments.
classification across
diverse varieties and
3.2. Data preprocessing types
Pajaziti Robotic system Used image Did not account for
3.2.1. Scaling et al. [35] by using processing techniques variations in size,
Artificial with the OpenCV weight, or defects
During the data preprocessing phase, the necessary steps were taken Intelligence library and the within the same fruit
to prepare the data for analysis. The image underwent a resizing process TensorFlow platform type.
to conform to the standard dimensions of (224, 224). This step was for training the
crucial in maintaining consistent proportions and preparing the images machine learning
model to recognize
for subsequent analysis.
different fruits. The
trained model was
3.2.2. Data augmentation then integrated with a
It is imperative to acknowledge that the dataset under consideration Dobot Magician
may possess inherent biases that may prevent it from becoming robotic arm equipped
with a conveyor belt
observable. The presence of these biases may potentially lead to the and sensors.
phenomenon of overfitting, wherein the model becomes excessively Vasumathi CNN-LSTM Combined the CNN- Limited to single fruits.
tailored to the training dataset. Further insights can be derived from the et al. [16] LSTM deep learning
training dataset by implementing modifications to the images to miti­ model for
classification with
gate this potential concern. The utilization of data augmentation, a
CNN for deep feature
technique widely recognized in the field, is employed with the objective extraction and LSTM
of generating a more diverse range of images depicting fruits and veg­ To classify
etables. The primary purpose of this approach is to mitigate the risk of pomegranate fruits as
overfitting the training dataset, as indicated by previous research [48]. normal (healthy) or
abnormal (diseased)
Data augmentation techniques can be classified into two main types: based on features
pixel-level and spatial-level. Pixel-level modifications, such as blurring, such as fruit color,
brightness adjustments, noise addition and CutMix or Cutout, are number of spots and
frequently used in image editing tasks, aimed at maintaining bounding shapes.

5
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

Table 2 3.3. Customized model based on improved CNN


The table illustrates the total number of objects found in the overall subset of the
dataset. Our customized model based on the improved CNN architecture aims
Name of datasets Source Number of Image Number of Class to enhance the accuracy of generic fruit classification by implementing a
Fruit Recognition [46] Kaggle 44406 15
hybrid pooling method. By incorporating a hybrid approach that com­
Fruit 360 [44] Mendeley 67692 131 bines both average pooling and max pooling, we leverage the strengths
Fruits and Vegetables [45] Kaggle 3468 36 of each pooling technique while mitigating their respective weaknesses.
Dry fruit [47] Mendeley 11500 12 Average pooling helps retain valuable spatial information, whereas max
pooling focuses on capturing the most relevant features within the data.
This synergy leads to a more robust and discriminative representation of
Table 3 fruit images during classification. Additionally, we fine-tuned the model
Lists of notations. by training it on several large and diverse fruit datasets, enabling it to
Symbol Meaning effectively generalize to unseen fruit categories. Our customized model
demonstrated superior performance through rigorous experimentation
|Pj | Total number of items inside the pooling region.
Pj Pooling region of the input feature map. and evaluation compared to traditional CNN architectures, making it a
Pave Outcomes from average pooling. promising solution for accurate and efficient fruit classification tasks.
Pj max Maximum value of the Pj pooling region.
Pj ave Average value of the Pj pooling region. 3.3.1. Convolution layer
TP True Positive, the number of positive instances correctly identified as
positive by the model.
The convolutional layer is a pivotal component within the archi­
FP False Positive, the number of negative instances inaccurately labeled as tecture of a (CNN), designed specifically for the purpose of effectively
positive by the model. processing and analyzing images.
TN True Negative, the number of negative instances correctly identified as The major function of the convolutional layer is to apply a series of
negative by the model.
flexible filters or kernels to the input data to extract useful features. Key
FN False Negative, the number of positive instances inaccurately labeled as
negative by the model. aspects of a convolutional layer.

1. Filters/Kernels: s (CNNs) employ diminutive matrices, commonly


box integrity and preserving spatial information for object localization. sized at 3x3 or 5x5, denoted as filters or kernels, to extract significant
On the other hand, spatial-level transformations are more intricate, as features from the input data. Convolution operation is executed by
they affect both content and bounding boxes, making their imple­ each filter as it traverses the input data, systematically capturing
mentation more challenging compared to pixel-level techniques. To patterns or features at every position.
improve object recognition and detection, our study revealed that 2. During training, the kernel elements undergo optimization to effec­
spatial-level modifications, including rotation, zooming and cropping, tively capture various input data patterns.
proved highly effective [2]. As depicted in Fig. 3, we expanded the fruit 3. Convolution Operation: The 2D matrix representation of the image
and vegetable classification dataset by enlarging, cropping, and rotating (I) is convolved with a smaller 2D kernel matrix (K) in the convo­
each initial image, enabling the model to learn from diverse examples. lution procedure, which is widely used in digital image processing.
The zero-padding mathematical formulation in Ref. [21] is as
follows:
∑ ∑
(I × K)(x, y) = m i = − m n j = − n (x + i, y + j) × (i, j) A filter is
used to perform horizontal (from left to right) and vertical (from top to
bottom) convolution on an image. To demonstrate, Fig. 4 displays the
convolution procedure using a 16x16 input image and a 3x3 convolution
kernel, yielding a convolved output image.

4. Activation Function: The rectified linear unit (ReLU) activation


function is incorporated to introduce nonlinearity to the (CNN),
enabling the learning and modelling of intricate relationships be­
tween inputs and outputs of neurons or layers. The ReLU activation
function is characterized by a straightforward thresholding opera­
tion in which negative values yield an output of zero, while positive
values remain unaltered; expressed mathematically as follows:

(x) = max (0, x) = { 0, x < 0; x, x ≥ 0}

The primary advantage of ReLU lies in its ability to circumvent the


"vanishing gradient" issue observed in alternative activation functions
such as sigmoid or tanh, wherein the gradients diminish during back­
propagation, impeding the learning process in deep networks. By
ensuring a constant gradient for positive inputs, ReLU mitigates this
problem, resulting in faster and more stable training.

3.3.2. Pooling layer


By decreasing the spatial size of convolutional outputs, the network
effectively reduces its parameter count. Moreover, pooling operations
aid in obtaining input-invariant representations for minor translations.
Fig. 3. Snapshots of augmented images: (a)original, (b)flipped, (c) rotated, Fig. 4 illustrates two primary pooling operations using a 2x2 filter.
(d) clipped.

6
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

Fig. 4. Illustration of convolution with a 4 × 4 input image and a 3 × 3 kernel, accompanied by pooling using 2 × 2 filters and a stride of 2.

3.3.3. Max pooling elements O1 , …, OJ. In the training stage, we employ average pooling
The max-pooling layer computes the maximum value within each and max pooling for all the pooling regions within the initial convolu­
input patch [54,55], preserving this value while sliding the filter across tional layer’s feature map. Consequently, the pooling process of the
the feature map and can be represented mathematically as follows: convolutional feature map leads to the generation of the ensuing pooling
feature map. Fig. 5 demonstrates the application of the hybrid pooling
Pmax = P 1 m , . …, P n max technique using a filter size of 2 and a stride of 2. By employing a
P j max = MAXiεPj probability-based approach to choose the pooling method for each
convolutional feature map, this method enhances the CNN’s general­
ization capabilities through diverse feature extraction training.

3.3.4. Average pooling


3.4. XAI-FruitNet
The average pooling process involves calculating the average value
within specific patches of the input [54,55]. The average pooling layer
The workflow for building the CNN architecture for generic fruit
down samples the convolutional activation by dividing the input into
classification is shown in Fig. 1. First, we gathered a diverse and
pooling regions and computing their average value.
extensive dataset of fruit images, encompassing various types, shapes
and colors, to ensure the robustness of the model. Next, we designed the
3.3.5. Hybrid pooling
CNN architecture with chosen the convolutional layers, activation
The pooling layer, following the convolutional layer, was incorpo­
functions and fully connected layers. In the first convolution layer of our
rated to reduce the network parameters in.
architecture, we integrate the hybrid pooling layer, which selectively
accordance with the hybrid pooling method introduced by which
retains the crucial features from both average and maximum pooling,
forms the basis of the pooling approach. The objective of the pooling
striking a balance between preserving the spatial information and
method centers on computing the output OJ for a distinct pooling re­
capturing significant patterns. We then trained the model using
gion. As a result of aggregating the outcomes of all these pooling regions,
advanced optimization techniques and evaluated its performance
we obtain a pooling feature map O, represented as a set containing
through rigorous testing and validation. By leveraging the power of

7
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

comprised 20 % of the images used for testing the models. Table 4 lists
the configuration details of the proposed approach for 224 x 224-pixel
images, comprising six convolutional layers, one hybrid pooling layer
and five max-pooling layers. The CNN input consists of a fixed-size color
image utilizing the rectified linear unit (ReLU) [53] as the activation
function and applying the soft-max classifier cost function in the fully
connected layer. Fig. 6 shows a pictorial view of the customized CNN
architecture of ours.

4.2. Statistical analysis

In evaluating XAI-FruitNet, various metrics assess different aspects of


performance. Accuracy, the most basic metric, reflects the overall pro­
portion of correctly classified instances (TP and TN) divided by the total
number of instances. Precision, calculated as TP divided by the sum of
TP and FP, measures the proportion of actual positives.
among those the model predicted as positive. Conversely, recalling
the ratio of TP to the sum of TP and FN indicates.
the model’s ability to identify all actual positives. Specificity gauges
the model’s ability to identify negative instances correctly. A high
specificity value (closer to 1) indicates that the model avoids false
positives. In simpler terms, the model rarely mistakes a negative
instance for a positive one. Finally, the F1- Score incorporates both
Precision and Recall, harmonically averaging them to provide a
Fig. 5. Hybrid pooling. balanced view of the model’s performance.
Accuracy = TP + TN/TP + FP + FN + TN.
Precision = TP/TP + FP.
hybrid pooling. The CNN architecture can be optimized for generic fruit
Recall = TP/TP + FN.
classification to provide accurate and efficient fruit recognition
Specificity = TN/FP + TN
capabilities.
F1-Score = 2 × Recall × P recision /Recall + P recision.
4. Performance evaluations

4.1. Experimental setup


4.3. Result analysis
The proposed model was implemented on a Windows 10 system with
Python 3.0, an i7 processor, 20 GB of RAM, and a 16 GB GPU, using Table 5 provides both the training accuracy and validation accuracy
TensorFlow and Keras frameworks for the deep learning models. Deep for each model and dataset combination, reflecting the models’ profi­
learning model training encompassed standard augmentation methods, ciency in fruit classification during training and their generalization to
including zoom, clip and rotation. Training was new data during validation. MobileNetv2 consistently exhibited strong
performed over 100 epochs, with each epoch involving iteration performance across different datasets, achieving notable training accu­
through a batch size of 32. RMSprop was employed as the optimizer to racies (93.61 %–99.67 %) and relatively high validation accuracies
update the model parameters during training, using a learning rate of (92.48 %–97.66 %). Similarly, the ResNet50 model demonstrated
0.0001. Prior to implementing the data augmentation techniques, the competitive results, with training accuracies ranging from 94.89 % to
datasets were split into two subsets with a 4:1 ratio, where one subset 99.67 % and validation accuracies ranging from 93.45 % to 96.95 %,
included 80 % of the images for training the models and the other subset similar to MobileNetv2. On the other hand, the VGG16 model achieved a

Table 4
Configuration of the proposed method for 224 × 224 images.
Layer Type Filter size Number of filters Stride Padding Data depth Parameters

1 Input – – – – 224 × 224 × 3 ​


2 Convolution 3 16 1 0 224 × 224 × 3 448
3 Hybrid Pooling 2 – 2 0 112 × 112 × 16 0
4 Convolution 3 32 1 0 112 × 112 × 16 4640
6 Convolution 3 32 1 0 56 × 56 × 32 9248
7 MaxPooling 2 – 2 0 28 × 28 × 32 0
8 Convolution 3 64 1 0 28 × 28 × 32 18496
9 MaxPooling 2 – 2 0 14 × 14 × 64 0
10 Convolution 3 128 1 0 14 × 14 × 64 73856
11 MaxPooling 2 – 2 0 7 × 7 × 128 0
12 Convolution 3 128 1 0 7 × 7 × 128 147584
13 MaxPooling 2 – 2 0 3 × 3 × 128 0
14 flattened matrix – – – – 1152 0
15 Dense – 256 – – 256 295168
16 RELU – – – – 256 0
17 DROPOUT – – – – 256 0
18 Dense – 12 – – 12 3084
19 SOFTMAX – – – – 12 0

8
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

Fig. 6. Proposed CNN architecture with 5 convolution layers, one hybrid pooling layer and 5 max pooling layers.

Table 5
Classification performance of proposed model.
Dataset

Fruit 360 Dry Fruit Fruit and Vegetable Fruit Recognition

Model Train Acc. Val. Acc Train Acc. Val. Acc. Train Acc. Val. Acc. Train Acc. Val. Acc
MobileNetv2 99.67 96.42 99.15 97.66 93.61 92.48 96.89 93.53
ResNet50 99.45 98.01 99.67 96.95 94.89 93.45 99.23 96.92
VGG16 99.92 97.11 68.45 55.92 77.26 48.38 98.88 94.65
EfficientNetB0 99.60 96.52 96.25 80.52 90.48 87.06 98.77 95.59
Proposed Model without Hybrid Pooling 99.87 97.59 98.97 97.27 93.85 95.42 98.34 96.37
Proposed Model with Hybrid Pooling 99.94 99.99 99.34 97.79 95.67 96.41 99.38 97.12

remarkably high training accuracy (99.92 %) on the Fruit 360 dataset. performs well across datasets, boasting training accuracies ranging from
However, its performance substantially declined on different datasets, 93.85 % to 99.87 % and validation accuracies ranging from 95.42 % to
with validation accuracies as low as 48.38 % and 55.92 %. 97.59 %, indicating its robustness in generalization. But the proposed
This indicates that VGG16 may generalize poorly to these datasets. In model with Hybrid Pooling attains the highest validation accuracy
contrast, the proposed model without hybrid pooling consistently among all models and maintains strong performance on all datasets,

Fig. 7. Visualizing performance of our proposed model: (a) Training Accuracy, (b) Validation Accuracy.

9
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

with validation accuracies ranging from 95.67 % to 97.79 %. This sug­ MobileNetV2, contributing to its superior accuracy in fruit classification
gests that the hybrid pooling technique enhances the model’s general­ tasks.
ization capacity and ability to perform well across diverse datasets. The
proposed Model with hybrid Pooling emerges as the top-performing 4.4. Explainability analysis
model, exhibiting the highest.
validation accuracy. This outcome underscores the significant po­ Explainability analysis refers to understanding and interpreting the
tential of the hybrid Pooling technique for enhancing fruit recognition decisions made by AI models, which are often considered black boxes
tasks. The results presented in Table 3 are depicted graphically in Fig. 7. due to their complex architectures and high-dimensional data process­
The predictions generated by the proposed architecture are illus­ ing. XAI is essential for several reasons, including building trust with
trated in Fig. 8. This figure shows the actual and forecasted class labels end-users [55], regulatory compliance, detecting bias and discrimina­
for each picture. For example, the initial cell includes an image with the tion, and identifying potential weaknesses or vulnerabilities in the
accurate class label. model. In the context of XAI, various techniques and methods are
"Cherry Rainier" and the predicted class label "Cherry Rainier". Thus, employed to shed light on how AI models arrive at specific predictions or
every subsequent cell in the row includes an image with both the actual decisions. These techniques aim to answer questions such as "Why did
and predicted class labels. Based on Table 6, the proposed model dem­ the model predict this outcome?" or "What are the significant features
onstrates varying performance across different datasets but shows that influenced the model’s decision?"
promising results overall based on the performance metrics. With an
impressive average accuracy of 97.75 % across all four datasets, the 4.4.1. Grad-CAM
proposed model shows proficiency in correctly classifying fruits and Grad-CAM, introduced by Selvaraju et al., in 2017 [56], is a visual­
vegetables. It maintains an average precision of 97.2 %, indicating its ization technique that aids in comprehending and interpreting (CNN)
ability to predict positive classes accurately. The high recall (0.96) decisions during image classification tasks, with the primary goal of
suggests that the model effectively identifies actual fruits and providing visual explanations for the predictions made by CNN. When a
vegetables. CNN processes an image and makes a prediction, Grad-CAM helps
However, the slightly lower precision (0.97) indicates that the model identify which regions of the image were critical in influencing that
might misclassify some non-fruits or vegetables as fruits or vegetables. prediction. This can be particularly helpful in understanding why CNN
Overall, the proposed model performs well classifying fruits and vege­ might have classified an image in a certain way, providing insights into
tables across various datasets. It achieves high accuracy and recall, its decision-making process. By visualizing the Grad-CAM heatmap, we
successfully identifying most fruits and vegetables in the images. Table 7 can obtain insights into which parts of the image CNN focuses on to
showcases the XAI-FruitNet model’s state-of-the-art performance in fruit make its decision. For example, in an image of an apple, Grad-CAM
classification across various datasets, demonstrating its potential for might highlight the shape, color, texture, or specific markings or
real-world applications in agriculture and related industries. The model blemishes, indicating that those regions contributed most to the CNN’s
consistently achieves higher validation accuracy than other models, prediction of the "apple" class. The visualization results of the custom­
ranging from 96.92 % on the Fruits and Vegetables dataset to 98.97 % on ized architecture are shown in Fig. 9 using Grad-CAM.
the Fruits-360 dataset, indicating its strong generalization capability
across different fruit categories, image qualities, and dataset charac­ 5. Discussion
teristics. Even without Hybrid Pooling, XAI-FruitNet performs compet­
itively, but including Hybrid Pooling consistently enhances The results demonstrate that the proposed XAI-FruitNet model with
performance. The model’s customized CNN architecture excels in hybrid pooling achieves superior performance for fruit classification
feature extraction compared to generic architectures like ResNet50 or compared to other benchmark models and attains the highest validation

Fig. 8. Prediction using our proposed model.

10
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

Table 6
Classification performance of proposed model for each dataset.
Dataset Test Accuracy Average Precision Average Recall Average F1-Score Average Sensitivity

Fruits-360 98.97 % 0.98 0.97 0.97 0.96


Fruits and Vegetables 97.01 % 0.97 0.96 0.96 0.96
Fruit Recognition 96.92 % 0.96 0.96 0.96 0.94
Dry Fruit 98.11 % 0.98 0.98 0.98 0.98

manual inspections Additionally, challenging cases such as occluded,


Table 7
oddly shaped, or damaged fruits that are frequently misclassified can be
Comparing the performance existing approaches.
identified through the explanations, prompting targeted improvements
Author Best Model Dataset Result to the model’s robustness. For real-world deployment, particularly in
Aherwadi et al. Custom CNN model Fruit 360 81.96 agricultural settings, considerations such as input data size and number
[49] % of classes will likely need to be considered as they could impact the
Sulthan at el PCA, deep learning, and k-NN Fruits-360 92 %
scalability of the XAI-FruitNet model. In real world scenarios, the di­
[50]. methods dataset
Salim et al. ResNet-50 Fruits-360 98.36 versity of fruit and vegetable varieties is expected to significantly in­
[51] % crease the computational requirements for training and inference,
Zeeshan et al. Custom CNN model Fruits-360 93.8 % resulting in longer training times higher memory demands, and chal­
[52] lenging deployment on edge devices or embedded systems with limited
Israel et al. Local Binary Pattern (LBP) with Fruits-360 89.44
computational capabilities. In a addition, fruits may be partially
[53] SVM %
Salim et al. DenseNet-201 Fruit 99.13 occluded by leaves, branches, or other objects, potentially making ac­
[51] Recognition % curate classification by the model more complex. Agricultural systems
Hussain et al. Deep CNN model Fruit 82 % are dynamic, with new fruit varieties, environmental changes, and
[54] Recognition
evolving agricultural practices. Continuous monitoring and adaptation
Xue et al. [36] Attention-based densely Fruit 93.78
connected convolutional net- Recognition % of the model will probably be necessary to maintain its accuracy and
works with convolutional relevance over time, which could be resource-intensive and require
autoencoder (CAE-ADN) ongoing data collection and model updates. Addressing the challenges
Our XAI-FruitNet Fruits-360 98.97 posed by the real-world deployment of the XAI FruitNet model in agri­
proposed %
cultural settings, further research can focus on several key areas.
model Fruit 96.92
Recognition %
Fruits and 97.01 • Methods such as model pruning, quantization, and knowledge
Vegetables % distillation can make the model more suitable for edge-devices
Dry Fruit 98.11
deployment.
%
• Technique such as Generative Adversarial Networks (GANs) to create
synthetic data that include occlusions and other challenges can
accuracy across multiple datasets - Fruits-360, Fruits and Vegetables enhance the model’s ability to handle real-world complexities.
Image Recognition, Fruit Recognition, and Dry Fruit. The validation • By integrating methods for incremental learning strategies, the
accuracies range from 95.67 % to 97.79 % for, beat models such as model can identify and request annotations for uncertain or novel
MobileNetv2, ResNet50, and VGG16. The architectural design of XAI- instances, ensuring continuous improvement with minimal manual
FruitNet, comprising six convolutional layers, one hybrid pooling intervention.
layer and five max-pooling layers, was crafted to handle fruit classifi­ • Leveraging distributed computing techniques, including federated
cation tasks efficiently. The consistent validation accuracy of XAI- learning, in which multiple devices collaboratively train a shared
FruitNet underlines the advantages of its customized CNN architecture model while keeping data localized, can help manage resource
and hybrid pooling technique. By utilizing both average and max constraints and privacy concerns.
pooling, XAI-FruitNet can extract vital spatial information while
concentrating on the essential features that help the model learn a varied By focusing on these research areas, the challenges of deploying the
yet distinctive representation of fruit images, thereby improving its XAI-FruitNet model in real-world agricultural settings can be effectively
generalization ability and demonstrating outstanding performance addressed, ensuring scalability, robustness, and adaptability in dynamic
across various datasets without overfitting any particular one. The ReLU agricultural environments.
activation function overcomes issues such as the vanishing gradient
problem and enables faster and more stable training by introducing non- 6. Conclusion
linearity. Table 6 shows that XAI-FruitNet achieved the highest valida­
tion accuracy across all the datasets. This indicates that the model’s The proposed XAI-FruitNet is an Explainable AI-integrated the deep
hybrid pooling strategy, data augmentation techniques, and fine-tuned architecture not only achieves state-of-the-art fruit classification but also
architectures of the model work together to create a powerful and ac­ achieves high accuracy across diverse datasets while maintaining model
curate fruit classification system. In addition to strong accuracy, XAI­ interpretability, surpassing popular CNN models such as MobileNetV2,
FruitNet also achieved high average precision (0.96–0.98), recall ResNet50, and VGG16. XAI-FruitNet incorporates a hybrid pooling
(0.96–0.98), and F1-scores (0.96–0.98) across datasets. Incorporating technique that selectively aggregates spatial information from average
XAI into the XAI-FruitNet model fosters trust, enables error detection, pooling and salient features from max pooling to enhance the repre­
provides domain insights, promotes accessibility, guides model refine­ sentational power and generalization capability of the model. Further­
ment, and provides valuable insights into visual cues that are most more, XAI-FruitNet enables model interpretability through Grad-CAM
discriminative for different fruit classes. The transparency provided by visualization, highlighting the influential regions in fruit images for
XAI allows users to verify whether highlighted features align with their predictions, shedding light on the model’s decision-making process, and
domain knowledge and expectations. In addition, this knowledge can building trust. Deploying XAI-FruitNet in real-world fruit recognition
guide farmers or fruit inspectors to prioritize specific visual traits during systems would offer practical validation while developing a prototype

11
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

Fig. 9. Visualizing the images using Grad-Cam.

mobile app or production-grade solution based on the model, which Data availability
would uncover practical challenges and use cases.
Data will be made available on request.
CRediT authorship contribution statement
References
Shirin Sultana: Writing – original draft. Md All Moon Tasir:
[1] G.C. Sunil, Yu Zhang, Cengiz Koparan, Mohammed Raju Ahmed, Kirk Howatt,
Writing – review & editing, Methodology. S.M. Nuruzzaman Nobel: Xin Sun, Weed and crop species classification using computer vision and deep
Formal analysis, Conceptualization. Md Mohsin Kabir: Writing – re­ learning technologies in greenhouse conditions, Journal of Agriculture and Food
view & editing, Validation. M.F. Mridha: Validation, Supervision. Research 9 (2022) 100325.
[2] Mukhriddin Mukhiddinov, Akmalbek Bobomirzaevich Abdusalomov, Jinsoo Cho,
Automatic fire detection and notification system based on improved yolov4 for the
blind and visually impaired, Sensors 22 (9) (2022) 3307.
Declaration of competing interest [3] Van Lic Tran, Thi Ngoc Canh Doan, Fabien Ferrero, Trinh Le Huy, Nhan Le-Thanh,
The novel combination of nano vector network analyzer and machine learning for
The authors declare the following financial interests/personal re­ fruit identification and ripeness grading, Sensors 23 (2) (2023) 952.
[4] RM.Rasika D. Abeyrathna, Victor Massaki Nakaguchi, Arkar Minn, Tofael Ahamed,
lationships which may be considered as potential competing interests: Recognition and counting of apples in a dynamic state using a 3d camera and deep
Dr. M. F. Mridha reports was provided by American International Uni­ learning algorithms for robotic harvesting systems, Sensors 23 (8) (2023) 3810.
versity Bangladesh. Dr. M. F. Mridha reports a relationship with Amer­ [5] R. Nithya, B. Santhi, R. Manikandan, Masoumeh Rahimi, Amir H. Gandomi,
Computer vision system for mango fruit defect detection using deep convolutional
ican International University Bangladesh that includes: employment.
neural network, Foods 11 (21) (2022) 3483.
[6] Nurulaqilla Khamis, Hazlina Selamat, ShuwaibatulAslamiah Ghazalli, Nurul Izrin
Acknowledgements Md Saleh, Nooraini Yusoff, Comparison of palm oil fresh fruit bunches (ffb)
ripeness classification technique using deep learning method, in: 2022 13th Asian
Control Conference (ASCC), IEEE, 2022, pp. 64–68.
The authors would like to thank the Advanced Machine Intelligence
Research (AMIR) Lab for its supervision and resources.

12
S. Sultana et al. Journal of Agriculture and Food Research 18 (2024) 101474

[7] Siqi Liu, Yishu Jin, Zhiwen Ruan, Zheng Ma, Rui Gao, Zhongbin Su, Real-time [32] Harmandeep Singh Gill, Baljit Singh Khehra, Fruit Image Classification Using Deep
detection of seedling maize weeds in sustainable agriculture, Sustainability 14 (22) Learning, 2022.
(2022) 15088. [33] Harmandeep Singh Gill, Baljit Singh Khehra, An integrated approach using cnn-
[8] Artur Janowski, Rafał Kaźmierczak, Cezary Kowalczyk, Jakub Szulwic, Detecting rnn-lstm for classification of fruit images, Mater. Today: Proc. 51 (2022) 591–595.
apples in the wild: potential for harvest quantity estimation, Sustainability 13 (14) [34] Dalia Ezzat, Aboul Ella Hassanien, Hassan Aboul Ella, An optimized deep learning
(2021) 8054. architecture for the diagnosis of covid-19 disease based on gravitational search
[9] Gregorius Natanael Elwirehardja, Jonathan Sebastian Prayoga, et al., Oil palm optimization, Appl. Soft Comput. 98 (2021) 106742.
fresh fruit bunch ripeness classification on mobile devices using deep learning [35] Arbnor Pajaziti, Fatmir Basholli, Ylber Zhaveli, Identification and classification of
approaches, Comput. Electron. Agric. 188 (2021) 106359. fruits through robotic system by using artificial intelligence, Engineering
[10] Fanqianhui Yu, Tao Lu, Changhu Xue, Deep learning-based intelligent apple Applications 2 (2) (2023) 154–163.
variety classification system and model interpretability analysis, Foods 12 (4) [36] Gang Xue, Shifeng Liu, Yicao Ma, A hybrid deep learning-based fruit classification
(2023) 885. using attention model and convolution autoencoder, Complex & Intelligent
[11] SM Nuruzzaman Nobel, Md Asif Imran, Nahida Zaman Bina, Md Mohsin Kabir, Systems (2020) 1–11.
Mejdl Safran, Sultan Alfarhood, Muhammad Firoz Mridha, Palm leaf health [37] Namal Rathnayake, Upaka Rathnayake, Tuan Linh Dang, Yukinobu Hoshino, An
management: a hybrid approach for automated disease detection and therapy efficient automatic fruit-360 image identification and recognition using a novel
enhancement, IEEE Access 12 (2024) 9097–9111, https://doi.org/10.1109/ modified cascaded-anfis algorithm, Sensors 22 (12) (2022) 4401.
ACCESS.2024.3351912. [38] Normaisharah Mamat, Mohd Fauzi Othman, Rawad Abdulghafor, Ali A. Alwan,
[12] SM Nuruzzaman Nobel, SM Masfequier Rahman Swapno, Md Rajibul Islam, Yonis Gulzar, Enhancing image annotation technique of fruit classification using a
Mejdl Safran, Sultan Alfarhood, M.F. Mridha, A machine learning approach for deep learning approach, Sustainability 15 (2) (2023) 901.
vocal fold segmentation and disorder classification based on ensemble method, Sci. [39] Kathiresan Shankar, Sachin Kumar, Ashit Kumar Dutta, Alkhayyat Ahmed, Anwar
Rep. 14 (1) (2024) 14435. Ja’afar Mohamad Jawad, Hashim Abbas Ali, Yousif K. Yousif, An automated
[13] Yukun Ma, Automatic detection of oranges peel based on the yolov5 model. hyperparameter tuning recurrent neural network model for fruit classification,
Highlights in Science, Eng. Technol. 34 (2023) 176–182. Mathematics 10 (13) (2022) 2358.
[14] Zhongxian Zhou, Zhenzhen Song, Longsheng Fu, Fangfang Gao, Rui Li, [40] Peng Wang, Niu Tong, Dongjian He, Tomato young fruits detection method under
Yongjie Cui, Real-time kiwifruit detection in orchard using deep learning on near color background based on improved faster r-cnn with attention mechanism,
android™ smartphones for yield estimation, Comput. Electron. Agric. 179 (2020) Agriculture 11 (11) (2021) 1059.
105856. [41] Linlu Zu, Yanping Zhao, Jiuqin Liu, Fei Su, Yan Zhang, Pingzeng Liu, Detection and
[15] Yanchao Zhang, Jiya Yu, Yang Chen, Yang Wen, Wenbo Zhang, Yong He, Real-time segmentation of mature green tomatoes based on mask r-cnn with automatic image
strawberry detection using deep neural networks on embedded system (rtsd-net): acquisition approach, Sensors 21 (23) (2021) 7842.
an edge ai application, Comput. Electron. Agric. 192 (2022) 106586. [42] Harmandeep Singh Gill, G. Murugesan, Abolfazi Mehbodniya, Guna Sekhar Sajja,
[16] M.T. Vasumathi, M. Kamarasan, An effective pomegranate fruit classification based Gaurav Gupta, Abhishek Bhatt, Fruit type classification using deep learning and
on cnn-lstm deep learning models, Indian J. Sci. Technol. 14 (16) (2021) feature fusion, Comput. Electron. Agric. 211 (2023) 107990.
1310–1319. [43] Weiqing Min, Zhiling Wang, Jiahao Yang, Chunlin Liu, Shuqiang Jiang, Vision-
[17] Yonis Gulzar, Fruit image classification model based on mobilenetv2 with deep based fruit recognition via multi-scale attention cnn, Comput. Electron. Agric. 210
transfer learning technique, Sustainability 15 (3) (2023) 1906. (2023) 107911.
[18] Linh T. Duong, Phuong T. Nguyen, Claudio Di Sipio, Davide Di Ruscio, Automated [44] Fruits 360 dataset data, mendeley.com. https://data.mendeley.com/datasets/rp7
fruit recognition using efficientnet and mixnet, Comput. Electron. Agric. 171 3yg93n8/1. (Accessed 28 July 2023).
(2020) 105326. [45] Fruits and vegetables image recognition dataset — kaggle.com. https://www.
[19] Atif Mahmood, Sanjay Kumar Singh, Amod Kumar Tiwari, Pretrained deep kaggle.com/datasets/kritikseth/fruit-and-vegetable-image-recognition. (Accessed
learning-based classification of jujube fruits according to their maturity level, 28 July 2023).
Neural Comput. Appl. 34 (16) (2022) 13925–13935. [46] Fruit Recognition — kaggle.com. https://www.kaggle.com/datasets/chrisfilo/fruit
[20] Cinmayii A. Garillos-Manliguez, John Y. Chiang, Multimodal deep learning via late -recognition. (Accessed 28 July 2023).
fusion for non-destructive papaya fruit maturity classification, in: 2021 18th [47] Dry fruit image dataset — data.mendeley.com. https://data.mendeley.com/datase
International Conference on Electrical Engineering, Computing Science and ts/yfhgn8py5f/1. (Accessed 28 July 2023).
Automatic Control (CCE), IEEE, 2021, pp. 1–6. [48] Lilia Tightiz, Joon Yoo, Towards latency bypass and scalability maintain in digital
[21] Herman Herman, Tjeng Wawan Cenggoro, Albert Susanto, Bens Pardamean, Deep substation communication domain with iec 62439-3 based network architecture,
learning for oil palm fruit ripeness classification with densenet, in: 2021 Sensors 22 (13) (2022) 4916.
International Conference on Information Management and Technology (ICIMTech) [49] Nagnath Aherwadi, Usha Mittal, Jimmy Singla, N.Z. Jhanjhi, Abdulsalam Yassine,
ume 1, IEEE, 2021, pp. 116–119. M Shamim Hossain, Prediction of fruit maturity, quality, and its life using deep
[22] M. Fajrun Nadhif, Saruni Dwiasnati, Classification of date fruit types using cnn learning algorithms, Electronics 11 (24) (2022) 4100.
algorithm based on type, MALCOM: Indonesian Journal of Machine Learning and [50] M. Burhanis Sulthan, Fiqih Rahman Hartiansyah, Fruit type recognition using
Computer Science 3 (1) (2023) 36–42. hybrid method with principal component analysis (pca), Proceedings of
[23] Sm Nuruzzaman Nobel, Md Anwar Hussen Wadud, Anichur Rahman, Malikussaleh International Conference oMultidisciplinary Studies (MICoMS) 3
Dipanjali Kundu, Airin Afroj Aishi, Sadia Sazzad, Muaz Rahman, et al., (2022) 47–49.
Categorization of dehydrated food through hybrid deep transfer learning [51] Farsana Salim, Faisal Saeed, Shadi Basurra, Sultan Noman Qasem, Tawfik Al-
techniques, Statistics, Optimization & Information Computing 12 (4) (2024) Hadhrami, Densenet-201 and xception pre-trained deep learning models for fruit
1004–1018. recognition, Electronics 12 (14) (2023) 3132.
[24] SM Nuruzzaman Nobel, Md Ashraful Hossain, Md Mohsin Kabir, M.F. Mridha, [52] Sadaf Zeeshan, Tauseef Aized, Fahid Riaz, The design and evaluation of an orange-
Sultan Alfarhood, Mejdl Safran, SegX-Net: a novel image segmentation approach fruit detection model in a dynamic environment using a convolutional neural
for contrail detection using deep learning, PLoS One 19 (3) (2024) e0298160. network, Sustainability 15 (5) (2023) 4329.
[25] SM Nuruzzaman Nobel, SM Masfequier Rahman Swapno, A.C. Ramachandra, [53] Nohadra Behnam Israel, Adnan Ismail Al-Sulaifanie, Khorsheed Al-
Hasibul Hossain Shajeeb, Md Babul Islam, Rezaul Haque, Hybrid CNN LSTM Sulaifanie Ahmed, A recognition and classification of fruit images using texture
approach for sentiment analysis of Bengali language comment on facebook, in: feature extraction and machine learning algorithms, Academic Journal of Nawroz
2024 International Conference on Integrated Circuits and Communication Systems University 13 (1) (2024) 92–104.
(ICICACS), IEEE, 2024, pp. 1–8. [54] Israr Hussain, Qianhua He, Zhuliang Chen, Automatic fruit recognition based on
[26] Sadaf Zeeshan, Tauseef Aized, Fahid Riaz, The design and evaluation of an orange- dcnn for commercial source trace system, Int. J. Comput. Sci. Appl. 8 (2/3) (2018)
fruit detection model in a dynamic environment using a convolutional neural 1–14.
network, Sustainability 15 (5) (2023) 4329. [55] Shamim Ahmed, Sm Nuruzzaman Nobel, Oli Ullah, An effective deep cnn model for
[27] SM Nuruzzaman Nobel, Omar Faruque Sifat, Md Rajibul Islam, Md Shohel Sayeed, multiclass brain tumor detection using mri images and shap explainability, in:
Md Amiruzzaman, Enhancing GI cancer radiation therapy: advanced organ 2023 International Conference on Electrical, Computer and Communication
segmentation with ResECA-U-net model, Emerging Science Journal 8 (3) (2024) Engineering (ECCE), IEEE, 2023, pp. 1–6.
999–1015. [56] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das,
[28] SM Nuruzzaman Nobel, Shirin Sultana, Sondip Poul Singha, Sudipto Chaki, Md Ramakrishna Vedantam, Parikh Devi, Dhruv Batra, Grad-cam: visual explanations
Julkar Nayeen Mahi, Tony Jan, Alistair Barros, Md Whaiduzzaman, Unmasking from deep networks via gradient-based localization, in: Proceedings of the IEEE
banking fraud: unleashing the power of machine learning and explainable AI (XAI) International Conference on Computer Vision, 2017, pp. 618–626.
on imbalanced data, Information 15 (6) (2024) 298. [57] Understanding the black-box: towards interpretable and reliable deep learning
[29] Jin Wang, Cheng Zhang, Ting Yan, Jingru Yang, Xiaohui Lu, Guodong Lu, models [peerj]. https://peerj.com/articles/cs-1629/. (Accessed 10 May 2024).
Bincheng Huang, A cross-domain fruit classification method based on lightweight [58] Xuhong Li, Haoyi Xiong, Xingjian Li, Xuanyu Wu, Xiao Zhang, Ji Liu, Jiang Bian,
attention networks and unsupervised domain adaptation, Complex & Intelligent Dejing Dou, Interpretable deep learning: interpretation, interpretability,
Systems (2022) 1–21. trustworthiness, and beyond, Knowl. Inf. Syst. 64 (12) (2022) 3197–3234.
[30] Devrim Unay, Deep learning based automatic grading of bi-colored apples using [59] Nourah Alangari, Mohamed El Bachir Menai, Hassan Mathkour,
multispectral images, Multimed. Tool. Appl. 81 (27) (2022) 38237–38252. Ibrahim Almosallam, Exploring evaluation methods for interpretable machine
[31] Shenglian Lu, Wenkang Chen, Xin Zhang, Manoj Karkee, Canopy-attention-yolov4- learning: a survey, Information 14 (8) (2023) 469.
based immature/mature apple fruit detection on dense-foliage tree architectures [60] Brent Mittelstadt, 378Interpretability and transparency in artificial intelligence, in:
for early crop load estimation, Comput. Electron. Agric. 193 (2022) 106696. Oxford Handbook of Digital Ethics, 12, Oxford University Press, 2023.

13

You might also like