Fruit Classification
Fruit Classification
Scientific Programming
Volume 2022, Article ID 4194874, 16 pages
https://doi.org/10.1155/2022/4194874
Research Article
Fruits Classification and Detection Application Using
Deep Learning
Received 10 June 2022; Revised 4 October 2022; Accepted 4 November 2022; Published 17 November 2022
Copyright © 2022 Nur-E-Aznin Mimma et al. Tis is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Computer vision and image processing techniques are considered efcient tools for classifying various types of fruits and
vegetables. In this paper, automated fruit classifcation and detection systems have been developed using deep learning algorithms.
In this work, we used two datasets of colored fruit images. Te frst FIDS-30 dataset of 971 images with 30 distinct classes of fruits is
publicly available. A major contribution of this work is to present a private dataset containing 761 images with eight categories of
fruits, which have been collected and annotated by ourselves. In our work, the YOLOv3 deep learning object detection algorithm
have been used for individual fruit detection across multiple classes, and ResNet50 and VGG16 techniques have been utilized for
the fnal classifcation for the recognition of a single category of fruit in images. Next, we implemented the automatic fruit
classifcation models with fask for the web framework. We got 86% and 85% accuracies from the public dataset with ResNet50 and
VGG16, respectively. We achieved 99% accuracy with ResNet50 and 98% accuracy with the VGG16 model on the custom dataset.
Te domain adaptation approach is used in this work so that the proposed deep learning-based prediction model can cope with
real-world problems of diverse domains. Finally, an Android smartphone application has been developed to classify and detect
fruits with the camera in real-time. All the images uploaded from the Android device are automatically sent and consequently
analyzed on the web, and fnally, the processed data and results are returned to the smartphone. Te custom dataset and
implementation codes will be available after the manuscript has been accepted. Te custom dataset can be found at: https://
github.com/SumonAhmed334/dataset_fruit.
1. Introduction deep neural networks can be used for object discovery and
semantic picture division [5].
Fruits contain vitamins and dietary fbers, a vital source of Tanks to computer vision and advanced image pro-
the human diet[1]. Diferent types of more than 2,000 fruits cessing techniques, it has been successfully employed in
are found worldwide, but most people are familiar with only various works for automatic fruit recognition and classif-
10% of them. According to fruit production statistics, cation. Some of the notable works on automatic fruit rec-
million metric tons of fruits were produced worldwide in ognition and classifcation have been briefy described in the
2021, from which the largest producing countries are China, following paragraphs.
India, and Brazil [2]. Te advanced agricultural fruit rec- Most of these automatic fruit classifcation works used a
ognition system with a simple camera or sensor will play an wide range of deep learning-based neural network frame-
excellent role for farmers and general people [3]. In this works. For instance, in a recent work [6], the authors
modern era of technological advancements, fruit classif- designed an automatic model to recognize vegetables by
cation and recognition systems can be used for kids’ edu- image processing and computer vision approaches. Te
cational purposes, which interest them greatly [4]. Te latest authors compare 24 diferent types of vegetables in the
advanced computer vision technology with the utilization of dataset by using 3,924 pictures. First, the authors trained the
2 Scientifc Programming
data, and then, preprocessed the images by resizing and Te Gaussian Naive Bayes algorithm has been developed
normalizing. Next, they implemented the convolutional utilizing the Python platform environment. Te fndings
neural network (CNN) [7] learning model, which trains the of the various types of apples, i.e., Granny Smith,
data with a batch size of 16 and 100 numbers of epochs. Braeburn, Golden Delicious and Cripps Pink, and other
Finally, the convolution neural network approach developed fruits, for example, mandarin, lemon, and orange, indi-
on Keras recognized the vegetables employing color and cated that the projected average accuracy values for
texture characteristics and achieved approximately 95.50% training and test datasets were 100% and 73%,
accuracy. In [8], the authors utilized MLPNeural, fuctuating respectively.
logic, signifcant analysis of components, and neural net- Few automatic plant and fruit classifcation works have
works as additional techniques for automated skin deformity been extended to diferent applications, e.g., disease detec-
identifcation of citrus fruits. Next, the accuracy of the tion, smart farming, and so on. Abayomi-Alli et al. [14] used
proposed system has been evaluated in terms of its cate- MobileNetv2 to detect cassava leaf disease and an open-
gorization of the decision-making approach. Tis work source dataset. Tis deep learning model achieved 97.7%
achieved defect detection accuracies of 100 percent, 91 accuracy and 96.7% F1 score. Almadhor et al. [15] initiated
percent, 89 percent, and 83 percent, with the decision tree, plant disease recognition using machine learning techniques
principal parts analysis, fuzzy logic, and MLPNeural ap- and complex sensors. Te authors implemented image
proaches, respectively. Joseph et al. proposed an automatic segmentation and feature extraction by applying image
fruit classifcation system using a convolutional neural processing approaches. Oyewola et al. [16] identifed cassava
network framework [9]. Tis study uses a diverse dataset of mosaic viruses in plants to enhance the production of crops.
Indian fruits of more than 130 classes. Finally, the authors A custom convolutional recurrent neural network is applied
claimed accuracy of approximately 95% for the proposed to an open-source dataset of more than 5,600 images with
TensorFlow-based CNN model. Ponce and his colleagues fve distinct classes.
[10] focused on the automatic classifcation of olive. For the After reviewing the above papers, we can confrm that
experiment, 2,800 olive fruits of seven diferent varieties computer vision and sophisticated deep learning approaches
were photographed. Tis article used six diferent con- have been successfully applied for automatic fruit classif-
volutional neural network architectures, but fnally, Incep- cation and detection. However, only a few studies have used
tion-ResNetV2 produced the best results. Tis model deep learning models in a website framework or smartphone
achieves a 95.91 percent accuracy rate for diverse categories app for real-time classifcation. Most of the models are not
of olive classifcation. In [11], the authors introduced tested to determine whether they can be implemented in
computerized classifcation to classify fruits. Tey worked diverse domains.
with a particular dataset containing four diferent types of Tis paper implements an automatic classifcation and
fruits, all subjected to image processing algorithms. BPNN, detection of fruits using neural network approaches. Tis
SVM, and CNN classifers are utilized in this paper, with system can be used for children for educational purposes,
CNN providing the most remarkable accuracy of 90% industries and supermarkets and for people to know dif-
compared to other classifers. A novel fruit categorization ferent fruits for learning purposes. Te primary contribu-
approach based on fruit pictures and deep learning is tions of this work are described as follows:
presented in [12]. Te authors created a six-layer CNN to
recognize nine diferent types of fruits. Tree types of (i) Automatic fruit detection and classifcation system
datasets have been combined in this work, viz. one is a have been developed by using two datasets, i.e.,
bespoke dataset and the other two are public datasets. open-source FIDS-30 of 30 classes and collected
According to the fndings of the experiments, the proposed custom dataset of eight categories of fruits.
CNN method with six layers performed well in classifcation, (ii) For classifcation, we used VGG16 and ResNet50
with an accuracy of approximately 91.50%, which performed neural network models. YOLOv3 and YOLOv7,
better than the state-of-the-art classifcation approaches. deep learning frameworks, have been employed to
In some of the articles, advanced image processing detect multiple fruits in the image.
and machine learning techniques have been used. Seng (iii) Te domain adaptation technique is applied so that
and his co-author [13] built an automatic fruits recog- the proposed deep learning-based fruit classifcation
nition system consisting of various images processing model can cope with real-world problems of diverse
frameworks, i.e., input selection, color, shape, and size domains. Using this method, diferent sets of images
computing, and classifcation or recognition. Te authors of various fruits were used to train and test the
used 50 collected images for this recognition system in proposed model.
this paper. Next, 36 images were used to develop and train
using the KNN model. Finally, the authors utilized 14 (iv) Te web framework of the proposed automatic fruit
fruit images to evaluate the proposed framework. Tey classifcation and detection system is created with
have used a custom-made dataset that they have created the help of a fask.
on their own. Lastly, this model shows nearly 90% ac- (v) A Python-based API has been utilized to develop an
curacies in computing the geometrical properties of Android smartphone application, which uses the
various fruit categories. Shiv and Anand [3] built a fruit phone’s camera for instantaneous detection and
classifcation system using image processing techniques. recognition of fruits.
Scientifc Programming 3
Te primary contributions of this work are to create a 2.2.1. YOLOv3 and YOLOv7. Te YOLO technique is faster
custom dataset of eight species of fruits and apply YOLOv7, than the faster RCNN approach because it can directly
deep learning model, and domain adaptation technique. predict the bounding box and compare class likelihood
In this paragraph, the organization of this manuscript through a single neural architecture. Tis strategy can help to
has been discussed. Section 2 discusses the materials and mitigate the efects of lighting variance, overlap, and oc-
methods of the proposed automatic fruit classifcation clusion [17]. YOLOv3 and YOLOv7 utilize Darknet53 and
system with appropriate tables, fgures, and fowcharts. Te PyTorch, respectively, as the component extraction orga-
actual results of the research have been shown in Section 3. nization and utilization of the FPN construction to ac-
Finally, Section 4 provides some recommendations for the complish the combination of various scale highlights and
paper’s future improvement. diferent scale expectations [20]. Te utilization of various
scale expectations causes YOLOv3 to identify tiny targets
2. Materials and Models better. Te most advanced single-stage object detection
framework, YOLOv7 [21] ofers faster inference with ef-
In this section, the dataset details, brief discussions about the cient model scaling. Tis model preserves multiple batches
employed deep learning models and the performance of the of weights for parts of the entire architecture. Hence,
system will be discussed. YOLOv3 and YOLOv7 have been used as the framework to
detect fruits in this paper.
In the YOLO calculation, the frst pictures are frst
2.1. Datasets. In this work, we used two types of datasets of
resized to the information size, utilizing a scale pyramid
various categories of fruits. One is the public dataset named
structure like the feature pyramid network. For instance,
FIDS-30 [17], and the pictures of this dataset are collected
YOLOv3 will anticipate the three dimensions of the element
from Internet sources. Tis open-source consists of 30
guide of 13 × 13, 26 × 26, and 52 × 52 and go through
classes of fruits, as illustrated in Table 1. Tere are ap-
multiple times to move highlights between 2 contiguous
proximately 30 to 40 images in each category with a con-
scales. In each expectation scale, each framework cell will
siderable variation of added noises. A signifcant
anticipate three jumping boxes with the assistance of 3
contribution of this work is to present a dataset of various
anchor boxes. Te bounding box shape could increase de-
fruits images that we had obtained and captured by
tection performance when focusing on a particular activity
smartphone cameras. In this second dataset, we work with
[17]. Te architecture of the proposed YOLOv3 model is
eight classes of fruits, which are primarily available in
depicted in Figure 3. In Figure 4, an example of the detection
Bangladesh. As illustrated in Table 2, this custom dataset
operation for individual fruit detection of multiple classes of
contains 761 images of apples, coconuts, grapes, limes, or-
fruits of the proposed YOLOv3 framework is illustrated.
anges, tomatoes, bananas, and guavas. Figure 1 demon-
Next, ResNet50, and VGG16 techniques are utilized for
strates some sample images of the used open-source FIDS-30
the fnal classifcation.
dataset. Next, in the data splitting part, we assigned 80% of
the images for the training, 10% for the testing part, and 10%
for the validation part. Next, traditional data augmentation
2.2.2. ResNet50. ResNet50 is a convolutional neural orga-
techniques, e.g., projection, rotation, scaling, changing
nization that is 50 layers profound. Tis convolutional
brightness and contrast, etc., are applied to both datasets to
neural network comprises 5 phases, each with a convolution
enhance the efciency of the deep learning techniques.
and identity block. Every convolution and personality block
In Figure 2,demonstrates some sample images of the
consists of three layers. More than 23 million trainable
acquired custom dataset we captured with the mobile
parameters are available in the ResNet50 framework. Te[9]
camera.
ResNet50’s top layers are not frozen, which they learn
through backpropagation. However, the rest of the
2.2. Model Descriptions. Tis section describes the deep ResNet50’s layers remain frozen [22]. Tis operation is
learning approaches that have been employed in this work performed by building a ResNet50 model with pretrained
for automatic fruit detection of multiple classes and clas- parameters (weights) from the ImageNet dataset [23]. Te
sifcation of single category. In this paper, YOLOv3 deep preprepared organization can group photos into 1,000 item
convolutional neural network framework has been used for classes.
fruit detection. Te convolutional neural networks are a Te architecture of the used ResNet50 model is dem-
variation of profound networks, which consequently learn onstrated in Figure 5, which is a 50-layer deep residual
basic edge shapes from crude information and recognize the learning framework with fve steps. Tis framework is de-
complicated conditions inside each picture through include vised for large-scale classifcation of fruits in the natural
extraction [13]. Te convolutional neural networks incor- environment. Te cross-entropy loss function for the
porate diferent layers compared to the human visual ResNet50 model is expressed as follows:
framework. Among them, the convolutional layers, by and
large, have flters with the parts of 11 × 11, 9 × 9, 7 × 7, 5 × 5, 1 n
Loss � − log P(N), (1)
or 3 × 3. Te channel fts load through preparing and n i�0
learning, while the loads can separate elements, only like
camera channels. where n is the output size of the classifcation network.
4 Scientifc Programming
Table 1: Class names and numbers of dataset-I. yield layer has four hubs. We then, at that point, freeze the
Class names Number of samples
VGG16 model, i.e., we do not prepare any of the loads; we
use it as it is. VGG16 has 16 convolutional layers, as depicted
Acerolas 24
in Figure 6, and is particularly appealing due to its very
Apples 38
Apricots 30 homogeneous and uniform architecture.
Avocados 26 Next, the proposed automatic fruit detection and clas-
Bananas 42 sifcation systems have been deployed into a website
Blackberries 37 framework and Android smartphone application, which has
Blueberries 32 been discussed in the subsequent sections.
Cantaloupes 31
Cherries 33
Coconuts 26 2.3. Web Framework. In this work, the fask framework has
Figs 26 been used to create a web application of the proposed fruit
Grapefruits 31 classifcation system. Python-based fask is an easy-to-use
Grapes 38
and fexible web framework [26]. Tis module is named a
Guava 33
Kiwifruit 36 microframework because it does not require specialized
Mangos 34 instruments or libraries. It has no dataset deliberation layer,
Olives 23 structure approval layer, or other parts where prior external
Oranges 35 libraries provide typical functions. However, this fexible
Passion fruit 22 module supports integrations that can add application
Peaches 27 features as if they were made inside of fask. Figure 7 il-
Pears 32 lustrates how the proposed classifcation models have been
Pineapples 34 deployed to a web and smartphone applications using the
Plums 31 fask framework.
Pomegranates 30
Finally, Figure 8 shows the complete fruit detection and
Raspberries 39
Strawberries 46
recognition system deployed in a website framework. Here
Tomatoes 46 are two categories: the frst one is called classifcation, and
Watermelons 39 the second one is called detection. When a user selects the
Lemons 29 classifcation category, consequently, he needs to upload the
Limes 29 image of a single fruit. After that, the name will be shown as
the output on the fnal page of the mobile application. For
detection, a user has to drop a photo of multiple fruits, and
Table 2: Class names and numbers of dataset-II. then, the annotations and name will be returned as output. It
is worth mentioning that the user can upload images from
Class names Number of samples
the mobile phone’s memory or can capture images in-
Apple 200
stantaneously by using its camera.
Orange 100
Guava 60
Lime 50
Tomato 95 2.4. Android Application. We have created an Android
Coconut 50 smartphone application with which we can easily detect
Banana 150 and classify fruits using a mobile camera in real time.
Grapes 30 Consequently, after building the fruit detection and
classifcation model, an API was created. We could di-
rectly load this model into the app, but we will serve it via
2.2.3. VGG16. Te main contribution of VGG16 is to API to decrease the fnal application’s requirements.
demonstrate that, under certain circumstances, increasing Second, this system can be utilized in other applications
the network’s depth can improve performance [24]. Tis such as websites, telegram bots, slack bots, and so on.
convolutional neural network model is vast, and it has Developing an API in Python is simple, and we have used
around 138 million (approx.) boundaries. Te model ac- the fask REST framework in this research. It is a fexible
complishes 92.70% top-5 test precision in ImageNet, which microframework that can handle any web service with
is a dataset of more than 14 million pictures consists of just a few lines of code. We have constructed an API
approximately 1,000 classes. Accordingly, the framework endpoint for this application, which accepts all model
has learned rich element portrayals for many pictures. Te parameters as input and returns the fnal forecast. Te
organization has a picture input size of 224 × 224. Next, we result from the API, the fruit images, and the user data
add another top layer comprising a seriously huge, com- will be stored in the database. Tis work used a Firebase
pletely associated layer with satisfactory regularization as a nonrelational database for the app management tasks.
dropout. Consequently, the pictures were downsized to Figure 9 illustrates the working sequences in developing
224 × 224 pixels in size to meet the input requirements of the the proposed Android application of the fruit classif-
pretrained VGG16 [25]. Since there are four yield classes, the cation system.
Scientifc Programming 5
3. Results and Discussion data augmentation techniques make the model insensitive to
camera orientation, generalize better to various contexts,
In this section, the results of the proposed automatic fruit and reduce overftting and underftting problems. Precision,
classifcation system are discussed. In this work, an online recall, and F1 score have been calculated from the expres-
framework, make sense, has been used to label the images sions as follows:
and export them into corresponding XML fles. Te labeled
images are preprocessed in the Robofow tool by performing TP
conventional preprocessing techniques, e.g., auto-orienta- Precision � , (2)
(TP + FP)
tion, object isolation (crop and extract bounding boxes into
individual images), and resizing. Te data augmentation TP
process has been done to create new training examples for Recall � , (3)
(TP + FN)
the model to learn from by generating augmented versions
of each image in the training set [14]. In this work, bounding (2 × Precision × Recall)
box level augmentation instead of picture level has been F1 � . (4)
(Precision + Recall)
executed as the bounding box accomplishes better accuracy.
Random horizontal and vertical fips, ± 90° rotation, nor- In (2) and (3), TP, FN, and FP are shortenings for True
malization, and brightness adjustment are implemented. Te Positive (promising discoveries), False Negative (missed
6 Scientifc Programming
Concat
Conv
Conv
Conv
Conv
Conv
Conv
Concat
Upsample
Upsample
3×3 Conv, 32
Conv, 128
Conv, 256
Conv, 512
Conv
Conv
Conv
Conv
81 91 93 105
36 61 79
Residual block Detection Detection Detection
Figure 4: Demonstration of fruit detection of multiple classes of fruits of the implemented YOLOv3 model.
identifcations), and False Positive (inaccurate fndings), networks have attained 99% and 98% accuracy, respectively,
respectively. F1 score was led as a compromise among recall on the test data after the end of the tenth epoch. Te
and precision to show the exhaustive presentation of the confusion matrix in Figure 15 demonstrates how the models
prepared models, which has been expressed in (4). performed on the test dataset for eight classes of fruits.
Figure 16 shows the behavior of training and validation
accuracies concerning epoch numbers for the VGG16
3.1. Results for the FIDS-30 Open-Source Dataset. technique. According to Figure 17, the training and vali-
Tables 3 and 4 demonstrate the F1 score for the public FIDS- dation losses of the VGG16 framework reduce signifcantly
30 dataset in the ResNet50 and VGG16 models, respectively. with the change of epochs. Interestingly, the results for the
According to these tables, the mean F1 score for the VGG16 ResNet50 model for the custom dataset have not been re-
and ResNet50 frameworks for the FIDS-30 dataset are 0.878 ported, as the VGG16 model performed slightly better than
and 0.843, respectively. the ResNet50 technique.
Next, Figures 10 and 11 illustrate the performance of the
ResNet50 and VGG16 classifcation algorithms, respectively,
which have been summarized using confusion matrices. 3.3. Results for the Domain Adaptation Technique. Next, the
Figures 12 and 13 illustrate the training and validation domain adaptation technique is applied to substantiate
behaviors versus epochs of the VGG16 model with the FIDS- the efciency of the proposed fruit detection and clas-
30 dataset. We have used VGG16 as our fnal classifcation sifcation system. According to this approach, the deep
model because this technique gives us more accuracy than learning model was initially trained on source images and
ResNet50. evaluated on a dataset from a diferent source [27].
In this work, the YOLOv7 deep learning model has been Various cross-domain adaptation experiments have been
implemented in the Robofow environment and PyTorch performed to evaluate the performance of the proposed
framework. Hyperparameters to train the YOLOv7 model fruit classifcation system. First, the proposed models
have been illustrated in Table 5. Te dataset has been trained were trained with the custom dataset and tested with the
for ten epochs and 16 batch sizes and the training stops if the open-source FIDS-30 dataset. According to Table 6, in
validation generalization loss does not improve. Te this case, the ResNet50 and VGG16 techniques exhibit
YOLOv7 model achieved 96.1% accuracy, 0.93 and 0.89 testing accuracies of 85% and 86%, respectively. Next, the
precision and recall, respectively, for the FID-30 dataset of proposed models were trained with the FIDS-30 public
30 fruit categories. Testing accuracy and precision with the dataset and tested with the custom dataset. For this case,
increment of epochs for the YOLOv7 model are demon- the ResNet50 and VGG16 techniques exhibit testing
strated in Figure 14. accuracies of 98% and 99%, respectively, with the custom
dataset.
3.2. Results for the Custom Dataset. In this section, the
proposed models’ performance has been evaluated using the 3.4. Android Development. In this work, we have designed
custom dataset obtained by the authors of the manuscript. a simple Android smartphone application with a user-
Te proposed VGG16 and ResNet50 convolutional neural friendly interface and named it Fruit Holmes. We used
Scientifc Programming 7
Zero padding
Batch Norm
Conv Block
Conv Block
Conv Block
Conv Block
Input Output
Flattening
Max Pool
ID Block
ID Block
Avg Pool
ID Block
ID Block
CONV
ReLu
FC
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
Figure 5: Architecture of the used ResNet50 architecture.
Conv 3-3
Conv 3-1
Conv 3-2
Conv 1-1
Conv 1-2
Conv 2-1
Conv 2-2
Conv 4-1
Conv 4-2
Conv 4-3
Conv 5-1
Conv 5-2
Conv 5-3
Pooling
Pooling
Pooling
Pooling
Pooling
Dense
Dense
Dense
OUTPUT
INPUT
1. URL
from user
Generate HTML
Web Page
6. Send prediction
to generate HTML
Flask Server
PyTorch 5. Request model to
predict classes on new
downloaded images 4. Store Images
and update
3. Send Request to server.
Download Images
from URL
Start
Classifcation Detection
Submit Submit
Figure 8: Work fowchart of the implementation of the proposed system in a web application.
Table 3: F1 score for all 30 classes in the ResNet50 model for the FIDS-30 dataset.
Class names F1 scores
Acerolas 0.80
Apples 0.60
Apricots 0.50
Avocados 1
Bananas 0.89
Blackberries 1
Blueberries 0.80
Cantaloupes 0.80
Cherries 1
Coconuts 1
Figs 0
Grapefruits 1
Grapes 0.86
Guava 0.67
Kiwifruit 1
Mangos 1
Olives 1
Oranges 0.80
Passion fruit 1
Peaches 0.75
Pears 1
Pineapples 1
Plums 0.86
Pomegranates 1
Raspberries 0.80
Strawberries 0.80
Tomatoes 0.80
Watermelons 0.89
Lemons 0.86
Limes 1
Table 4: F1 score for all 30 classes in the VGG model for the FIDS-30 dataset.
Class names F1 scores
Acerolas 0.57
Apples 1
Apricots 0.40
Avocados 0.89
Bananas 0.89
Blackberries 0.89
Blueberries 1
Cantaloupes 0.89
Cherries 0.86
Coconuts 1
Figs 0.89
Grapefruits 1
Grapes 0.80
Guava 0.57
Kiwifruit 1
Mangos 0.67
Olives 1
Oranges 0.67
Passion fruit 0.67
Peaches 0.86
Pears 1
Pineapples 0.75
Plums 0.86
Pomegranates 1
Raspberries 0.57
Strawberries 1
10 Scientifc Programming
Table 4: Continued.
Class names F1 scores
Tomatoes 1
Watermelons 1
Lemons 0.92
Limes 0.67
Acerolas
Apricots
Bananas
Blueberries
Cherries
Figs
Grapes
Kiwifruit
Limes
Olives
Passionfruit
Pears
Plums
Raspberries
Tomatoes
Olives
Passionfruit
Acerolas
Apricots
Bananas
Blueberries
Cherries
Figs
Grapes
Kiwifruit
Limes
Pears
Plums
Raspberries
Tomatoes
Figure 10: Confusion matrix of ResNet50 model for the FIDS-30 dataset.
Android Studio to create a remote Android app to im- descriptions of the detected fruits to this app using the
plement several libraries. We used the VGG16 technique necessary libraries. After the output fruit has been de-
for the proposed Android application because it gives us tected, there will be an option “say.” After clicking that
more accurate results than other models. Next, we used a button, the designed mobile application will start reading
fask framework to connect the models. Flask assists us in the name and descriptions of the specifc fruit. Figure 18
creating the APK link. After copying the URL into the shows some example images of the proposed application
app-specifc blank box, the app and models are con- in real-time for fruit classifcation (left) and detection
nected. Finally, we add text-based and verbal speech (right).
Scientifc Programming 11
Acerolas
Apricots
Bananas
Blueberries
Cherries
Figs
Grapes
Kiwifruit
Limes
Olives
Passionfruit
Pears
Plums
Raspberries
Tomatoes
Olives
Passionfruit
Acerolas
Apricots
Bananas
Blueberries
Cherries
Figs
Grapes
Kiwifruit
Limes
Pears
Plums
Raspberries
Tomatoes
Figure 11: Confusion matrix of VGG16 for the FIDS-30 dataset.
100
90
Accuracy
80
70
60
50
0 2 4 6 8
Epochs
Training Accuracy
Validation Accuracy
Figure 12: Training and validation accuracies vs. epoch graphs of the VGG16 network for FIDS-30 dataset.
Table 7 illustrates the comparison of the proposed fruit proposed automatic fruit classifcation and detection system
detection and classifcation system with other similar works outperformed most of the works in terms of classifcation
on the FIDS-30 dataset. According to this table, the accuracy and other metrics.
12 Scientifc Programming
1.4
1.2
1.0
Loss
0.8
0.6
0.4
0.2
0 2 4 6 8
Epochs
training loss
Validation loss
Figure 13: Training and validation loss vs. epochs of the VGG16 model for FIDS-30 dataset.
1.0
0.8
0.8
0.6
Accuracy
Precision
0.6
0.4
0.4
0.2
0.2
0 10 0 10
Epochs Epochs
Figure 14: Accuracy and precision vs. epochs for the FIDS-30 dataset in YOLOv7.
Scientifc Programming 13
0 23 0 0 0 0 0 0 0
20
1
0 6 0 0 0 0 0 0
2 0 0 7 0 0 0 0 0 15
3 0 0 0 4 0 0 0 0
Actual
4 12 10
0 0 0 0 0 0 0
5 0 0 0 0 0 10 0 0
5
6 0 0 0 0 0 0 14 0
7 0 0 0 0 0 0 0 3
0
0 1 2 3 4 5 6 7
Predicted
Figure 15: Confusion matrix of VGG16 technique for the custom dataset.
100
90
Accuracy
80
70
60
50
0 2 4 6 8
Epochs
Training Accuracy
Validation Accuracy
Figure 16: Training accuracy vs. validation accuracy graph for custom dataset in VGG16 model.
1.50
1.25
1.00
Loss
0.75
0.50
0.25
0 2 4 6 8
Epochs
training loss
Validation loss
Figure 17: Training loss and validation loss graph for custom dataset in VGG16.
14 Scientifc Programming
(a) (b)
Figure 18: Examples of the sample work of the designed smartphone application interface.
Table 7: Comparison of the proposed system with other existing works on FIDS-30 dataset.
References Networks Accuracies (%) Other metrics
[28] ResNet50 89.16 F1 score: 0.89
[29] MobileNetv3 89.7
[30] Inceptionv3 89.7
Precision: 0.93
Tis work YOLOv7 96.1
recall: 0.89
Scientifc Programming 15