Classification of Nutmeg Ripeness Using Artificial Intelligence
Classification of Nutmeg Ripeness Using Artificial Intelligence
Corresponding Author:
Imam Bil Qisthi
Department of Electrical Engineering, Faculty of Industrial Technology, Gunadarma University
Pesona Kahuripan 5 B7/36, Cilengsi, West Java, Indonesia
Email: [email protected]
1. INTRODUCTION
Nutmeg is one of the plants originating from Indonesia with a high value. The plant with the Latin
name myristica fragrans is classified as a type of spice plant and has the potential benefits of almost all parts
of the nutmeg plant. In addition to nutmeg pulp which can be utilized as candied nutmeg, nutmeg syrup,
pickles, jams, and dodol in the culinary field, myristica fragrans has been widely used for traditional
medicine to treat several diseases, besides myristica fragrans has the potential as an antimicrobial
antioxidant, anti-inflammatory, antiulcer, anticancer, aphrodisiac, and various other activities because it
contains various phytochemicals such as lignans, neolignans, diphenylalkanes, phenylpropanoids, and
terpenoid [1]–[4]. The nutmeg plant parts, namely seeds, mace, and nutmeg oil, also have high economic
value and are export commodities because Indonesia is the largest nutmeg producer in the world, where most
nutmeg plantations in Indonesia are cultivated by community plantations and the rest by large plantations.
According to Indonesian plantation development statistics, nutmeg production in Indonesia in 2021 reached
39,577 tons [5]. Nutmeg essential oil contains up to 80% trimyristin, and trimyristin is a whitening agent for
the skin or whitening agent. Nutmeg, especially nutmeg, is a source of essential oil with high economic value
because this part contains around 20% to 40% fixed oil [6].
Indonesia is the largest producer and exporter of nutmeg on the world market. Sukabumi is one of
the regencies in West Java that has the largest nutmeg plantation, with an area of 1,679 ha. Based on the
results of observations, we made at one of the distillery in the Sukabumi area, nutmeg farmers did not pay
attention to the maturity level of nutmeg seeds after drying which resulted in nutmeg seeds with low maturity
mixed with nutmeg seeds that had an optimal maturity level. This has become a common problem for nutmeg
refiners. The oil content in nutmeg plants is most found in nutmeg parts of nutmeg seeds, the oil in nutmeg
seeds will be maximum when the nutmeg seeds have a little moisture content or in other words maximum
drying. The drying process is one way to reduce the moisture content of nutmeg seeds to reach the standard.
According to Jaiyeoba et al. [7], the moisture content will decrease as the maturity level of the seeds
increases, which means that the harvest time or maturity level of the fruit will affect the quality of nutmeg
after drying. Ignorance of farmers about the importance of harvesting and post-harvest processes that can
cause a decrease in the quality of nutmeg.
This research is important to make it easier for nutmeg oil refiners to classify the maturity of nutmeg
seeds. This study uses the convolutional neural network (CNN) method as one of the deep learning
technologies to help with classification problems. This deep learning uses a simpler network scheme with
fewer connections that are computationally much more efficient [8]. CNN was chosen because of research
conducted by Bogar et al. [9]. Seeing the rapid development of mobile technology, this research will utilize
Android mobile technology to facilitate the community, especially nutmeg oil refining entrepreneurs [9].
2. RELATED WORKS
CNN is an algorithm that already exists but is widely developed by several researchers in the field of
deep learning. This study is inseparable from previous research as a reference. Several studies have focused
on the development of CNNs across various objects.
Melinda et al. [9] introduces a mobile phone application capable of differentiating individuals with
autism spectrum disorder (ASD) from those with typical brain signals, based on asynchronous EEG brain
signals. It develops a preprocessing algorithm and utilizes the BCI2000 EEG data signal, making the process
automated using Python. Furthermore, the study employs a deep learning CNN as the output model, deployed
using Python-Flask. This enables the diagnosis of EEG signals for ASD and normal patients to be accessible
across various platforms through a REST API. Bacus and Linsangan [10] classify diseases in papaya leaves
using MobileNet's CNN architecture. The study used 1,394 image data and managed to obtain a high
accuracy average of 91.667%. There are things that need to be underlined in this study, namely that the
dataset used is an imbalanced dataset so that there is the potential for overfitting the model.
Research by Paulson and Ravishankar [11] is aimed at identifying types of herbal plants using
artificial intelligence or deep learning technology. There are 64 types of herbal plants that will be identified
by the CNN model and the visual geometry group (VGG) pre-train model version 16 and VGG version 19.
The results of this study give CNN a good accuracy of about 95.79%, VGG16 of 97.8%, and VGG19 of
9.6%. Research by Roslan et al. [12] purposed to investigate the performance of CNN on dataset of herbal
plants as medicine. The dataset used is a dataset of herbal plants on the Island of Pinang, Malaysia with
original data and augmented additional data. The results of this study testing the model with the original data
resulted in 75% accuracy and testing the model with the original data and additional data resulted in 88%
accuracy. So it can be concluded from this research that the accuracy of the model can increase with the
amount of data.
Wang et al. [13] detecting diseases and pest infections in plants using CNN architecture called ultra-
lightweight efficient network. This architecture consists of 2 part modules, namely the feature extraction
module that adopts residual depth-wise convolution and the qualification module that accepts multi-scale
features enhanced by spatial pyramid pooling layer. As a result, the model with this architecture produces
good results and has a lightweight performance. Wiryana et al. [14] classify store products by using the CNN
algorithm. The test is conducted with 1,050 product images divided into 35 labels and divided into three data,
namely 80% data training, 10% data validation, and 10% data test. The image used is preprocessed with a
size of 256×256 pixels. The data was trained with six convolution layers and an epoch of 50 with an
execution time of 33 minutes so as to achieve an accuracy of 91.37%.
Desai et al. [15] classify flower types by using the CNN algorithm. In this study using CNN
architecture, namely VGG19 as an extractor feature from flower image data. The results of this study
obtained 100% accuracy for training and 91.1% for validation for 17 flower type classes. Dyrmann et al. [16]
classify plant species by using deep convolutional neural network (DCNN) algorithm. The network was
engineered from scratch by being trained and tested on a complete of 10,413 pictures containing 22 weeds
and plant species within the early stages of growth. For these 22 species, this network is in a position to
realize a classification accuracy of 86.2%.
Rathi et al. [17] classify fish species by using DCNN algorithm and image processing methods. The
image processing methods used are Otsu binarization, dilation, and erosion. The results of this study obtained
an accuracy of 96.29%. Liew et al. [18] classify gender by using CNN algorithm. This study used publicly
available datasets namely SUM and AT&T. The input image size used was 32×32. The accuracy results on
each dataset were 98.75% for SUM and 99.38% for AT&T.
Vishnupriya and Meenakshi [19] classified music genres using neural network (NN) algorithms and
mel frequency cepstral coeficient (MFCC) feature extraction. The dataset in this study covers ten different
genres. The result of this study obtained 76% accuracy for training. Lu et al. [20] classify fruits by using
CNN algorithm. Designed CNN using six layers consisting of convolution layers, pooling layers, and fully
connected layers. The result of this study obtains 91.44% accuracy better than three state-of-the art: Support
vector machine, wavelet entropy, and genetic algorithm. Razali et al. [21] classifying nutrient deficiencies in
oil palms on leaves using the CNN algorithm. The study used 180 datasets and used several CNN
architectures. The results of this study Alexnet became an efficient architecture with few layers.
3. METHOD
The method will explain in detail about the general description of the application and the stages in
preparing data for the model in this study. The stages of preparing image data start from collection until the
data can be used by the model including resizing data, augmentation of image data, and processing of image
data. So that it can be understood for development in the future.
day 0 (after peeling) to day 5. Fifth, the distance between the camera and the object at the time of image data
collection is 30 cm. These provisions are obtained based on the results of direct observation. The image data
that was successfully collected with the existing provisions was 240 image data. The collected image data
will not be displayed all, only one image data for each class as shown in Figure 2. Figure 2(a) is a label of
LowQuality class, Figure 2(b) is a label of MidQuality class, and Figure 2(c) is a label of HighQuality class.
The data labeling process in this study is based on direct observations with nutmeg seed oil refining business
actors located in Sukabumi, Indonesia.
Figure 2. Image for (a) LowQuality, (b) MidQuality, and (c) HighQuality
Furthermore, the data will be processed so that it can be used by the model such as resizing,
multiplying images, and removing the background. Image data will be resized to 224×224 size, multiply
image data with augmentation techniques, and remove background using image image data segmentation.
Thus, it is expected that the model can work more efficiently with smaller image sizes but there is still
information that can be learned and will not process all image pixels only segmented pixels.
Furthermore, the image data will be performed in the augmentation process. The general purpose of
this stage is to multiply image data by modifying the image so that it is considered different by the computer.
Augmentation of images in two ways, namely rotating and flipping horizontally so that they are perceived by
the computer as different image.
Figure 4. Figure 4(a) is the original image before the augmentation process, Figure 4(b) is a horizontal flip
augmentation image, and Figure 4(c) is a rotated augmentation image with an angle of 359 degrees.
Figure 4. Image for (a) original image, (b) horizontal flip image, and (c) rotate image
After going through the image data augmentation stage, originally the image data amounted to 240
data, now it has become 720 image data. The 720 images data consists of 240 original data, 240 rotate data,
and 240 horizontal flip data. Next, the data will be segmented to separate parts of the image object from the
background and limit the image area to be processed by the model.
Figure 5. Image for (a) original image, (b) segmented image, (c) inverted image, (d) binary image,
and (e) ROI image
wasted in Table 1, particularly in the LowQuality category. It seems that the segmentation process is not
perfect, which is causing some object parts to be lost. To improve the results of data processing, it may be
necessary to refine the segmentation process.
MidQuality
HighQuality
The results of the model's performance in classifying are shown in the confusion matrix shown in
Figure 7. Figure 7 is a confusion matrix that describes the performance of the model. This confusion matrix
can be known as the accuracy value, precision value, recall value, and F1-score value. The confusion matrix
in Figure 7 is then calculated to determine the accuracy value, precision value, recall value, and F1-score
value. These values are used to find out whether the model that has been created is a good model. The
calculation results are shown in Tabel 2.
Based on Table 2, it can be concluded. First, the model has high accuracy for test data, reaching
100% for all classes. This shows that the model can classify data accurately. Secondly, the precision value on
the model for all classes reaches 100%. This indicates that of all the positive predictions made by the model,
all of them are correct, and none of the predictions are wrong. Third, the recall value on the model is also
very good for the "LowQuality" and "MidQuality" classes, with a value of 100%. As for the high class, recall
reached 92.3%, indicating that most samples included in the "HighQuality" class were identified by the
model. Fourth, all classes have a high F1-score, reaching 100% for the "LowQuality" and "MidQuality"
classes and 95.9% for the "HighQuality" classes. Fifth, the model shows uniform performance and good
performance in all classes, without either class having lower performance compared to the others.
Based on the existing points, it can be concluded that the model has a very good ability to classify
into three different classes, consistent performance and high matrix evaluation values show that this model
can be relied upon in performing classification tasks on the given data. However, it is important to remember
that these results are based on the data that has been provided, and model performance may vary on different
data. This requires further testing using data that is completely separate from the training data to validate the
performance of the model and ensure that the model has good generalization capabilities.
errors and make time efficiency. However, in addition to the success in making models and implementing
models into Android applications, there are obstacles such as models that are wrong in classifying and do not
rule out the possibility that the model will continue to be developed both structurally and how to process
data.
5. CONCLUSION
Based on the results of the design and testing in this study, it can be concluded that the creation of a
model that can classify the maturity level of nutmeg seeds using artificial intelligence has been successfully
carried out, and the implementation of the model into the android application has been successfully carried
out. The training accuracy result of 97.92% indicates that the CNN model is effective in recognizing the
features contained in the image, this is a pretty good result. The results of testing on models and applications
with new test data also showed good results, the model was able to classify new test data into one of three
existing classes. Then, for testing the application also gets good results, the application can run properly on
several Android devices with different specifications. From this result, it can be developed again such as
improving the results of data processing because there are still some parts of the object wasted due to the
image data segmentation process so it needs to be refined again for further research because the part of the
object must have wasted information. Then there are new test data that are misclassified so that future
research can refine the model structure and multiply the training data so that it can improve the level of
accuracy. Research can be further developed by implementing classification models into devices other than
Android so that it can help in the industrial field.
REFERENCES
[1] V. Nikolic et al., “Chemical composition, antioxidant and antimicrobial activity of nutmeg (myristica fragrans houtt.) seed
essential oil,” Journal of Essential Oil Bearing Plants, vol. 24, no. 2, pp. 218–227, Mar. 2021, doi:
10.1080/0972060X.2021.1907230.
[2] C. R. Kholibrina and A. Aswandi, “The aromatherapy formulation of essential oils in reducing stress and blood pressure on
human,” IOP Conference Series: Earth and Environmental Science, vol. 914, no. 1, pp. 1-7, Nov. 2021, doi: 10.1088/1755-
1315/914/1/012072.
[3] S. Das et al., “Assessment of chemically characterised myristica fragrans essential oil against fungi contaminating stored scented
rice and its mode of action as novel aflatoxin inhibitor,” Natural product research, vol. 34, no. 11, pp. 1611–1615, Jun. 2020, doi:
10.1080/14786419.2018.1519826.
[4] M. T. Ha, N. K. Vu, T. H. Tran, J. A. Kim, M. H. Woo, and B. S. Min, “Phytochemical and pharmacological properties of
myristica fragrans houtt.: an updated review,” Archives of Pharmacal Research, vol. 43, no. 11, pp. 1067–1092, Nov. 2020, doi:
10.1007/s12272-020-01285-4.
[5] E. Kembauw et al., “Cultivation system and marketing chain of nutmeg in East Seram District, Maluku Province, Indonesia,”
International Journal of Multidisciplinary Sciences and Arts, vol. 1, no. 2, pp. 134–139, Jan. 2023, doi:
10.47709/ijmdsa.v1i2.2015.
[6] A. Trifan, G. Zengin, I. Korona-Glowniak, K. Skalicka-Woźniak, and S. V. Luca, “Essential oils and sustainability: in vitro
bioactivity screening of myristica fragrans houtt. post-distillation by-products,” Plants, vol. 12, no. 9, pp. 1-16, Apr. 2023, doi:
10.3390/plants12091741.
[7] K. F. Jaiyeoba, C. A. Ogunlade, O. S. Kwanaki, and O. K. Fadele, “Moisture dependent physical properties of nutmeg (myristica
fragrans) relevant for design of processing machines,” Current Journal of Applied Science and Technology, vol. 39, no. 12, pp.
74–85, Jun. 2020, doi: 10.9734/cjast/2020/v39i1230665.
[8] Y. Wang et al., “A survey on deploying mobile deep learning applications: a systemic and technical perspective,” Digital
Communications and Networks, vol. 8, no. 1, pp. 1–17, Feb. 2022, doi: 10.1016/j.dcan.2021.06.001.
[9] M. Melinda, F. Arnia, A. Yafi, N. A. C. Andryani, and I. K. A. Enriko, “Design and implementation of mobile application for
CNN-based EEG identification of autism spectrum disorder,” International Journal on Advanced Science, Engineering, vol. 14,
no. 1, pp. 57–64, Feb. 2024, doi: 10.18517/ijaseit.14.1.19676.
[10] J. A. Bacus and N. B. Linsangan, “Detection and identification with analysis of carica papaya leaf using android,” Journal of
Advances in Information Technology, vol. 13, no. 2, pp. 162–166, 2022, doi: 10.12720/jait.13.2.162-166.
[11] A. Paulson and S. Ravishankar, “AI based indigenous medicinal plant identification,” in 2020 Advanced Computing and
Communication Technologies for High Performance Applications (ACCTHPA), Jul. 2020, pp. 57–63, doi:
10.1109/ACCTHPA49271.2020.9213224.
[12] N. A. M. Roslan, N. M. Diah, Z. Ibrahim, Y. Munarko, and A. E. Minarno, “Automatic plant recognition using convolutional
neural network on malaysian medicinal herbs: the value of data augmentation,” International Journal of Advances in Intelligent
Classification of nutmeg ripeness using artificial intelligence (Imam Bil Qisthi)
2450 ISSN: 2252-8938
BIOGRAPHIES OF AUTHORS