100% found this document useful (1 vote)
36 views

The Influence of Parallel Computing On Building Deep Learning Model For The Classification of Bean Diseases

In recent years, the utilization of deep learning techniques for image classification has made significant strides in the field of agriculture. One of the key areas of interest in agriculture is the early detection and classification of diseases in crops, as this can have an insightful impact on crop revenue and quality. This research has investigated the influence of parallel computing on the performance of a deep learning-based classification model for diagnosing bean diseases.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
36 views

The Influence of Parallel Computing On Building Deep Learning Model For The Classification of Bean Diseases

In recent years, the utilization of deep learning techniques for image classification has made significant strides in the field of agriculture. One of the key areas of interest in agriculture is the early detection and classification of diseases in crops, as this can have an insightful impact on crop revenue and quality. This research has investigated the influence of parallel computing on the performance of a deep learning-based classification model for diagnosing bean diseases.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL1251

The Influence of Parallel Computing on


Building Deep Learning Model for the
Classification of Bean Diseases
1
Jean Bosco Gashugi ; 2Dr. Emmanuel Bugingo (PhD)
Master of Science with Honors in Information Technology, at the University of Kigali, Rwanda

Abstract:- In recent years, the utilization of deep II. METHODOLOGY


learning techniques for image classification has made
significant strides in the field of agriculture. One of the A. Data Collection Methods and Instruments/ Tools
key areas of interest in agriculture is the early detection Data collection in deep learning for image
and classification of diseases in crops, as this can have an classification is a critical step that significantly impacts the
insightful impact on crop revenue and quality. This performance of the model. Based on that while conducting
research has investigated the influence of parallel Data collection for this research, we have chosen to use a
computing on the performance of a deep learning-based dataset provided by AI-Lab Makerere, because the leaf
classification model for diagnosing bean diseases. images had high resolution and were annotated,
Specifically, we have explored the use of parallel standardized size, format, and color balance, which
computing frameworks to accelerate model training and enhanced the quality and consistency of the dataset.
inference, thereby enhancing the efficiency and Additionally, these leaf images are totally similar with the
effectiveness of disease classification. Our findings local bean leaf here in Rwanda (AI-Lab-Makerere, 2020).
demonstrated the potential for parallel computing to
accelerate model training. When training a bean disease  Jupyter Notebook
classification model, we achieved an accuracy of 0.93 During this research, we have used Jupyter Notebook
using parallel computing, compared to 0.83 with serial as an interactive computing environment that is specifically
computing. Moreover, parallel computing significantly designed for developing, training, and evaluating deep
reduced training time, taking only 3 minutes compared learning models. It is integrated with popular deep learning
to 51 minutes with serial computing. frameworks called TensorFlow and many libraries installed
(Brian, 2014).
I. INTRODUCTION
 Python Language
Parallel computing has emerged as a transformative Python language serves as a powerful and flexible tool
technology in various domains, revolutionizing the for implementing algorithms and building models. It enables
efficiency and scalability of computational tasks (Wu, developers to easily express complex mathematical
2021). In the realm of agriculture, early and accurate disease computations and build intricate neural networks, its
detection in crops is crucial for maximizing yield and simplicity, readability, and robust ecosystem (python, 2024).
minimizing losses. Deep learning particularly convolutional
neural networks (CNNs), has developed as a powerful tool  Parallel Computing
for building disease classification models. However, these Parallel computing offers a compelling solution to
models often require significant computational resources, address the computational challenges associated with deep
especially when dealing with large datasets of images. learning for beans disease classification. (Hegde Vishakh;
Sheema Usmani, 2017) investigated the use of parallel
This paper explores the profound influence of parallel computing frameworks like TensorFlow or PyTorch to
computing on the development of deep learning models for distribute the training workload across multiple GPUs,
beans disease classification. By harnessing the achieving significant speedups (Wang, 2019). SLURM as a
computational power of parallel architectures, such as job scheduler was used to efficiently distribute and manage
Graphics Processing Units (GPUs) and distributed the computational workload across multiple nodes of a
computing frameworks, where we overcome the cluster (Slurm & Deep Learning, n.d.).
computational bottlenecks inherent in training deep neural
networks on large-scale datasets (Geng, 2020).  Ray Core
Ray Core is the heart of Ray, an open-source
framework for parallel and distributed Python applications.
It offers a small set of powerful primitives that enable you to
easily leverage the processing power of multiple cores or
even a cluster of machines (Team, 2024).

IJISRT24JUL1251 www.ijisrt.com 1498


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL1251

B. Data Analysis
To enhance deep learning model performance in
distinguishing between healthy and diseased bean plants,
robust datasets comprising images of both conditions are
essential. Implementing data preprocessing techniques,
including image resizing, normalization, and augmentation,
is crucial to refine the data quality and augment model
accuracy. Furthermore, leveraging parallel computing
facilitates swift and efficient execution of tasks like parallel
image resizing and data augmentation across multiple
processors, thereby expediting the preprocessing phase and
overall model development process (Elhoucine Elfatimi,
2023).

 Model Evaluation
Model evaluation is a very crucial part of building an
effective deep learning model since it gives an explanation
on the model performance. To evaluate a model, evaluation
metrics are employed to discriminate among the model
results. This research adopted 3 model evaluation metrics
that are classification accuracy, confusion matrix, and
classification report (OCHIENG, 2022).

C. Research Design
For this study, due to stratified sampling is a technique
in which the population is divided into smaller, more
homogeneous subgroups called strata. Furthermore, it
ensures that the sample is representative of the entire
population, and it is especially important when the
population is not homogeneous. It is in this regard that we
used this research method to collect the target population
which corresponds to the research questions we intended to
address. Therefore, we developed a model to be used by
farmers and others to detect diseases in bean plants in real-
time.

The following block diagram shows how parallel


computing empowered by Ray Core and SLURM speeds up
the development of a deep learning model for classifying
bean diseases. Data is collected and prepared, then
augmented to create a richer dataset. Ray Core and GPUs
are used for parallel computing to train the model more
efficiently. Once trained, the model is tested, evaluated, and
potentially compressed (quantization) for deployment on
smartphones.
Fig 1 Parallel Computing Block Diagram

III. CONCEPTUAL FRAMEWORK

The conceptual framework illustrates independent and


dependent variables to take into consideration while
studying the influence of parallel computing on bean disease
classification model building using deep learning. The
parallel computing techniques were implemented to speed
up computations, where many calculations are carried out
simultaneously by taking advantage of modern multi-core
processors and distributed computing environments,
enabling efficient handling of complex and large-scale
computational tasks. As a result, we obtained the best model
to be deployed in real-world applications for agricultural
contexts.

IJISRT24JUL1251 www.ijisrt.com 1499


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL1251

Fig 2 Conceptual Framework

IV. DATA PRESENTATION, ANALYSIS AND computing influences the efficiency and accuracy of deep
INTERPRETATION OF FINDINGS learning models in this specific context.

A. Introduction of the Contents of the Chapter B. Parallel vs Serial Computing


This chapter presents the findings from the study on Parallel computing leverages multiple processors to
the influence of parallel computing on building deep perform computations simultaneously, significantly
learning models for the classification of bean diseases. It reducing training times for models. We have compared the
includes a detailed analysis of the data and the results performance of parallel and serial computing approaches in
obtained from various experiments. The objective was to the context of training deep learning models for the
provide a comprehensive demonstration of how parallel classification of bean diseases, then the following findings
were observed.

Fig 3 Model Training with Parallel Processing

IJISRT24JUL1251 www.ijisrt.com 1500


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL1251

In contrast, Serial Computing executes one instruction We observed that, when employing parallel computing,
at a time in a sequential manner. The process data were in a the training time of the deep learning model was taking
step-by-step fashion and the performance was generally approximately 3 minutes. This efficiency was achieved by
slow due to the sequential of tasks. The speed was limited distributing computations across multiple processors,
by a single core and could not exploit modern multi-core allowing simultaneous processing of data and tasks. In
architectures effectively. It was taking a long training time contrast, serial computing, which processes tasks
for large datasets and complex models because serial sequentially, took extremely longtime around 51 minutes to
computing was limited to the capabilities of a single core. complete the same training. This stark difference highlights
the advantage of parallel computing in handling the
C. Analysis of Findings intensive computations typical in deep learning tasks,
thereby accelerating the training process and improving
 Training Time Findings overall efficiency.
To evaluate the impact of parallel computing on
training time, we trained the same deep learning model  Model Performance Findings
using both serial and parallel computing approaches. The Our model was trained using parallel computing
training times were recorded and compared as follows: approach and achieved higher accuracy due to the ability to
process larger datasets and reduce the risk of overfitting by
 With Parallel Computing Training Time: 201.33 (s)  3 mitigating the use of larger datasets and regularization
mins techniques as shown on below figures.
 With Serial Computing Training Time: 3049.52 (s) 
51mins  Accuracy

Fig 4 Accuracy Obtained on Parallel vs Serial Computing

Using parallel computing, the deep learning model  Model in Real-World Application Findings
achieved an accuracy of approximately 0.93. This is due to Deploying the model on mobile devices allows farmers
the ability to process larger batches of data simultaneously and agricultural workers to use their smartphones to quickly
and utilize more complex architectures efficiently. In and accurately diagnose bean diseases in the field. This has
contrast, with serial computing, the model reached an led to timely interventions and better crop management
accuracy of around 0.83, likely due to limitations in practices in bean disease detection.
processing power and the inability to handle larger datasets
effectively. This demonstrates the significant benefits of
parallel computing in both speed and accuracy for deep
learning tasks.

IJISRT24JUL1251 www.ijisrt.com 1501


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL1251

Fig 5 Model deployment on Android Smartphones in Real-World Application

As a result, the farmers will diagnose bean diseases RECOMMENDATIONS


directly in the field by capturing images with their
smartphones. This eliminates the need for sending samples Based on the study findings, the following
to labs for analysis, saving valuable time, etc. Moreover, recommendations are made: First, we encourage individuals
early detection is crucial for containing outbreaks and to create their own high-performance computing clusters
preventing significant yield loss. using Raspberry Pi nodes, which offers a cost-effective
solution for setting up HPC clusters and utilizing parallel
V. CONCLUSION computing for model training and heavy computational
tasks. Second, researchers and practitioners should adopt
This study conclusively showed that parallel parallel computing frameworks to accelerate the training
computing demonstrably accelerated the development and process and enhance model performance. Third, leveraging
deployment of a deep-learning model for bean disease transfer learning with pre-trained models can significantly
classification. Leveraging parallel computing yielded a reduce training time and improve accuracy, particularly
significant improvement in model accuracy, achieving a when dealing with limited datasets. Fourth, implementing
classification accuracy of 0.93 compared to 0.83 obtained comprehensive data augmentation techniques consistently
with serial computing. Furthermore, parallel computing can enhance model robustness and generalization
drastically reduced training time, taking only 3 minutes capabilities. Finally, optimizing models for mobile
compared to 51 minutes with serial processing. This deployment is crucial, as it enables real-time, on-the-field
enhanced efficiency facilitates faster model iterations, diagnostics, greatly benefiting agricultural practices by
exploration of a broader range of hyperparameters, and providing valuable tools for farmers and agricultural
ultimately, the development of more robust and accurate workers.
models. Additionally, the ability to deploy these models on
mobile devices paves the way for real-time disease
diagnosis in agricultural fields, empowering farmers and
agricultural workers.

IJISRT24JUL1251 www.ijisrt.com 1502


Volume 9, Issue 7, July – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24JUL1251

REFERENCES [14]. Prathamesh Borhade, R. D. (2020). Image


Classification using Parallel CPU and GPU
[1]. Cheng, Y. e. (2019). Bean leaf disease detection and Computing. International Journal of Engineering and
classification based on deep residual learning. Advanced Technology(IJEAT), ISSN: 2249 - 8958,
Computer and Electronics in Agriculture. Volume-9 Issue-4.
[2]. Deep Learning on Supercomputers. (n.d.). Retrieved [15]. Seyed Hossein Nazer Kakhki, M. V. (2022). Predict
from https://towardsdatascience.com/: https://towards bean production according to bean growth, root rots,
datascience.com/deep-learning-on-supercomputers- fly and weed development under different planting
96319056c61f dates and weed control treatments. Kermanshah:
[3]. Elhoucine Elfatimi, R. E. (2023, November). Impact Plant Protection Research Department, Kermanshah
of datasets on the effectiveness of MobileNet for Agricultural & Natural Resources Research &
beans leaf disease detection. Retrieved from Education Center, AREEO, Kermanshah, Iran.
SpringerLink: https://link.springer.com/article/ [16]. Shouan Zhang, N. D. (2019). Disease Control for
10.1007/s00521-023-09187-4 Snap Beans in Florida. IFAS Extension, University
[4]. Geng. (2020). Parallel computing for training deep of Florida.
learning models for beans disease classification. [17]. Slurm & Deep Learning. (n.d.). Retrieved from
IEEE international Conference. run.ai: https://www.run.ai/guides/slurm/slurm-deep-
[5]. Hoang-Tu Vo, L.-D. Q. (2023). Ensemble of Deep learning
Learning Models for Multi-plant Disease [18]. Stratified Random Sampling. (n.d.). Retrieved from
Classification in Smart Farming. Cantho City, Questionpro: https://www.questionpro.com/blog/
Vietnam: Software Engineering Department, stratified-random-sampling/
FPTUniversity. [19]. Wang, X. e. (2019). A GPU-accelerated deep
[6]. Jean B. Ristainoa, P. K. (2021). The persistent threat learning framework for bean disease classification.
of emerging plant diseasepandemics to global food Computers and Electronics in Agriculture.
security. Manhattan: Barbara Valent, Kansas State [20]. What Is CUDA? (n.d.). Retrieved from https://blogs.
University, Manhattan, KS. nvidia.com: https://blogs.nvidia.com/blog/what-is-
[7]. Kahira, A. N. (2021). Convergence of Deep Learning cuda-2/
and High Performance Computing: Challenges and [21]. Wu, W. e. (2021). Parallel computing for training
Solutions. Barcelona: Universitat Politecnica de models for multi-plant disease classification. In
Catalunya. Proceedings of the 2021 International Conference on
[8]. Kolodziejczak, K. P. (2020). The Role of Agriculture Artificial Intelligence and Machine Learning
in Ensuring Food Security in Developing Countries. (ICAIML)).
Considerations in the Context of the Problem of [22]. Xidong Wu, P. B. (2023). Performance and Energy
Sustainable Food Production. Poznan, Poland: Consumption of Parallel Machine Learning
Department of Economics and Economic Policy in Algorithms. ECE 2166.
Agribusinesses, Faculty of Economics and Social [23]. Yang, S. J. (2010). A survey on transfer learning. .
Sciences, Poznan University of Life Sciences, IEEE Transactions on knowledge and data
Wojska Polskiego 28, 60-637 Poznan, Poland. engineering, 22(10):1345-1359.
[9]. Michelle M. Nay, T. L.-V.-C. (2018). A Review of
Angular Leaf Spot Resistance in Common Bean.
Parana: Dep. Agronomia, Univ. Estadual de Maringá,
Maringá, Paraná, Brazil.
[10]. NVIDIA cuDNN. (n.d.). Retrieved from https://docs.
nvidia.com/cudnn/index.html: https://docs.nvidia.
com/cudnn/index.html
[11]. NVIDIA Tesla P100 PCIe 16 GB. (n.d.). Retrieved
from techpowerup: https://www.techpowerup.com/
gpu-specs/tesla-p100-pcie-16-gb.c2888
[12]. P. Pamela, D. M. (2014). Severity of angular leaf
spot and rust diseases on common beans in Central
Uganda. Kampala: National Crops Resources
Research Institute, Namulonge.
[13]. Paymode, A. S. (2021). Transfer Learning for Multi-
Crop Leaf Disease Image Classification using
Convolutional Neural Network VGG. MGM's
Jawaharlal Nehru Engineering College, Aurangabad
431001, Maharashtra, Inda.

IJISRT24JUL1251 www.ijisrt.com 1503

You might also like