Efficient Crop Yield Analysis Prediction in Modern Agriculture System Using Machine Learning Algorithm
Efficient Crop Yield Analysis Prediction in Modern Agriculture System Using Machine Learning Algorithm
N.Rajkumar
Computer Science and Engineering M.A.Mukunthan
Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science Computer Science and Engineering
and Technology Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science
Chennai, India, [email protected] and Technology
Chennai, India [email protected]
Abstract— In agriculture, yield estimation accuracy is Thanks to a method called machine learning, artificial
crucial. To increase production while decreasing operational intelligence (AI) systems may automatically learn from their
costs and environmental effects, remote sensing systems are experiences and improve over time. Algorithms and
increasingly being employed in developing decision-support techniques for machine learning focus on building computer
tools for modern agricultural systems. However, because RS-
systems that can swiftly retrieve data and make it available for
based technologies need to analyze vast quantities of remotely
sensed data from many platforms, more focus is being placed on use in improvement. Machine learning algorithms have been
machine learning techniques. This is because systems based on categorized into two main subcategories. supervised and
machine learning can handle non-linear tasks and process a unsupervised. Only supervised algorithms are used in our
huge number of inputs. To make decisions about which crops to project.
plant and what to do when they are in the growing season,
machine learning is an essential decision-making tool for Supervised algorithms require supervision by an
forecasting agricultural yields. Several other machine learning individual skilled in machine learning for both input and the
methods have been used, too. We are taking into account a few desired outcome. This algorithms are applied to the previously
factors that influence crop productivity. The variables include studied data in the past and then to new data. A machine
rainfall, PH, temperature, humidity, and nutrients. A machine learning specialist must oversee the input and the
learning model will be created utilising the data retrieved from desired output of supervised algorithms. This algorithm
these parameters in order to assess and forecast the ideal crop also searches for discrepancies between the output and
using machine learning. the correct, planned outcome and then modifies the
Keywords— Machine Learning, Crop yield, Agriculture,
model as appropriate.
Random Forest, Remotesensing. The predictions that arise from a sequence of feature-based
splits are represented by a flowchart that resembles a tree
I. INTRODUCTION structure in the decision tree method. The decision made by
India is mostly an agricultural country. Two thirds of the the leaves marks the end of it, which begins at the root node.
population depends on agriculture for their living. It acts as With decision trees, problems in classification and regression
the cornerstone of the nation's economy. A relatively big can both be resolved.
percentage of the population can also find work in agriculture. Based on the Bayes theorem and operating under the
Crop yield is one of the problem which farmers are facing assumption of predictor independence is the naive bayes
from many years. Crop production predictions in the past were algorithm. The Naive Bayes classifier, to put it simply,
dependent on the particular crops and cultivation expertise of believes that the existence of one feature in a class has no
the farmer. So, to solve this problem, we come up with a effect on the existence of any other characteristics.
method which involves collecting of existing data and based With support vector machines (SVM), every data point is
on that, we will estimate which crop gives the best yield using represented as a point in n-dimensional space, where n is the
Machine learning algorithms. For the best accuracy and crop number of features you have. The value of each feature is a
prediction, SVM, naive bayes, random forest, and decision specific coordinate. Finding the hyper-plane that best divides
tree are the four main machine learning algorithms that we are the two groups is the next step in the categorization process.
considering. We can consider some components like nitrogen, Support vectors are only one observation's coordinates
phosphorous, sulphur, humidity, rain fall and PH which will presented simply. The hyper-plane/line frontier used by the
be helpful for crop prediction. SVM classifier is the one that most effectively separates the
Our main aim is to analyse Crop Yield with two groups.
improved degree of accuracy using ma- chine Learning A well-liked strategy for problems with regression and
Algorithms, In this study, crop yield is predicted using classification is random forest, a supervised machine learning
supervised learning techniques. The Random Forest, Naive method. It constructs decision trees using many samples, use
Bayes, SVM, and Decision Tree algorithms will all be the average of the samples for regression and the majority of
employed to predict yield. the data for categorization. The Random Forest Algorithm's
Authorized licensed use limited to: Institute of Space Technology. Downloaded on March 20,2024 at 09:10:28 UTC from IEEE Xplore. Restrictions apply.
capacity to handle data sets containing both continuous "The Applying machine learning for intelligent agriculture-
variables—used in regression—and categorical variables— based crop selection analysis" was introduced and put into
used in classification—is one of its primary features. For practise in 2019 [7] by R. Reda et al. It has a number of
categorization issues, it produces superior results. techniques, such as the random forest regressor, which aids in
making predictions with high accuracy even with a little
Our goal is to maximise crop productivity by utilising machine dataset.
learning algorithms that provide more precision by analysing
III. METHODOLOGY
many characteristics, including nutrients, humidity,
temperature, PH, and rainfall. A. Existing System
II. LITERATURE SURVEY The existing system deals with the contents like State
name, Cropname, Production, Cost and Year in which it was
“The crop yield prediction using data analysis and hybrid
cultivated. The crops which were included belong to only one
approach" was put into practise in 2018 by B. Zhu et al. This
season and most are repeated crops in the dataset and the
will address crop yield production by utilising a variety of
execution also in very less appropriation with repeated data.
existing data analysis [1].
"The Fuzzy logic based crop yield prediction using B. Proposed System
Temperature and Rainfall" was presented by E. Khosla et al.
The proposed system contains some internal factors which
in 2019 [2]. It comprises the water content and temperature
will helps in the growth of crop like potassium, sulphur,
increase of the earth for plant growth.
nitrogen, humidity and water level. Execution is done with
In 2021, "The Hybrid prediction strategy for predicting the machine learning model naming random forest regressor
agricultural information" was discussed by F. H. Tseng et al. which helps in generating high accuracy prediction even with
[3]. It claims that a hybrid prediction model can be used to less amount of data and both testing and training are done
assess crop information from a field. using this module. To anticipate the optimum crop output,
machine learning is used in the framework that has been
2019 saw the implementation of "A Comparative Study of developed. A dataset of crops is experimented with by the
Chemical Components between New and Old Methodologies suggested model. The crop is chosen while taking into account
in Farming" by M.Alizamir et al. [4]. This addresses the yield the soil's composition, the environment in place, and the
production that is influenced by internal chemical components current meteorological circumstances.
including potassium, sulphur, nitrogen, humidity, and water
level. Four algorithms were employed, and we chose the one that
will predict events most accurately. The Algorithms we used
A comparison of a novel method and other machine are Decision Tree , Naive Bayes , SVM and Random Forest
learning techniques for estimating soil organic carbon and Algorithms,where Random Forest Algorithm Predict More
total nitrogen using near infrared spectroscopy in 2020 was accurately So finally we used Random Forest Algorithm as
applied by the author in [5]. It also takes into account the soil the final Algorithm to Predict the Best Crop.
yield level, which will aid in the plant or crop's more
wholesome growth.
P. S. Maya Gopal and R. Bhargavi implemented "The
Advanced machine learning model for better prediction
accuracy of soil temperature at various depths" in 2020 [6].
which claims that crop output is produced using a variety of
different data analysis methods.
General Architecture
Authorized licensed use limited to: Institute of Space Technology. Downloaded on March 20,2024 at 09:10:28 UTC from IEEE Xplore. Restrictions apply.
and then forecast which algorithm will yield the best crop for label Encoder() function to turn category input into numerical
the designated area. data.
C. Design Phase
Splitting the Data and Applying Machine
Once the data was collected and the data will go to Pre-
Learning Modules.
processing model for the removal of redundant values and
forms as a perfect data. Then the data will pass through
SVM,Random Forest,Decision Tree and Naive Bayes and the The input data is trained on and tested in this stage. The
train and test- ing data will be achieved and the output will loaded data is divided into two groups, such as training and
represent as graphical representation. test data, using a division ratio of 80% or 20%, or 0.8 or 0.2.
The easily available input data for a learning set is produced
by a classifier.
D. Algorithm
step 1: Start the process. To approximatingly and categorise the function,
step 2: Load the data set into Jupiter notebook. construct the classifier's support data and assumptions in this
step 3: Data sets for training and testing should be stage. Testing the data is the testing process. Preprocessing
separated. creates the final data, which is processed by the machine
step 4: Now apply machine learning algorithms like learning module.
SVM, NB, DT and RF.
step 5: Find the accuracy of every machine learning We are using four algorithms in our research to forecast
algorithm. crop production. Decision Tree, SVM, Naive Bayes, and
step 6: Now fit the model to the algorithm which gives Random Forest algorithms are used to forecast which crop
highest accuracy. will be suited and yield.
step 7: Now open the visual studio and load the code into
it. IV. IMPLEMENTATION AND DISCUSSION
step 8: Create one web page using HTML, CSS and A. Input and Output
collect the data from user.
step 9: Now pass this data to the Machine Learning This is the dataset we are considering for machine learning
algorithm . algorithms which will give best accuracy.This data will not
step 10: Finally return the output the web page using contain any null or repeated values in it.
Python Framework Flask. step 11: Exit the
process.
E. Module Description
Collecting of Data for Dataset
Authorized licensed use limited to: Institute of Space Technology. Downloaded on March 20,2024 at 09:10:28 UTC from IEEE Xplore. Restrictions apply.
Humidity, and elements. We have chosen four machine Here in the graph, we see that random forest gives more
learning algorithms for generating a high amount of accuracy than all algorithms and pre-dicting the suitable crop
prediction even with fewer attributes. The training and testing also.
modules which we have used helped us in getting higher
accuracy.
Authorized licensed use limited to: Institute of Space Technology. Downloaded on March 20,2024 at 09:10:28 UTC from IEEE Xplore. Restrictions apply.