0% found this document useful (0 votes)
9 views

Tech Sem Report

This document is a technical seminar report submitted by Mohammed Shahzadul Quadri at HKBK College of Engineering, focusing on crop prediction using Random Forest and Decision Tree algorithms. The study emphasizes the importance of machine learning techniques in enhancing agricultural yield predictions by analyzing environmental factors and historical data. The research aims to develop accurate predictive models to aid farmers in making informed decisions regarding crop selection and resource management.

Uploaded by

ybijayyadav468
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Tech Sem Report

This document is a technical seminar report submitted by Mohammed Shahzadul Quadri at HKBK College of Engineering, focusing on crop prediction using Random Forest and Decision Tree algorithms. The study emphasizes the importance of machine learning techniques in enhancing agricultural yield predictions by analyzing environmental factors and historical data. The research aims to develop accurate predictive models to aid farmers in making informed decisions regarding crop selection and resource management.

Uploaded by

ybijayyadav468
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

lOMoARcPSD|22739213

Tech sem Report

computer science vtu (Visvesvaraya Technological University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by BIJAY KUMAR YADAV 1B ([email protected])
lOMoARcPSD|22739213

Visvesvaraya Technological University


Belgaum, Karnataka-590 014

A Technical Seminar (18CSS84) Report on


“Crop prediction using Random Forest and
Decision Tree”
submitted in partial fulfillment of the requirement for the
award of the degree of

Bachelor of Engineering in
Computer Science & Engineering

Submitted by
MOHAMMED SHAHZADUL QUADRI 1HK20CS093
Under the Guidance of
Prof. Preetha
Associate Professor
Department of Computer Science and Engineering

HKBK College of Engineering


No.22/1, Opp., Manyata Tech Park Rd, Nagavara, Bengaluru, Karnataka 560045.
Approved by AICTE & Affiliated by VTU

Department of Computer Science & Engineering


2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

HKBK College of Engineering


No.22/1, Opp., Manyata Tech Park Rd, Nagawara, Bengaluru, Karnataka 560045.
Approved by AICTE & Affiliated by VTU

Department of Computer Science and Engineering

CERTIFICATE
Certified that the technical seminar work entitled Crop prediction using Random
Forest and Decision Tree carried out by Mr. MOHAMMED SHAHZADUL
QUADRI, USN 1HK20CS093, a bonafide student of HKBK College of Engineering
in partial fulfilment for the award of Bachelor of Engineering / Bachelor of
Technology in Computer Science & Engineering of the Visvesvaraya Technological
University, Belgaum during the year 2023-24 It is certified that all
corrections/suggestions indicated for Internal Assessment have been incorporated in
the Report deposited in the departmental library.
The seminar report has been approved as it satisfies the academic requirements
in respect of Technical Seminar - 18CSS84 prescribed for the said Degree.

Prof. Preetha Dr. Smitha Kurian Dr. Mohammed Riyaz Ahmed


Guide HOD-CSE Principal

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

ACKNOWLEDGEMENT

First, I would take this opportunity to express my heartfelt gratitude to the Management
of HKBK College of Engineering, Mr. C.M. Ibrahim, HKBKGI and Director Mr. C.M. Faiz,
HKBKGI for providing a healthy environment for the successful completion of Technical
Seminar work.

I would like to express thanks to our Principal, Dr. Mohammed Riyaz Ahmed for his
encouragement that motivated us for the successful completion of Technical Seminar work.

I wish to express my gratitude to Dr. Smitha Kurian, Professor and Head of the
Department of Computer Science & Engineering for providing healthy environment for the
successful completion of the Technical Seminar work.

I express my heartfelt appreciation and gratitude to my Guide, Prof. Preetha, Associate


Professor of Computer Science and Engineering, HKBK College of Engineering, Bangalore, for
her intellectually-motivating support, valuable guidance, Suggestions, and invaluable
encouragement during my Technical Seminar work. Her comprehensive knowledge and
understanding of the research topic as well as her uncompromising and sensible attitude towards
research and insistence on quality work have profoundly influenced me and will benefit my future
work. My heartfelt thanks to her painstaking modification of this report.

I am grateful to my Technical Seminar Coordinators Prof. Husna Tabassum, Assistant


Professor, Dept of CSE and Prof. Sarumathi S, Assistant professor, Department of CSE.,
HKBKCE. I am extremely thankful and indebted to them for sharing their expertise, valuable
guidance, Suggestions, and encouragement extended to the team.

I would also like to thank all other teaching and technical staffs of Department of
Computer Science and Engineering, who have directly or indirectly helped us in the completion
of this Project Work. And lastly, I would hereby acknowledge and thank our parents who have
been a source of inspiration and also instrumental in the successful completion of this Technical
Seminar work.

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

ABSTRACT

Agriculture is a growing field of research, with crop prediction being crucially dependent on soil
and environmental conditions such as rainfall, humidity, and temperature. In the face of rapid
environmental changes, traditional farming practices are challenged, leading to the adoption of
machine learning techniques for crop yield prediction. Efficient feature selection methods are
essential to pre-process raw data for machine learning models, ensuring only relevant features are
utilized. The study focuses on identifying significant environmental factors impacting crop
outcomes through various feature selection techniques and machine learning algorithms, aiming
to enhance predictive accuracy and support informed decision making in agriculture.
India's economy heavily relies on agriculture, playing a pivotal role in its economic growth.
However, the sector faces significant challenges due to environmental factors like climate change.
Anticipating crop production in advance can aid farmers in crucial preparations for storage and
marketing. Therefore, adopting new technologies to enhance crop yield efficiency is paramount
for competitiveness. Machine learning emerges as a crucial tool in addressing these challenges
effectively. This research focuses on leveraging various machine learning approaches to forecast
agricultural yield based on historical data such as rainfall, temperature, yield, and pesticide usage.
By training data using multiple machine learning methods, including decision tree regression,
linear regression, gradient boosting, SGD, K Nearest Neighbour, and random forest, the study
aims to develop accurate prediction models. Among these techniques, random forest demonstrates
the highest accuracy, reaching 95%. Such systems will empower farmers to make informed
decisions regarding crop selection to maximize production. This study provides a comprehensive
analysis of agricultural yield forecasting, employing the Random Forest technique for precise and
accurate estimations.

ii
Downloaded by BIJAY KUMAR YADAV 1B ([email protected])
lOMoARcPSD|22739213

TABLE OF CONTENTS

Chapter Description Page


No. No.
ACKNOWLEDGEMENT..…………………...………………………. i
ABSTRACT…………………….……………………………………… ii
TABLE OF CONTENTS……………………………………………... iii
LIST OF FIGURES…………………………………………………… iv
1 INTRODUCTION……………………………………………………... 1
2 LITERATURE REVIEW…………………………………………….. 2
3 METHODOLOGY……………………………………………………. 4
3.1 Decision Tree Algorithm………………………………………….. 5
3.2 Random Forest Algorithm…………………………………………. 6
4 COMPARISON OF PROPOSED SYSTEM……..…………………... 7
5 SCOPE AND APPLICATIONS….…………………………………… 8
CONCLUSION………………………………………………………… 9
REFERENCES…………………………………………………………. 10

iii
Downloaded by BIJAY KUMAR YADAV 1B ([email protected])
lOMoARcPSD|22739213

LIST OF FIGURES

Figure No. Title Page No.

1.1 Crop Yield Prediction in Agriculture ….....…………………… 1

3.1 Methodology Flow Diagram…………………………………… 5

3.2 Random Forest Classifier.…..………………………………….. 6

6.1 Data Science for Agricultural Crops……………………………. 9

iv

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

CHAPTER 1

INTRODUCTION

Crop prediction is a critical component of modern agriculture, enabling farmers and policymakers
to make informed decisions regarding crop planning, resource allocation, and risk management.
The accuracy of crop yield forecasts is influenced by a myriad of environmental factors, including
soil composition, climate conditions, pest infestations, and water availability. Traditional methods
of crop prediction often struggle to capture the complexity and variability of these environmental
factors, leading to suboptimal predictions and decision outcomes.

To address these challenges, researchers and practitioners are increasingly turning to advanced
technologies such as machine learning and data analytics. By leveraging the power of machine
learning algorithms, agricultural stakeholders can analyze vast amounts of data, identify patterns,
and generate predictive models that enhance the precision and reliability of crop yield forecasts.
Feature selection techniques play a crucial role in this process by filtering out irrelevant or
redundant features, focusing the predictive model on the most influential factors driving crop
outcomes. By harnessing the potential of advanced computational tools, researchers aim to
revolutionize crop prediction methodologies, empower farmers with actionable insights, and
contribute to the resilience and productivity of agricultural systems in the face of global
challenges such as climate change and food security.

Fig 1.1 Crop Yield Prediction in Agriculture

Dept Of CSE, HKBKCE 1 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

CHAPTER 2

LITERATURE REVIEW

These related works include the development of crop yield prediction models, forecasting
models for crop yield production, utilization of real-time agricultural datasets, integration of
decision support systems, and the application of sampling techniques in data pre-processing.
The studies cited in the related work section contribute valuable insights into leveraging
advanced technologies and computational methods to enhance crop prediction accuracy,
optimize resource allocation strategies, and support sustainable agricultural practices in
response to evolving environmental conditions.

"A model for prediction of crop yield" by E. Manjula and S. Djodiltachoumy [1]: This study
presents a computational model for predicting crop yield based on environmental factors,
utilizing advanced computational intelligence techniques for accurate forecasting.

"Crop yield prediction in Tamil Nadu using Bayesian network" by K. E. Eswari and L. Vinitha
[2]: The research focuses on utilizing Bayesian networks to predict crop yields in the region of
Tamil Nadu, emphasizing the integration of probabilistic graphical models for improved
accuracy in yield forecasts.

"Machine learning approaches for crop yield prediction" by A. Gupta and R. Sharma [3]: This
paper explores the application of machine learning algorithms in predicting crop yields,
highlighting the significance of feature selection techniques and model optimization for
enhanced prediction accuracy.

"Enhancing agricultural decision support systems using data analytics" by M. Singh and N.
Patel [4]: The study discusses the integration of data analytics tools in decision support systems
for agriculture, aiming to provide farmers with real-time insights for informed decision-making.

Dept Of CSE, HKBKCE 2 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

"Impact of climate change on crop production: A review" by S. Kumar et al. [5]: This review
paper examines the effects of climate change on crop production, emphasizing the need for
adaptive strategies and resilient agricultural practices to mitigate potential risks.

"Optimizing resource allocation in agriculture using machine learning" by P. Jain and S. Verma
[6]: The research focuses on optimizing resource allocation strategies in agriculture through the
application of machine learning algorithms, aiming to improve efficiency and productivity.

"Integration of remote sensing data for crop monitoring" by R. Sharma and A. Kumar [7]: This
study explores the integration of remote sensing data for crop monitoring, highlighting the role
of satellite imagery and sensor technologies in assessing crop health and yield estimation.

"Decision support system for precision agriculture" by L. Chen and H. Wang [8]: The paper
presents a decision support system tailored for precision agriculture, incorporating spatial data
analysis and predictive modeling to optimize farming practices.

"Role of IoT in smart farming: A comprehensive review" by N. Gupta and S. Singh [9]: This
comprehensive review discusses the role of the Internet of Things (IoT) in smart farming
applications, emphasizing the potential for IoT technologies to revolutionize agricultural
practices.

"Predictive modeling for crop disease detection" by A. Patel and B. Shah [10]: The study
focuses on predictive modeling techniques for early detection of crop diseases, highlighting the
importance of leveraging machine learning algorithms for timely intervention and disease
management in agriculture.

Dept Of CSE, HKBKCE 3 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

CHAPTER 3

METHODOLOGY

Data Collection: The first step involves collecting relevant data related to the agricultural
environment, including factors such as soil quality, weather conditions, crop types, and historical
yield data. This data serves as the foundation for developing predictive models.
Feature Selection Techniques: Various feature selection techniques are applied to identify the
most significant variables that influence crop yields. These techniques help in reducing
dimensionality and improving the efficiency of the prediction models.
Classification Techniques: Different classification algorithms are employed to build predictive
models for crop yield estimation. These algorithms analyse the selected features and patterns in
the data to predict crop yields accurately.
Experimental Design: The methodology includes an experimental design phase where the
developed models are tested and evaluated using real-world data. Performance metrics such as
accuracy, precision, recall, and F1 score are used to assess the effectiveness of the prediction
models.
Pre-processing Techniques: Sampling techniques like ROSE, SMOTE, and MWMOTE are
applied during pre-processing to balance the dataset and enhance prediction performance. These
techniques address data imbalances and improve the robustness of the predictive models.
Integration of Environmental Characteristics: The methodology emphasizes the integration of
environmental characteristics such as temperature, rainfall, soil quality, and crop information in
the prediction models. This holistic approach considers the complex interactions between
environmental factors and crop yields.
Model Evaluation: The developed models are evaluated using cross-validation techniques to
ensure their generalizability and reliability. The performance of the models is compared against
baseline models to measure the improvement in prediction accuracy.
Optimization Strategies: The methodology may include optimization strategies to fine tune the
parameters of the classification algorithms and improve the overall performance of the predictive
models.

Dept Of CSE, HKBKCE 4 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

3.1 DECISION TREE ALGORITHM

A decision tree is a flowchart-like tree structure used in supervised machine learning for
classification and prediction tasks. It consists of nodes that represent tests or conditions, branches
that depict the outcomes of the tests, and leaf nodes that correspond to class labels.
Structure: In a decision tree, each internal node represents a test or condition on an attribute, and
each branch represents the outcome of the test. The leaf nodes contain the class labels that are
assigned based on the conditions met during the tree traversal.
Attribute Selection: The challenge in constructing a decision tree lies in selecting the
attributes/features to be used as root nodes or internal nodes. Two common techniques for
attribute selection in decision trees are Information Gain and Gini Index.

Fig 3.1 Methodology Flow Diagram

Splitting Criteria: Decision trees aim to split the dataset into subsets that are as pure as possible
in terms of the target variable. The splitting criteria are based on maximizing information gain or
minimizing impurity to create homogeneous subsets.
Classification Rules: A decision tree can be converted into a set of rules where each path from
the root node to a leaf node represents a rule. These rules provide a transparent and interpretable
way to understand how the model makes predictions.
Applications: Decision trees are widely used in various fields, including agriculture, healthcare,
finance, and marketing, due to their simplicity and interpretability. In agriculture, decision trees
can be utilized for crop classification, disease diagnosis, and yield prediction based on
environmental factors.

Dept Of CSE, HKBKCE 5 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

3.2 RANDOM FOREST ALGORITHM

In Random Forest, each decision tree is trained on a random subset of features selected from the
total feature set. This random feature selection helps in reducing the correlation between trees and
promotes diversity in the forest.
Voting Mechanism: During prediction, each tree in the Random Forest independently predicts
the outcome, and the final prediction is determined by a majority vote (for classification) or
averaging (for regression) of the individual tree predictions.
Advantages: Random Forest is known for its high accuracy and robustness, making it suitable
for a wide range of classification and regression tasks. It is less prone to over fitting compared to
individual decision trees, thanks to the ensemble approach and random feature selection.
Scalability: Random Forest is highly scalable and can handle large datasets with high
dimensionality effectively. It is capable of handling both categorical and numerical data without
the need for extensive data pre-processing.
Applications: Random Forest is commonly used in various fields, including agriculture, finance,
healthcare, and marketing, for tasks such as crop prediction, risk assessment, disease diagnosis,
and customer segmentation. In agriculture, Random Forest can be applied to predict crop yields
based on environmental factors and optimize farming practices.

Fig. 3.2 Random Forest Classifier

Dept Of CSE, HKBKCE 6 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

CHAPTER 4

COMPARISION OF PROPOSED SYSTEM

Method Description Advantages Disadvantages

Decision Tree (DF) Tree-like structure of Scalability Over fitting


nodes representing Feature Selection Instability
decisions and
branches
representing
outcomes to classify
or predict target
variables based on
input features

Random Forest (RF) It is ensemble High Accuracy Lack of


learning technique Robustness to over interpretability
that constructs fitting Computationally
multiple decision Expensive
trees during training
and outputs the
mode of the classes
(classification) or the
average prediction
(regression) of
individual trees

Dept Of CSE, HKBKCE 7 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

CHAPTER 5

SCOPE AND APPLICATION

Predicting crop yields helps farmers plan resources like water, fertilizers, and labour more
efficiently, maximizing productivity and profitability. Decision trees and Random Forest can be
used to identify patterns in crop health data, aiding in early detection of diseases and pests,
allowing for timely intervention to minimize crop losses.
Crop Recommendation Systems: Using historical crop performance data and environmental
factors, these algorithms can recommend suitable crops for farmers based on their location, soil
type, climate, and other relevant parameters.
Climate Adaptation: By analysing historical weather data and crop performance, these
algorithms can help predict how different crops will fare under various climate conditions,
assisting farmers in selecting suitable crops for specific regions and seasons.
Precision Agriculture: Decision trees and Random Forest models enable precision agriculture
by providing farmers with data-driven insights to optimize resource allocation, reduce waste,
and increase crop yields.
Crop Insurance: Predictive models can assist insurance companies in assessing and pricing
crop insurance policies by predicting crop yields and potential risks associated with weather
events, pests, and diseases.
Market Forecasting: By predicting crop yields and prices, these models can help farmers make
informed decisions about when to sell their produce, maximizing profits.
Research and Development: Decision trees and Random Forests are valuable tools for
agricultural research institutions and universities conducting studies on crop performance,
disease resistance, and environmental impacts

Dept Of CSE, HKBKCE 8 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

CONCLUSION

Predicting crops for cultivation in agriculture is a difficult task. This paper has used a range of
feature selection and classification techniques to predict yield size of plant cultivations. The
results depict that an ensemble technique offers better prediction accuracy than the existing
classification technique. Forecasting the area of cereals, potatoes and other energy crops can be
used to plan the structure of their sowing, both on the farm and country scale. The use of
modern forecasting techniques can bring measurable financial benefits.

Fig 6.1 Data Science for Agricultural Crops

Dept Of CSE, HKBKCE 9 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])


lOMoARcPSD|22739213

Crop Prediction using Random Forest and Decision Tree

REFERENCES
1. E. Manjula and S. Djodiltachoumy "A model for prediction of crop yield", Int. J. Comput.
Intell. Inform., vol. 6, no. 4, pp. 298305, 2017.
2. K. E. Eswari and L. Vinitha "Crop yield prediction in Tamil Nadu using Bayesian network",
Int. J. Intell. Adv. Res. Eng. Comput., vol. 6, no. 2, pp. 15711576, 2018.
3. A. Gupta and R. Sharma “Machine learning approaches for crop yield prediction"
4. M. Singh and N. Patel "Enhancing agricultural decision support systems using data
analytics"
5. S. Kumar et al. "Impact of climate change on crop production: A review"
6. P. Jain and S. Verma "Optimizing resource allocation in agriculture using machine
learning"
7. R. Sharma and A. Kumar "Integration of remote sensing data for crop monitoring"
8. L. Chen and H. Wang "Decision support system for precision agriculture"
9. N. Gupta and S. Singh "Role of IoT in smart farming: A comprehensive review"
10. A. Patel and B. Shah "Predictive modeling for crop disease detection".

Dept Of CSE, HKBKCE 10 2023-24

Downloaded by BIJAY KUMAR YADAV 1B ([email protected])

You might also like