Project - Presentation For ML Based Weather Prediction
Project - Presentation For ML Based Weather Prediction
Weather Prediction: A
Comparative Study of
Regression and
Classification Algorithms
By Shaik Zubair, Mohammed
Zahid Ullah,Parvez Ahmed.
Justification of Project Title:
"Machine Learning-based
Weather Prediction: A
Comparative Study of
Regression and
Classification Algorithms"
Why Machine Learning for
Weather Prediction?
Weather Impacts Our Lives
★ Agriculture 🌾: Weather forecasts help farmers plan planting, harvesting, and crop
management.
★ Aviation ✈️: Accurate weather forecasts ensure safe takeoffs, landings, and flight
routes.
★ Daily Life 🏠: Weather forecasts help us plan our daily activities, from commuting to
outdoor events.
Traditional Methods Have Limits
★ Limited Data: Traditional weather forecasting methods rely on limited data and
simple models, leading to inaccuracies.
★ These limitations can cause inaccuracies for specific weather events and struggle with
massive datasets from modern sensors.
Machine Learning Can Learn Complex Patterns for More Accurate
Forecasts!
● Regression: Suitable for continuous weather variables like temperature, humidity, and
wind speed
● Classification: Suitable for categorical weather variables like rain/no rain, sunny/cloudy, or
storm/no storm
Boosting for Enhanced Accuracy
● XGBoost: A popular gradient boosting algorithm that can improve weather prediction
accuracy
● AdaBoost: A boosting algorithm that can be used to boost the performance of decision
trees on binary classification problems
● Key benefits:
● Can significantly improve weather prediction accuracy
● Can handle complex patterns and relationships in the data
● Can be used to combine multiple models for better performance
Objective and Scope of the
Project
I. Objectives
Data-Centric Approach
○ Focus on Collecting & Preprocessing Historical Weather Data
○ Building ML Models for Informed Forecasts
Performance Evaluation
○ Evaluate the performance of the developed models using appropriate
metrics.
○ Compare regression and classification approaches to identify the most
suitable one for specific weather variables.
1. Villarreal Guerra et al. New dataset (RFS) for weather Diverse weather & unclear Offers new dataset and data
(2023) classification (rain, snow, fog) features augmentation for CNNs
using CNNs
2. Tiwari et al. (2023) Classifying robot swarm Limited data availability Highlights importance of data
(Machine Learning for behaviors (flocking vs. non- for machine learning models
Complex Systems) flocking)
Feature Selection
● Selected key features: precipitation, temperature (max & min), wind.
Algorithm Selection
● Evaluated ML algorithms:
○ Boosting (AdaBoost, XGBoost) for complex patterns.
○ Traditional (Decision Trees, Random Forests) for comparison.
Model Training
● Applied stratified k-fold cross-validation for thorough training.
I. Research Methodology
Model Evaluation
● Assessed models with accuracy, precision, recall, F1-score, and AUC-ROC.
Further Analysis
● Conducted additional analyses (ROC curve analysis, lift curve analysis) to gain deeper insights
into algorithm performance under varying weather conditions. This helps refine and optimize the
models..
II. Proposed Weather Prediction System:
Objective: Revolutionize forecasting accuracy and reliability through advanced machine learning and
data integration.
Key Features:
● Integration of Machine Learning: Leverages regression and classification algorithms to predict
diverse weather parameters (temperature, precipitation, wind speed, etc.).
● Diverse Data Sources: Utilizes historical data, satellite imagery, IoT sensor data, and crowd-
sourced observations for more precise predictions.
● User-Friendly Interfaces: Designed for accessibility with clear and intuitive interfaces for easy
forecast interpretation.
● Transparency: Provides users with information on data sources, algorithms used, and
methodologies, fostering trust in the predictions.
II. Proposed Weather Prediction System:
Overall Approach:
This project proposes a novel weather prediction system that addresses limitations of traditional
methods by:
By employing this innovative approach, the proposed system has the potential to significantly improve
weather prediction accuracy and reliability, benefiting sectors like agriculture, transportation, and
disaster management.
Algorithms Used to
Overcome the Specific
Problem
Algorithms Used to Overcome the Specific Problem
Current Limitations:
● Traditional methods struggle with limited accuracy for complex patterns and handling large
datasets.
● Machine Learning Solution: ML improves forecast accuracy by learning from data.
Evaluation Metrics
● Accuracy: Measures overall correctness.
● Precision: Ratio of true positives to predicted positives.
● Recall: Ratio of true positives to actual positives.
● F1-Score: Harmonic mean of precision and recall.
Conclusion
● By exploring these diverse machine learning algorithms, we aim to:
○ Enhance weather prediction accuracy by leveraging their strengths in handling complex
weather patterns, potentially imbalanced data, and categorical features.
○ Identify the most suitable algorithm (or combination) for our specific project goals.
Data Collection
1. Weather Data Source:
● This project leverages a weather dataset from Kaggle, a reputable platform for sharing and
exploring machine learning datasets.
2. Dataset Description:
● The dataset contains various weather variables relevant to our prediction task:
○ Precipitation amount
○ Maximum and minimum daily temperatures
○ Wind speed
○ Categorical weather conditions (e.g., drizzle, rain, snow, sun)
4. Conclusion:
● We have acquired a comprehensive weather dataset from Kaggle, eliminating the need for
extensive data cleaning or preparation.
● This well-structured data serves as a solid foundation for our weather prediction project.
OverAll Experimental Setup for Weather Prediction
1. Development Environment:
● Primary development used Visual Studio Code (Vscode) for its user-friendly
interface and features.
2. training(request):
def training(request):
from .utility import module
rf_report = module.training()
xgb_report = module.training_xgboost()
adaboost_report = module.training_adaboost()
nb_catboost = module.training_catboost()
gradiantboost_report = module.training_gradiantboost()
return render(request,'users/Training.html',{'rf': rf_report,"xgb":xgb_report, 'ada':
adaboost_report,'cat':nb_catboost,'gradient':gradiantboost_report})
Code Implementation
3. Predication(request):
def Prediction(request):
if request.method == "POST":
import pandas as pd
from django.conf import settings
precipitation = request.POST.get("precipitation")
temp_max = request.POST.get("temp_max")
temp_min = request.POST.get("temp_min")
wind = request.POST.get("wind")
test_set = [precipitation,temp_max,temp_min,wind]
print(test_set)
from .utility import module
ot = module.prediction(test_set)
print(ot)
if(ot==0):
res = "Drizzle"
elif(ot==1):
res = "Fog"
elif(ot==2):
res = "Rain"
Code Implementation
elif(ot==3):
res = "snow"
else:
res = "Sun"
return render(request,'users/Predication.html',{'raesult':res})
else:
return render(request,'users/Predication.html',{})
Code Implementation For Algorithms
Random Forest:
def training():
clf = RandomForestClassifier(random_state = 0)
clf.fit(X_train, y_train)
y_pred_knn = clf.predict(X_test)
accuracy_score(y_test, y_pred_knn)
print("Classification Report\n")
print(classification_report(y_test, y_pred_knn))
print("Confusion Matrix\n")
print(confusion_matrix(y_test, y_pred_knn))
cm = confusion_matrix(y_test, y_pred_knn)
nb_cr = pd.DataFrame(nb_cr).transpose()
nb_rf = pd.DataFrame(nb_cr)
return nb_rf.to_html
Code Implementation For Algorithms
XGBoost:
def training_xgboost():
clf = XGBClassifier()
clf.fit(X_train, y_train)
y_pred_knn = clf.predict(X_test)
accuracy_score(y_test, y_pred_knn)
print("Classification Report\n")
print(classification_report(y_test, y_pred_knn))
print("Confusion Matrix\n")
print(confusion_matrix(y_test, y_pred_knn))
cm = confusion_matrix(y_test, y_pred_knn)
nb_cr = pd.DataFrame(nb_cr).transpose()
nb_xg = pd.DataFrame(nb_cr)
return nb_cr.to_html
Code Implementation For Algorithms
AdaBoost:
def training_adaboost():
clf = AdaBoostClassifier()
clf.fit(X_train, y_train)
y_pred_knn = clf.predict(X_test)
accuracy_score(y_test, y_pred_knn)
print("Classification Report\n")
Code Implementation For Algorithms
AdaBoost:
print(classification_report(y_test, y_pred_knn))
print("Confusion Matrix\n")
print(confusion_matrix(y_test, y_pred_knn))
cm = confusion_matrix(y_test, y_pred_knn)
nb_cr = pd.DataFrame(nb_cr).transpose()
nb_ada = pd.DataFrame(nb_cr)
return nb_ada.to_html
Code Implementation For Algorithms
CatBoost:
def training_catboost():
clf = CatBoostClassifier()
clf.fit(X_train, y_train)
y_pred_knn = clf.predict(X_test)
accuracy_score(y_test, y_pred_knn)
print("Classification Report\n")
print(classification_report(y_test, y_pred_knn))
print("Confusion Matrix\n")
print(confusion_matrix(y_test, y_pred_knn))
cm = confusion_matrix(y_test, y_pred_knn)
nb_cr = pd.DataFrame(nb_cr).transpose()
nb_cat = pd.DataFrame(nb_cr)
return nb_cat.to_html
Conclusion
● Machine Learning for Weather Prediction: Explored various machine learning
● Top Performers: XGBoost and AdaBoost achieved the highest accuracies (>87%).
● Validation Confirmed: Lift curve and ROC analysis support these findings.
○ Improve prediction accuracy for diverse weather events, including extreme ones.
change prediction.
2. Tiwari, R. G., Yadav, S. K., Misra, A., & Sharma, A. (2023). Classification of Swarm
Collective Motion Using Machine Learning. Smart Innovation, Systems and Technologies,
316, 173–181. doi: 10.1007/978-981-19-5403-0_14/COVER.
3. Tiwari, R. G., Agarwal, A. K., Jindal, R. K., & Singh, A. (2022). Experimental Evaluation of
Boosting Algorithms for Fuel Flame Extinguishment with Acoustic Wave. In 2022
International Conference on Innovation and Intelligence for Informatics, Computing, and
Technologies (3ICT) (pp. 413–418). doi: 10.1109/3ICT56508.2022.9990779.
4. Wang, W., & Sun, D. (2021). The improved AdaBoost algorithms for imbalanced data
classification. Information Sciences, 563, 358–374. doi: 10.1016/J.INS.2021.03.042.
References
5. Bahad, P., & Saxena, P. (2020). Study of AdaBoost and Gradient Boosting Algorithms for
Predictive Analytics (pp. 235–244). doi: 10.1007/978-981-15-0633-8_22.
6. Mitchell, R., Adinets, A., Rao, T., & Frank, E. (2018). XGBoost: Scalable GPU Accelerated
Learning. doi: 10.48550/arxiv.1806.11248.
7. Gautam, V., et al. (2022). A Transfer Learning-Based Artificial Intelligence Model for Leaf
Disease Assessment. Sustainability, 14(20), 13610. doi: 10.3390/SU142013610.
9. Scher, S., & Messori, G. (2018). Predicting weather forecast uncertainty with machine
learning. Quarterly Journal of the Royal Meteorological Society, 144(717), 2830–2841. doi:
10.1002/QJ.3410.
References
10. Markovics, D., & Mayer, M. J. (2022). Comparison of machine learning methods for
photovoltaic power forecasting based on numerical weather prediction. Renewable and
Sustainable Energy Reviews, 161, 112364. doi: 10.1016/J.RSER.2022.112364.
THANK YOU