0% found this document useful (0 votes)

4 views

Classification ML Project Report

A classification model was built to predict accident survival based on factors like age, gender, speed, and safety gear usage, using a dataset of 200 cases. The Random Forest model outperformed others with a balanced F1 score of 55%, while Logistic Regression had the highest precision but lower recall. Key insights indicated that helmet and seatbelt use significantly affect survival rates, and recommendations for model improvement include adding more variables and exploring different algorithms.

Uploaded by

elliptiicclips

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Classification ML Project Report

Uploaded by

elliptiicclips

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

I chose to build a classification model to predict whether a person involved in an accident

survives based on age, gender, speed, helmet and seatbelt use. This prediction could be used
to help stakeholders improve safety regulations.

There were 200 accident cases in a csv file I downloaded from Kaggle. I cleaned the data
changing it to binary values, filled in missing info, and created a training testing split 80% to
20%.

I trained it using logistic regression, decision tree, and random forest. The
accuracy/Precision/Recall/F1 score are as follows:

Model 1: Logistic Regression

● Accuracy: 57.5%
● Precision: 60%
● Recall: 45%
● F1 Score: 51.4%

Model 2: Decision Tree

● Accuracy: 47.5%
● Precision: 47.4%
● Recall: 45%
● F1 Score: 46.1%

Model 3: Random Forest

● Accuracy: 55%
● Precision: 55%
● Recall: 55%
● F1 Score: 55%

The Random Forest model performed the best overall, with a balanced precision and recall
(55%), making it the most reliable predictor. While Logistic Regression had higher precision
(60%), it suffered from lower recall (45%), meaning it failed to correctly identify some survival
cases. The Decision Tree performed the worst, likely due to overfitting.

Some key findings and insights were that helmet and seatbelt significantly impacted the survival
rates, higher speed is correlated with lower survival, age plays a moderate role, younger people
survive more, and gender did not really have much of an impact.

Some recommendations were adding more conditions such as weather conditions or time of
day, adjusting the model's hyperparameters, and trying other models like gradient boosting or
neural networks. Using SMOTE. And eventually deploy the model to help people.
This analysis provides valuable insights into factors affecting road accident survival. The
Random Forest model proved to be the best predictor, but further improvements can be made
with additional data and tuning. These insights can be used to inform public safety policies,
vehicle safety features, and accident response strategies.

My python code:
from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClassifier

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracy_score, precision_score, recall_score,

f1_score

from sklearn.model_selection import train_test_split

# Load your dataset

import pandas as pd

df = pd.read_csv("C:\\Users\sigle\OneDrive\Desktop\\accident.csv") # Replace
with your file path

# Preprocess the data

df["Gender"].fillna(df["Gender"].mode()[0], inplace=True)

df["Speed_of_Impact"].fillna(df["Speed_of_Impact"].median(), inplace=True)

df["Gender"] = df["Gender"].map({"Male": 0, "Female": 1})

df["Helmet_Used"] = df["Helmet_Used"].map({"No": 0, "Yes": 1})

df["Seatbelt_Used"] = df["Seatbelt_Used"].map({"No": 0, "Yes": 1})

# Define features and target variable

X = df.drop(columns=["Survived"])

y = df["Survived"]

# Split dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42, stratify=y)

# Initialize models

log_reg = LogisticRegression()

decision_tree = DecisionTreeClassifier(random_state=42)

random_forest = RandomForestClassifier(random_state=42)

# Train models

log_reg.fit(X_train, y_train)

decision_tree.fit(X_train, y_train)

random_forest.fit(X_train, y_train)

# Make predictions

log_reg_preds = log_reg.predict(X_test)

decision_tree_preds = decision_tree.predict(X_test)

random_forest_preds = random_forest.predict(X_test)

# Function to evaluate models

def evaluate_model(y_true, y_pred):

return {
"Accuracy": accuracy_score(y_true, y_pred),

"Precision": precision_score(y_true, y_pred),

"Recall": recall_score(y_true, y_pred),

"F1 Score": f1_score(y_true, y_pred),

# Evaluate models

log_reg_results = evaluate_model(y_test, log_reg_preds)

decision_tree_results = evaluate_model(y_test, decision_tree_preds)

random_forest_results = evaluate_model(y_test, random_forest_preds)

# Print results

print("Logistic Regression:", log_reg_results)

print("Decision Tree:", decision_tree_results)

print("Random Forest:", random_forest_results)

CRISP - DM - Business Understanding
No ratings yet
CRISP - DM - Business Understanding
17 pages
FDS_REPORT FINAL[1]
No ratings yet
FDS_REPORT FINAL[1]
9 pages
Random Forest Algorithm - Titanic Dataset
No ratings yet
Random Forest Algorithm - Titanic Dataset
12 pages
SUMMARY
No ratings yet
SUMMARY
16 pages
Titanic
No ratings yet
Titanic
1 page
23BCE7092_ML_Lab_Assignment[1]
No ratings yet
23BCE7092_ML_Lab_Assignment[1]
14 pages
AI ML - Cycle 2 Programs (1)
No ratings yet
AI ML - Cycle 2 Programs (1)
15 pages
decision tree
No ratings yet
decision tree
2 pages
ex 6b
No ratings yet
ex 6b
3 pages
Rouse Final
No ratings yet
Rouse Final
8 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
Home Work
No ratings yet
Home Work
12 pages
23BCE7199 ML Lab Assignment[1]
No ratings yet
23BCE7199 ML Lab Assignment[1]
15 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Titanic Dataset Model Prediction
No ratings yet
Titanic Dataset Model Prediction
11 pages
Titanic Akshaya
No ratings yet
Titanic Akshaya
12 pages
AI Lab9 22it3044
No ratings yet
AI Lab9 22it3044
21 pages
4.1.3.5 Lab - Decision Tree Classification
No ratings yet
4.1.3.5 Lab - Decision Tree Classification
11 pages
AAM 6th Prac
No ratings yet
AAM 6th Prac
3 pages
AIH_Lab2
No ratings yet
AIH_Lab2
10 pages
bacdeaf_23032025_115708_split_1
No ratings yet
bacdeaf_23032025_115708_split_1
37 pages
Machine Learning Laboratory (21AIL66)
No ratings yet
Machine Learning Laboratory (21AIL66)
7 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Logistic Regression On Titanic Dataset
No ratings yet
Logistic Regression On Titanic Dataset
6 pages
ML (Lab 8) Tasks Bilal Habib (5th Semester)
No ratings yet
ML (Lab 8) Tasks Bilal Habib (5th Semester)
16 pages
C2_W4_Lab_02_Tree_Ensemble
No ratings yet
C2_W4_Lab_02_Tree_Ensemble
10 pages
AIML PROGRAMS
No ratings yet
AIML PROGRAMS
12 pages
1-10
No ratings yet
1-10
4 pages
DWDM Lab 3
No ratings yet
DWDM Lab 3
10 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
ML File
No ratings yet
ML File
17 pages
KNN Age Prediction Model
No ratings yet
KNN Age Prediction Model
9 pages
Decision Trees - Jupyter Notebook
No ratings yet
Decision Trees - Jupyter Notebook
4 pages
PRJ-Parkinsons Disease Prediction
No ratings yet
PRJ-Parkinsons Disease Prediction
16 pages
4-10 Aiml
No ratings yet
4-10 Aiml
25 pages
Anemia Word
No ratings yet
Anemia Word
7 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
FREE AI Code Generator - Generate Code Online in Any Language
No ratings yet
FREE AI Code Generator - Generate Code Online in Any Language
12 pages
IEEE Conference Team ATOM
No ratings yet
IEEE Conference Team ATOM
5 pages
Heart Disease Predictor - ML - Report
No ratings yet
Heart Disease Predictor - ML - Report
15 pages
C2 W4 Lab 02 Tree Ensemble
No ratings yet
C2 W4 Lab 02 Tree Ensemble
16 pages
Practice 2+
No ratings yet
Practice 2+
25 pages
Capstone Projects
No ratings yet
Capstone Projects
4 pages
PA LAb 3
No ratings yet
PA LAb 3
6 pages
Logistic Regression
No ratings yet
Logistic Regression
2 pages
Recsify Technologies Assignment
No ratings yet
Recsify Technologies Assignment
10 pages
Data analytics
No ratings yet
Data analytics
10 pages
Maneesha Nidigonda Minor Project .Ipynb
No ratings yet
Maneesha Nidigonda Minor Project .Ipynb
35 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
1.1 Objective: 2. Data Preparation and Exploratory Analysis
No ratings yet
1.1 Objective: 2. Data Preparation and Exploratory Analysis
11 pages
Road Accident Prediction Journal Paper
No ratings yet
Road Accident Prediction Journal Paper
3 pages
Data Mining Report
No ratings yet
Data Mining Report
7 pages
Car_crash___Jupyter_Notebook.pdf
No ratings yet
Car_crash___Jupyter_Notebook.pdf
33 pages
ATUL MLT EXP 4-11
No ratings yet
ATUL MLT EXP 4-11
17 pages
100800 Task 11 AI
No ratings yet
100800 Task 11 AI
4 pages
Dhyey V Desai Supervised Machine Learning Approaches
No ratings yet
Dhyey V Desai Supervised Machine Learning Approaches
5 pages
Titanic ML Kaggle
No ratings yet
Titanic ML Kaggle
3 pages
Decision Trees in Sklearn Decision Trees in Sklearn
No ratings yet
Decision Trees in Sklearn Decision Trees in Sklearn
7 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
AI 1z0-1122-25
No ratings yet
AI 1z0-1122-25
14 pages
Long Short-term Temporal Fusion Transformer for Sh
No ratings yet
Long Short-term Temporal Fusion Transformer for Sh
23 pages
Session - 3 (14 Dec 2020)
No ratings yet
Session - 3 (14 Dec 2020)
21 pages
Artificial Intelligent Based Electric Vehicle Monitoring System Using IoT
No ratings yet
Artificial Intelligent Based Electric Vehicle Monitoring System Using IoT
42 pages
Principal Component Analysis (PCA) : Feature Extraction Node
No ratings yet
Principal Component Analysis (PCA) : Feature Extraction Node
4 pages
A Comprehensive Study of Camouflaged Object Detection Using Deep Learning
No ratings yet
A Comprehensive Study of Camouflaged Object Detection Using Deep Learning
8 pages
Sentiment Prediction for Market Volatility
No ratings yet
Sentiment Prediction for Market Volatility
25 pages
Final Project Report-KSCST
100% (1)
Final Project Report-KSCST
52 pages
1 s2.0 S0301420722005529 Main
No ratings yet
1 s2.0 S0301420722005529 Main
15 pages
Image_Fusion 1
No ratings yet
Image_Fusion 1
4 pages
AI Tutorial 2
No ratings yet
AI Tutorial 2
5 pages
Ultimate Sales Closing Guide
No ratings yet
Ultimate Sales Closing Guide
29 pages
Data Science Must Work in All Industry With AI and ML
No ratings yet
Data Science Must Work in All Industry With AI and ML
32 pages
Question Bank 1
No ratings yet
Question Bank 1
4 pages
13 Useful Deep Learning Interview Questions and Answer
No ratings yet
13 Useful Deep Learning Interview Questions and Answer
6 pages
Chatbots and Virtual Assistants
No ratings yet
Chatbots and Virtual Assistants
5 pages
Ude My For Business Course List New
No ratings yet
Ude My For Business Course List New
64 pages
ML Quiz-1
No ratings yet
ML Quiz-1
4 pages
Project Report on Natural Language Processing
No ratings yet
Project Report on Natural Language Processing
4 pages
Reinforcement Learning Details
No ratings yet
Reinforcement Learning Details
9 pages
AI Report Presentation
No ratings yet
AI Report Presentation
14 pages
Anomaly Detection Survey
No ratings yet
Anomaly Detection Survey
72 pages
Soft Computing Question Bank
No ratings yet
Soft Computing Question Bank
4 pages
AI-900 - Fundamental Principles of ML
No ratings yet
AI-900 - Fundamental Principles of ML
55 pages
14th International Conference on Soft Computing, Artificial Intelligence and Applications (SAI 2025)
No ratings yet
14th International Conference on Soft Computing, Artificial Intelligence and Applications (SAI 2025)
3 pages
Ethical Framework For Harnessing The Power of AI I
No ratings yet
Ethical Framework For Harnessing The Power of AI I
32 pages
Applications of Arti Ficial Intelligence Methodologies To Behavioral and Social Sciences
No ratings yet
Applications of Arti Ficial Intelligence Methodologies To Behavioral and Social Sciences
13 pages
Pavan
No ratings yet
Pavan
23 pages
The Ai Revolution Future Unveiled
No ratings yet
The Ai Revolution Future Unveiled
138 pages