0% found this document useful (0 votes)
43 views27 pages

Smoking & Drinking Prediction ML

Uploaded by

Yohannes Dereje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views27 pages

Smoking & Drinking Prediction ML

Uploaded by

Yohannes Dereje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Smoking and Alcohol

drinking prediction system

Machine Learning Project


Group Eight
1. Introduction
 The primary objective of this research is to
shed light on the potential applications and
implications of smoking and drinking vs.
body signal prediction systems in modern
healthcare. By understanding the
physiological impact of smoking and
drinking through the lens of data analysis
and prediction, we can empower
individuals to make healthier choices and
enable healthcare professionals to offer
personalized interventions and support.
2. Statement Of the
problem

 Theproblem at hand is the lack of an


effective prediction system that can
accurately assess the impact of
smoking and drinking on the human
body by analyzing physiological
signals and parameters.
Contd ……….
 healthcareprofessionals struggle to provide
targeted interventions and support due to a
limited understanding of the unique
physiological effects of smoking and
drinking on individuals. Thus, there is a need
to develop a robust prediction system that
offers personalized feedback and enables
healthcare professionals to deliver effective
interventions, ultimately reducing the
burden of smoking and drinking-related
diseases.
3. Objective of the
study

3.1. General Objective


 The general objective of this
research is to develop a
comprehensive and accurate
prediction system that utilizes data
analysis techniques to assess the
impact of smoking and drinking on
the human body
3.2. Specific Objective
1. Review and analyze existing literature on the
physiological effects of smoking and drinking.
2. Develop a robust data collection framework to
capture relevant physiological signals.
3. Apply data analysis techniques to identify
patterns and correlations related to smoking and
drinking habits.
4. Develop a user-friendly prediction system that
provides personalized feedback.
5. Validate the prediction system and assess its
effectiveness.
6. Evaluate the potential impact of the prediction
system on public health.
4. Methodology
The "Smoker and Alcohol drinker Prediction" ML
Project Methodology:
 Define Problem
 Data Collection & Ethics
 Exploratory Data Analysis
 Preprocessing
 Feature Engineering
 Model Selection
 Training & Evaluation
 Interpretability
 Deployment & Feedback Loop
 Ethical Considerations & Documentation.
4.1. Data Collection

 Thedata set we used to train our ML


model is derived from from National
Health Insurance Service in Korea.

 We found the data in the link :-


https://www.kaggle.com/datasets/soo
youngher/smoking-drinking-dataset
4.2. Data type
 Inour "Smoker and Drinker Prediction" ML
project, we leveraged numeric continuous data
as our primary data type for training the
predictive models. This approach allows us to
analyze and extract meaningful patterns from a
range of quantitative variables, contributing to
the accuracy and effectiveness of our
predictions. By focusing on numerical attributes,
we aim to capture nuanced correlations and
dependencies within the dataset, enhancing the
reliability of our machine learning algorithms in
predicting smoking and alcohol consumption
behaviors.
4.3. Machine Learning
Algorithm
 We used Logistic Regression Algorithm for our
project.
 Logistic regression is a commonly used algorithm in
predictive modeling and classification tasks, and it
can be employed in the smoking and drinking
prediction system project. It is a statistical model
that determines the relationship between a binary
dependent variable (such as smoking or drinking
behavior) and one or more independent variables
(such as physiological signals and self-reported
data). The goal is to predict the likelihood of an
individual exhibiting a certain behavior (smoking or
drinking) based on the given set of input variables.
Contd ……
 The logistic regression algorithm uses the
logistic or sigmoid function to transform
the output into a probability value between
0 and 1. This probability represents the
likelihood of the individual belonging to a
particular class (e.g., smoker or non-
smoker) based on the input variables. The
logistic regression model estimates the
coefficients of the independent variables,
which indicate the strength and direction
of their influence on the outcome variable.
Contd……
 We used Logistic regression to model the
relationship between the physiological
signals, self-reported data, and the
likelihood of an individual engaging in
smoking or drinking behaviors. The
algorithm will be trained on a labeled
dataset, where the input variables are the
physiological signals and self-reported
data, and the output variable represents
the smoking or drinking behavior (e.g.,
smoker or non-smoker).
5. Data preprocessing/Data
Preparation
 Inthe Data Preprocessing phase of our
project, we meticulously prepared and
cleaned the dataset to ensure its quality and
suitability for machine learning. This involved
handling missing values, addressing outliers,
and normalizing or scaling features to create
a standardized foundation for model training.
Our rigorous data preparation process is
pivotal in optimizing the performance of our
machine learning algorithms, enhancing their
ability to discern meaningful patterns and
trends from the input data.
5.1. Importing the
dependencies
5.2. Data Cleaning
 Data cleaning is the systematic process of
identifying and rectifying errors or
inconsistencies in a dataset, ensuring its
accuracy and reliability for analysis or
machine learning.

 Handlingmissing values involves addressing


the absence of data points in a dataset,
employing techniques such as imputation or
removal to ensure completeness and
accuracy in analysis or modeling.
5.3. Data Labeling and
Annotation

 Handling missing values involves


addressing the absence of data
points in a dataset, employing
techniques such as imputation or
removal to ensure completeness and
accuracy in analysis or modeling.
Contd ………
6. Training the model

 LogisticRegression is a statistical
method used for binary
classification, predicting the
probability of an event's occurrence.
It's widely applied in machine
learning for scenarios where the
outcome is either yes/no or 0/1.
To understand it……….

 Imagine you have a superhero friend


who helps you decide if you should
wear a jacket or not. This superhero
looks at the weather and says, "I'm
80% sure it will rain today." That's a
bit like logistic regression! It helps
predict things with two choices, like
"rain" or "no rain," based on different
factors, and it's really good at
making these yes/no predictions.
Contd…..
7. Model Evaluation

 Model evaluation in machine


learning is the process of assessing
how well a trained model performs
on new, unseen data. It involves
using metrics like accuracy and
precision to measure the model's
effectiveness in making predictions.
The goal is to ensure the model
works reliably on different datasets.
Contd…..
 Model evaluation in machine learning is like
checking how well our superhero friend (the
model) is doing in predicting things. We
give our superhero some test questions
(new data it hasn't seen before) and see if
it gets them right or wrong. It's like asking,
"Did you correctly predict if it would rain
today?" We use metrics, like accuracy, to
measure how good our superhero is at
making predictions. The better our
superhero performs on new challenges, the
more we trust it to help us in the real world!
Contd……
8. Making Predictions
 Predictingsmoking and alcohol consumption
involves using a trained model to anticipate
whether an individual is likely to be a smoker
or drink alcohol based on input features. The
model takes relevant factors into account,
such as demographic information, lifestyle
choices, or health indicators, and produces
predictions indicating the likelihood of
smoking and alcohol consumption. These
predictions aid in understanding and
addressing potential health-related behaviors
in individuals.
Contd…….

 The Smoker and Alcohol Drinker


Prediction System holds the potential
to advance public health by
predicting risky behaviors. Future
improvements may enhance
prediction accuracy, incorporate
diverse data, and foster
collaborations for impactful
preventive interventions.
Contd…..
Thank YOU

You might also like