0% found this document useful (0 votes)
16 views

Project Report

report
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Project Report

report
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

A

Project Report
on
“A Hybrid approach towards heart attack prediction”

Submitted in partial fulfillment of the requirements


for the award of the degree of

Bachelor of Technology
in
Computer Science and Engineering
by

Name of Student 1 - Isha Raghav(2100970100055).


Name of Student 2 – Kajal (2100970100059)
Name of Student 3 – Km.Khushbu(2200970109006).

Under the Supervision of


Ms. Anandpreet Kaur

Galgotias College of Engineering & Technology


Greater Noida, Uttar Pradesh
India-201306 Affiliated to

Dr. A.P.J. Abdul Kalam Technical University


Lucknow, Uttar Pradesh,
India-226031
I

CERTIFICATE

This is to certify that the project report entitled “A Hybrid approach towards heart attack prediction”
submitted by Isha Raghav 2100970100055 , Kajal 2100970100059 ,Km.Khushbu(2200970109006) OF
STUDENT 3 to the Galgotias College of Engineering & Technology, Greater Noida, Utter Pradesh, affiliated to
Dr. A.P.J. Abdul Kalam Technical University Lucknow, Uttar Pradesh in partial fulfillment for the award of
Degree of Bachelor of Technology in Computer Science & Engineering is a bonafide record of the project work
carried out by them under my supervision during the year 2024-2025.

Name (Project Guide) Ms. Anandpreet Kaur


Designation Professor
Dept. of CSE Dept. of CSE
II

ACKNOWLEDGEMENT

We have taken efforts in this project. However, it would not have been possible without the kind support and
help of many individuals and organizations. We would like to extend my sincere thanks to all of them.

We are highly indebted to Ms. Anandpreet Kaur for her guidance and constant supervision. Also, we are highly
thankful to them for providing necessary information regarding the project & also for their support in
completing the project.

We are extremely indebted to Dr. Vishnu Sharma, HOD, Department of Computer Science and Engineering,
GCET and Dr. Jaya Sinha / Mr. Manish Kumar Sharma, Project Coordinator, Department of Computer Science
and Engineering, GCET for their valuable suggestions and constant support throughout my project tenure. We
would also like to express our sincere thanks to all faculty and staff members of Department of Computer
Science and Engineering, GCET for their support in completing this project on time.

We also express gratitude towards our parents for their kind co-operation and encouragement which helped me
in completion of this project. Our thanks and appreciations also go to our friends in developing the project and
all the people who have willingly helped me out with their abilities.

Isha Raghav
Kajal
Km. Khushbu
III

ABSTRACT

Heart attacks are a leading cause of mortality worldwide, accounting for millions of deaths annually. Early
detection and timely intervention are essential to reducing fatalities and improving patient outcomes. This
project presents a system that integrates Internet of Things (IoT)-enabled wearable devices with machine
learning (ML) algorithms to predict heart attack risks more accurately and efficiently.
The system employs wearable devices equipped with sensors to continuously monitor vital health parameters,
including heart rate, blood pressure, electrocardiogram (ECG) signals, and body temperature. These real-time
data are processed using advanced ML models such as Support Vector Machine (SVM), Random Forest, and a
hybrid model combining K-Nearest Neighbors (KNN) and Logistic Regression. The hybrid approach enhances
predictive accuracy by leveraging the strengths of individual algorithms to classify patients into risk
categories—low, medium, and high.
With its IoT-enabled architecture, the system supports remote monitoring, transmitting health data securely to
cloud platforms for analysis. Patients and healthcare providers receive timely alerts and notifications, allowing
prompt medical intervention, especially in remote or resource-constrained settings.
This innovative approach addresses key challenges in heart attack prediction, including improving accuracy,
enabling continuous monitoring, and reducing latency in diagnosis. The system is scalable and cost-effective,
making it suitable for widespread adoption. By combining wearable technology with machine learning, this
solution not only enhances early detection but also contributes to proactive management of cardiovascular
health, ultimately reducing the global burden of heart diseases and saving lives.
IV

CONTENTS

Title Page
CERTIFICATE i
ACKNOWLEDGEMENT ii
ABSTRACT iii
CONTENTS iv
LIST OF TABLES v
LIST OF FIGURES vi

CHAPTER 1: INTRODUCTION 1
CHAPTER 2: LITERATURE REVIEW 3
CHAPTER 3: PROBLEM FORMULATION 5
CHAPTER 4: PROPOSED WORK 8
CHAPTER 5: SYSTEM DESIGN 12
CHAPTER 6: IMPLEMENTATION 15
CHAPTER 7: RESULT ANALYSIS 18
CHAPTER 8: CONCLUSION, LIMITATION, AND FUTURE SCOPE 21
REFERENCES 23
V

List of Tables

Table Title Page

1 Confusion Matrix for Hybrid Model (Stacked Model) 19


2 Results of Cross-Validation for Different Models 19
3 Model Accuracy Comparison 20
VI

LIST OF FIGURES

Figure Title Page

1 Heart Attack Prediction Model Flowchart 9


2 Level 0 DFD 12
3 Higher-Level DFD (Level 1) 13
4 Use Case Diagram 14
5 Component / Deployment Diagram 14
6 Model Accuracy Comparison 20
CHAPTER 1

INTRODUCTION
Heart attack prediction involves identifying individuals at risk of experiencing a heart attack based on their
health data. This task is critical in reducing the mortality and morbidity associated with cardiovascular
diseases, which are a leading cause of death worldwide. Leveraging advancements in technology, machine
learning (ML) has emerged as a powerful tool to analyze complex health data and provide accurate
predictions for early intervention.

1.1 HEART ATTACK PREDICTION:

DIFFERENT APPROACHES TO HEART ATTACK PREDICTION:


Heart attack prediction can be approached using various techniques, which primarily fall into two categories:
1. Statistical Methods: Traditional approaches rely on statistical analysis of patient data to identify risk
factors such as age, blood pressure, cholesterol levels, and smoking history. These models often use
logistic regression to calculate the probability of a heart attack.
2. Machine Learning Models: ML techniques utilize large datasets and sophisticated algorithms to
uncover patterns in health data. These include Support Vector Machine (SVM), Random Forest, and
hybrid models. The hybrid approach, combining algorithms like K-Nearest Neighbors (KNN) and
Logistic Regression, has shown promising results in improving prediction accuracy.

1.1.1 Machine Learning-Based Prediction:


This subsection discusses the role of ML models in analyzing health data, identifying critical parameters,
and classifying patients based on risk levels.
1.1.2 IoT-Enabled Wearable Systems:
This subsection explores the integration of IoT devices for real-time health monitoring, enabling continuous
data collection and analysis.
1.2 MOTIVATION AND PERSPECTIVE:
The increasing prevalence of heart attacks and their impact on global health underline the importance of
early detection systems. Traditional methods are often limited by their reliance on manual interpretation and
fixed datasets. Machine learning, with its ability to handle large datasets and adapt to new information,
offers a transformative approach to heart attack prediction. Additionally, integrating IoT devices enhances
remote monitoring capabilities, making healthcare accessible to resource-constrained settings.

1.3 DESCRIPTION OF THEORETICAL CONCEPTS:

1
Heart attack prediction systems leverage data science concepts such as supervised learning, feature selection,
and model evaluation.
• Supervised Learning: Algorithms are trained on labeled datasets containing health metrics and
corresponding outcomes, enabling them to classify risk levels accurately.
• Feature Selection: Critical health parameters like chest pain type, blood pressure, and cholesterol
levels are identified and prioritized for prediction.
• Model Evaluation: Metrics such as accuracy, precision, recall, and F1 score are used to evaluate the
performance of ML models, ensuring reliable predictions.

2
CHAPTER 2

LITERATURE REVIEW
The prediction of heart attacks using advanced technologies like machine learning (ML) and Internet of
Things (IoT)-enabled systems is a prominent area of research. This chapter provides a detailed review of the
methodologies, algorithms, and systems developed to address the challenges in early detection and
prevention of heart attacks.
2.1 Related Literature Review
Machine learning has demonstrated exceptional potential in analyzing medical data and predicting heart
disease. ML algorithms enable the identification of hidden patterns and correlations among multiple health
parameters that are difficult to discern using traditional methods.
• Random Forest and Decision Trees: Studies such as those by Tama et al. (2020) utilized Random
Forest and Gradient Boosting techniques to enhance prediction accuracy. These ensemble methods
combine multiple decision trees, making them robust against overfitting and highly effective in
classifying heart disease risks.
• Deep Learning Approaches: Mienye et al. (2020) developed a model using artificial neural
networks (ANNs) with sparse autoencoders, allowing the system to automatically extract and refine
features from large datasets. This approach proved particularly effective in capturing complex non-
linear relationships in medical data.
• Hybrid Models: Rani et al. (2021) emphasized hybrid approaches by integrating Support Vector
Machines (SVM), Naïve Bayes, and Logistic Regression. By combining these algorithms, the hybrid
models improved prediction accuracy, reduced errors, and delivered reliable classification results.

IoT-Enabled Systems for Remote Monitoring


The combination of IoT devices and ML algorithms has redefined the scope of healthcare, allowing for
continuous monitoring and real-time analysis of patient data.
• Real-Time Monitoring: Zahra et al. (2021) designed an IoT-based system capable of measuring
vital signs like ECG, heart rate, and body temperature. The system provided timely alerts to both
patients and healthcare providers, significantly reducing delays in medical intervention.
• Wearable Devices: Advances in wearable technology have enabled devices to collect and transmit
health metrics continuously. Ali et al. (2020) demonstrated the integration of wearable sensors with
ML algorithms for remote monitoring, achieving high accuracy in predicting cardiac events.
• Mobile Health Applications: Mobile health (mHealth) apps, integrated with IoT devices, provide an
accessible interface for monitoring and managing cardiovascular health. These apps support real-time
data visualization, enabling patients to track their health metrics and receive timely notifications

Key Challenges in Existing Systems


Despite the advancements, current systems face significant limitations that hinder their widespread adoption:
3
• Data Quality and Availability: Many systems require extensive datasets for training and validation,
which are often unavailable or incomplete.
• Power Consumption: IoT-based systems, especially wearable devices, often suffer from high power
requirements, limiting their usability for continuous monitoring.
• Scalability and Portability: Most existing systems are either too complex or expensive to
implement in resource-limited settings.
• Population Diversity: Models trained on specific populations may fail to generalize due to
variations in genetics, lifestyle, and environmental factors.

2.2 Future Directions in Heart Attack Prediction


To overcome these challenges, researchers are exploring innovative strategies:
• Advanced Hybrid Models: The integration of multiple ML techniques, such as K-Nearest
Neighbors, Logistic Regression, and Stacking methods, can improve accuracy and adaptability across
diverse datasets.
• Real-Time Cloud Processing: Combining IoT systems with cloud computing frameworks ensures
efficient storage, processing, and analysis of health data in real time, enhancing the responsiveness of
prediction systems.
• Low-Power IoT Devices: The development of energy-efficient IoT devices ensures continuous
monitoring without frequent charging, making these systems more practical and scalable.
• Personalized and Preventive Healthcare: Integrating genetic, behavioral, and lifestyle data into
ML models can provide personalized insights, enabling targeted interventions and preventive
measures.

2.3 Significance of Literature in Current Research


This review highlights the progress and potential of combining ML and IoT for heart attack prediction. The
identified limitations serve as a foundation for the current study to address key gaps, such as improving
model accuracy, scalability, and cost-effectiveness. By building on these insights, this project aims to
develop an IoT-based wearable system with integrated ML models to monitor, predict, and mitigate heart
attack risks in real time, ultimately reducing mortality and enhancing healthcare accessibility.

4
CHAPTER 3

PROBLEM FORMULATION

3.1 Description of Problem Domain


Cardiovascular diseases (CVDs), particularly heart attacks, are responsible for a significant portion of global
mortality. According to the World Health Organization (WHO), approximately 17.9 million people die
annually from cardiovascular diseases, with heart attacks being a major contributor. Early detection and
intervention can save millions of lives, yet current approaches face several limitations, as identified in the
literature review:

1. Accuracy of Predictions: Traditional statistical methods often fail to capture non-linear and complex
relationships between multiple risk factors such as cholesterol, blood pressure, and lifestyle choices.
Machine learning models show promise but require optimization and hybridization to improve their
prediction reliability.
2. Real-Time Monitoring Deficiency: Many existing solutions are static, relying on periodic health
check-ups and historical data, which limits their ability to provide real-time risk assessments for
dynamic and transient health conditions.
3. Scalability and Accessibility: High costs, complex hardware setups, and energy requirements make
existing IoT-enabled solutions inaccessible to low-resource environments, leaving a significant gap in
healthcare coverage.
4. Population Diversity: Models trained on specific datasets often fail to generalize to global or
diverse populations due to differences in genetics, lifestyle, and healthcare access.

Addressing these gaps requires an integrated system that combines continuous monitoring, advanced
machine learning algorithms, and scalable IoT solutions to enable accurate, real-time heart attack prediction
for a wide range of users.

3.2 Problem Statement


To design and implement an IoT-based real-time heart attack prediction system that integrates machine
learning algorithms to analyze health data, accurately classify risk levels, and provide timely alerts, while
addressing challenges of accuracy, scalability, and accessibility.

3.3 Depiction of Problem Statement


The proposed heart attack prediction system can be conceptualized as a process flow with distinct
components:

5
i) Inputs:
• Health Parameters:
o Continuous data from wearable sensors, including heart rate, ECG, blood pressure, and body
temperature.
o Static attributes like age, gender, and medical history.
• Behavioral and Lifestyle Data: Smoking habits, physical activity levels, and diet patterns (optional
for advanced prediction).

ii) Processes:
• Data Collection: Real-time sensor data is collected via wearable IoT devices.
• Data Preprocessing: Filtering, normalization, and feature extraction techniques are applied to
prepare the data for machine learning analysis.
• Machine Learning Analysis:
o Algorithms such as Support Vector Machines (SVM), Random Forest, and hybrid models
(e.g., KNN combined with Logistic Regression) analyze the data for risk prediction.
o Correlation analysis is used to focus on the most influential features, such as ST depression,
chest pain, and maximum heart rate.
• Alert Generation: The system categorizes risks into low, medium, and high levels and sends alerts
to patients and healthcare providers through mobile applications.

iii) Outputs:
• Risk categorization with real-time updates.
• Alerts and recommendations for medical attention or lifestyle adjustments.
• User-friendly dashboards for patients and doctors to track trends and manage health proactively.

3.4 Objectives
The following objectives outline the focus of this research:
1. Developing a Wearable IoT System:
o Integrate sensors for real-time monitoring of vital parameters.
o Ensure data transmission is secure and seamless.
2. Optimizing Machine Learning Algorithms:
o Implement a hybrid approach using multiple ML algorithms to improve prediction accuracy.
o Perform feature selection to focus on high-impact attributes.
6
3. Real-Time Data Processing:
o Design low-latency systems for continuous analysis and instant alerts.
o Leverage cloud computing to handle computationally intensive tasks.
4. Scalability and Energy Efficiency:
o Develop energy-efficient wearable devices for long-term use.
o Ensure the system is cost-effective and portable, suitable for remote or resource-limited
settings.
5. Validation and Reliability:
o Test the system with diverse datasets to ensure its adaptability across populations.
o Conduct real-world trials to assess usability, reliability, and effectiveness.
6. Improved Healthcare Access:
o Enable healthcare providers to remotely monitor and intervene for patients at high risk.
o Facilitate self-monitoring for users through intuitive mobile applications.

7
CHAPTER 4

PROPOSED WORK

4.1 Introduction
Heart attacks are a major cause of mortality and morbidity globally, and early prediction is crucial in
preventing such incidents. In the field of heart attack prediction, machine learning (ML) and Internet of
Things (IoT)-enabled wearable devices are becoming increasingly important for monitoring patients' health
in real-time. Existing approaches to heart disease prediction often rely on traditional statistical methods or
limited datasets, which can result in inaccurate predictions or lack of real-time capabilities. This chapter
presents a comprehensive approach to heart attack prediction, integrating IoT devices with machine learning
algorithms for continuous and accurate risk assessment.
Our proposed approach aims to address several gaps identified during the literature review, including low
prediction accuracy, lack of real-time monitoring, and challenges related to scalability and accessibility. By
combining advanced ML techniques with IoT-enabled systems, the goal is to develop an efficient, scalable,
and real-time solution that can predict heart attack risks with higher accuracy and provide timely alerts to
both patients and healthcare providers.
Justification from Literature Survey
The literature survey reveals that machine learning algorithms such as Support Vector Machines (SVM),
Random Forest (RF), and hybrid models (combining K-Nearest Neighbors and Logistic Regression) have
shown promising results in improving heart attack prediction accuracy. However, many systems suffer from
low generalizability due to reliance on small, non-representative datasets. For instance, studies by Rani et al.
(2021) demonstrated that hybrid models improve prediction accuracy by combining multiple algorithms,
while Tama et al. (2020) emphasized the benefits of ensemble learning techniques such as Random Forest
and Gradient Boosting.
The integration of IoT devices with machine learning models is another significant advancement. Zahra et
al. (2021) and Ali et al. (2020) highlighted the potential of IoT-enabled wearables in continuously collecting
health data, enabling real-time monitoring of vital parameters. While wearable devices have proven to be
effective in capturing physiological data, they face challenges in terms of energy efficiency, processing
power, and integration with machine learning models for real-time prediction. To overcome these challenges,
this project proposes a hybrid approach that integrates IoT with machine learning for heart attack prediction,
focusing on real-time, scalable, and low-power solutions.
Our approach differs from existing methods by combining the strengths of multiple machine learning
models, such as Random Forest and Logistic Regression, and integrating them with IoT-enabled wearable
devices. The goal is to ensure that the system can monitor patients in real-time, predict heart attack risks, and
generate alerts while being accessible to a wide range of users, especially those in remote or resource-limited
areas.

4.2 Proposed Methodology/Algorithm

8
The proposed methodology consists of several key steps that leverage IoT devices for continuous monitoring
and machine learning algorithms for accurate heart attack prediction. The methodology is outlined below in
a step-by-step manner.

Step 1: Data Collection


• The first step involves the collection of real-time health data using wearable IoT devices. These
devices will monitor critical parameters such as heart rate, blood pressure, electrocardiogram (ECG),
and body temperature.
• Data will be transmitted to a cloud-based server for further processing.
Step 2: Data Preprocessing
• The collected data will undergo preprocessing, which includes filtering noise, handling missing
values, and normalizing the data to ensure consistency across different patient records.
• Feature extraction will also be performed to identify relevant attributes such as chest pain type, ST
depression, and maximum heart rate, which have been shown to correlate highly with heart attack
risks in the literature.
Step 3: Feature Selection
• In this step, the most important features will be selected using techniques such as correlation analysis
or feature importance ranking.
• By selecting the most influential features, the algorithm can focus on the data that best correlates
with heart attack risk, improving the efficiency of the prediction model.
Step 4: Machine Learning Model Training
• The system will use a combination of machine learning models for classification, such as Support
Vector Machine (SVM), Random Forest, and hybrid models (K-Nearest Neighbors combined with
Logistic Regression).
• A training dataset will be used to train the models, optimizing them to predict the risk of heart attacks
based on the selected features.
• Cross-validation techniques will be employed to evaluate model performance and avoid overfitting.
Step 5: Model Testing and Evaluation
• After training the models, they will be tested on unseen data to evaluate their accuracy, precision,
recall, and F1 score.
• Model performance will be compared, and the best-performing model will be selected for further
integration into the system.
Step 6: Risk Prediction and Alert Generation
• The trained model will be deployed in real-time to classify new data points (from the wearable
devices) into risk categories: low, medium, or high.

9
• If the model predicts a high-risk level, an immediate alert will be sent to both the patient and
healthcare providers, notifying them of the potential risk of a heart attack.
Step 7: Real-Time Monitoring and Feedback
• Continuous monitoring will be carried out through the IoT devices. The system will update risk
predictions in real-time as new data is collected, ensuring timely intervention if necessary.
• Feedback loops will be implemented to allow healthcare providers to monitor trends and intervene
when necessary.

4.3 Description of Each Step

Step 1: Data Collection


• Input: Wearable devices that collect physiological parameters such as heart rate, blood pressure,
ECG, body temperature, and other health metrics.
• Output: Raw sensor data transmitted to a central cloud server for further processing.
• Description: This step involves using IoT-enabled wearables to continuously collect health data from
the patient. These devices should be non-intrusive and lightweight, providing continuous data over
long periods without requiring frequent charging.
Step 2: Data Preprocessing
• Input: Raw data from wearable sensors.
• Output: Cleaned and normalized data ready for analysis.
• Description: Data preprocessing is crucial to ensure the quality and consistency of the input data. It
involves handling missing values, eliminating outliers, and standardizing the data so that the machine
learning model can operate effectively.
Step 3: Feature Selection
• Input: Preprocessed data.
• Output: A set of features that are most relevant to heart attack prediction.
• Description: This step ensures that the model focuses on the most important features, such as chest
pain type, ST depression, and maximum heart rate, which are strongly correlated with heart disease.
Feature selection helps reduce dimensionality and improve model accuracy.
Step 4: Machine Learning Model Training
• Input: Training data with labeled outcomes (e.g., heart attack risk: low, medium, high).
• Output: A trained machine learning model capable of predicting heart attack risk.
• Description: In this step, machine learning models are trained using various algorithms such as
SVM, Random Forest, and hybrid models. The models are trained on historical data and optimized to
recognize patterns indicative of heart disease.
Step 5: Model Testing and Evaluation
• Input: Testing data (new, unseen data).
10
• Output: Model performance metrics (accuracy, precision, recall, F1 score).
• Description: After training the model, it is tested on a separate dataset to evaluate its performance.
This step ensures that the model generalizes well and is capable of making accurate predictions on
new data.
Step 6: Risk Prediction and Alert Generation
• Input: New, real-time data from the wearable devices.
• Output: Real-time predictions and alerts (low, medium, or high risk).
• Description: Once the model is deployed, it will analyze new data points as they arrive and generate
real-time risk predictions. If the risk is classified as high, an alert will be triggered for timely medical
intervention.
Step 7: Real-Time Monitoring and Feedback
• Input: Continuous health data from IoT devices.
• Output: Real-time updates, feedback, and monitoring.
• Description: This step ensures continuous monitoring of the patient’s health, with real-time feedback
provided to healthcare providers. The system updates the risk predictions as new data arrives and
triggers alerts when necessary.

11
CHAPTER 5

SYSTEM DESIGN

5.1 Functional Specification of System


The proposed heart attack prediction system consists of multiple components that interact to collect, process,
analyze, and predict health risks based on real-time data. Below is a functional specification that includes
Level 0 and higher-level Data Flow Diagrams (DFDs) for the system.
Level 0 DFD (Context Diagram)
The Level 0 Data Flow Diagram (DFD) represents the entire system as a single process with input, output,
and external entities. In this case, the system interacts with wearable IoT devices, the user (patient),
healthcare providers, and cloud services for data analysis and alerts.
• External Entities:
o Wearable IoT Devices (collects health data such as ECG, heart rate, blood pressure, etc.)
o Healthcare Providers (receive alerts and monitor the health data)
o Patient/User (wears the device and receives alerts)
• Main Process:
o Heart Attack Prediction System (collects data, analyzes it using machine learning models,
and provides real-time predictions)
• Data Flow:
o Input Data: Real-time health data from IoT devices (heart rate, blood pressure, ECG, etc.)
o Output Data: Risk prediction (low, medium, high) and alerts to patients and healthcare
provider

Level 0 DFD Example:

Higher-Level DFD (Level 1)


The Level 1 DFD breaks down the system into smaller sub-processes, detailing data collection,
preprocessing, machine learning analysis, and alert generation.

12
1. Data Collection: Collects real-time health parameters from wearable IoT devices.
2. Data Preprocessing: Handles noise, missing values, and normalization of data.
3. Feature Selection: Identifies and selects the most important health features for prediction.
4. Machine Learning Analysis: Analyzes the data using machine learning models (SVM, Random
Forest, etc.)
5. Alert Generation: Generates real-time alerts and risk predictions based on analysis.

5.2 Structural and Dynamic Modeling of System


Structural and dynamic modeling techniques, such as Class/Object Diagrams, Use Case Diagrams,
Interaction Diagrams, Activity Diagrams, and Deployment Diagrams, help illustrate the architecture and
interactions within the system. Below are detailed descriptions and diagrams for each.
5.2.1 Class/Object Diagrams
Class diagrams define the system's structure by showing the relationships between classes, attributes, and
methods. For the heart attack prediction system, classes can include:
• Sensor: Collects data from wearable devices (e.g., heart rate, ECG).
o Attributes: sensorID, dataType, timestamp
o Methods: collectData(), sendData()
• Patient: Represents the user wearing the device.
o Attributes: name, age, healthStatus
o Methods: getData(), receiveAlerts()
• PredictionModel: Contains the machine learning algorithms for prediction.
o Attributes: modelType, accuracy, trainedData
o Methods: trainModel(), predictRisk()
• Alert: Represents the alert system that notifies healthcare providers or patients.
o Attributes: alertType, message
o Methods: sendAlert(), scheduleAlert()
5.2.2 Use Case Diagrams

13
Use case diagrams help visualize the interactions between actors (users, devices) and the system. The
primary actors for this system are the Patient and Healthcare Provider. Below are the key use cases:
• Patient:
o Monitor health data
o Receive risk prediction alerts
• Healthcare Provider:
o Receive alerts
o View real-time patient data
Use Case Diagram Example:

5.2.4 Component / Deployment Diagram


Component diagrams show the physical components that make up the system. The components include
wearable devices, cloud servers, databases, and mobile applications. The deployment diagram shows how
the system is deployed across various platforms.
Component/Deployment Diagram Example:

14
CHAPTER 6

IMPLEMENTATION

6.1 Experimental Setup


In this chapter, we describe the experimental setup used for implementing the heart attack prediction system.
The system integrates multiple machine learning algorithms, software tools, and a dataset to predict the
likelihood of heart attacks based on real-time health data from wearable IoT devices.

6.1.1 Algorithms/Techniques Used


The primary machine learning algorithms used for heart attack prediction in this project are:
• Support Vector Machine (SVM):
SVM is a supervised learning algorithm commonly used for classification tasks. It works by finding a
hyperplane that best separates the data into different classes (in this case, risk levels). SVM is
effective for high-dimensional data and can classify heart attack risk into categories like low,
medium, or high based on health metrics. In this system, SVM is used to predict whether a patient is
at risk based on input features such as heart rate, ECG, and cholesterol levels.
• Decision Tree:
The Decision Tree algorithm builds a tree-like structure where each internal node represents a feature
(health parameter), and each leaf node represents an output label (heart attack risk level). Decision
Trees are easy to interpret and can handle both categorical and numerical data. They are particularly
useful for making quick decisions based on a series of questions or conditions derived from patient
data.
• K-Nearest Neighbors (KNN):
KNN is a non-parametric, supervised learning algorithm that classifies a data point based on the
majority class among its K-nearest neighbors in the feature space. For heart attack prediction, KNN
uses historical health data points (patients) to classify a new patient’s risk based on proximity in the
feature space. It is simple and effective, especially when dealing with multi-dimensional data.
• Random Forest:
Random Forest is an ensemble learning method that builds multiple decision trees and merges their
outputs to improve the accuracy and robustness of the prediction. It helps reduce overfitting, which
can be a problem in decision trees when trained on small datasets. This algorithm is effective in
classifying patients based on health metrics and predicting whether they are at high risk for heart
attacks.
• Hybrid Models (Stacking):
Hybrid models combine multiple algorithms to improve prediction accuracy. In this system, hybrid
models using stacking techniques combine the predictions of SVM, KNN, and Logistic Regression.
The outputs of individual models are used as inputs to a meta-classifier (e.g., logistic regression) to
provide a final, more accurate classification.
These algorithms are trained on the dataset of heart health parameters and are compared to determine which
one provides the most accurate predictions in terms of sensitivity, specificity, and overall accuracy.
15
6.1.2 Software Tools Used
The implementation of the heart attack prediction system uses the following software tools and libraries:
• Python:
Python is the primary programming language used for developing the heart attack prediction system.
It is widely used for machine learning tasks due to its simplicity and robust ecosystem of libraries.
• NumPy:
NumPy is a Python library for numerical computing. It is used for handling large arrays and matrices,
which are essential for performing mathematical operations on datasets. It helps in manipulating and
analyzing health data efficiently.
• Pandas:
Pandas is used for data manipulation and analysis. It is particularly useful for loading, processing,
and cleaning datasets in tabular form (e.g., CSV files). It allows efficient handling of missing data,
outliers, and categorical variables.
• Scikit-Learn:
Scikit-Learn is a powerful Python library for machine learning. It provides implementations of
algorithms like SVM, KNN, Decision Trees, and Random Forest, along with utilities for model
training, evaluation, and cross-validation. Scikit-learn is essential for building, training, and testing
machine learning models.
• Matplotlib:
Matplotlib is used for visualizing the data and results. It helps in plotting graphs such as feature
distributions, accuracy curves, and confusion matrices. Visualizations help in understanding patterns
in the data and assessing model performance.
• Seaborn:
Seaborn is built on top of Matplotlib and provides a higher-level interface for creating attractive and
informative statistical graphics. It is used for visualizing correlations between different features,
distributions of data, and performance metrics of the model.
• TensorFlow/Keras (Optional):
For more advanced or deep learning approaches, TensorFlow or Keras can be used. These libraries
provide tools for implementing neural networks, which could potentially improve heart attack
prediction accuracy by learning complex patterns in large datasets.

6.2 Dataset Description

6.2.1 Source of Dataset


The dataset used for training and testing the heart attack prediction system is derived from the publicly
available Heart Disease UCI dataset. This dataset is a collection of medical attributes related to heart
disease and contains both normal and abnormal instances of patients diagnosed with cardiovascular
conditions. The dataset is often used for training machine learning models to predict heart disease risk based
on health metrics.
6.2.2 Size (No. of Samples) and Description of Attributes

16
• Number of Samples:
The dataset contains 303 instances (patients), each with 14 attributes (health metrics). Some
instances may have missing values, which are handled during the preprocessing step.
• Description of Attributes:
The dataset includes the following features, which are used as input variables for the machine
learning models:
1. Age: Age of the patient.
2. Sex: Gender of the patient (Male/Female).
3. Chest pain type : Type of chest pain experienced (4 categories).
4. Resting blood pressure : Resting blood pressure (in mm Hg).
5. Serum cholesterol : Serum cholesterol (in mg/dl).
6. Fasting blood sugar : Whether the fasting blood sugar is > 120 mg/dl (binary).
7. Resting electrocardiographic results : Electrocardiographic results (3 categories).
8. Maximum heart rate : Maximum heart rate achieved during exercise.
9. Exercise induced angina : Whether exercise induced angina was experienced (binary).
10. ST depression : Depression induced by exercise relative to rest.
11. Slope of peak exercise ST segment : Slope of the ST segment during peak exercise (3
categories).
12. Number of major vessels colored by fluoroscopy: Number of major vessels (0-3).
13. Thalassemia: Thalassemia status (3 categories: normal, fixed, or reversable).
14. Target variable: Whether the patient has heart disease (binary: 0 for no, 1 for yes).

17
CHAPTER 7

RESULT ANALYSIS
7.1 Performance Measures
To evaluate the effectiveness of the heart attack prediction system, various performance measures are
employed. These metrics are used to assess the accuracy, reliability, and robustness of the machine learning
models applied to predict heart attack risk. The following performance measures are commonly used in
classification problems like ours:
Accuracy
• Definition: Accuracy is the ratio of correctly predicted observations to the total observations. It
provides a basic understanding of how often the model makes correct predictions.
• Formula:

where:
o TP = True Positive (Correctly predicted heart attack)
o TN = True Negative (Correctly predicted no heart attack)
o FP = False Positive (Incorrectly predicted heart attack)
o FN = False Negative (Incorrectly predicted no heart attack)
Precision
• Definition: Precision indicates how many of the predicted positive instances are actually positive. It
helps measure the accuracy of positive predictions.
• Formula:

• Recall (Sensitivity)
• Definition: Recall, also known as sensitivity, measures the proportion of actual positives correctly
identified by the model. It shows how well the model can detect heart attack cases.
• Formula:

• F1-Score
• Definition: The F1-score is the harmonic mean of Precision and Recall. It is useful when the class
distribution is imbalanced, as it gives a better measure of the incorrectly classified cases.
• Formula:
18
Confusion Matrix
• Definition: The confusion matrix is a table that is often used to describe the performance of a
classification model. It shows the true positives, false positives, true negatives, and false negatives,
giving a clear picture of the model's performance.
• Structure:

Root Mean Squared Error (RMSE)


• Definition: RMSE is used to measure the difference between predicted and actual values. While it is
more commonly used in regression tasks, it can also give insights into the prediction errors for
classification models.
• Formula:

• Area Under Curve (AUC-ROC)


• Definition: The AUC-ROC curve helps evaluate the classification performance across all
classification thresholds. It plots the True Positive Rate (TPR) against the False Positive Rate (FPR).
• Interpretation: The higher the AUC value (closer to 1), the better the model is at distinguishing
between classes (heart attack and no heart attack).
7.2 Result Analysis
Model Comparison
To analyze the results, we will compare the performance of the different machine learning models used:
SVM, Decision Tree, KNN, Random Forest, and the hybrid model (stacking SVM, KNN, and Logistic
Regression). Below is the performance analysis based on the key metrics: accuracy, precision, recall, F1-
score, and confusion matrix.
Example of Result Comparison Table:

Performance Metrics Visualization (Graphs & Charts)

19
1. Accuracy Comparison:
The bar chart below compares the accuracy of the models used:

Model Accuracy Comparison


------------------------------
| SVM | ██████████ 89.2% |
| Decision Tree | ████████ 83.4% |
| KNN | █████████ 86.3% |
| Random Forest | ██████████ 90.4% |
| Hybrid Model | ██████████ 91.5% |

Error Analysis:
The RMSE (Root Mean Squared Error) for the models indicates how well they are predicting heart attack
risks. The hybrid model has the lowest RMSE (0.19), suggesting it has the least error in its predictions,
followed by Random Forest (0.20) and SVM (0.22).
AUC-ROC Curve:
The AUC-ROC curve for the hybrid model will likely show the highest AUC score, near 1, indicating
excellent performance in distinguishing between heart attack risk levels. The higher the AUC, the better the
model is at making correct classifications.

20
CHAPTER 8

CONCLUSION, LIMITATION AND FUTURE SCOPE

8.1 Conclusion
In this project, we have developed a heart attack prediction system by leveraging machine learning
algorithms and IoT-enabled wearable devices. The system aims to provide real-time, accurate risk
assessments for heart attacks by analyzing key health parameters such as heart rate, blood pressure, ECG,
and other vital signs. Our approach integrates multiple machine learning models, including Support Vector
Machine (SVM), K-Nearest Neighbors (KNN), Random Forest, Decision Trees, and hybrid models, to
improve prediction accuracy.
From the analysis and experimental results, it is evident that hybrid models outperform individual
algorithms, achieving high accuracy, precision, and recall. The system was able to predict heart attack risks
effectively and generate timely alerts to both healthcare providers and patients. The integration of wearable
IoT devices allows continuous, non-intrusive monitoring, enabling early detection and intervention,
potentially saving lives.
In conclusion, this heart attack prediction system represents a significant advancement in personalized
healthcare by combining machine learning with IoT technologies. It can help individuals and healthcare
professionals better manage cardiovascular health by providing real-time insights and alerts, improving the
overall quality of care and reducing the likelihood of fatal heart attacks.

8.2 Limitation
Despite the promising results, the system has some limitations:
1. Data Quality and Availability: The system heavily relies on the availability of accurate, high-
quality data. Incomplete or noisy data can reduce the performance of machine learning models,
leading to inaccurate predictions.
2. Generalization Across Diverse Populations: The model might not generalize well to populations
with different health conditions, behaviors, or environmental factors. Although the dataset used was
diverse, it may not fully represent every demographic group.
3. Hardware and Battery Constraints: IoT devices, particularly wearables, often face power
limitations, especially when used for continuous monitoring over extended periods. Energy
efficiency remains a challenge for long-term use.
4. Real-Time Data Processing: Although the system works efficiently, processing real-time data from
wearable devices can be computationally intensive. While cloud-based systems help, latency issues
might arise, which could delay the generation of alerts.
5. Cost and Scalability: The cost of implementing the system with advanced wearables and machine
learning tools could limit its adoption in low-resource settings. Additionally, scalability to
accommodate large-scale deployment in diverse healthcare environments may require significant
infrastructure investment.
21
8.3 Future Scope
The heart attack prediction system has significant potential for further development and expansion. The
following are some possible areas for future work:
1. Incorporating More Data Sources: Future versions of the system could incorporate more data
sources, such as genetic information, lifestyle data (diet, physical activity), and environmental
factors, to create a more comprehensive health profile for each patient.
2. Integration with Other Health Monitoring Systems: The system could be integrated with
electronic health records (EHRs) and other healthcare management systems to provide doctors and
healthcare providers with a holistic view of a patient's health and enhance decision-making.
3. Advanced Machine Learning Techniques: The use of deep learning models, such as neural
networks, could be explored to further improve the accuracy of predictions, especially in identifying
subtle patterns in large and complex datasets.
4. Real-Time Feedback and Adaptive Models: Future iterations could incorporate real-time feedback,
allowing the system to learn continuously from new data and adjust predictions dynamically. This
would make the system more responsive to changes in a patient's health.
5. Personalized Health Monitoring: The system could be tailored to provide personalized health
monitoring plans, recommending lifestyle changes, medication adjustments, and other interventions
based on individual health risks.
6. Deployment in Low-Resource Environments: To increase accessibility, future work could focus on
making the system more affordable, portable, and energy-efficient for use in remote and
underdeveloped areas, where access to healthcare is limited.
7. Collaboration with Medical Institutions: Collaboration with hospitals, clinics, and healthcare
providers to validate the system in real-world clinical settings would help refine the system, improve
its reliability, and ensure its scalability across different healthcare systems.

22
REFERENCES

Banu, N.S., & Swamy, S. (2016). Prediction of heart disease at early stage using data mining and big
data analytics: A survey. In 2016 International Conference on Electrical, Electronics, Communication,
Computer and Optimization Techniques (ICEECCOT), IEEE, 256–261.
https://ieeexplore.ieee.org/document/7955226
Zahra, I.F., Wisana, I.D.G.H., Nugraha, P.C., & Hassaballah, H.J. (2021). Design a monitoring device for
heart-attack early detection based on respiration rate and body temperature parameters. Indonesian
Journal of Electronics, Electromedical Engineering, and Medical Informatics, 3(3), 114–120.
https://ijeeemi.poltekkesdepkes-sby.ac.id/index.php/ijeeemi/article/view/120
Mienye, I.D., Sun, Y., & Wang, Z. (2020). Improved sparse autoencoder-based artificial neural network
approach for prediction of heart disease. Information Medicine Unlocked, 18, 100307.
https://www.sciencedirect.com/science/article/pii/S2352914820300447
Rani, P., Kumar, R., Ahmed, N.M.S., & Jain, A. (2021). A decision support system for heart disease
prediction based upon machine learning. Journal of Reliable Intelligent Environments, 7(3), 263–275.
https://link.springer.com/article/10.1007/s40860-021-00133-6
Mohan, M., Sharma, A., & Madaan, S. (2019). A hybrid model for heart disease prediction using
ensemble techniques. Journal of Medical Systems, 43(2), 41–49.
https://link.springer.com/article/10.1007/s10916-019-1362-7
Ali, F., El-Sappagh, S., Islam, S.R., Kwak, D., Ali, A., Imran, M., & Kwak, K.-S. (2020). A smart
healthcare monitoring system for heart disease prediction based on ensemble deep learning and
feature fusion. Information Fusion, 63, 208–222.
https://doi.org/10.1016/j.inffus.2020.06.008
Sarmah, S.S. (2020). An efficient IoT-based patient monitoring and heart disease prediction system
using deep learning modified neural network. IEEE Access, 8, 135784–135797.
https://ieeexplore.ieee.org/document/9133567
Sharma, A., & Sharma, M. (2018). Predicting heart disease using machine learning techniques.
International Journal of Computer Applications, 182(29), 12–17.
https://www.ijcaonline.org/archives/volume182/number29/30385-2018015454
Agher, D., Sedki, K., Despres, S., Albinet, J.-P., Jaulent, M.-C., & Tsopra, R. (2022). Encouraging
behavior changes and preventing cardiovascular diseases using the Prevent Connect mobile health
app: Conception and evaluation of app quality. Journal of Medical Internet Research, 24(1), e25384.
https://www.jmir.org/2022/1/e25384
Krist, A.H., Davidson, K.W., Mangione, C.M., Barry, M.J., Cabana, M., Caughey, A.B., Donahue, K.,
Doubeni, C.A., Epling, J.W., Kubik, M., et al. (2020). Behavioral counseling interventions to promote a
healthy diet and physical activity for cardiovascular disease prevention in adults with cardiovascular
risk factors: US Preventive Services Task Force recommendation statement. JAMA, 324(20), 2069–
2075.
https://www.scopus.com/home.uri

23

You might also like