Project Report
Project Report
Project Report
on
“A Hybrid approach towards heart attack prediction”
Bachelor of Technology
in
Computer Science and Engineering
by
CERTIFICATE
This is to certify that the project report entitled “A Hybrid approach towards heart attack prediction”
submitted by Isha Raghav 2100970100055 , Kajal 2100970100059 ,Km.Khushbu(2200970109006) OF
STUDENT 3 to the Galgotias College of Engineering & Technology, Greater Noida, Utter Pradesh, affiliated to
Dr. A.P.J. Abdul Kalam Technical University Lucknow, Uttar Pradesh in partial fulfillment for the award of
Degree of Bachelor of Technology in Computer Science & Engineering is a bonafide record of the project work
carried out by them under my supervision during the year 2024-2025.
ACKNOWLEDGEMENT
We have taken efforts in this project. However, it would not have been possible without the kind support and
help of many individuals and organizations. We would like to extend my sincere thanks to all of them.
We are highly indebted to Ms. Anandpreet Kaur for her guidance and constant supervision. Also, we are highly
thankful to them for providing necessary information regarding the project & also for their support in
completing the project.
We are extremely indebted to Dr. Vishnu Sharma, HOD, Department of Computer Science and Engineering,
GCET and Dr. Jaya Sinha / Mr. Manish Kumar Sharma, Project Coordinator, Department of Computer Science
and Engineering, GCET for their valuable suggestions and constant support throughout my project tenure. We
would also like to express our sincere thanks to all faculty and staff members of Department of Computer
Science and Engineering, GCET for their support in completing this project on time.
We also express gratitude towards our parents for their kind co-operation and encouragement which helped me
in completion of this project. Our thanks and appreciations also go to our friends in developing the project and
all the people who have willingly helped me out with their abilities.
Isha Raghav
Kajal
Km. Khushbu
III
ABSTRACT
Heart attacks are a leading cause of mortality worldwide, accounting for millions of deaths annually. Early
detection and timely intervention are essential to reducing fatalities and improving patient outcomes. This
project presents a system that integrates Internet of Things (IoT)-enabled wearable devices with machine
learning (ML) algorithms to predict heart attack risks more accurately and efficiently.
The system employs wearable devices equipped with sensors to continuously monitor vital health parameters,
including heart rate, blood pressure, electrocardiogram (ECG) signals, and body temperature. These real-time
data are processed using advanced ML models such as Support Vector Machine (SVM), Random Forest, and a
hybrid model combining K-Nearest Neighbors (KNN) and Logistic Regression. The hybrid approach enhances
predictive accuracy by leveraging the strengths of individual algorithms to classify patients into risk
categories—low, medium, and high.
With its IoT-enabled architecture, the system supports remote monitoring, transmitting health data securely to
cloud platforms for analysis. Patients and healthcare providers receive timely alerts and notifications, allowing
prompt medical intervention, especially in remote or resource-constrained settings.
This innovative approach addresses key challenges in heart attack prediction, including improving accuracy,
enabling continuous monitoring, and reducing latency in diagnosis. The system is scalable and cost-effective,
making it suitable for widespread adoption. By combining wearable technology with machine learning, this
solution not only enhances early detection but also contributes to proactive management of cardiovascular
health, ultimately reducing the global burden of heart diseases and saving lives.
IV
CONTENTS
Title Page
CERTIFICATE i
ACKNOWLEDGEMENT ii
ABSTRACT iii
CONTENTS iv
LIST OF TABLES v
LIST OF FIGURES vi
CHAPTER 1: INTRODUCTION 1
CHAPTER 2: LITERATURE REVIEW 3
CHAPTER 3: PROBLEM FORMULATION 5
CHAPTER 4: PROPOSED WORK 8
CHAPTER 5: SYSTEM DESIGN 12
CHAPTER 6: IMPLEMENTATION 15
CHAPTER 7: RESULT ANALYSIS 18
CHAPTER 8: CONCLUSION, LIMITATION, AND FUTURE SCOPE 21
REFERENCES 23
V
List of Tables
LIST OF FIGURES
INTRODUCTION
Heart attack prediction involves identifying individuals at risk of experiencing a heart attack based on their
health data. This task is critical in reducing the mortality and morbidity associated with cardiovascular
diseases, which are a leading cause of death worldwide. Leveraging advancements in technology, machine
learning (ML) has emerged as a powerful tool to analyze complex health data and provide accurate
predictions for early intervention.
1
Heart attack prediction systems leverage data science concepts such as supervised learning, feature selection,
and model evaluation.
• Supervised Learning: Algorithms are trained on labeled datasets containing health metrics and
corresponding outcomes, enabling them to classify risk levels accurately.
• Feature Selection: Critical health parameters like chest pain type, blood pressure, and cholesterol
levels are identified and prioritized for prediction.
• Model Evaluation: Metrics such as accuracy, precision, recall, and F1 score are used to evaluate the
performance of ML models, ensuring reliable predictions.
2
CHAPTER 2
LITERATURE REVIEW
The prediction of heart attacks using advanced technologies like machine learning (ML) and Internet of
Things (IoT)-enabled systems is a prominent area of research. This chapter provides a detailed review of the
methodologies, algorithms, and systems developed to address the challenges in early detection and
prevention of heart attacks.
2.1 Related Literature Review
Machine learning has demonstrated exceptional potential in analyzing medical data and predicting heart
disease. ML algorithms enable the identification of hidden patterns and correlations among multiple health
parameters that are difficult to discern using traditional methods.
• Random Forest and Decision Trees: Studies such as those by Tama et al. (2020) utilized Random
Forest and Gradient Boosting techniques to enhance prediction accuracy. These ensemble methods
combine multiple decision trees, making them robust against overfitting and highly effective in
classifying heart disease risks.
• Deep Learning Approaches: Mienye et al. (2020) developed a model using artificial neural
networks (ANNs) with sparse autoencoders, allowing the system to automatically extract and refine
features from large datasets. This approach proved particularly effective in capturing complex non-
linear relationships in medical data.
• Hybrid Models: Rani et al. (2021) emphasized hybrid approaches by integrating Support Vector
Machines (SVM), Naïve Bayes, and Logistic Regression. By combining these algorithms, the hybrid
models improved prediction accuracy, reduced errors, and delivered reliable classification results.
4
CHAPTER 3
PROBLEM FORMULATION
1. Accuracy of Predictions: Traditional statistical methods often fail to capture non-linear and complex
relationships between multiple risk factors such as cholesterol, blood pressure, and lifestyle choices.
Machine learning models show promise but require optimization and hybridization to improve their
prediction reliability.
2. Real-Time Monitoring Deficiency: Many existing solutions are static, relying on periodic health
check-ups and historical data, which limits their ability to provide real-time risk assessments for
dynamic and transient health conditions.
3. Scalability and Accessibility: High costs, complex hardware setups, and energy requirements make
existing IoT-enabled solutions inaccessible to low-resource environments, leaving a significant gap in
healthcare coverage.
4. Population Diversity: Models trained on specific datasets often fail to generalize to global or
diverse populations due to differences in genetics, lifestyle, and healthcare access.
Addressing these gaps requires an integrated system that combines continuous monitoring, advanced
machine learning algorithms, and scalable IoT solutions to enable accurate, real-time heart attack prediction
for a wide range of users.
5
i) Inputs:
• Health Parameters:
o Continuous data from wearable sensors, including heart rate, ECG, blood pressure, and body
temperature.
o Static attributes like age, gender, and medical history.
• Behavioral and Lifestyle Data: Smoking habits, physical activity levels, and diet patterns (optional
for advanced prediction).
ii) Processes:
• Data Collection: Real-time sensor data is collected via wearable IoT devices.
• Data Preprocessing: Filtering, normalization, and feature extraction techniques are applied to
prepare the data for machine learning analysis.
• Machine Learning Analysis:
o Algorithms such as Support Vector Machines (SVM), Random Forest, and hybrid models
(e.g., KNN combined with Logistic Regression) analyze the data for risk prediction.
o Correlation analysis is used to focus on the most influential features, such as ST depression,
chest pain, and maximum heart rate.
• Alert Generation: The system categorizes risks into low, medium, and high levels and sends alerts
to patients and healthcare providers through mobile applications.
iii) Outputs:
• Risk categorization with real-time updates.
• Alerts and recommendations for medical attention or lifestyle adjustments.
• User-friendly dashboards for patients and doctors to track trends and manage health proactively.
3.4 Objectives
The following objectives outline the focus of this research:
1. Developing a Wearable IoT System:
o Integrate sensors for real-time monitoring of vital parameters.
o Ensure data transmission is secure and seamless.
2. Optimizing Machine Learning Algorithms:
o Implement a hybrid approach using multiple ML algorithms to improve prediction accuracy.
o Perform feature selection to focus on high-impact attributes.
6
3. Real-Time Data Processing:
o Design low-latency systems for continuous analysis and instant alerts.
o Leverage cloud computing to handle computationally intensive tasks.
4. Scalability and Energy Efficiency:
o Develop energy-efficient wearable devices for long-term use.
o Ensure the system is cost-effective and portable, suitable for remote or resource-limited
settings.
5. Validation and Reliability:
o Test the system with diverse datasets to ensure its adaptability across populations.
o Conduct real-world trials to assess usability, reliability, and effectiveness.
6. Improved Healthcare Access:
o Enable healthcare providers to remotely monitor and intervene for patients at high risk.
o Facilitate self-monitoring for users through intuitive mobile applications.
7
CHAPTER 4
PROPOSED WORK
4.1 Introduction
Heart attacks are a major cause of mortality and morbidity globally, and early prediction is crucial in
preventing such incidents. In the field of heart attack prediction, machine learning (ML) and Internet of
Things (IoT)-enabled wearable devices are becoming increasingly important for monitoring patients' health
in real-time. Existing approaches to heart disease prediction often rely on traditional statistical methods or
limited datasets, which can result in inaccurate predictions or lack of real-time capabilities. This chapter
presents a comprehensive approach to heart attack prediction, integrating IoT devices with machine learning
algorithms for continuous and accurate risk assessment.
Our proposed approach aims to address several gaps identified during the literature review, including low
prediction accuracy, lack of real-time monitoring, and challenges related to scalability and accessibility. By
combining advanced ML techniques with IoT-enabled systems, the goal is to develop an efficient, scalable,
and real-time solution that can predict heart attack risks with higher accuracy and provide timely alerts to
both patients and healthcare providers.
Justification from Literature Survey
The literature survey reveals that machine learning algorithms such as Support Vector Machines (SVM),
Random Forest (RF), and hybrid models (combining K-Nearest Neighbors and Logistic Regression) have
shown promising results in improving heart attack prediction accuracy. However, many systems suffer from
low generalizability due to reliance on small, non-representative datasets. For instance, studies by Rani et al.
(2021) demonstrated that hybrid models improve prediction accuracy by combining multiple algorithms,
while Tama et al. (2020) emphasized the benefits of ensemble learning techniques such as Random Forest
and Gradient Boosting.
The integration of IoT devices with machine learning models is another significant advancement. Zahra et
al. (2021) and Ali et al. (2020) highlighted the potential of IoT-enabled wearables in continuously collecting
health data, enabling real-time monitoring of vital parameters. While wearable devices have proven to be
effective in capturing physiological data, they face challenges in terms of energy efficiency, processing
power, and integration with machine learning models for real-time prediction. To overcome these challenges,
this project proposes a hybrid approach that integrates IoT with machine learning for heart attack prediction,
focusing on real-time, scalable, and low-power solutions.
Our approach differs from existing methods by combining the strengths of multiple machine learning
models, such as Random Forest and Logistic Regression, and integrating them with IoT-enabled wearable
devices. The goal is to ensure that the system can monitor patients in real-time, predict heart attack risks, and
generate alerts while being accessible to a wide range of users, especially those in remote or resource-limited
areas.
8
The proposed methodology consists of several key steps that leverage IoT devices for continuous monitoring
and machine learning algorithms for accurate heart attack prediction. The methodology is outlined below in
a step-by-step manner.
9
• If the model predicts a high-risk level, an immediate alert will be sent to both the patient and
healthcare providers, notifying them of the potential risk of a heart attack.
Step 7: Real-Time Monitoring and Feedback
• Continuous monitoring will be carried out through the IoT devices. The system will update risk
predictions in real-time as new data is collected, ensuring timely intervention if necessary.
• Feedback loops will be implemented to allow healthcare providers to monitor trends and intervene
when necessary.
11
CHAPTER 5
SYSTEM DESIGN
12
1. Data Collection: Collects real-time health parameters from wearable IoT devices.
2. Data Preprocessing: Handles noise, missing values, and normalization of data.
3. Feature Selection: Identifies and selects the most important health features for prediction.
4. Machine Learning Analysis: Analyzes the data using machine learning models (SVM, Random
Forest, etc.)
5. Alert Generation: Generates real-time alerts and risk predictions based on analysis.
13
Use case diagrams help visualize the interactions between actors (users, devices) and the system. The
primary actors for this system are the Patient and Healthcare Provider. Below are the key use cases:
• Patient:
o Monitor health data
o Receive risk prediction alerts
• Healthcare Provider:
o Receive alerts
o View real-time patient data
Use Case Diagram Example:
14
CHAPTER 6
IMPLEMENTATION
16
• Number of Samples:
The dataset contains 303 instances (patients), each with 14 attributes (health metrics). Some
instances may have missing values, which are handled during the preprocessing step.
• Description of Attributes:
The dataset includes the following features, which are used as input variables for the machine
learning models:
1. Age: Age of the patient.
2. Sex: Gender of the patient (Male/Female).
3. Chest pain type : Type of chest pain experienced (4 categories).
4. Resting blood pressure : Resting blood pressure (in mm Hg).
5. Serum cholesterol : Serum cholesterol (in mg/dl).
6. Fasting blood sugar : Whether the fasting blood sugar is > 120 mg/dl (binary).
7. Resting electrocardiographic results : Electrocardiographic results (3 categories).
8. Maximum heart rate : Maximum heart rate achieved during exercise.
9. Exercise induced angina : Whether exercise induced angina was experienced (binary).
10. ST depression : Depression induced by exercise relative to rest.
11. Slope of peak exercise ST segment : Slope of the ST segment during peak exercise (3
categories).
12. Number of major vessels colored by fluoroscopy: Number of major vessels (0-3).
13. Thalassemia: Thalassemia status (3 categories: normal, fixed, or reversable).
14. Target variable: Whether the patient has heart disease (binary: 0 for no, 1 for yes).
17
CHAPTER 7
RESULT ANALYSIS
7.1 Performance Measures
To evaluate the effectiveness of the heart attack prediction system, various performance measures are
employed. These metrics are used to assess the accuracy, reliability, and robustness of the machine learning
models applied to predict heart attack risk. The following performance measures are commonly used in
classification problems like ours:
Accuracy
• Definition: Accuracy is the ratio of correctly predicted observations to the total observations. It
provides a basic understanding of how often the model makes correct predictions.
• Formula:
where:
o TP = True Positive (Correctly predicted heart attack)
o TN = True Negative (Correctly predicted no heart attack)
o FP = False Positive (Incorrectly predicted heart attack)
o FN = False Negative (Incorrectly predicted no heart attack)
Precision
• Definition: Precision indicates how many of the predicted positive instances are actually positive. It
helps measure the accuracy of positive predictions.
• Formula:
• Recall (Sensitivity)
• Definition: Recall, also known as sensitivity, measures the proportion of actual positives correctly
identified by the model. It shows how well the model can detect heart attack cases.
• Formula:
• F1-Score
• Definition: The F1-score is the harmonic mean of Precision and Recall. It is useful when the class
distribution is imbalanced, as it gives a better measure of the incorrectly classified cases.
• Formula:
18
Confusion Matrix
• Definition: The confusion matrix is a table that is often used to describe the performance of a
classification model. It shows the true positives, false positives, true negatives, and false negatives,
giving a clear picture of the model's performance.
• Structure:
19
1. Accuracy Comparison:
The bar chart below compares the accuracy of the models used:
Error Analysis:
The RMSE (Root Mean Squared Error) for the models indicates how well they are predicting heart attack
risks. The hybrid model has the lowest RMSE (0.19), suggesting it has the least error in its predictions,
followed by Random Forest (0.20) and SVM (0.22).
AUC-ROC Curve:
The AUC-ROC curve for the hybrid model will likely show the highest AUC score, near 1, indicating
excellent performance in distinguishing between heart attack risk levels. The higher the AUC, the better the
model is at making correct classifications.
20
CHAPTER 8
8.1 Conclusion
In this project, we have developed a heart attack prediction system by leveraging machine learning
algorithms and IoT-enabled wearable devices. The system aims to provide real-time, accurate risk
assessments for heart attacks by analyzing key health parameters such as heart rate, blood pressure, ECG,
and other vital signs. Our approach integrates multiple machine learning models, including Support Vector
Machine (SVM), K-Nearest Neighbors (KNN), Random Forest, Decision Trees, and hybrid models, to
improve prediction accuracy.
From the analysis and experimental results, it is evident that hybrid models outperform individual
algorithms, achieving high accuracy, precision, and recall. The system was able to predict heart attack risks
effectively and generate timely alerts to both healthcare providers and patients. The integration of wearable
IoT devices allows continuous, non-intrusive monitoring, enabling early detection and intervention,
potentially saving lives.
In conclusion, this heart attack prediction system represents a significant advancement in personalized
healthcare by combining machine learning with IoT technologies. It can help individuals and healthcare
professionals better manage cardiovascular health by providing real-time insights and alerts, improving the
overall quality of care and reducing the likelihood of fatal heart attacks.
8.2 Limitation
Despite the promising results, the system has some limitations:
1. Data Quality and Availability: The system heavily relies on the availability of accurate, high-
quality data. Incomplete or noisy data can reduce the performance of machine learning models,
leading to inaccurate predictions.
2. Generalization Across Diverse Populations: The model might not generalize well to populations
with different health conditions, behaviors, or environmental factors. Although the dataset used was
diverse, it may not fully represent every demographic group.
3. Hardware and Battery Constraints: IoT devices, particularly wearables, often face power
limitations, especially when used for continuous monitoring over extended periods. Energy
efficiency remains a challenge for long-term use.
4. Real-Time Data Processing: Although the system works efficiently, processing real-time data from
wearable devices can be computationally intensive. While cloud-based systems help, latency issues
might arise, which could delay the generation of alerts.
5. Cost and Scalability: The cost of implementing the system with advanced wearables and machine
learning tools could limit its adoption in low-resource settings. Additionally, scalability to
accommodate large-scale deployment in diverse healthcare environments may require significant
infrastructure investment.
21
8.3 Future Scope
The heart attack prediction system has significant potential for further development and expansion. The
following are some possible areas for future work:
1. Incorporating More Data Sources: Future versions of the system could incorporate more data
sources, such as genetic information, lifestyle data (diet, physical activity), and environmental
factors, to create a more comprehensive health profile for each patient.
2. Integration with Other Health Monitoring Systems: The system could be integrated with
electronic health records (EHRs) and other healthcare management systems to provide doctors and
healthcare providers with a holistic view of a patient's health and enhance decision-making.
3. Advanced Machine Learning Techniques: The use of deep learning models, such as neural
networks, could be explored to further improve the accuracy of predictions, especially in identifying
subtle patterns in large and complex datasets.
4. Real-Time Feedback and Adaptive Models: Future iterations could incorporate real-time feedback,
allowing the system to learn continuously from new data and adjust predictions dynamically. This
would make the system more responsive to changes in a patient's health.
5. Personalized Health Monitoring: The system could be tailored to provide personalized health
monitoring plans, recommending lifestyle changes, medication adjustments, and other interventions
based on individual health risks.
6. Deployment in Low-Resource Environments: To increase accessibility, future work could focus on
making the system more affordable, portable, and energy-efficient for use in remote and
underdeveloped areas, where access to healthcare is limited.
7. Collaboration with Medical Institutions: Collaboration with hospitals, clinics, and healthcare
providers to validate the system in real-world clinical settings would help refine the system, improve
its reliability, and ensure its scalability across different healthcare systems.
22
REFERENCES
Banu, N.S., & Swamy, S. (2016). Prediction of heart disease at early stage using data mining and big
data analytics: A survey. In 2016 International Conference on Electrical, Electronics, Communication,
Computer and Optimization Techniques (ICEECCOT), IEEE, 256–261.
https://ieeexplore.ieee.org/document/7955226
Zahra, I.F., Wisana, I.D.G.H., Nugraha, P.C., & Hassaballah, H.J. (2021). Design a monitoring device for
heart-attack early detection based on respiration rate and body temperature parameters. Indonesian
Journal of Electronics, Electromedical Engineering, and Medical Informatics, 3(3), 114–120.
https://ijeeemi.poltekkesdepkes-sby.ac.id/index.php/ijeeemi/article/view/120
Mienye, I.D., Sun, Y., & Wang, Z. (2020). Improved sparse autoencoder-based artificial neural network
approach for prediction of heart disease. Information Medicine Unlocked, 18, 100307.
https://www.sciencedirect.com/science/article/pii/S2352914820300447
Rani, P., Kumar, R., Ahmed, N.M.S., & Jain, A. (2021). A decision support system for heart disease
prediction based upon machine learning. Journal of Reliable Intelligent Environments, 7(3), 263–275.
https://link.springer.com/article/10.1007/s40860-021-00133-6
Mohan, M., Sharma, A., & Madaan, S. (2019). A hybrid model for heart disease prediction using
ensemble techniques. Journal of Medical Systems, 43(2), 41–49.
https://link.springer.com/article/10.1007/s10916-019-1362-7
Ali, F., El-Sappagh, S., Islam, S.R., Kwak, D., Ali, A., Imran, M., & Kwak, K.-S. (2020). A smart
healthcare monitoring system for heart disease prediction based on ensemble deep learning and
feature fusion. Information Fusion, 63, 208–222.
https://doi.org/10.1016/j.inffus.2020.06.008
Sarmah, S.S. (2020). An efficient IoT-based patient monitoring and heart disease prediction system
using deep learning modified neural network. IEEE Access, 8, 135784–135797.
https://ieeexplore.ieee.org/document/9133567
Sharma, A., & Sharma, M. (2018). Predicting heart disease using machine learning techniques.
International Journal of Computer Applications, 182(29), 12–17.
https://www.ijcaonline.org/archives/volume182/number29/30385-2018015454
Agher, D., Sedki, K., Despres, S., Albinet, J.-P., Jaulent, M.-C., & Tsopra, R. (2022). Encouraging
behavior changes and preventing cardiovascular diseases using the Prevent Connect mobile health
app: Conception and evaluation of app quality. Journal of Medical Internet Research, 24(1), e25384.
https://www.jmir.org/2022/1/e25384
Krist, A.H., Davidson, K.W., Mangione, C.M., Barry, M.J., Cabana, M., Caughey, A.B., Donahue, K.,
Doubeni, C.A., Epling, J.W., Kubik, M., et al. (2020). Behavioral counseling interventions to promote a
healthy diet and physical activity for cardiovascular disease prevention in adults with cardiovascular
risk factors: US Preventive Services Task Force recommendation statement. JAMA, 324(20), 2069–
2075.
https://www.scopus.com/home.uri
23