0% found this document useful (0 votes)

28 views

Predicting Heart Disease Through Machine Learning Methods

Heart diseases including heart attacks, cause about 31% of global deaths, remaining a significant health threat despite preventability. Limited tech advancements and awareness, especially in developing nations, amplify this challenge. Machine learning offers promise in tackling this issue, with studies advocating ensemble methods for accurate predictive models.

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Predicting Heart Disease Through Machine Learning Methods

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Volume 9, Issue 9, September– 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP382

Predicting Heart Disease through

Machine Learning Methods
Latthika S
Post Graduate Program
Vellore Institute of Technology, Bangalore,
Karnataka, India

Abstract:- Heart diseases including heart attacks, cause bodily organs, presenting a formidable challenge. Unhealthy
about 31% of global deaths, remaining a significant health dietary habits and the rapid pace of modern lifestyles
threat despite preventability. Limited tech advancements contribute substantially to the heightened risk of heart-related
and awareness, especially in developing nations, amplify diseases.
this challenge. Machine learning offers promise in tackling
this issue, with studies advocating ensemble methods for Leveraging machine learning and deep learning
accurate predictive models. These models analyze techniques to analyse diverse patient data within the medical
extensive medical data to efficiently predict heart diseases, field offers a promising avenue for assessing risks, identifying
undergoing stages like data exploration, feature selection, symptoms, and predicting heart-related diseases. Factors such
model implementation, and comparative analysis. A model as diabetes, smoking, excessive alcohol consumption, high
using Logistic Regression, Naive Bayes, and Random cholesterol, high blood pressure, and obesity significantly
Forest initially identified top-performing models, later elevate the risk of heart issues. Despite efforts to manage these
refined to CatBoost, RandomForest, and XGBoost factors, heart diseases can manifest regardless of gender or
through cross-validation and tuning. A hybrid model, age.
combining Logistic Regression, CatBoost, and
RandomForest, achieved a 97% accuracy, showcasing The purpose of this project is to use machine learning and
improved precision, recall, F1 score, and ROC AUC. This deep learning techniques to predict heart disease by doing a
underscores machine learning's potential in enhancing thorough examination of these risk variables. These kinds of
predictive accuracy and refining strategies to combat predictive powers could transform healthcare and improve
heart diseases effectively. people's lives. Furthermore, the research delves into a range of
heart disorders, such as Cardiomyopathy, Congenital Heart
Keywords:- Logistic Regression(LR), K-Nearest Disease, Heart Failure, and coronary artery disease, each with
Neighbors(KNN), RandomForest(RF), CatBoost(CB), unique traits and implications for the cardiovascular system.
XSBoost (XSB), Stochastic Gradient Descent(SGD), Cross-
Validation(CV), Support Vector Machine(SVM) In the initial phase of the study, the dataset was loaded,
Hyperparameter Tuning(HT) and Voting Classifier(VC). and multiple machine learning algorithms were employed,
including SGD, NB, RF, CB, XB, KNN, LR, and SVM.
I. INTRODUCTION Performance metrics such as Precision, recall, accuracy, F1
Score, and ROC AUC were computed, identifying the top-
In today's fast-paced world, the emphasis on self-care performing models before hyperparameter adjustment.
often gets overshadowed by the demands of daily life, leading
to heightened stress levels and neglect of one's health. Even Subsequently, cross-validation and hyperparameter
with the progress that medicine has made, diseases like cancer, tuning were performed, leading to the identification of another
heart disease, and tuberculosis still take a lot of lives each year. set of top-performing models with enhanced predictive
Globally, cardiovascular disease (CVD) is now the leading capabilities. Most models exhibited noticeable improvements
cause of death, accounting for about 31% of all deaths, across various criteria following hyperparameter adjustment,
according to the World Health Organisation (WHO). Over the particularly in precision, recall, accuracy, and F1 score.
span of 15 years, WHO reported an alarming 15.2 million
deaths attributed to heart-related diseases, underscoring the Finally, a hybrid model combining LR, CB, and RF was
persistent threat posed by these conditions. Notably, heart- developed using a voting classifier. This model demonstrated
related ailments inflicted a significant economic toll, remarkable predictive performance, achieving high accuracy
amounting to around $237 billion in India alone between 2005 and impressive precision and recall scores. The balanced F1
and 2015. score and outstanding ROC AUC further underscored the
model's overall performance.
The heart, as a vital organ responsible for blood
circulation, plays a crucial role in supplying oxygen and This comprehensive approach utilizing machine learning
nutrients throughout the body. Any dysfunction in this techniques highlights the potential to accurately predict heart
essential organ severely impacts the functionality of other disease, marking significant progress in early identification

IJISRT24SEP382 www.ijisrt.com 829

Volume 9, Issue 9, September– 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP382

and intervention against cardiovascular ailments. The advanced techniques such as genetic algorithms and hyper-
integration of various algorithms and methodologies signifies parameter optimization, the absence of a thorough explanation
the potential for impactful advancements in healthcare and of feature selection procedures is noted [6].
enhanced patient outcomes.
Katarya, R., & Meena, S. K.'s literature review paper
II. LITERATURE SURVEY explores the use of machine learning and deep learning
techniques for heart disease analysis. The systematic review of
In the paper "Heart disease identification from patients’ existing literature aims to guide future research in the
social posts, machine learning solution on spark" by H. healthcare industry. However, the paper's reliance on
Ahmed, E.M.G. Younis, A. Hendawi, and A.A. Ali, Apache secondary sources and lack of original research may limit its
Spark and Apache Kafka are utilized alongside machine contributions [7].
learning methods such as Decision Tree, Support Vector
Machine, RF Classifier, and LR Classifier to create a real-time Abeer Alsadoon's paper compares the accuracy of
system for predicting heart disease from medical data streams. different machine learning models for heart disease prediction.
The methodology includes feature selection algorithms, While the study recommends specific models for
machine learning algorithms, hyperparameter tuning, and classification, the lack of detail regarding feature selection is
cross-validation. However, limitations exist in terms of sample identified as a drawback [8].
size, data quality, and generalizability to other populations [1].
Katarya, R., & Meena, S. K.'s paper delves into the
S. Matin Malakouti's paper, "Heart disease classification application of machine learning for heart disease prediction,
based on ECG using machine learning models," explores the emphasizing the increasing prevalence of heart disease and the
automated categorization of Electrocardiography (ECG) data need for efficient data analysis in the medical sector. The study
using Gaussian NB, RF, LR, and Linear Discriminant reviews various risk factors and employs algorithms such as
Analysis. The study discusses the advantages and LR, K-Nearest Neighbor, Support Vector Machine, Naïve
disadvantages of these methods, emphasizing the use of 10- Bayes, and Decision Trees for prediction and classification
fold cross-validation to reduce prediction variance and avoid [9].
biased assessment. However, the study's limitation lies in the
challenges of accurately distinguishing between healthy and Finally, the paper by Naseri, A., Tax, D., van der Harst,
sick individuals using machine learning and deep learning P., Reinders, M., & van der Bilt explores the use of machine
methods [2]. learning methods to detect atrial fibrillation and heart failure
from wearable devices. While the study presents innovative
Md Mamun Ali et al.'s paper, "Heart disease prediction methods for cardiovascular outcome prediction, data privacy
using supervised machine learning algorithms: Performance concerns and limited sample size may impact the
analysis and comparison," investigates various machine generalizability of the findings [11].
learning classifiers for heart disease prediction. While the RF
method achieved 100% accuracy, sensitivity, and specificity III. PROPOSED METHODOLOGY
on a specific dataset, the study's reliance on a single dataset
raises concerns about generalizability to other datasets [3]. Before implementing cross-validation and
hyperparameter tuning, LR was the leading model, exhibiting
L. Sharan Monica et al.'s paper, "Latest trends on heart commendable accuracy. However, following cross-validation
disease prediction using machine learning and image fusion," and hyperparameter tuning, CB emerged as the top-
aims to develop a program for reliable and instant disease performing model, showcasing superior accuracy. Throughout
diagnosis. The methodology involves exploratory data these processes, the RF algorithm consistently demonstrated
analysis, attribute selection, and the use of machine learning strong performance both before and after tuning.
methods such as NB, decision trees, SVM, and artificial neural
networks. Similar to previous studies, the reliance on a single Given the robust performances of LR, CB, and RF
dataset limits the generalizability of the findings [4]. individually, a hybrid model was crafted using a voting
classifier, leveraging the strengths of these three algorithms.
Ivan Miguel Pires et al.'s paper, "Machine learning for
the evaluation of the presence of heart disease," explores The code demonstrates the creation and evaluation of a
different machine learning techniques for detecting cardiac VCensemble, amalgamating three distinct algorithms:
illness. Despite achieving high accuracy using Decision Tree RFClassifier, CBClassifier, and LogisticRegression. The
and Support Vector Machine approaches, the paper lacks a 'voting' parameter is set to 'soft', indicating that the final
detailed description of feature extraction, selection, and model prediction is determined by the weighted average probability
training methods [5]. of each classifier.

Jinny, S. V., & Mate, Y. V.'s paper, "Early prediction After training the VC on the given dataset, it's evaluated
model for coronary heart disease using genetic algorithms, using various metrics. The achieved performance metrics are
hyper-parameter optimization, and machine learning impressive: an accuracy of 97%, with a precision of 99%,
techniques," aims to identify heart diseases using machine recall of 95%, and an F1 score of 97%. Additionally, the
learning methods and heart rate features. While the study uses Receiver Operating Characteristic (ROC) curve showcases an

IJISRT24SEP382 www.ijisrt.com 830

Area Under the Curve (AUC) of 99.85%, signifying  Handling Missing Values: Address missing values through
exceptional model discrimination ability across different imputation (mean, median, mode) or deletion based on the
thresholds. extent of missingness.
 Outlier Treatment: Identify and handle outliers using
The 'soft' voting method considers the probabilities statistical methods (e.g., Z-score, IQR) to prevent skewing
predicted by each model, weighing them and making of results.
predictions based on these weighted probabilities. This tends  Normalization/Standardization: To improve model
to offer more nuanced decisions by taking into account the performance and convergence, normalize or standardize
confidence levels of individual models. In contrast, 'hard' numerical features to bring them to a common scale.
voting considers only the class labels predicted by each model
and selects the majority class as the final prediction. The 'soft'  Data Split:
approach can often lead to improved performance when Partitioning the dataset into test, validation, and training
models are well-calibrated and have reliable probability sets is essential for building and assessing models:
estimates.
 Training set: The predictive model is trained using the
IV. SYSTEM DESIGN & IMPLEMENTATION training set.
 Validation Set: Used to evaluate model performance
To create a reliable predictive model for heart disease, during training and adjust hyperparameters.
the suggested methodology includes sophisticated machine  Test Set: Used to assess the performance of the finished
learning algorithms, deliberate data preprocessing, and model model on unobserved data.
validation. The steps in the methodology are as follows:
B. Exploratory Data Analysis (EDA):
A. Data Collection and Preprocessing
 Descriptive Analysis:
 Data Sourcing: Understand the dataset's characteristics, distributions,
Obtaining a comprehensive dataset involves sourcing and statistical summaries:
diverse patient information from various sources, including
hospitals, research databases, or healthcare institutions. This  Central Tendency: Mean, median, mode of features.
dataset should encompass:
 Dispersion: Standard deviation, range, interquartile range
(IQR).
 Demographics: Age, gender, ethnicity, etc.
 Correlation Analysis: Identify relationships between
 Medical History: Pre-existing conditions (diabetes,
variables (e.g., correlation matrix) to understand feature
hypertension), medication history.
importance.
 Vital Signs: Blood pressure, heart rate, BMI.
 Lab Results: Cholesterol levels, blood glucose, etc.  Visualization:
Utilize visual tools to gain deeper insights and identify
 Data Cleaning: potential patterns related to heart disease:
Cleaning the dataset is essential to ensure data quality
and consistency:  Histograms: Display distributions of numerical variables.
 Heatmaps: Visualize correlations between features.
 Scatter Plots: Explore relationships between two numerical
variables.

IJISRT24SEP382 www.ijisrt.com 831

Fig 1 Correlation Matrix with Heat Map

C. Feature Selection and Engineering:  Logistic Regression (LR):

Regression and classification tasks are two popular uses
 The Significance of Features: for supervised machine learning algorithms such as LR. To
Utilise methods (such as correlation matrices and forecast the categories into which categorical data will be split,
statistical tests) to identify pertinent features linked to heart LR uses probability. It blends input numbers linearly for
disease. outcome prediction by using coefficient values and a sigmoid
or logistic function. The sigmoid function is employed to
 Feature Engineering: estimate maximum likelihood using the most probable
To improve the model's capacity for prediction, add new evidence, resulting in an event's probability ranging from 0 to
features or modify current ones. 1. Classification problems occur when a decision threshold is
used. Binary (0 or 1), Multinomial (three or more categories
D. Algorithm Selection: without a hierarchy), and Ordinal (three or more categories
Explore diverse machine learning algorithms suited for with a hierarchy) are several forms of logistic regression (LR).
heart disease classification tasks. Despite its simplicity and good predictive power, LR remains

IJISRT24SEP382 www.ijisrt.com 832

prone to categorization issues. The LR formula for CB also incorporates robust handling of missing data and
establishing the probability that input X belongs in class 1 can provides excellent accuracy by default, requiring minimal
be expressed as: hyperparameter tuning, making it an efficient and user-
friendly choice for predictive modeling tasks, especially in
scenarios with complex datasets containing categorical
features.

 XGBoost (XB):
XB, an abbreviation for extreme Gradient Boosting,
Here is bias and is the weight that is multiplied stands out as a highly efficient and accurate ensemble learning
by input X . technique tailored for structured or tabular data. Belonging to
the gradient boosting family, it constructs models sequentially,
 K-Nearest Neighbors (KNN): addressing the shortcomings of its predecessors. By
A flexible supervised machine learning technique for integrating weak learners, typically decision trees, XB
regression and classification applications is the KNN mitigates loss through the optimization of a predefined
algorithm. It functions according to the similarity principle, objective function. Its methodology entails a gradient descent
which states that the majority class of a sample's KNN in the algorithm, which computes gradients for updating model
feature space determines its class. To ascertain the KNN for a parameters. This algorithm aims to minimize a regularized
new data point, KNN calculates the Euclidean distance objective, comprising both a loss function and a penalty term,
between the new point and each point in the training set. A thereby preventing overfitting and enhancing generalization.
majority vote among these neighbours then determines the The final prediction of the XB model results from a weighted
class of the new point. In regression tasks, KNN uses a aggregation of predictions generated by individual trees within
weighted average or an average of the target values of its KNN the ensemble. The objective function of XB incorporates a loss
to forecast the value of the incoming data point. In an n- function (L) for error measurement and a regularization term
dimensional feature space. The Euclidean distance between (Ω) to manage model complexity, formulated as: Objective =
two points is determined using the subsequent formula: L(predictions, targets) + Ω(complexity)

Euclidean distance = √(∑i=1N (pi-qi)²)  Naive Bayes (NB):

The basic Bayes theorem, which presumes predictor
This distance metric measures the proximity between independence, is the basis for the NB probabilistic classifier.
data points, forming the foundation for KNN's decision- To ascertain the possibility that an instance belongs to a
making process. specific class, this classifier computes the cumulative
probability of attributes. Comparing NB to other models, it
 Random Forest (RF): frequently outperforms them in text categorization and spam
RF is a popular ensemble learning method that excels at filtering tasks despite its simplicity and feature independence
both classification and regression due to its strong resilience assumption. The following is the formula for NB, which
and accuracy. It consists of several decision trees that were comes from the Bayes theorem:
trained using a random subset of features and a bootstrapped
sample of the dataset. When making a prediction, the ensemble P(S|R) = P(R|S)P(S) / P(R)
averages the predictions made by each individual tree to get
the final result, which is either the mode or the mean for Where
classification or regression. This approach promotes diversity
among the trees, mitigating overfitting and enhancing P(S|R) is s the posterior probability of class S given predictor
generalization by reducing sensitivity to noise and variance in R.
the data. RF's capability to handle large datasets and capture P(R|S) is the likelihood, the probability of predictor R given
complex feature relationships makes it popular across various class S.
machine learning applications, providing reliable and robust P(S) is the prior probability of the class S.
predictions. P(R) is the probability of predictor R.

 CatBoost (CB):  Stochastic Gradient Descent (SGD)

CB is a powerful gradient boosting algorithm designed An iterative optimization technique called SGD (SGD)
for handling categorical variables in machine learning tasks. trains machine learning models, especially on big datasets, to
Its name, "CB," derives from its ability to effectively handle minimize the loss function and determine the ideal parameters.
categorical features without the need for extensive pre- SGD is computationally efficient since it changes the model's
processing, reducing the risk of overfitting. Developed by parameters using a single randomly selected data point or a
Yandex, CB employs an innovative method to handle tiny subset (mini-batch) of data, as opposed to traditional
categorical data, utilizing a variant of gradient boosting that Gradient Descent, which uses the complete dataset for each
integrates an advanced algorithm for handling categorical iteration. The gradient of the loss function with respect to the
variables. It employs a symmetric tree structure and utilizes current parameters is calculated as part of the SGD parameter
novel strategies like ordered boosting and oblivious trees to update procedure using a randomly selected data point or mini-
optimize model performance while minimizing overfitting. batch. To minimize the loss, the parameters are then changed

IJISRT24SEP382 www.ijisrt.com 833

in the gradient's opposite direction. θt+1=θt−α⋅∇f(θt;xi,yi) is  F1-Score:

the formula for updating the parameters θ in SGD at each
iteration t. Here, α stands for learning rate, and ∇f(θt;xi,yi) Formula: F1= 2 * (Precision * Recall) / (Precision + Recall)
denotes the gradient of the loss function f at parameters θt with
respect to a randomly selected data point (xi,yi). The F1 score provides a fair evaluation of both recall and
precision by computing the harmonic mean of the two. When
 Model Implementation: there is an uneven distribution of classes, such as in
Develop and train multiple models using the selected imbalanced datasets used to forecast heart disease, it is quite
algorithms on the training dataset: useful.

 Model Development: Implement the selected algorithms  Area Under Curve - Receiver Operating Characteristic, or
using appropriate libraries (e.g., scikit-learn) to create ROC-AUC:
predictive models. The ROC Curve is a plot of True Positive Rate
 Training: Train each model on the training dataset using (Sensitivity) against False Positive Rate (1 - Specificity). How
appropriate parameters. successfully the model can distinguish between the two groups
(heart disease vs. no heart disease) is shown by the area under
E. Model Assessment: the ROC curve (AUC). A greater AUC in heart disease
A crucial first step in evaluating the efficacy, precision, prediction indicates improved ability to distinguish between
and resilience of machine learning models for heart disease those with and without heart disease.
prediction is model evaluation. A few crucial elements of
model evaluation are as follows: The code initializes an empty dictionary model_scores1
to store evaluation metrics for various machine learning
 Accuracy: models. After that, iterating through a dictionary of models,
each model is assessed using X_test and y_test data after being
Formula: (TP + TN) / (TP + TN + FP + FN) trained using X_train and y_train data. Using the appropriate
functions from scikit-learn, it computes evaluation metrics for
Measures the proportion of correct predictions out of the each model, including precision, recall, accuracy, F1 score,
total predictions made. and ROC AUC. These metrics are then appended to the
model_scores1 dictionary along with the model's name. This
In heart disease prediction, it reflects the overall process creates a structured collection of evaluation scores for
correctness of identifying both healthy individuals and those each model, allowing easy comparison of their performance.
with heart disease.
The code employs Python libraries like matplotlib,
 Precision: pandas, and scikit-learn to visualize and analyze the
performance metrics of multiple machine learning models for
Formula: TP / (TP + FP) classification tasks. Initially, it imports necessary modules for
plotting, data manipulation, and model evaluation. Assuming
Indicates the accuracy of positive predictions. In heart the existence of a populated DataFrame model_scores1
disease prediction, it measures the proportion of correctly containing model performance metrics (Precision, recall,
identified individuals with heart disease among all predicted accuracy, F1 Score, ROC AUC) for various models, it
positive cases. converts this data into a pandas. DataFrame scores_df. The
subsequent section uses matplotlib to create a 2x3 subplot grid,
High precision means fewer false positives, reducing plotting bar graphs for each metric (Precision, recall, accuracy,
unnecessary interventions or treatments for individuals who F1 Score, ROC AUC) against different model names on
are actually healthy. separate subplots, enabling visual comparison of model
performances. It then identifies and prints the top-performing
 Recall (Sensitivity): models based on each metric and displays their individual
performance metrics like precision, recall, accuracy, F1 score,
Formula: TP / (TP + FN) and ROC AUC. The code concludes by summarizing the
overall analysis of top models' performances, aiming to
Evaluates how well the model can accurately recognise provide insights into the most effective models for the
every positive case. When it comes to heart disease prediction, classification task at hand. The code's layout enables a
it measures the percentage of accurately diagnosed heart comprehensive analysis and comparison of multiple models'
disease patients among all true positive cases. performances, aiding in model selection and decision-making
processes based on key evaluation metrics.
A high recall rate indicates that the model is successful
in identifying heart disease patients, lowering the possibility
of overlooking those who need medical attention.

IJISRT24SEP382 www.ijisrt.com 834

Fig 2 Model Evaluation Result 1

Fig 3 Model Evaluation Result 2

IJISRT24SEP382 www.ijisrt.com 835

F. Cross-Validation and Parameter Tuning: evaluates all specified hyperparameter combinations

Cross-validation is a basic machine learning approach for exhaustively, while randomized search selects combinations
evaluating the performance and generalisability of a model. randomly from predefined ranges. By fine-tuning
With this approach, the dataset is divided into several folds, or hyperparameters via cross-validation, models can achieve
subsets, and the model is repeatedly trained on one fold and better performance metrics like accuracy, precision, and recall.
verified on the remaining folds. The widely used method The objective is to identify the hyperparameter set that
known as "K-fold cross-validation" splits the dataset into k maximizes the model's predictive ability and generalization on
subgroups. Once each fold is utilised as a validation set, the unseen data, thus improving its efficacy in real-world
remaining folds are used to train the model. By guaranteeing applications.
that every data point appears in the validation set precisely
once, this enhances prediction reliability and lessens biases The code utilizes the Scikit-learn and CB libraries to
resulting from a single train-test split.Through cross- build models, perform cross-validation, and tune
validation, we can identify issues like overfitting or hyperparameters for various machine learning algorithms. It
underfitting and adjust parameters to enhance model accuracy begins by generating a sample dataset, preprocessing it using
and generalization. StandardScaler, and splitting it into training and test sets. The
code then iterates through different models, conducting cross-
Parameter tuning, or hyperparameter optimization, is the validation to evaluate performance and tuning
process of selecting the optimal combination of hyperparameters to optimize accuracy on the test set.
hyperparameters for a machine learning system. GridSearchCV or RandomizedSearchCV is employed to
Hyperparameters, such as decision tree depth or learning rate, search for the best hyperparameters for each model type,
govern the model's learning process and are external including LR, KNN, NB, SVM, XB, and CB. Finally, it
configurations. Grid search and randomized search are outputs the best hyperparameters found for each model along
common techniques used for parameter tuning. Grid search with their corresponding accuracy scores on the test set.

Fig 4 Cross-Validation and Hyper Parameter Tuning

G. Re- Model Evaluation after Cross-Validation and Hyper  Comparison with Initial Results:
Parameter Tuning Comparing the performance metrics before and after
We are evaluating the model performance on the test set tuning helps gauge the improvement achieved through
using various metrics like precision, recall, accuracy, F1 score, hyperparameter tuning. It allows us to verify if the changes
and ROC AUC again after cross-validation and hyper made to the model indeed enhance its predictive capabilities.
parameter tuning. The reason why we performing model
evaluation gain after cross validation are as follows:  Selecting the Best Model:
Post-tuning, this evaluation helps identify the top-
 Performance Evaluation on Test Set: performing models based on their performance on the unseen
The initial assessment you performed before any tuning test data. It ensures that you select the best-performing model
provides a baseline. However, after cross-validation and for deployment or further consideration.
hyperparameter tuning, the model might have changed
significantly. Hence, it's essential to evaluate the tuned models
on an unseen dataset (the test set) to get a realistic estimate of
how well our models generalize to new, unseen data.

IJISRT24SEP382 www.ijisrt.com 836

 Providing Final Conclusions: Therefore, re-evaluating the model on the test set post-
This evaluation assists in summarizing the outcomes of tuning is a vital step to ensure you have an accurate
the entire process, emphasizing the improvements achieved understanding of the model's performance and to make
through tuning and aiding in decision-making for model informed decisions about which model(s) to proceed with.
selection or next steps in the model.

Fig 5 Re- Model Evaluation Result 1

Fig 6 Re- Model Evaluation Result 2

IJISRT24SEP382 www.ijisrt.com 837

H. Comparative Analysis and Evaluation: across multiple metrics, solidifying their positions as top-
performing models even after tuning.
 Before Tuning:
The initial assessment of the models revealed varied  Comparative Analysis and Evaluation.
performances. Models like LR and NB displayed The tuning process had a considerable impact on refining
commendable precision, recall, accuracy, and F1 Score, the models' predictive abilities. Models that initially had
whereas KNN exhibited relatively lower performance in weaker performance, like KNN, exhibited notable
comparison. Notably, models such as SVM and SGD depicted improvements in accuracy, precision, and F1 Score.
suboptimal performance, evident from lower accuracy, recall, Furthermore, the SVM and SGD models, which initially
and F1 Score. performed poorly, showed noticeable enhancements in
multiple metrics post-tuning.
 After Tuning (Cross-Validation & Hyperparameter
Tuning):  Best Model Selection.
Following cross-validation and hyperparameter tuning, The evaluation highlights that CB, after tuning, emerged
there was a marked improvement across most models. LR as the most consistent and robust performer. It displayed
showed significant enhancements in precision and F1 Score, noteworthy improvements in precision, recall, accuracy, F1
indicating better predictive capabilities after tuning. After Score, and ROC AUC. These enhancements position CB as
post-tuning, RF showed notable gains in ROC AUC, accuracy, the top-performing model among the others, showcasing its
and precision, indicating its increased robustness as a suitability for this specific dataset and problem context.
classifier. CB and XB retained their high-performance levels

Fig 7 Model Accuracy Comparison before and after Tuning Result 1

Fig 8 Model Precision Comparison before and after Tuning Result 2

IJISRT24SEP382 www.ijisrt.com 838

Fig 9 Model Recall Comparison before and after Tuning Result 1

Fig 10 Model F1 Score Comparison before and after Tuning Result 2

Fig 11 Model ROC AUC Comparison before and after Tuning

IJISRT24SEP382 www.ijisrt.com 839

I. Development of Hybrid Model probabilities assigned by each model, ultimately enhancing the
ensemble's predictive accuracy.
 Introduction
The hybrid model we've constructed amalgamates the Subsequently, the VCis trained on the given training
predictive strengths of three key models: LR, deemed the best dataset (X_train and y_train) via the fit() function, which
performer before cross-validation and hyperparameter tuning, allows the ensemble to learn from the provided data. Through
CB, identified as the superior model after this tuning, and the this process, the hybrid model learns to make predictions by
RF demonstrating consistent competence both before and after aggregating the outputs of the individual classifiers,
the tuning process. The inclusion of LR, CB, and RF within harnessing the diverse strengths and approaches of each
this hybrid framework leverages the varied strengths and model. Upon training completion, the voting_classifier
diverse learning methodologies of these models. LR, instance is equipped to predict on new, unseen data,
recognized for its interpretability and simplicity, acts as a capitalizing on the collective intelligence derived from the
strong baseline, whereas CB, with its advanced boosting constituent classifiers to potentially improve overall predictive
technique and optimized parameters post-tuning, bolsters performance.
predictive accuracy. Additionally, the RF, exhibiting
commendable performance across different scenarios,  Hybrid Model Performance
contributes to the ensemble's robustness and adaptability. This Evaluating the performance of the trained hybrid
hybridization strategy aims to capture the collective prowess VotingClassifier model involves using various assessment
of these models, potentially enhancing predictive accuracy and metrics and visual aids. Following the training of the
resilience across a wide spectrum of datasets and real-world 'voting_classifier' on the dataset, the model undergoes testing
scenarios. using the test set (X_test) to generate predictions ('y_pred') for
the target values. Evaluation metrics such as precision, recall,
 Hybrid Model Implementation accuracy, and F1 score are calculated to assess the model's
The code demonstrates the creation of a hybrid model predictive accuracy, indicating its capability to accurately
using the VotingClassifier (VC) ensemble from Scikit-Learn. classify instances from the test data. Furthermore, the Receiver
The objective is to merge the predictive capabilities of three Operating Characteristic (ROC) curve and its associated Area
distinct machine learning algorithms: RF Classifier, CB Under the Curve (AUC) metric are determined, offering
Classifier, and LR. Initially, individual instances of these insights into the model's balance between true positive rate and
classifiers are initialized with specific hyperparameters: a RF false positive rate across varying threshold values. The
Classifier with 100 estimators, a CB Classifier with 100 resulting ROC curve visualization illustrates the model's
iterations, and a LR instance. These models are integrated into discriminative performance, highlighting its ability to
a VC, which acts as a meta-estimator, combining the distinguish between classes effectively. This thorough
predictions of its constituent models. The 'soft' voting scheme, assessment aids in comprehending the model's strengths and
employed in this instance, weighs predictions based on the weaknesses, facilitating the interpretation and evaluation of its
predictive abilities.

Fig 12 Hybrid Model Evaluation Result.

IJISRT24SEP382 www.ijisrt.com 840

V. RESULTS AND DISCUSSIONS  Metrics for Models Before Tuning:

Before hyperparameter tuning, the models exhibited
The study employed a range of machine learning and varying performance across different metrics. Some models
deep learning methods to forecast heart disease through like LR, RF, CB, and XB showed relatively good performance
thorough analysis of patient data. Initially, numerous in terms of precision, recall, accuracy, and F1 scores, while
algorithms, including LR, KNN, RF, CB, XB, NB, SVM, and others like KNN, SVM, and SGD demonstrated lower scores
SGD, underwent assessment using various performance across multiple metrics.
measures such as Precision, recall, accuracy, F1 Score, and
ROC AUC.

Table 1. Performance Metrics for Initial Models

After undergoing cross-validation and hyperparameter different metrics. Models like RF, CB, XB, and LR improved
tuning, a refined set of models emerged. their scores across metrics like precision, recall, accuracy, and
F1 score, indicating better-tuned parameters and enhanced
 Metrics for Models after Tuning: predictive capabilities.
After hyperparameter tuning, there was a noticeable
improvement in the performance of most models across

Table 2 Performance Metrics after Cross-Validation and Hyperparameter

The development of a blended hybrid model, which Table 3 Performance Metrics for Hybrid Model
merges LR and CB through a voting classifier, yielded
outstanding predictive capabilities.

 Observations Regarding the Hybrid Model (LR, CB, and

RF):
The hybrid model, merging LR, CB, and RF, displayed
excellent performance across various metrics. It achieved a
high accuracy of 97% and demonstrated impressive precision
and recall scores, both above 0.97. The F1 score also reflects
a balanced trade-off between precision and recall, and the
ROC AUC of 0.9984 suggests outstanding overall model
performance.

IJISRT24SEP382 www.ijisrt.com 841

Hyperparameter tuning significantly improved the [9]. Naseri, A., Tax, D., van der Harst, P., Reinders, M., &
performance of individual models, enhancing their predictive van der Bilt, I. (2023). Data-efficient machine learning
capabilities. methods in the ME-TIME study: Rationale and design
of a longitudinal study to detect atrial fibrillation and
The hybrid model, leveraging the strengths of LR, CB, heart failure from wearables. Cardiovascular Digital
and RF, emerged as a powerful ensemble, showcasing Health Journal
exceptional performance across multiple evaluation metrics, [10]. Pires, I. M., Marques, G., Garcia, N. M., & Ponciano,
indicating its potential for robust predictions on the heart V. (2020). Machine learning for the evaluation of the
disease dataset. presence of heart disease. Procedia Computer
Science, 177, 432-437.
FUTURE WORK [11]. Rimal, Y., Paudel, S., Sharma, N., & Alsadoon, A.
(2023). Machine learning model matters its accuracy: a
Future advancements may include integrating comparative study of ensemble learning and AutoML
hyperparameter optimization with emerging technologies like using heart disease prediction. Multimedia Tools and
reinforcement learning, boosting adaptability. Additionally, Applications, 1-18
the model's methodologies might evolve to address multi-
objective optimization, considering factors like
interpretability, fairness, and robustness in model
optimization.

REFERENCES

[1]. Ahmed, H., Younis, E. M., Hendawi, A., & Ali, A. A.

(2020). Heart disease identification from patients’
social posts, machine learning solution on
Spark. Future Generation Computer Systems, 111,
714-722
[2]. Ali, M. M., Paul, B. K., Ahmed, K., Bui, F. M., Quinn,
J. M., & Moni, M. A. (2021). Heart disease prediction
using supervised machine learning algorithms:
Performance analysis and comparison. Computers in
Biology and Medicine, 136, 104672
[3]. Bhushan, M., Pandit, A., & Garg, A. (2023). Machine
learning and deep learning techniques for the analysis
of heart disease: a systematic literature review, open
challenges and future directions. Artificial Intelligence
Review, 1-52
[4]. Chang, V., Bhavani, V. R., Xu, A. Q., & Hossain, M.
A. (2022). An artificial intelligence model for heart
disease detection using machine learning
algorithms. Healthcare Analytics, 2, 100016
[5]. Diwakar, M., Tripathi, A., Joshi, K., Memoria, M., &
Singh, P. (2021). Latest trends on heart disease
prediction using machine learning and image
fusion. Materials Today: Proceedings, 37, 3213-3218
[6]. Jinny, S. V., & Mate, Y. V. (2021). Early prediction
model for coronary heart disease using genetic
algorithms, hyper-parameter optimization and machine
learning techniques. Health and Technology, 11, 63-73
[7]. Katarya, R., & Meena, S. K. (2021). Machine learning
techniques for heart disease prediction: a comparative
study and analysis. Health and Technology, 11, 87-9
[8]. Malakouti, S. M. (2023). Heart disease classification
based on ECG using machine learning
models. Biomedical Signal Processing and
Control, 84, 104796.

IJISRT24SEP382 www.ijisrt.com 842

Heart Failure Prediction Using Machine Learning Algorithms
No ratings yet
Heart Failure Prediction Using Machine Learning Algorithms
7 pages
Accurate Prediction of Heart Disease Using Machine Learning: A Case Study On The Cleveland Dataset
No ratings yet
Accurate Prediction of Heart Disease Using Machine Learning: A Case Study On The Cleveland Dataset
8 pages
Accurate Prediction of Heart Disease Using Machine Learning-A Case Study On The Cleveland Dataset - IJISRT24JUL1400
No ratings yet
Accurate Prediction of Heart Disease Using Machine Learning-A Case Study On The Cleveland Dataset - IJISRT24JUL1400
8 pages
Paper 2
No ratings yet
Paper 2
5 pages
Cardiovascular Disease Prediction Combination Using Machine and Deep Learning Model
No ratings yet
Cardiovascular Disease Prediction Combination Using Machine and Deep Learning Model
16 pages
Heart Disease Prediction Using Machine L
No ratings yet
Heart Disease Prediction Using Machine L
7 pages
Diagnostics: Machine Learning-Based Predictive Models For Detection of Cardiovascular Diseases
No ratings yet
Diagnostics: Machine Learning-Based Predictive Models For Detection of Cardiovascular Diseases
19 pages
synopsis_cardio_last[1]
No ratings yet
synopsis_cardio_last[1]
12 pages
Machine Learning and Big Data Analytics For Precision Cardiac RiskStratification and Heart Diseases
No ratings yet
Machine Learning and Big Data Analytics For Precision Cardiac RiskStratification and Heart Diseases
6 pages
Predict The Heart Attack Possibilities Using Machine Learning
No ratings yet
Predict The Heart Attack Possibilities Using Machine Learning
2 pages
AI_review_1
No ratings yet
AI_review_1
5 pages
2022 Research
No ratings yet
2022 Research
19 pages
Paper1131 NikitaAhire
No ratings yet
Paper1131 NikitaAhire
8 pages
10 11648 J Ajcst 20220503 11
No ratings yet
10 11648 J Ajcst 20220503 11
10 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
10 pages
applsci-11-08352-v2
No ratings yet
applsci-11-08352-v2
22 pages
9.heart-disease-diagnosis-and-prediction-based-on-hybrid-30o3m8z8
No ratings yet
9.heart-disease-diagnosis-and-prediction-based-on-hybrid-30o3m8z8
6 pages
2023-Heart Disease Prediction Using Machine Learning
No ratings yet
2023-Heart Disease Prediction Using Machine Learning
11 pages
Batch 06 Book Chapter
No ratings yet
Batch 06 Book Chapter
7 pages
A Study On Heart Disease Prediction Using Machine Learning Algorithms
No ratings yet
A Study On Heart Disease Prediction Using Machine Learning Algorithms
7 pages
Heart Disease rp2
No ratings yet
Heart Disease rp2
14 pages
A Study of Heart Disease Diagnosis Using Machine Learning and Dat
No ratings yet
A Study of Heart Disease Diagnosis Using Machine Learning and Dat
52 pages
Heart Disease Detection Using AI
No ratings yet
Heart Disease Detection Using AI
6 pages
Predicting Coronary Heart Disease Using Various Regression Analysis
No ratings yet
Predicting Coronary Heart Disease Using Various Regression Analysis
6 pages
Heart Disease Prediction Using ML
No ratings yet
Heart Disease Prediction Using ML
4 pages
Predicting_Heart_Diseases_Using_Machine_Learning_a
No ratings yet
Predicting_Heart_Diseases_Using_Machine_Learning_a
16 pages
Prediction of Risk in Cardiovascular Disease Using Machine Learning Algorithms
No ratings yet
Prediction of Risk in Cardiovascular Disease Using Machine Learning Algorithms
6 pages
Heart Disease Detection Using Machine Learning Models
No ratings yet
Heart Disease Detection Using Machine Learning Models
11 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
6 pages
Comprehensive Review of Machine Learning Applications in Heart Disease Prediction
No ratings yet
Comprehensive Review of Machine Learning Applications in Heart Disease Prediction
8 pages
Research Paper - IRJMETS60500110643
No ratings yet
Research Paper - IRJMETS60500110643
8 pages
Heart Disease Prediction System Using Machine Learning
No ratings yet
Heart Disease Prediction System Using Machine Learning
7 pages
Heart Failure Prediction Using Machine Learning Algorithm
No ratings yet
Heart Failure Prediction Using Machine Learning Algorithm
5 pages
11.PredictiveHeartdiseaceprediction
No ratings yet
11.PredictiveHeartdiseaceprediction
9 pages
2nd Review
No ratings yet
2nd Review
21 pages
14th ICCCNT 2023 Paper 15732
No ratings yet
14th ICCCNT 2023 Paper 15732
8 pages
Phase 1 Project Report
No ratings yet
Phase 1 Project Report
44 pages
Heart Disease Identification Using Machine Learning Classification
No ratings yet
Heart Disease Identification Using Machine Learning Classification
11 pages
Advancements in Machine Learning For The Detection of Human Heart Diseases Ijariie24499
No ratings yet
Advancements in Machine Learning For The Detection of Human Heart Diseases Ijariie24499
6 pages
Nigercon Abuad IEEE 2024
No ratings yet
Nigercon Abuad IEEE 2024
5 pages
6 223 Rehan HeartDiseasePredictionAccuracyUsingaHybridMachineLearningApproachv2
No ratings yet
6 223 Rehan HeartDiseasePredictionAccuracyUsingaHybridMachineLearningApproachv2
6 pages
Enhancing Coronary Artery Disease Detection With A Hybrid Machine Learning Approach: Integrating K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) Algorithms
No ratings yet
Enhancing Coronary Artery Disease Detection With A Hybrid Machine Learning Approach: Integrating K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) Algorithms
10 pages
Heart Disease Prediction and Classification Using Machine Learning and Transfer Learning Model
No ratings yet
Heart Disease Prediction and Classification Using Machine Learning and Transfer Learning Model
7 pages
Heart Disease Prediction Using Machine Learning
No ratings yet
Heart Disease Prediction Using Machine Learning
7 pages
JOCC - Volume 2 - Issue 1 - Pages 50-65
No ratings yet
JOCC - Volume 2 - Issue 1 - Pages 50-65
16 pages
2020, R.Jane Preetha Princy, Saravanan Parthasarathy, P. Subha Hency Jose, Prediction of Cardiac Disease Using Supervised Machine Learning Algorithms
No ratings yet
2020, R.Jane Preetha Princy, Saravanan Parthasarathy, P. Subha Hency Jose, Prediction of Cardiac Disease Using Supervised Machine Learning Algorithms
6 pages
IEEE
No ratings yet
IEEE
8 pages
Olayinka Babe-2
No ratings yet
Olayinka Babe-2
48 pages
JETIR2008396
No ratings yet
JETIR2008396
6 pages
Major Project Report - AKTU
No ratings yet
Major Project Report - AKTU
15 pages
PAPER - 7430 ArticleText 8046 1 10 20230803
No ratings yet
PAPER - 7430 ArticleText 8046 1 10 20230803
7 pages
INTRODUCTION
No ratings yet
INTRODUCTION
8 pages
Heart Disease Prediction Using Machine Learning
No ratings yet
Heart Disease Prediction Using Machine Learning
7 pages
Heart Disease Prediction Using Machine Learning
No ratings yet
Heart Disease Prediction Using Machine Learning
4 pages
Report Heart
No ratings yet
Report Heart
62 pages
HEART_DISEASE_PREDICTION_RANDOM_FOREST_A
No ratings yet
HEART_DISEASE_PREDICTION_RANDOM_FOREST_A
7 pages
Predicting Heart Disease Using Machine Learning Logistic Regression
No ratings yet
Predicting Heart Disease Using Machine Learning Logistic Regression
5 pages
Cardiovascular Diseases Prediction Article
No ratings yet
Cardiovascular Diseases Prediction Article
28 pages
Clinical Decision Support System: Fundamentals and Applications
From Everand
Clinical Decision Support System: Fundamentals and Applications
Fouad Sabry
5/5 (1)
Data Science Project Ideas, Methodology & Python Codes in Health Care
From Everand
Data Science Project Ideas, Methodology & Python Codes in Health Care
Zemelak Goraga
No ratings yet
UML Modeling and Full-Stack Implementation of a Teleconsultation Platform with Real-Time Management of Patients and Medical Procedures
No ratings yet
UML Modeling and Full-Stack Implementation of a Teleconsultation Platform with Real-Time Management of Patients and Medical Procedures
13 pages
Comparative Study of Practical and Theoretical Approach in Teaching and Learning of Chemistry A Case Study of Government Technical College Kano
No ratings yet
Comparative Study of Practical and Theoretical Approach in Teaching and Learning of Chemistry A Case Study of Government Technical College Kano
6 pages
The Evolution of Luxury Tourism in Thailand: Trends and Consumer Behavior in the Hotel Industry by 2030
No ratings yet
The Evolution of Luxury Tourism in Thailand: Trends and Consumer Behavior in the Hotel Industry by 2030
12 pages
Investment Feasibility of Hydroponic Farming: Analysing the Return on Investment (ROI) Compared to Traditional Farming
No ratings yet
Investment Feasibility of Hydroponic Farming: Analysing the Return on Investment (ROI) Compared to Traditional Farming
4 pages
Recognizing and Addressing Mental Health Comorbidities in Hypertension Care Strategies: A Narrative Review
No ratings yet
Recognizing and Addressing Mental Health Comorbidities in Hypertension Care Strategies: A Narrative Review
9 pages
Mathematical Optimization of Vertical Farm Locations Advancing Sustainable Agriculture in Saudi Arabia
No ratings yet
Mathematical Optimization of Vertical Farm Locations Advancing Sustainable Agriculture in Saudi Arabia
26 pages
Exploratory Data Analysis for Banking
No ratings yet
Exploratory Data Analysis for Banking
5 pages
Computer-Assisted Lung Cancer Diagnosis through Morphological Analysis & CNN
No ratings yet
Computer-Assisted Lung Cancer Diagnosis through Morphological Analysis & CNN
7 pages
Machine Learning-Enhanced Models in Brain Tumors: A Mathematical and Computational Perspective
No ratings yet
Machine Learning-Enhanced Models in Brain Tumors: A Mathematical and Computational Perspective
4 pages
Conceptual Model on the Effect of Axial Load on Shallow Isolated Footings Resting on Clay Soil
No ratings yet
Conceptual Model on the Effect of Axial Load on Shallow Isolated Footings Resting on Clay Soil
6 pages
A Effectiveness of Multi-Intervention Programme Combining Benson's Relaxation Therapy and Counseling on Perceived Stress among Stroke Victims
No ratings yet
A Effectiveness of Multi-Intervention Programme Combining Benson's Relaxation Therapy and Counseling on Perceived Stress among Stroke Victims
7 pages
Predicting Employee Attrition using Machine Learning Techniques
No ratings yet
Predicting Employee Attrition using Machine Learning Techniques
10 pages
Corporate Social Responsibility as a Strategic Tool for Organisaeional Success in Corpoarate Financial Intermediation: Empirical Evidence from Rivers State, Nigeria
No ratings yet
Corporate Social Responsibility as a Strategic Tool for Organisaeional Success in Corpoarate Financial Intermediation: Empirical Evidence from Rivers State, Nigeria
9 pages
Assessment Tools and Gap Analysis on the Competencies Covered in Mathematics in Tupi Secondary High School
No ratings yet
Assessment Tools and Gap Analysis on the Competencies Covered in Mathematics in Tupi Secondary High School
12 pages
AI-Powered Inventory Management System: Revolutionizing Stock Monitoring with Real-Time Alerts & Visual Recognition
No ratings yet
AI-Powered Inventory Management System: Revolutionizing Stock Monitoring with Real-Time Alerts & Visual Recognition
12 pages
Real-Time Sign Language to Speech Translation using Convolutional Neural Networks and Gesture Recognition
No ratings yet
Real-Time Sign Language to Speech Translation using Convolutional Neural Networks and Gesture Recognition
5 pages
Learning-Based Intrusion Detection and Prevention System (LIDPS)
No ratings yet
Learning-Based Intrusion Detection and Prevention System (LIDPS)
10 pages
Phacoemulsification vs. Manual SICS: Which Poses a Higher Risk for Postoperative Dry Eye?
No ratings yet
Phacoemulsification vs. Manual SICS: Which Poses a Higher Risk for Postoperative Dry Eye?
5 pages
Comparative Study of Formulated Herbal Lozenges and AYURTUSS Lozenges
No ratings yet
Comparative Study of Formulated Herbal Lozenges and AYURTUSS Lozenges
6 pages
Exploring The Skin Lightening Potential of PADMAKA (Prunus cerasoides) In A Novel Face Serum
No ratings yet
Exploring The Skin Lightening Potential of PADMAKA (Prunus cerasoides) In A Novel Face Serum
8 pages
Mechanical Performance and Durability Evaluation of Self-Healing Polymers
No ratings yet
Mechanical Performance and Durability Evaluation of Self-Healing Polymers
5 pages
AI-Powered Local Crime Prediction
No ratings yet
AI-Powered Local Crime Prediction
6 pages
Design and Economic Analysis of Boil-Off Gas Recovery in LNG Facilities
No ratings yet
Design and Economic Analysis of Boil-Off Gas Recovery in LNG Facilities
11 pages
An EOQ Model for Deteriorating Item with Preservation Technology, Linear Holding Cost, and Multi-Variate Demand
No ratings yet
An EOQ Model for Deteriorating Item with Preservation Technology, Linear Holding Cost, and Multi-Variate Demand
6 pages
Extraction of Cu(II) Ions Using Chloroform Solution of 4,4 ́-(1E,1E ́)-1,1 ́-(Ethane-1,2- Diylbis(Azan-1-YL- 1ylidene))BIS(5-Methyl-2- Phenyl-2,3-Dihydro-1H-Pyrazol-3-OL) (H2BuEtP) Under the Influence of Acids, Anions and Complexing Agents
No ratings yet
Extraction of Cu(II) Ions Using Chloroform Solution of 4,4 ́-(1E,1E ́)-1,1 ́-(Ethane-1,2- Diylbis(Azan-1-YL- 1ylidene))BIS(5-Methyl-2- Phenyl-2,3-Dihydro-1H-Pyrazol-3-OL) (H2BuEtP) Under the Influence of Acids, Anions and Complexing Agents
10 pages
Impact of Nurse-Patient Ratios on Patient Outcomes in Acute Care Settings in Mogadishu, Somalia
No ratings yet
Impact of Nurse-Patient Ratios on Patient Outcomes in Acute Care Settings in Mogadishu, Somalia
7 pages
Cardio-Eye Connection: Retinal Eye Imaging for Heart Attack Risk Prediction
No ratings yet
Cardio-Eye Connection: Retinal Eye Imaging for Heart Attack Risk Prediction
6 pages
Evaluating The Impact of Partially Replacing Cement with Rice Husk Ash and Metakaolin on the Rheological Behavior and Mechanical Strength of Self-Compacting Concrete
No ratings yet
Evaluating The Impact of Partially Replacing Cement with Rice Husk Ash and Metakaolin on the Rheological Behavior and Mechanical Strength of Self-Compacting Concrete
19 pages
Promoting Sustainable Development through Waste Recycling: A Case Study of Green Entrepreneurship in Bo City, Sierra Leone
No ratings yet
Promoting Sustainable Development through Waste Recycling: A Case Study of Green Entrepreneurship in Bo City, Sierra Leone
11 pages
Machine Learning Approaches to Classification of Online Users by Exploiting Information Seeking Behaviours
No ratings yet
Machine Learning Approaches to Classification of Online Users by Exploiting Information Seeking Behaviours
6 pages
Google Aiml
No ratings yet
Google Aiml
117 pages
Technologies-11-00091_Implementation of Deep Learning Models on an SoC-FPGA Device for Real-Time Music Genre Classification
No ratings yet
Technologies-11-00091_Implementation of Deep Learning Models on an SoC-FPGA Device for Real-Time Music Genre Classification
18 pages
Santry D. Demystifying Deep Learning. An Introduction... Math of Neural Net. 2024
100% (3)
Santry D. Demystifying Deep Learning. An Introduction... Math of Neural Net. 2024
259 pages
UNIT-1 Foundations of Deep Learning
100% (1)
UNIT-1 Foundations of Deep Learning
51 pages
B.N.M. Institute of Technology: Prediction of Remaining Useful Life of Aircraft Engine
No ratings yet
B.N.M. Institute of Technology: Prediction of Remaining Useful Life of Aircraft Engine
28 pages
Deep Learning
No ratings yet
Deep Learning
18 pages
Beyond A Gaussian Denoiser: Residual Learning of Deep CNN For Image Denoising
No ratings yet
Beyond A Gaussian Denoiser: Residual Learning of Deep CNN For Image Denoising
13 pages
Machine Learning Unit1
No ratings yet
Machine Learning Unit1
151 pages
TensorFlow Tutorial For Beginners (Article) - DataCamp PDF
No ratings yet
TensorFlow Tutorial For Beginners (Article) - DataCamp PDF
60 pages
Dinosaurus Island - Character-Level Language Model - (Final) - Learners - Ipynb
No ratings yet
Dinosaurus Island - Character-Level Language Model - (Final) - Learners - Ipynb
10 pages
CS 224n Assignment #2: Word2Vec and Dependency Parsing
No ratings yet
CS 224n Assignment #2: Word2Vec and Dependency Parsing
10 pages
Incentivizing Honesty Among Competitors in Collaborative Learning and Optimization
No ratings yet
Incentivizing Honesty Among Competitors in Collaborative Learning and Optimization
37 pages
AI ML 50 Page Detailed Guide
No ratings yet
AI ML 50 Page Detailed Guide
50 pages
2017project Paper
No ratings yet
2017project Paper
5 pages
Mule Proposal
No ratings yet
Mule Proposal
21 pages
Instant Download Enabling AI Applications in Data Science Aboul-Ella Hassanien PDF All Chapters
100% (3)
Instant Download Enabling AI Applications in Data Science Aboul-Ella Hassanien PDF All Chapters
58 pages
Cluster-Based Grid Computing On Wireless Network Data Transmission With Routing Analysis Protocol and Deep Learning
No ratings yet
Cluster-Based Grid Computing On Wireless Network Data Transmission With Routing Analysis Protocol and Deep Learning
18 pages
Unit 2
No ratings yet
Unit 2
35 pages
WILP Degree Course Descriptions
No ratings yet
WILP Degree Course Descriptions
80 pages
Mid Summary
No ratings yet
Mid Summary
13 pages
Introducing Machine Learning To Parameter Estimation
No ratings yet
Introducing Machine Learning To Parameter Estimation
6 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
34 pages
Instant download Inference and Learning from Data: Volume 2: Inference Ali H. Sayed pdf all chapter
100% (4)
Instant download Inference and Learning from Data: Volume 2: Inference Ali H. Sayed pdf all chapter
50 pages
Jnca D 24 00212
No ratings yet
Jnca D 24 00212
35 pages
Linear Models (Unit II) Chapter III 1
No ratings yet
Linear Models (Unit II) Chapter III 1
24 pages
Digital Speech Transmission and Enhancement 2nd Edition Peter Vary Rainer Martin - Download the ebook and explore the most detailed content
100% (3)
Digital Speech Transmission and Enhancement 2nd Edition Peter Vary Rainer Martin - Download the ebook and explore the most detailed content
69 pages
Zubair - 2022 - Critical_heat_flux_prediction_for_safety_analysis_of_nuclear_reactors_using_machine_learning
No ratings yet
Zubair - 2022 - Critical_heat_flux_prediction_for_safety_analysis_of_nuclear_reactors_using_machine_learning
6 pages
Object Detection Using Adaptive Mask RCNN
No ratings yet
Object Detection Using Adaptive Mask RCNN
12 pages
Deep learning-LSTM
No ratings yet
Deep learning-LSTM
55 pages
aDSA SuperComp4Trng DNN
No ratings yet
aDSA SuperComp4Trng DNN
12 pages

Predicting Heart Disease Through Machine Learning Methods

Uploaded by

Predicting Heart Disease Through Machine Learning Methods

Uploaded by

Volume 9, Issue 9, September– 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP382

Predicting Heart Disease through

IJISRT24SEP382 www.ijisrt.com 829

IJISRT24SEP382 www.ijisrt.com 830

IJISRT24SEP382 www.ijisrt.com 831

Fig 1 Correlation Matrix with Heat Map

C. Feature Selection and Engineering:  Logistic Regression (LR):

IJISRT24SEP382 www.ijisrt.com 832

Euclidean distance = √(∑i=1N (pi-qi)²)  Naive Bayes (NB):

 CatBoost (CB):  Stochastic Gradient Descent (SGD)

IJISRT24SEP382 www.ijisrt.com 833

in the gradient's opposite direction. θt+1=θt−α⋅∇f(θt;xi,yi) is  F1-Score:

IJISRT24SEP382 www.ijisrt.com 834

Fig 2 Model Evaluation Result 1

Fig 3 Model Evaluation Result 2

IJISRT24SEP382 www.ijisrt.com 835

F. Cross-Validation and Parameter Tuning: evaluates all specified hyperparameter combinations

Fig 4 Cross-Validation and Hyper Parameter Tuning

IJISRT24SEP382 www.ijisrt.com 836

Fig 5 Re- Model Evaluation Result 1

Fig 6 Re- Model Evaluation Result 2

IJISRT24SEP382 www.ijisrt.com 837

Fig 7 Model Accuracy Comparison before and after Tuning Result 1

Fig 8 Model Precision Comparison before and after Tuning Result 2

IJISRT24SEP382 www.ijisrt.com 838

Fig 9 Model Recall Comparison before and after Tuning Result 1

Fig 10 Model F1 Score Comparison before and after Tuning Result 2

Fig 11 Model ROC AUC Comparison before and after Tuning

IJISRT24SEP382 www.ijisrt.com 839

Fig 12 Hybrid Model Evaluation Result.

IJISRT24SEP382 www.ijisrt.com 840

V. RESULTS AND DISCUSSIONS  Metrics for Models Before Tuning:

Table 1. Performance Metrics for Initial Models

Table 2 Performance Metrics after Cross-Validation and Hyperparameter

 Observations Regarding the Hybrid Model (LR, CB, and

IJISRT24SEP382 www.ijisrt.com 841

[1]. Ahmed, H., Younis, E. M., Hendawi, A., & Ali, A. A.

IJISRT24SEP382 www.ijisrt.com 842

You might also like