0% found this document useful (0 votes)
26 views

Exploring the Applications of Machine Learning in Healthcare

Uploaded by

uzmasaiyed713
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Exploring the Applications of Machine Learning in Healthcare

Uploaded by

uzmasaiyed713
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/338093820

Exploring the Applications of Machine Learning in Healthcare

Article · December 2019


DOI: 10.2174/2210327910666191220103417

CITATIONS READS
29 1,425

2 authors:

Tausifa Jan Saleem Mohammad Ahsan Chishti


Indian Institute of Technology Delhi National Institute of Technology Srinagar
20 PUBLICATIONS 287 CITATIONS 81 PUBLICATIONS 762 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Tausifa Jan Saleem on 10 March 2022.

The user has requested enhancement of the downloaded file.


458 Send Orders for Reprints to [email protected]

International Journal of Sensors, Wireless Communications and Control, 2020, 10, 458-472
REVIEW ARTICLE
ISSN: 2210-3279
eISSN: 2210-3287

Exploring the Applications of Machine Learning in Healthcare


BENTHAM
SCIENCE

Tausifa Jan Saleem1,* and Mohammad Ahsan Chishti2


International Journal of Sensors, Wireless Communications and Control

1
Department of Computer Science and Engineering, National Institute of Technology Srinagar, Jammu and Kashmir, India;
2

e.
Department of Information Technology, School of Engineering & Technology, Central University of Kashmir Ganderbal, Jam-

r
mu and Kashmir, India
Abstract: The rapid progress in domains like machine learning, and big data has created plenty
he
w
of opportunities in data-driven applications particularly healthcare. Incorporating machine in-

y
telligence in healthcare can result in breakthroughs like precise disease diagnosis, novel meth-

. n
ods of treatment, remote healthcare monitoring, drug discovery, and curtailment in healthcare

ly r a
costs. The implementation of machine intelligence algorithms on the massive healthcare da-

n
o eo
tasets is computationally expensive. However, consequential progress in computational power
ARTICLE HISTORY
during recent years has facilitated the deployment of machine intelligence algorithms in

e n
healthcare applications. Motivated to explore these applications, this paper presents a review of re-

s
Received: January 30, 2019

u yo
Revised: April 08, 2019
Accepted: July 12, 2019 search works dedicated to the implementation of machine learning on healthcare datasets. The studies
that were conducted have been categorized into following groups (a) disease diagnosis and detection,

te an
DOI:
(b) disease risk prediction, (c) health monitoring, (d) healthcare related discoveries, and (e) epidemic

a
10.2174/2210327910666191220103417

iv to
outbreak prediction. The objective of the research is to help the researchers in this field to get a

r
comprehensive overview of the machine learning applications in healthcare. Apart from revealing

p ed
the potential of machine learning in healthcare, this paper will serve as a motivation to foster advanced

l
research in the domain of machine intelligence-driven healthcare.

a ad
n
so plo
Keywords: Machine learning, big data, healthcare, health monitoring, disease diagnosis, disease risk prediction.

r
e ru
p
1. INTRODUCTION future. EHR contains information like the medical history of

r o
the patient, disease diagnostic results, treatment information,

o
According to a statement by McKinsey Global Institute,

d
allergies the patient is prone to, medication details, etc. [4].

F te
it is reported that the healthcare sector in US can save up to
300 billion dollars yearly by analyzing the massive quantity The rationale for this paper stems from the reality that the

b u
of medical data [1]. Despite the fact that healthcare is being

i
integration of machine learning and healthcare has resulted

r
in diverse applications including identifying the health

t
revolutionized by the voluminous magnitude of data about

s
complications a patient is suffering from, forecasting the risk

i
the patients, analyzing such a gigantic volume of data goes
of the disease a patient is likely to develop in future,
d
beyond the potential of humans. Machine learning offers a
examining and recuperating the personal health of an

e
panacea to this problem by extracting the concealed patterns

b
individual, optimization of the clinical decision-making
and hidden correlations from the data. It comprises of a set of

t
process, premature prediction of epidemic outbreaks, etc. A

No
algorithms that learn statistical correlations from the data and number of research works that contain the implementation of
perform prediction tasks for effective decision-making [2]. machine learning on healthcare datasets have been carried
The implementation of machine intelligence algorithms on out with good results. This paper presents a comprehensive
healthcare datasets is computationally expensive [3]. review of research studies devoted to machine intelligence-
However, the rapid advancement in computational power in driven healthcare applications.
recent years has removed this inadequacy to a great extent [3].
The paper is structured as follows: Section 2 presents a
Maintaining the details of every individual patient in the detailed elucidation of machine learning techniques.
form of Electronic Health Records (EHRs) and the rapid Moreover, a list of performance metrics is provided in this
progress in machine intelligence is surely going to be the key section. Section 3 provides a review of machine learning
factors for a massive transformation of healthcare in the near applications in healthcare. A tabular summary of the
explored research efforts is also presented in this section.
Furthermore, a list of several publicly available healthcare
*Address correspondence to this author at the Department of Computer datasets is provided in this section. Section 4 presents the
Science and Engineering, National Institute of Technology Srinagar, Jammu
and Kashmir, India; E-mail: [email protected] concluding remarks.

2210-3287/20 $65.00+.00 © 2020 Bentham Science Publishers


Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 459

Unlabelled
Data

Machine
Labelled Classification/ Predicted
Learning
Data Prediction Model Output
Technique

Fig. (1). A fundamental approach for machine learning.


Table 1. Performance measures of machine learning techniques.

Performance Measure Definition Formula

Accuracy is defined as the percentage of correct predictions (equation 1). This 
Accuracy = (1)


e.
includes situations where data items that belong to positive class are predicted

r
as positive (True Positive), and data items that belong to negative class are
Accuracy

e
predicted as negative (True Negative) and excludes cases where positive class Where TP = True Positive, TN = True

h
data items are predicted negative (False Negative) and negative class data Negative, FP = False Positive, FN =

w
items are predicted positive (False Positive). False Negative

. ny
Specificity represents the fraction of negative class data instances predicted

ly r a

Specificity negatively by the machine-learning model (True Negative) from the sum of Specificity = (2)

n


o eo
True Negative and False Positive data instances (equation 2).

e n
Precision represents the fraction of positive class data instances predicted

s


u yo
Precision positively by the machine-learning model (True Positive) from the sum of Precision = (3)

True Positive and False Positive data instances (equation 3).

ate an
Recall represents the fraction of positive class data instances predicted posi-


iv to
Recall (Sensitivity) tively by the machine-learning model (True Positive) from the sum of True Recall = (4)


r
Positive and False Negative data instances (equation 4).

l p ed MAE = 
 
     (5)

a ad

Mean Absolute Error MAE is a performance measure that represents the average difference between the

n
Where N is the total number of data
(MAE) absolute value of the actual values (Z) and the predicted values (Z´) (equation 5).

so plo
instances

Mean Absolute Percentage


r
e ru
Mean absolute percentage error is the average percentage of the absolute rela- 
   
    (6)

p
 MAPE =
Error (MAPE) tive error ( |) (equation 6).  

r o


o d


F te
Root Mean Square Error Root Mean Square Error is used to represent the squared difference between    

RMSE = (7)
(RMSE) target values and predicted values (equation 7). 

i b u
2. MACHINE LEARNING

s t r linear regression, k-nearest neighbor, decision tree, random

i
forest, naive baye’s, support vector machine, artificial neural

d
The objective of machine learning is to learn from
experience and to produce a model that can perform network, etc. have been designed [6]. The following provides

e
a description of these techniques.

b
prediction tasks for effective decision making [5]. Fig. (1)

t
illustrates the fundamental machine learning approach. 2.1.1. Linear Regression

No
Based on the learning model, machine learning approaches Regression is a supervised learning algorithm that is used
are classified into the following categories: classification, to forecast a real-valued output from the correlations learned
clustering, association rule mining, and dimensionality from the training data. Linear regression presumes a linear
reduction. To assess the proficiency of machine learning correlation between the input predictors (x) and the target
techniques, various performance measures like accuracy, output (y) [7]. The simple linear regression technique in which
precision, recall, etc. are used. A list of these performance there is a single input variable and a single output variable is
measures is given in Table 1. Equations 1-7 present the represented in terms of following mathematical equation (8):
mathematical formulas for these performance measures.
      , (8)
Where x is an independent input variable, y is a
2.1. Machine Learning for Classification
dependent output variable, a & b are the coefficients.
The objective of classification is to develop a classifier The objective of the linear regression algorithm is to
that learns the distribution of patterns in the set of labeled obtain optimal values for the coefficients a & b. This is
data. For that purpose, a number of techniques including realized by representing this problem with an error
460 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

minimization problem in which the variation between the 2.1.4. Naive Baye’s
forecasted output and actual output is minimized by varying
It is a supervised learning technique for performing
the values of a & b till the minimal value of the error is
multi-class classification. It uses Baye’s theorem for
obtained (equation 9). The values of a & b at this minimal
determining the probability of a class given a data item.
point of error are chosen as optimal values.
Baye’s theorem is given as,
 
     (9)    
          (13)
   

And this can be represented in terms of the following


cost function (equation 10). Where         is a posterior probability,
        is a likelihood,  is the prior
  probability and n denotes the number of features of data item
     (10)
 Z.

e.
Where  is the   predicted output,  is the   actual In order to lower down the complexity of Bayesian
output and  represents the number of data items.
er
classifier, a presumption of conditional independence is made

h
over the training dataset [11] which is given as, (eq. 14).

w
Hence, the objective is to minimize the cost function J.

y
This is done with the help of an approach known as gradient

n

            

.
descent. Gradient descent initializes a & b with some value (14)
and then their values are changed iteratively to lessen the
ly r a
on e o
value of cost function J. And, the optimal values of a & b are This algorithm proceeds by determining the posterior
calculated with the help of following mathematical (equation probability of all the classes given the data item using Baye’s
11 and 12).
s e n theorem and the data item is allocated to the class with a

u yo
higher value of posterior probability [12]. The main

te an
  advantage of Naive Baye’s classifier over other classifiers is
       (11)

a
 the small training time and high scalability. Moreover, no

iv to
optimization strategies are required. However, the

r
 
       (12) presumption of conditional independence results in less
p ed
 

l
accuracy [9].
Where η is the learning rate.
na ad
so plo
2.1.5. K-Nearest Neighbour (KNN)

r
2.1.2. Decision Tree

e ru
KNN is a supervised learning technique in which outputs

p
The decision tree follows a greedy strategy to classify for new data instances are predicted by exploring ‘K’

r o
data items by arranging them based on attribute values. A identical data instances in the dataset and taking the mode of

F teo d
decision tree is depicted with the help of a binary tree in
which each node represents a feature of the data item to be
their output values as the predicted output for the new data
instance [12]. In the case of regression, the mean of the

u
classified and each edge depicts the value, the node can take. values is considered as the output for the new data instance.

i b
The leaf nodes of the decision tree hold the output values [8].

r
KNN does not require any kind of training; however, storing

t
Constructing a decision tree is basically a procedure of the entire data set is indispensable. Coherent data structures

d is
partitioning the input space by performing specific splits
which are determined by a cost function. The attribute of the
are used for storing training data in order to make lookup and
matching for new patterns efficient. Distance measures like

b e
input space that best partitions the training data is chosen as Euclidean distance, Hamming distance, and Manhattan

t
the root node of the decision tree. In the case of regression distance are used for ascertaining ‘K’, the most identical data

No
modeling problems, Sum of Squared Error is the cost function, items to the new data instance. Euclidean distance is the
and in the case of classification, Gini index function is used as most extensively used distance measure and is calculated as:
a cost function. The node with a minimum value of the cost
function is chosen as a split point. The same procedure is  
repeated iteratively for each set of data until training data is   
Euclidean distance (X, Y) = (15)
partitioned into subsets of the same class [9]. 

2.1.3. Random Forest


Where X and Y are feature vectors represented as X=
It is a supervised learning technique in which a myriad of {       } &          .
decision trees are trained on different subsets of training set
chosen randomly [10]. And the forecasted label for the new Choosing an optimal value for ‘K’ is crucial for the
data item is the mode of the outputs forecasted by decision accuracy of the algorithm. The computational intricacy of
trees. However, in case of regression mean of the outputs KNN grows with an increase in the magnitude of the training
forecasted by decision trees is taken as the output label. data set.
Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 461

2.1.6. Support Vector Machine (SVM) sequences and tree structures, and hence play an important
role in Natural Language Processing.
SVM, a supervised learning technique is based on the
concept of augmenting the margin, i.e., each of the two sides
2.1.9. Hidden Markov Model (HMM)
of a hyperplane that splits the linearly separable input
variable further divides into two classes [9]. Margin is the HMM is a ubiquitous mathematical model for time series
perpendicular distance from the hyperplane to the nearest classification and is represented as (equation 16) [17];
points on each side of the hyperplane. Data points that
recline on the margin of the hyperplane are termed as µ= (H, O, A, B, ) (16)
support vectors. The solution of SVM, i.e., the optimal Where,            is the finite set of
hyperplane is represented in terms of support vectors. The hidden states,             is the set of
idea is to choose the best hyperplane that can segregate the observations. It is possible that sometimes many states may
data points optimally into two classes. In other words, the generate the same observation (m<=n). Thus, it may be
optimal hyperplane is the one that has the highest margin. impossible to invert f(H) =O, f being a many-to-one

e.
Suppose that the input space is optimally separated by the function. A is the state transition matrix in which each

r
following line; element represents the transition probability from state  to
     state  i.e. (equation 17),
he
Where  is the slope of the line and  is the intercept.
y
    Where         
w (17)
For some data point  if     , then  
y . a n
l r
    

on e o
Else if      ,      
 

e n
Hence, the prediction for a new unseen data point is

s
according to Baye’s probability theorem.

u yo
ascertained depending on the aforementioned criteria.
B is the emission matrix in which each element

te an
2.1.7. Artificial Neural Network (ANN) represents the probability of generating observation  when

a
iv to
Artificial Neuron is the elementary computational unit in
the model is in the state (equation 18):

r
p ed
an ANN. It accepts one or more inputs and performs their     Where      (18)

l
weighted sum, which is then passed as an input to a non-

na ad
linear function called as activation function [13]. Activation    

so plo
function may be a threshold function, piecewise linear
   
r
function, logistic function, Gaussian function, etc. ANNs can

e ru
be contemplated as a directed weighted graph with neurons

p
as nodes and weights as directed edges. Feedforward ANN  is the initial state vector in which each element depicts

o r o
with at least one hidden layer is called as a Multi-Layer the probability of that state to be the starting state. The word

F te d
Perceptron (MLP). ‘Markov’ in HMM refers to the property that the probability
of getting into a subsequent state is decided only by the
2.1.8. Deep Neural Network (DNN)

i b u present state the system is in and not by the preceding states.

r
Mathematically (equation 19),

t
DNN offers an excellent solution for diverse

d is             
classification and recognition tasks as it encapsulates various
levels of abstraction. It consists of diverse architectures,
       (19)

be
which include; unsupervised pre-trained network, The word ‘Hidden’ in HMM refers to the property that it

t
convolutional neural network, recurrent neural network and is impossible to know the current state of the system;

o
recursive neural network. Unsupervised pre-trained networks nevertheless, the observer only has a probabilistic vision on

N
are further grouped into auto-encoders, deep belief networks, where it should be. Mathematically (equation 20),
and generative adversarial networks. Auto-encoders are
utilized for dimensionality reduction of high dimensional                   (20)
data. Deep Belief Networks capture high-level
representations of input data in an unsupervised manner [14]. 2.2 .Machine Learning for Clustering
Generative adversarial networks are a potent approach for
probabilistic modeling, they learn to imitate any distribution The clustering algorithm takes a set of unlabeled data,
of data, and generate prompt and precise inferences [15].            as input and the output is the  clusters,
Convolutional Neural Networks work exceptionally well          designated by their centroids, 
with image data and hence have got the massive potential for         which are calculated as follows (equation 21),
data analytics in healthcare applications [16]. Recurrent 
Neural Networks are utilized for time series forecasting.     (21)

Recursive Neural Networks work well for learning
462 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

In clustering, the inter-cluster distance is maximized and 


   (23)
intra-cluster distance is minimized so that the data items in 

the same cluster have similar characteristics and data items Confidence of an association rule (A ) is defined as
in different clusters have highly disparate traits [6]. Sum of (equation 24);
Squared Errors (SSE) and Peak Signal to Noise Ratio
(PSNR) are used to assess the proficiency of the clustering   
outcome. PSNR is extensively used for clustering image        (24)
  
data. The following provides a description of the most
widely used clustering algorithm, i.e., K- Means clustering. Following provides a description of the two most
important algorithms for association rule mining, namely,
2.2.1. K-Means Clustering apriori and FP Growth:

K-Means clustering, an unsupervised learning technique


2.3.1. Apriori Algorithm
is utilized in scenarios with unlabeled data. The objective of

e.
this algorithm is to group data items into a K number of Apriori is an extensively used algorithm for association

r
clusters [18]. This algorithm proceeds by iteratively rule mining. It is used for recognizing frequently occurring
allocating each data item to one of the K clusters depending
on feature similitude. The input fed to the K-means
he
attribute-value relationships in the dataset [20]. It begins by

w
recognizing individual data items occurring frequently in the

y
technique is the dataset and the value of K. The output dataset. In the first pass, support of all the item sets of size
produced consists of centroids of all K clusters and the
. n one is calculated. Item sets with support less than a particular

ly r a
allocation of each data item to a particular cluster. The value (threshold) are repudiated. In the next pass, support of
algorithm begins with initial values for centroids, which can
be randomly produced or randomly chosen from the dataset. n
o eo
items that survived in the first pass is coupled together to
form item sets of size two, and their support is calculated.

s
Each centroid specifies a cluster. In the first step, each data
e n Like in the previous pass, item sets with support less than the

u yo
item is allocated to its closest centroid based on Euclidean threshold are repudiated. Likewise, in the subsequent passes,

e an
distance measure given by (equation 22):

t
item sets that survived in the preceding pass are combined

a
and their support is calculated. Item sets with support less
(22) v o
i t
than the threshold are repudiated. This process continues

r
 
    


p ed
 until all the association rules from the preceding pass have

l
support less than the threshold.

na ad
Where            is the set of centroids in the K

so plo
clusters and  is the data item. In the next step, the centroid 2.3.2. FP Growth

r
of each cluster is recalculated by finding the mean of all the

e ru
data items present in that cluster. K means clustering Another procedure for association rule mining is FP

p
Growth. Both Apriori, and FP Growth procedures take

r o
algorithm iterates between the aforementioned steps till
identical input and display the same output. The Input being
o
halting criteria are attained, which can be either of the

F te d
following: (i) When no data item is assigned to a different the dataset and a threshold and output is the frequently
occurring associations of data items [21]. However,

u
cluster (ii) Maximum limit of iterations (iii) the Maximal
strategies followed by these algorithms are different. Apriori

i b
sum of distances.

r
utilizes a breadth-first search approach to determine the set

s t of frequently occurring data items and hence is quite

di
2.3. Machine Learning for Association Rule Mining expensive in terms of memory usage. While as FP Growth
algorithm utilizes a depth-first search approach. FP Growth
e
Given a large dataset, the fundamental objective of

b
association rule mining is to imitate the human algorithm works by building a tree known as an FP tree in

t
order to coherently represent the data items so that frequently

No
brain’s feature extraction efficacy and extract associations
from raw data [19]. It consists of two parts: an antecedent occurring associations of data items can be directly
and a consequent. An antecedent is any data item, and the redeemed by recursive calls without scanning the data set
consequent is a data item that occurs frequently with the multiplenumber of times.
antecedent. Support and confidence are used as a basis for
the recognition of important associations among data items. 2.4 Machine Learning for Dimensionality Reduction
Let           be the set of data items and T=
Dimensionality reduction means to filter the extraneous
{        be the set of transactions in the dataset where
and inappropriate features from the dataset in order to
    Support of a data item set A is defined as the
improve the efficiency of pattern recognition tasks. Table 2
fraction of transactions in the dataset that contain data Item
provides a summary of some important dimensionality
set A. Mathematically Eq. (23):
reduction techniques.
Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 463

Table 2. Dimensionality reduction techniques.

Technique Description Pros Cons References

PCA generates linear combinations of the original attributes.


The new attributes formed are arranged according to their
Principal -Flexible technique
explained variance. The first Principal Component (PC1) -New principal compo-
Component
elucidates the highest variance in the dataset, the second -Fast and simple to imple- nents formed are not [22, 23]
Analysis
Principal Component (PC2) elucidates the second-most ment interpretable
(PCA)
variance, and so on. Dimensionality is alleviated by restrict-
ing the number of principal components

-As with PCA, new


Linear Dis- It also generates linear combinations of original attributes. -Being a supervised tech- features are not inter-
pretable.

e.
criminant However, contrary to PCA, LDA doesn't maximize the ex- nique, LDA can improve the
[22, 24]
Analysis plained variance. Rather, it augments predictive performance of -Requires labeled data,
(LDA) the separability between classes. extracted features. which makes it more

er
h
situational

Auto-encoders are neural networks that are trained to regen-

y w
. n
erate their original inputs. The idea is to structure the hidden -Not used as general-

ly r a
-Perform well for image and
Auto Encoder layer to have lesser neurons than the input/output layers. purpose reduction tech- [22, 25]
audio data.

n
o eo
With the result, the hidden layer learns to build a smaller niques
representation of the input.

s e n
u yo
-Efficiently selects features
Genetic Algo- from high dimensional data

te an
Accomplishes supervised feature selection. -Highly complex. [22, 26]
rithm sets where exhaustive search

a
is not feasible.

r iv to
l p ed
na aDisease
d
r so ploand detection
diagnosis

p e ru
o r Epidemico
F prediction
e d Disease

t
outbreak risk

i b u Machine
Learning
prediction

s t r Applications

d i in
Healthcare

b e
t
No
Healthcare
Health
related
monitoring
discoveries

Fig. (2). Applications of machine learning in healthcare.

3. MACHINE LEARNING IN HEALTHCARE section reviews the applications of machine learning in


healthcare. These applications are categorized into five
Machine learning techniques have the ability to extract
groups; disease diagnosis and detection, disease risk
concealed patterns and hidden correlations from the massive
prediction, health monitoring, healthcare-related discoveries,
healthcare datasets, thereby facilitating improved,
and epidemic outbreak prediction (Fig. 2).
economical, and more convenient healthcare services. This
464 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

3.1. Disease Diagnosis and Detection pregnant women experiencing hypertensive ailments. The
study demonstrates that ensemble classifiers are the best
Machine learning techniques have the potential of
fitting solution for the prediction of such disorders. Hiba et
identifying the health complications a patient is suffering
al. [39] carried out a comparative analysis of four machine
from. The following research efforts demonstrate the
learning techniques: SVM, decision tree, naive baye’s and K-
application of machine learning for diagnosing the existing
nearest neighbor for breast cancer risk prediction. The
health ailments: Shashikant et al. [27] trained SVM with
experiments conducted in the study demonstrate that SVM
sequential minimization optimization on India centric dataset
outperforms the other three techniques. Richa et al. [40]
for diagnosis of heart disease. The study reveals that the SVM
conducted a performance analysis of various machine learning
is appropriate for heart disease diagnosis. In another study,
algorithms for the early prediction of Parkinson's disease. The
Sumedh et al. [28] used SVM and Multilayer Perceptron for
experiments demonstrate that the combination of K-nearest
the diagnosis of liver disease. The study demonstrates that
neighbor and multi-layer perceptron outperforms other
multi-layer perceptron outperforms SVM in liver disease
algorithms. Elaheh et al. [41] proposed a model based on
diagnosis. Moreover, the study proposes ways to augment the

e.
magnetic resonance imaging for mild cognitive impairment to
efficiency of these techniques. Bansal et al. [29] performed a
Alzheimer's disease conversion at least one year prior to the actual

r
comparative analysis of four techniques: J48, Naive Baye’s,

e
medical diagnosis. In another study, Mahua et al. [42] proposed a
Random Forest and multilayer perceptron for diagnosis of

h
technique called as a cascaded convolutional neural network for
dementia. The study reveals that J48 outperforms the other

w
premature prediction of Alzheimer’s disease. Yixue et al. [43]

y
three techniques. Emanuele et al. [30] proposed a method for

n
proposed a model based on a combination of the recurrent neural

.
real-time brain cancer detection. The proposed method uses

ly r a
network, convolutional neural network, and deep belief network
the K-means technique for recognizing tumor borders and

n
for multimodal disease risk prediction. The study demonstrates

o eo
hence can assist doctors during tumor removal operations. In
that the accuracy of the hybrid model surpasses several
another study, Rahib et al. [31] designed a CNN based model

e n
benchmark techniques. Fenglong et al. [44] proposed a method

s
for the detection of chest disease. The study achieved an

u yo
based on bidirectional recurrent neural networks for the prediction
accuracy of 92.4%.
of upcoming health conditions of patients. The experiments

te an
conducted in the study demonstrate that the proposed method

a
3.2. Disease Risk Prediction
outperforms the benchmark disease prediction methods.

r iv to
Machine learning has been found to be quite effective in

p ed
forecasting the risk of the disease a patient is likely to 3.3. Health Monitoring

l
a ad
develop in the future. The following studies reveal the
Health monitoring is crucial for examining and

n
application of machine learning for disease risk prediction.

so plo
recuperating the personal health of an individual. This
Deepti [32] performed a comparative analysis of three
category of applications utilizes machine learning for
r
techniques, decision tree, SVM, and naive baye’s for

e ru
providing crucial suggestions to individuals with health

p
predicting diabetes at the earliest. Results demonstrate that
ailments and for notifying the healthcare professionals in

r o
naive baye’s produces accurate results in comparison to
case of an emergency. The following demonstrates the

o
other techniques. Ahmed et al. [33] designed a hybrid system

F te d
application of machine learning for health monitoring. Min
based on linear regression and artificial neural networks for
et al. [45] proposed a healthcare system based on Edge

u
early prediction of chronic kidney disease. The study
Cognitive Computing for examining the health of

i b
demonstrates that the proposed system outperforms all the

r
individuals. The study demonstrates that the proposed

t
existing state-of-the-art models. In the study [34], Jung-Gi

s
system considerably enhances the survival rates of patients

i
et al. designed a hybrid model based on an adaptive network-

d
in case of an emergency. In another study, Pham et al. [46]
based-fuzzy inference system and linear discriminant
proposed a home healthcare system that imparts contextual

e
analysis for the likelihood prediction of coronary heart

b
knowledge to healthcare professionals about patient’s
disease. The experiments conducted in the study demonstrate
t
activities. Zia Uddin et al. [47] proposed a system based on a

No
that the proposed model outperforms all the existing
recurrent neural network for the prediction of daily activities
methods. Samaneh et al. [35] conducted a review of machine
of an individual. The experiments conducted in the study
learning techniques that have been used for the prediction of
demonstrate that the proposed system performs better than
cardiac arrest. The study concludes that machine learning is
conventional approaches. In [48], Hande et al. performed a
quite effective and efficient in early prediction of cardiac
comparative analysis of two machine learning techniques:
arrest and hence the espousal of machine learning techniques
hidden markov model and time windowed neural network
in healthcare scenarios is crucial. In another study, Fatma
for daily life behaviour monitoring. Experimental results
et al. [36] designed a system for cardiovascular risk
demonstrate that the hidden markov model performs better
assessment that monitors the patient continuously and
than time windowed neural networks in case of activity
maintains a record of the disease likelihood over time.
recognition. Gursel et al. [49] proposed a fall detection
Pattanpong et al. [37] analyzed the efficiency of a hybrid
system based on random forest and support vector machines.
model (based on LSTM and RNN) for stroke prediction.
A fall detection decision is generated 100 milliseconds prior
Experiments demonstrate that the model achieves an
to the fall, which stimulates the opening of body airbag,
accuracy of 99.98%. Mário et al. [38] proposed a model for
thereby preventing the ill consequences. Ghulam et al. [50]
forecasting the likelihood of postpartum depression in
Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 465

proposed a system for health monitoring based on the facial Hongming et al. [58] explored the potential of deep learning
expressions of the individual. The proposed system identifies in drug discovery and concluded that deep learning is quite
facial expressions accurately. Srividya et al. [51] performed effective in addressing the issues in drug discovery. Byung-
a comparative analysis of following machine learning Jun et al. [59] provided a review of applications of a hidden
techniques: support vector machines, decision trees, naïve Markov model in modelling and analyzing biological
bayes classifier, K-nearest neighbour classifier, logistic sequences. The study concludes that hidden Markov models
regression, and random forest for behavioural modelling for offer a befitting solution for analyzing biological sequences.
mental health. Experimental results demonstrate that random In the study [60], Jeena et al. proposed a method based on
forest outperforms other machine learning techniques. Alaa association rule mining that refines the adverse drug
et al. [52] developed an EEG telemonitoring system for the
reactions effectively.
detection of epileptic seizures that generates a notification in
case of an emergency. In another study, Musaed et al. [53]
3.5. Epidemic Outbreak Prediction
proposed an approach based on Deep CNN stacked

e.
autoencoder for epileptic seizure detection and monitoring One of the crucial facets of public health monitoring is
with the help of EEG sensors. The proposed approach yields the premature prediction of epidemic outbreaks so that
highly accurate results. Satoshi et al. [54] proposed a system
er
timely actions are taken to control their spread. Machine

h
based on a genetic algorithm that can be utilized to alleviate learning has been found promising in such scenarios. The

w
the response time of health facilities in case of life- following researches demonstrate this application. Manish et
threatening emergency cases.

. ny al. [61] proposed an approach based on a random tree for the

ly r a
prediction of ebola casualties. Abdul et al. [62] used an

n
3.4. Healthcare-Related Discoveries extreme learning approach to forecast the likelihood of a

o eo
dengue hemorrhagic fever outbreak depending on the
Another application of machine learning in healthcare

e n
conditions of weather. Experimental results show that the

s
lies in ascertaining the concealed correlations from massive

u yo
approach yields accurate results. In another study, Zhijian
healthcare datasets in order to improve and optimize the et al. [63] demonstrate the application of a graph-structured

te an
clinical decision-making process. The following works recurrent neural network for forecasting epidemics. Zaheer

a
demonstrate this application. Rhonda et al. [55] utilized an et al. [64, 65] performed a comparative analysis of following

iv to
association rule mining technique known as apriori for
r
machine learning techniques: decision tree, logistic

p ed
determining the disease co-occurrences. Mahmood et al. [56] regression, naive baye’s, and random forest for outbreak

l
utilized the apriori algorithm for exploring the associations

a ad
prediction of three diseases: pneumonia, malaria, and

n
between correlated diseases in the case of various age groups diarrhea. The study investigates the reasons for disease

so plo
for both males and females. Bowei et al. [57] proposed an outbreaks and hence can assist in the alleviation of such

r
extension of FP- Growth algorithm, called as PNFP-Growth outbreaks.

p e ru
for comprehensive mining of health data. The study Table 3 provides a summary of all the research efforts

r o
demonstrates that the proposed approach efficiently digs out mentioned in this section, highlighting the application of the

o d
more patterns than the conventional FP-growth algorithm.

F te
study, the dataset used, number of samples and attributes in
Moreover, the analysis conducted in the study demonstrates the dataset, machine learning techniques used and the results

u
that at least 30% people with hypertension have soaring and inferences obtained. Table 4 provides a list of some

i b
systolic pressure and liver ailments. In another study,

r
commonly available healthcare datasets.
Table 3.
is t
Summary of applications of machine learning in healthcare.

d
be
Number of Number of
Work Application Dataset Technique Used Results/Inferences

t
Samples Attributes

N o
Ghumbre
et al. [27]
Heart Disease Di-
agnosis
India Centric Da-
taset for heart dis- 214 19
Support Vector
Machine
Accuracy: 85.51%
Sensitivity: 84.60%
ease Specificity: 88.50%

Accuracy=71%
Support Vector
Sensitivity=71.5
Machine
Sontakke Diagnosis of Liver Indian Liver Patient Specificity=88.3
583 10
et al. [28] Diseases Dataset (ILPD) Accuracy=73.2%
Multi-Layer Per-
Sensitivity=73.3%
ceptron
Specificity= 87.7%
Table 3. Contd…
466 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

Number of Number of
Work Application Dataset Technique Used Results/Inferences
Samples Attributes

J48 Accuracy = 99.52%

Brain MRI dataset Random Forest Accuracy = 99.28%


Bansal et al.
Dementia diagnosis from OASIS- 416 - Naive Baye’s Accuracy = 92.55%
[29]
Brains.org
Multi-Layer Per-
Accuracy = 96.88%
ceptron

Torti et al. Real-time Brain Hyperspectral brain Parallel K-means


6 128 Speed-up = 126
[30] cancer detection cancer dataset clustering

Rahib et al. Chest disease detec- Convolutional


Chest X-ray images 120, 120 1024 Accuracy = 92.4%

e.
[31] tion Neural Network

Naive Baye’s

er
Accuracy: 76.30%

h
Pima Indians Diabe-
Deepti Support Vector

w
Diabetes Prediction tes Database 768 8 Accuracy: 65.1%

y
Sisodia [32] Machine
(PIDD)

. n
ly r a
Decision Tree Accuracy: 73.82%

n
o eo
Linear regression
Abdelaziz Chronic kidney +

e n
CKD dataset 200 16 Accuracy = 97.8%

s
et al. [33] disease prediction Artificial Neural

u yo
network

te an
Adaptive Net-

a
iv to
Korean National work-based-Fuzzy

r
Risk assessment of Inference System
Jung- Health and Nutri-

p ed
coronary heart 4826 7 Accuracy = 80.2%

l
Gi Yang [34] tion Examinations +
disease

a ad
Survey V dataset Linear Discrimi-

n
so plo
nant Analysis

r
Regression,

p e ru Support Vector

r o
Machine,

o d
This paper presents a compre-

F te
K- Nearest Neigh-
hensive review to assess the
Javan et al. Early prediction of bor,

u
- - - proficiency of machine learning
[35] cardiac arrest Artificial Neural

b
techniques in forecasting the

r i
Network,

t
risk of cardiac arrest.

is
Decision tree,

d
Ensemble Classi-

e
fier

t b Multiclass Lo-

No
Accuracy = 86%
gistic Regression

Akbuluta Cardiovascular risk Multiclass Neural


CVDiMo dataset 30 6 Accuracy: 86%
et al. [36] assessment Network

Multiclass deci-
Accuracy: 96%
sion forest

Electronic Health Long Short Term


Records from De- Memory
Pattanapong partment of Medical
Stroke Prediction 326, 152 - + Accuracy = 99.98%
et al. [37] Services, Ministry
of Public Health Recurrent Neural
Thialand Network

Table 3. Contd…
Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 467

Number of Number of
Work Application Dataset Technique Used Results/Inferences
Samples Attributes

Maternity School Support Vector


Accuracy = 95.8%
Moreira Postpartum depres- Assis Chateaubri- Machine
205 -
et al. [38] sion prediction and pregnancy K-Nearest Neigh-
dataset Accuracy = 94.5%
bour

C4.5 Accuracy = 95.13%

Support Vector
Accuracy = 97.13%
Asri et al. Breast cancer risk Wisconsin Breast Machine
699 11
[39] prediction Cancer dataset Naive Baye’s Accuracy = 95.99%

e.
K- Nearest Neigh-
Accuracy = 95.27%

r
bor

K-Nearest Neigh-

he
Accuracy = 91.28%

w
bor + AdaBoost

. ny K-Nearest Neigh-

ly r a
Mathur et al. Parkinson disease Parkinson disease Accuracy = 90.76%
195 24 bor + Bagging
[40] prediction dataset from UCI

n
o eo
K-Nearest Neigh-
bor + Multi Layer Accuracy = 91.28%

s e n Perceptron

u yo Low Density

te an
Alzheimer’s Dis- Accuracy = 74.74%
MRI based Alz- Separation

a
Moradi et al. ease Neuroimaging
heimer’s conversion 825 -

iv to
[41] Initiative (ADNI) Support Vector

r
prediction Accuracy = 69.15%
database

p ed
Machine

Manhua et Alzheimer’s disease


l
a ad
Cascaded Convo-

n
ADNI database 397 - lutional Neural Accuracy = 93.26%

so plo
al. [42] dignosis
Network

r
e ru
Dataset from Grade- Multimodal data-

p
Hao et al. Multimodal Disease A hospital of se- based Recurrent Accuracy = 96%

r
20, 320, 848 79

o
[43] risk prediction cond class in Wu- Convolutional Recall = 98.08%

F teo d
han

Medicaid claims
Neural Network

u
147, 810 - Attention based Accuracy in case of Medicaid

b
over the year 2011

r i
Ma et al. Disease diagnosis bidirectional Re- dataset = 84.75%

t
Diabetes dataset

s
[44] prediction current Neural Accuracy in case of diabetes

i
over the years 2012 22, 820 - Network

d
dataset = 83.18%
and 2013.

b e
Smart healthcare

t
The proposed system monitors
Chen et system based on Physiological data

No
- - Deep Learning the health of the users and
al.[45] edge cognitive of users
performs analysis in real-time.
computing

Support Vector
Machine
Pham et al. Body hydration
- - - + Accuracy: 91.5 %
[46] monitoring
Radial Basis
Function

Recurrent Neural
Activity Prediction
Md. Zia Network based Mean Prediction Performance:
System in smart MHEALTH dataset - -
Uddin [47] multimodal sys- 99.69%
healthcare
tem
Table 3. Contd…
468 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

Number of Number of
Work Application Dataset Technique Used Results/Inferences
Samples Attributes

Hidden Markov
Accuracy = 95%
Alemdar Daily Life behav- ARAS datasets Model
- -
et al. [48] iour monitoring Kasteren datasets Time Windowed
Accuracy = 94%
Neural Network

A fall detection decision is


Random Forest, generated 100 milliseconds
Serpen et al. Real-time fall de- prior to the fall which stimu-
SisFall dataset 32 12 Support Vector
[49] tection system lates the opening of body air-
Machine
bag, thereby preventing the ill
consequences.

r e.
JAFFE dataset 213 - Gaussian Mixture

e
Facial Healthcare Model

h
Muhammad Monitoring System Accuracy = 99.9% for both the
+

w
et al. [50] for improved CK dataset 408 - datasets

y
healthcare Support Vector

. n
Machine

n ly r a Logistic Regres-

o eo
Accuracy = 84%
sion
Dataset was collect-

e n
ed by analyzing

s
Naive Baye’s Accuracy = 73%

u yo
high school/college
Support Vector

te an
Behavioural Model- going students and Accuracy = 89%
Srividya Machine
ling for Mental working profession- 656 20

a
et al. [51]

iv to
Health als working in dif- Decision Tree Accuracy = 81%

r
ferent organisations

p ed
having less than 5 K-Nearest Neigh-

l
Accuracy = 89%
bor

a ad
years of experience.

n Random Forest Accuracy = 90%

r so plo 4096 samples

e ru
Abdellatif EEG-telemonitoring per patient, Fuzzy based clas-
EEG dataset - Accuracy: 98%

p
et al. [52] system sification

r o
300 subjects

Alhussein
F teo
Epileptic seizure
d
CHB-MIT dataset
obtained from Chil-
686 recordings Deep CNN Accuracy = 99.2%

u
detection and moni- from 23 epilep- - stacked autoen-
et al. [53] dren’s Hospital Sensitivity = 93.5%

i b
toring sy patients coder model

r
Boston

is t The proposed method can be

d
Optimize current Dataset provided by utilized to alleviate the re-
Sasaki et al.

e
and future health Niigata City Fire 21, 211 - Genetic algorithm sponse time of health facilities
[54]

b
planning Bureau in case of life-threatening

t
emergency cases.

No
Vermont Uniform
Kost et al. Exploring disease Hospital Discharge Co-occurrences of various
334, 409 75 Apriori
[55] co-occurrences DataSet diseases are investigated.
(VUHDDS)

Association between different


Rashid et al. Clinical observa- correlated diseases are explored
Medicare dataset 1000 - Apriori
[56] tions in case of various age groups
for both males and females.

Heart disease dataset The proposed method is capa-


Comprehensive
Wang et al. obtained from UCI Extended FP- ble of extracting more number
mining of health 270 14
[57] Machine Learning Growth of frequent patterns in an effi-
data
Repository cient manner.
Table 3. Contd…
Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 469

Number of Number of
Work Application Dataset Technique Used Results/Inferences
Samples Attributes

Deep Learning Deep learning has shown re-


(Convolutional markable results in addressing
Neural Network, issues like bioactivity predic-
Chen et al.
Drug discovery - - - Recurrent Neural tion, de novo molecular design,
[58]
Network, Genera- synthesis prediction, and bio-
tive Adversarial logical image analysis in drug
Networks) discovery.

Hidden Markov models present


Modelling and
Byung-Jun Hidden Markov a conceptual mathematical
analyzing biological - - -
Yoon [59] Model structure for modelling and
sequences

e.
analyzing biological sequences

r
The Health Im- The proposed method effective-

e
Reps et al. Refining adverse Association Rule

h
provement Network 2, 58, 397 - ly refines the adverse drug
[60] drug reactions Mining

w
(THIN) dataset reactions.

. ny Mean Absolute Error = 113.57

ly r a
Pandey et al. Ebola Causalities Dataset obtained Root Mean Square Error =
4112 - Random Tree

n
[61] Prediction from WHO sitrep 352.97

o eo Accuracy= 85.74%

s e n
u yo
Weather dataset
obtained from the

te an
Central Bureau of
Dengue Hemor-

a
Statistics Jakarta, Extreme learning Mean Absolute Error = 0.08698

iv to
Najar et al. rhagic Fever Out-
and dengue hemor- 84 - with binary activa- and Mean Absolute Percentage

r
[62] break Risk Level

p ed
rhagic fever dataset tion function Error = 3.00536
Prediction

l
obtained from Ja-

a ad
karta Health Agen-

n
so plo
cy, Indonesia.

r
Graph structured

e ru
Zhijian et al. Epidemic forecast- CDC Dataset from Root Mean Square Error =
- - recurrent neural

p
[63] ing 2013 to 2015 0.223

r o
network

F teo d
Precision for Pneumonia =
0.941

u
Decision Tree
Precision for Malaria = 0.952

r i b
t
Precision for Diarrhoea = 0.919

d is Logistic Regres-
Precision for Pneumonia =
0.926

be
Dataset obtained sion Precision for Malaria = 0.92

ot
Disease outbreak
Babar et al. from Health Units Precision for Diarrhoea = 0.931
prediction in devel- - 15
[64] (HUs) of Punjab

N
oping countries Precision for Pneumonia = 0.81
province in Pakistan
Naive Baye’s Precision for Malaria = 0.736
Precision for Diarrhoea = 0.802

Precision for Pneumonia =


0.914
Random Forest
Precision for Malaria = 0.914
Precision for Diarrhoea =0.918
470 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

Table 4. Healthcare datasets.

Dataset Provider Description Source

Diabetes dataset Washington University Dataset for Diabetes diagnosis https://archive.ics.uci.edu/ml/datasets/


Diabetes

Heart disease dataset Hungarian Institute of Cardiology, Contains 303 instances with 75 attrib- https://archive.ics.uci.edu/ml/datasets/
University Hospital, Zurich, Swit- utes for heart disease diagnosis Heart+Disease
zerland,
University Hospital, Basel, Swit-
zerland
V.A. Medical Center, Long Beach
and Cleveland Clinic Foundation

Thyroid Disease dataset Garavan Institute, Sydney, Austral- Contains 7200 instances with 21 attrib-

r e.
https://archive.ics.uci.edu/ml/datasets/

e
ia utes for thyroid disease diagnosis Thyroid+Disease

Parkinson’s Telemonitoring University of Oxford Consists of 5875 instances (with 26

w h
https://archive.ics.uci.edu/ml/datasets/

y
dataset attributes) from 42 subjects diagnosed Parkinsons+Telemonitoring

. n
ly r a
with early stage Parkinson’s disease

n
Breast Cancer dataset University of Wisconsin Contains 569 instances with 32 attrib- https://archive.ics.uci.edu/ml/datasets/

o eo
utes for breast cancer diagnosis Breast+Cancer+Wisconsin+(Diagnost

e n
ic)

OASIS-3
s
u yo
Centers for Medicare & Medicaid Contains MR and PET imaging data http://www.oasis-brains.org/

te an
Services. from 1098 subjects

a
iv to
The Apnea – ECG database Phillips-University, Marburg, Contains 70 instances of ECG record- http://www.physionet.org/physiobank

r
Germany ings from sleep apnea patients /database/apnea-ecg/

MHEALTH dataset

l p ed
University of Granada Contains 120 instances (23 attributes) https://archive.ics.uci.edu/ml/datasets/

a ad
of body motion and vital signs from 10 MHEALTH+Dataset

n
so plo
subjects

r
Dermatology dataset Gazi University, Bilkent Universi- Contains 366 instances with 33 attrib- http://archive.ics.uci.edu/ml/datasets/

e ru
ty, Ankara, Turkey utes for diagnosis of erythemato- Dermatology

r p o
squamous disease.

Liver Disorders dataset

F teo d
BUPA Medical Research Ltd. Contains 345 instances with 7 attributes
for diagnosis of liver disorders
http://archive.ics.uci.edu/ml/datasets/
Liver+Disorders

EEG dataset

i b u
Neurodynamics Laboratory, Contains 122 instances obtained from http://archive.ics.uci.edu/ml/datasets/

t r
State University of New York 64 electrodes placed on subjects scalps EEG+Database

s
Health Center to inspect EEG correlates of genetic

d i Brooklyn, New York predisposition to alcoholism.

e
Vertebral Column dataset Department of Teleinformatics Dataset for classification of orthopedic http://archive.ics.uci.edu/ml/datasets/

t b Engineering, Federal University of patients (310 instances with 6 attrib- Vertebral+Column

No
CearÃj, Brazil utes)

Human Activity Recognition Università degli Studi di Genova, Dataset built for detecting physical http://archive.ics.uci.edu/ml/datasets/
dataset Genoa, Italy, Universitat Politèc- activity recognition from 30 subjects Hu-
nica de Catalunya (Barcelo- (10299 instances with 561 attributes) man+Activity+Recognition+Using+S
naTech). Vilanova i la Geltrú, martphones
Spain

Daphnet Freezing of Gait University of Newcastle Upon Dataset for recognizing freezing of gait http://archive.ics.uci.edu/ml/datasets/
dataset Tyne, UK. in patients suffering from Parkinson’s Daphnet+Freezing+of+Gait
disease (237 instances with 9 attributes)
Exploring the Applications of Machine Learning International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 471

CONCLUSION [4] Gautam P, Ansari MD, Sharma SK. Enhanced security for electron-
ic health care information using obfuscation and RSA algorithm in
While healthcare organizations are not convinced to have cloud computing. Int J Inf Secur Priv 2019; 13: 59-69.
http://dx.doi.org/10.4018/IJISP.2019010105
complete reliance on machine intelligence, the brisk progress [5] Sethi K, Jaiswal V, Ansari MD. Machine learning based support
in machine intelligence and big data has inculcated the belief system for students to select stream. Recent Pat Comput Sci 2019;
that machine intelligence can assist healthcare professionals 13(3): 12.
to make better decisions and accelerate the process of [6] Tsai CW, Lai CF, Chiang MC, Yang LT. Data mining for internet
of things: A survey. IEEE Comm Surv Tutor 2014; 16: 77-97.
decision-making in healthcare systems. The growing http://dx.doi.org/10.1109/SURV.2013.103013.00206
accessibility of healthcare data and brisk advancements in [7] Gandhi R. Introduction to machine learning algorithms: Linear
computing power and data analytical paradigms has created regression. Available at:
the opportunity for researchers to implement the machine https://towardsdatascience.com/introduction-to-machine-learning-
algorithms-linear-regression-14c4e325882a
learning algorithms on healthcare datasets. This paper [8] Brownlee J. Classification and regression trees for machine learn-
presents a comprehensive review of research studies ing. Available at
dedicated to the implementation of machine learning for [9] https://machinelearningmastery.com/classification-and-regression-

e.
healthcare applications. The research studies that were trees-for-machine-learning/

r
[10] Kotsiantis SB. Supervised machine learning: A review of classifi-
conducted have been divided into the following categories

e
cation techniques. Informatica 2007; 31: 249-68.

h
(a) disease diagnosis and detection, (b) disease risk [11] Mahdavinejad MS, Rezvan M, Barekatain M, Adibi P, Barnaghi P,

w
prediction, (c) health monitoring, (d) healthcare-related Sheth AP. Machine learning for internet of things data analysis: A

y
discoveries, and (e) epidemic outbreak prediction. The survey. Digital Commun Netw 2018; 4: 161-75.

. n
http://dx.doi.org/10.1016/j.dcan.2017.10.002
presented literature demonstrates that the machine learning

ly r a
[12] Gupta P. Naïve bayes in machine learning. Available at
techniques are highly effective in identifying the health

n
https://towardsdatascience.com/naive-bayes-in-machine-learning-

o eo
complications a patient is suffering from, forecasting the risk f49cc8f831b4
of the disease a patient is likely to develop in future, [13] Brownlee JK. Nearest neighbors for machine learning. Available at

e n
https://machinelearningmastery.com/k-nearest-neighbors-for-

s
examining and recuperating the personal health of an

u yo
machine-learning/
individual, optimization of the clinical decision-making [14] Ani1 K, Jain JM. Artificial neural networks: A tutorial. Comput

te an
process, premature prediction of epidemic outbreaks, etc. IEEE 1996; 29: 31-44.

a
[15] Hinton GE. Deep belief networks. Scholarpedia 2009.

iv to
Finally, it is worthwhile to state that the advent of [16] A Beginner's Guide to Generative Adversarial Networks (GANs).

r
machine intelligence in healthcare will facilitate improved, Available at https://skymind.ai/wiki/generative-adversarial-

p ed
network-gan
economical, and more convenient healthcare services. Future
l
[17] Mohammadi M, Al-Fuqaha A, Sorour S, Guizani M. Deep learning

a ad
work would be to investigate the ways for reducing the for IoT big data and streaming analytics: A survey. IEEE Comm

n
computational complexity of the machine learning models Surv Tutor 2018; 20: 2923-60.

so plo
built on huge healthcare datasets so that real-time analytics is http://dx.doi.org/10.1109/COMST.2018.2844341

r
[18] Esmael B, Arnaout A, Fruhwirth RK, Thonhauser G. Improving
facilitated in healthcare systems.

e ru
time series classification using hidden markov models. 2012 12th

p
International Conference on Hybrid Intelligent Systems (HIS). Pu-

r o
CONSENT FOR PUBLICATION ne, India, 2012.

o d
http://dx.doi.org/10.1109/HIS.2012.6421385

F te
Not applicable. [19] Trevino A. Introduction to K-means clustering. Available at
https://www.datascience.com/blog/k-means-clustering

u
[20] Association learning. Available at https://deepai.org/machine-
FUNDING

i b
learning-glossary-and-terms/association-learning
None.
s t r [21] Hastie T, Friedman J, Tibshirani R. Unsupervised learning the

i
elements of statistical learning springer series in statistics. New

d
York: Springer 2001.
[22] Fournier-Viger P, Chun-wei JL, Bay V, Tin TC, Ji Z, Hoai BL. A

e
CONFLICT OF INTEREST
survey of item set mining. WIREs Data Mining and Knowledge

t b
The authors declare no conflict of interest, financial or Discovery 2017; 7: 1-18.

No
[23] Dimensionality reduction algorithms: Strengths and weaknesses.
otherwise. Available at https://elitedatascience.com/dimensionality-reduction
[24] Vasan KK, Surendiran B. Dimensionality reduction using principal
ACKNOWLEDGEMENTS component analysis for network intrusion detection. Perspect Sci
2016; 8: 510-2.
Declared none. [25] Lopes M. Is LDA a dimensionality reduction technique or a classi-
fier algorithm? Available at:
https://towardsdatascience.com/is-lda-a-dimensionality-reduction-
REFERENCES technique-or-a-classifier-algorithm-eeed4de9953a
[26] Ayushman SS. Reducing dimensionality of data using neural net-
[1] Manyikaetal J. Big Data The Next Frontier for Innovation, Compe- works. Available at:
tition, and Productivity. New York, NY, USA: McKinsey Global https://www.cse.iitk.ac.in/users/cs365/2015/_submissions/ayushmn
Institute 2011. /slides.pdf
[2] Magoulas GD, Prentza A. Machine Learning in Medical Applica- [27] Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK. Di-
tions ACAI ’ 99. Springer 2001; pp. 300-7. mensionality reduction using genetic algorithms. IEEE Trans Evol
http://dx.doi.org/10.1007/3-540-44673-7_19 Comput 2000; 4: 164-71.
[3] Shishavan OR, Zois OS, Soyata T. Machine intelligence in http://dx.doi.org/10.1109/4235.850656
healthcare and medical cyber physical systems: A survey IEEE ac- [28] Ghumbre SU, Ghatol AA. Heart disease diagnosis using machine
cess 2018; 6: 46420-46494. learning algorithm. Proceedings of the International Conference on
472 International Journal of Sensors, Wireless Communications and Control, 2020, Vol. 10, No. 4 Saleem and Chishti

Information Systems Design and Intelligent Applications 2012 [48] Zia Uddin Md. A wearable sensor-based activity prediction system
(INDIA 2012) held in Visakhapatnam, India, January 2012. to facilitate edge computing in smart healthcare system. J Parallel
http://dx.doi.org/10.1007/978-3-642-27443-5_25 Distr Comp 2019; 123: 46-53.
[29] Sontakke S, Lohokare J, Dani R. 2017 International Conference on [49] Alemdar H, Can T, Ersoy C. Daily life behaviour monitoring for
Emerging Trends & Innovation in ICT (ICEI). Pune, India, 2017. health assessment using machine learning: Bridging the gap be-
http://dx.doi.org/10.1109/ETIICT.2017.7977023 tween domains. Pers Ubiquitous Comput 2015; 19: 303-15.
[30] Bansal D, Chhikkara R, Kavita K, Poonal G. Comparative analysis http://dx.doi.org/10.1007/s00779-014-0823-y
of various machine learning algorithms for detecting dementia. [50] Serpen G, Rakibul HK. Real-time detection of human falls in pro-
Procedia Comput Sci 2018; 132: 1497-502. gress: Machine learning approach. Procedia Comput Sci 2018; 140:
[31] Torti E, Florimbi G, Castelli F, et al. Parallel K-means clustering 238-47.
for brain cancer detection using hyperspectral images electronics. [51] Ghulam M, Mansour A, Umar AS, Ahmed G, Alhamid MF. A
MDPI 2018; 2018: 7. facial-expression monitoring system for improved healthcare in
[32] Abiyev RH. Mohammad KSM. Deep convolutional neural net- smart cities. IEEE Access 2017; 5: 10871-81.
works for chest diseases detection. J Healthc Eng Hindawi 2018. [52] Srividya M, Mohanavalli S, Bhalaji N. Behavioral modeling for
[33] Sisodia D, Sisodia DS. Prediction of diabetes using classification mental health using machine learning algorithms. J Med Syst 2018;
algorithms. Procedia Comput Sci 2018; 132: 1578-85. 42(5): 88.
http://dx.doi.org/10.1016/j.procs.2018.05.122 [53] Abdellatif AA, Emam A, Carla FC, Amr M, Ali J, Rabab W. Edge

e.
[34] Abdelaziz A, Salama ASA, Riad M, Alia NM. A Machine Learning based compression and classification for smart healthcare systems:

r
Model for Predicting of Chronic Kidney Disease Based Internet of Concept, implementation and evaluation. Expert systems with ap-

e
Things and Cloud Computing in Smart Cities Security in Smart plications. Elsevie 2019; 117: 1-14.

h
Cities: Models, Applications, and Challenges, Lecture Notes in In- [54] Alhussein M, Muhammad G, Shamim HM, Syed UA. Cognitive

w
telligent Transportation and Infrastructure. Springer 2019. IoT-cloud integration for smart healthcare: Case study for epileptic

y
[35] Yang JG, Kim JK, Kang UG, Lee YH. Coronary heart disease seizure detection and monitoring. Mob Netw Appl 2018; 23: 1624-

. n
optimization system on adaptive-network based fuzzy inference 35.

ly r a
system and linear discriminant analysis (ANFIS–LDA) Personal [55] Sasaki S, Alexis JC, Hiroshi S, Chris B. Using genetic algorithms

n
and Ubiquitous Computing. Springer 2013. to optimise current and future health planning - The example of

o eo
[36] Layeghian JS, Sepehri MM, Aghajani H. Toward analyzing and ambulance locations. Int J Health Geogr 2010; 2010: 9.
synthesizing previous research in early prediction of cardiac arrest [56] Kost R, Littenberg B, Chen ES. Exploring generalized association

e n
using machine learning based on a multi-layered integrative rule mining for disease Co-occurrences. AMIA Annu Symp Proc

[37]
framework. J Biomed Inform 2018; 88: 70-89.
s
u yo
Akbuluta FP, Akan A. A smart wearable system for short-term [57]
2012; 2012: 1284-93.
Rashid MA, Hoque MT, Sattar A. Association rules mining based

te an
cardiovascular risk assessment with emotional dynamics Measure- clinical observations, 2014. Available at:

a
ment, Elsevier 2018; 128: 237-46. https://arxiv.org/ftp/arxiv/papers/1401/1401.2571.pdf

iv to
[38] Goyal M. Long short-term memory recurrent neural network for [58] Wang B, Chen D, Shi B, et al. Comprehensive association rules

r
stroke prediction. International Conference on Machine Learning mining of health examination data with an extended fp-growth

p ed
and Data Mining in Pattern Recognition, 2018. method mobile network applications. Mob Netw Appl 2017; 22:

l
[39] Moreira MWL, Rodrigues JJPC, Kumar N, Saleem K, Illin IV. 267-74.

a ad
Postpartum depression prediction through pregnancy data analysis [59] Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T. The rise

n
for emotion-aware smart systems. Information Fusion, Elsevier of deep learning in drug discovery. Drug Discov Today 2018;

so plo
2019; 2019: 4723-31. 23(6): 1241-50.

r
http://dx.doi.org/10.1016/j.inffus.2018.07.001 [60] Yoon BJ. Hidden markov models and their applications in biologi-

e ru
[40] Asri H, Mousannif H, Al Moatassime H, Noel T. Using machine cal sequence analysis. Curr Genomic 2009; 10(6): 402-15.

p
learning algorithms for breast cancer risk prediction and diagnosis. http://dx.doi.org/10.2174/138920209789177575

r o
Procedia Comput Sci 2016; 83: 1064-9. [61] Reps JM, Aickelin U, Ma J, Zhang Y. Refining adverse drug reac-

o d
http://dx.doi.org/10.1016/j.procs.2016.04.224 tions using association rule mining for electronic healthcare data.

F te
[41] Mathur R, Pathak V, Bandil D. Parkinson disease prediction using 2014 IEEE International Conference on Data Mining Workshop.
machine learning algorithm emerging trends in expert applications Shenzhen, China, 2014.

u
and security, advances in intelligent systems and computing. http://dx.doi.org/10.1109/ICDMW.2014.53

i b
Springer 2019. [62] Pandey MK, Subbiah K. Performance Analysis of Time Series
[42]

t r
Moradi E, Antonietta P, Christian G, Heikki H, Jussi T. Machine

s
Forecasting Using Machine Learning Algorithms for Prediction of

i
learning framework for early MRI-based Alzheimer’s conversion Ebola Casualties. International Conference on Application of Com-

d
prediction in MCI subjects. Neuroimage Elsevier 2015; 104: 398- puting and Communication Technologies, 2018.
412. [63] Najar AM, Irawan MI, Adzkiya D. Extreme learning machine
[43]
e
Manhua L, Danni C, Kundong W, Yaping W. Multi-modality cas-

b
method for dengue hemorrhagic fever outbreak risk level predic-

t
caded convolutional neural networks for Alzheimer’s disease diag- tion. 2018 International Conference on Smart Computing and Elec-

No
nosis. Neuroinformatics 2018; 16: 295-308. tronic Enterprise (ICSCEE). Shah Alam, Malaysia, 2018.
[44] Hao Y, Usama M, Yang JM, Shamim H, Ahmed G. Recurrent http://dx.doi.org/10.1109/ICSCEE.2018.8538409
convolutional neural network based multimodal disease risk predic- [64] Li Z, Luo X, Wang B, Bertozzi AL, Xin J. A Study on Graph-
tion. Future Gener Comput Syst 2019; 92: 76-83. Structured Recurrent Neural Networks and Sparsification with Ap-
[45] Ma F, Radha C, Jing Z, Quenzeng Y, Tong S, Jing G. Dipole: plication to Epidemic Forecasting 2019. Available at
Diagnosis prediction in healthcare via attention-based bidirectional https://arxiv.org/abs/1902.05113
recurrent neural networks KDD’17. Canada: ACM 2017. [65] Babar Z, Mannan A, Kamiran F, Karim A. Understanding the Im-
[46] Chen M, Li W, Hao Y, Qian Y, Humar I. Edge cognitive compu- pact of Socio-Economic and Environmental Factors for Disease
ting based smart healthcare system. Future Gener Comput Syst Outbreak in Developing Countries. IEEE 15th International Confer-
2018; 86: 403-11. ence on Data Mining Workshops. Atlantic City, NJ, USA, 2015.
http://dx.doi.org/10.1016/j.future.2018.03.054 http://dx.doi.org/10.1109/ICDMW.2015.49
[47] Pham M, Mengistu Y, Do H, Sheng W. Delivering home healthcare
through a Cloud-based Smart Home Environment (CoSHE). Future
Generation Computer Systems, Elsevier 2018; 81: 129-40.

View publication stats

You might also like