0% found this document useful (0 votes)
10 views4 pages

1015 Final

Uploaded by

KAGAN DENIZCIGIL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views4 pages

1015 Final

Uploaded by

KAGAN DENIZCIGIL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2020 IEEE Region 10 Symposium (TENSYMP), 5-7 June 2020, Dhaka, Bangladesh

Application of Machine Learning on ECG Signal


Classification Using Morphological Features
Anika Alim Md. Kafiul Islam
Dept. of Electrical and Electronic Engineering Dept. of Electrical and Electronic Engineering
Independent University, Bangladesh Independent University, Bangladesh
Dhaka, Bangladesh Dhaka, Bangladesh
Email: [email protected] Email: [email protected]

Abstract— An electrocardiogram (ECG) is a simple test that electrodes capture the tiny electrical changes that are a result
is used to check one’s heart’s electrical activity. Sensors of cardiac muscle depolarization followed by repolarization
attached to the skin are used to detect the electrical signal throughout every cardiac cycle (heartbeat) [2]. Dutch
produced by one’s heart each time it beats. Many people around physician and physiologist Willem Einthoven invented the
the world suffer from cardiovascular diseases. So it is important first practical electrocardiogram (ECG) [3].
to detect arrhythmia/abnormal and normal ECG signal more
accurately. In this paper, ECG signal is classified by support The normal ECG is consist of ‘P’ wave, a ‘QRS’ complex,
vector machine (SVM) and neural network. The research is and a ‘T’ wave. The QRS complex is three separate waves- Q,
conducted on the normal and arrhythmia ECG datasets R and S waves. P wave is generated during the atria depolarize
obtained from the PhysioNet website. The raw ECG data are before contraction. On the other hand, QRS complex is
preprocessed using different filters and then the features are produced when the ventricles depolarize before contraction
extracted based on morphological values of the waveform. i.e. as the depolarization wave spreads through the ventricles.
Twelve features are extracted and these are used to train Hence, ‘depolarization waves’ refer to both P wave and QRS
classifiers to classify normal and abnormal ECG data. For SVM complex. Finally, T wave is generated during the ventricles
classifier, the accuracy is around 87% while for artificial neural recovering from the depolarization state and is called
network, MATLAB’s pattern recognition app is used where the ‘repolarization wave’. Thus, the electrocardiogram consists of
classification accuracy found is around 90% - 93%. The both depolarization and repolarization waves [4].
accuracy varies for different numbers of hidden neurons.
Diverse ECG databases are used to prove the efficacy and
robustness of the use of proposed morphological features and
obtained results are compared with existing state-of-the-art
research works. This work can be further developed in the
future by incorporating deep learning for better performance
and it can eventually help to detect cardiac diseases.

Keywords—ECG, morphological features, SVM, ANN,


classification.

I. INTRODUCTION
According to the world health organization (WHO) the
number one cause of death around the world is cardiovascular
diseases. Around 17.9 million people die every year from
cardiovascular disease which is 31% of all deaths globally [1]. Fig. 1. A normal ECG waveform for one cardiac cycle.
Of all cardiovascular diseases, most deaths are due to heart
attacks and strokes. A large number of heart attacks and B. Support Vector Machine (SVM)
strokes can be prevented by early detection and proper Support Vector Machine (SVM) is one of the most popular
treatment. ECG is one of the most commonly used tests to supervised machine learning algorithms which is very
diagnose cardiac diseases. In recent years, many algorithms efficient for binary and linear classification problems. It is
and methods have been developed for ECG classification. commonly used for both classification and regression
Machine learning has become very popular and widely used analysis. However, it is mostly utilized in classification
for classification for biosignals such as ECG. In this work, we problems [5]. SVM observes data and sorts them into two
classify ECG signal by support vector machine (SVM) and categories. It is trained with a series of labelled 2-class data
artificial neural network (ANN) algorithms with building the model. The job of an SVM algorithm is to find
morphological feature extraction method. The main aim is to out which class a new data point belongs to.
detect normal and abnormal ECG signal by using the machine
learning algorithms along with the feature selection process C. Artificial Neural Network (ANN)
accurately and in a less complicated way in MATLAB ANN is originally inspired from biological neural
R2018a software. networks to recognize any pattern. In practice, ANN has
multi-layer fully-connected neural nets whose architecture
II. LITERATURE REVIEW usually contains an input layer at the left, multiple hidden
A. Electrocardiogram (ECG) layers in the middle, and an output layer at the right. Every
neuron (i.e. node) in one layer is connected to every alternative
Electrocardiogram (ECG) is also known as EKG. It is the
neuron or node within the subsequent layer with different
way to visualize the electrical activity of the heart over a
weights. The input layer receives various types of information
period of time using electrodes placed on the skin. These

978-1-7281-7366-5/20/$31.00 ©2020 IEEE


from the outside world and produces some output. This is the D. Feature extraction
labelled data that the network intents to process or learn about Feature extraction is very important for classification. For
based on its weighted connections between different layers of this project we have selected 12 features. It helps to identify
neurons in its training phase. Once the weights are known,
normality and abnormality of an ECG signal. We are using
based on the inputs, the network calculates its output from the
morphological method for feature extraction. 12 features are
same mathematical equation it learns during the training [6].
extracted from the signal using algorithm found in [18].
III. MATERIALS AND METHODS These features are:
• Maximum heart rate: A normal resting heart rate of a
A. Dataset Used healthy person is between 60 to 100 bpm.
ECG dataset has been prepared with 50 normal and 50 • Average heart rate: It is between 60 to 100 bpm.
arrhythmia databases. These data are collected from the • Minimum heart rate: It is around 60 to 70 bpm
PhysioNet website [7]. It offers free access to large collections • Total number of QRS: The total number of QRS is not
of physiological and clinical data and related open-source fixed. It changes according how long the ECG signal
software. The four databases are used for this project- is.
• Fantasia Database [8, 19] • Number of irregular beats: In normal heart rate there
• MIT-BIH Normal Sinus Rhythm Database [8] might be few irregular beats for exercising or running.
• MIT-BIH Arrhythmia Database [8, 20] If the percentage of irregular beats is less than 10 then
it is normal.
• Sudden Cardiac Death Holter Database [8, 21]
• Percentage of irregular beats: Percentage of irregular
B. Proposed Method for normal heart rate is less than 10.
The downloaded row data are not suitable for further • Number of episodes with consecutive beats: It has to
processing in the MATLAB R2018a. So the first step is to be less than 25 for normal heart rate.
prepare dataset. Before the feature extraction we have to pre- • Average PR interval: Normal PR interval is between
process the signal to remove the unwanted noise. After feature 120 to 200 ms.
selection a new dataset is created. This dataset is fed to SVM • Average QRS interval: QRS duration is between 60 to
and also to neural network to train them. After the training we 100 ms.
can classify normal and arrhythmia ECG signal. • Average QTc interval: QTc is referred to corrected QT
interval. Average QTc< 430 ms is considered as
normal.
• Number of P absence: When the number of absence P
wave is multiply by 100 and divided by total number
of QRS complex is less than 8 then the heart rate is
consider normal.
• Number of consecutive P absence: If the number of
consecutive P wave absence is multiply by 100 and
divided by total number of QRS complex is less than 1
then the heart rate is normal.

R-peak
0.8

0.7 R-peak
R-R
0.6 Interval
Normalized Amplitude

0.5

0.4

0.3

0.2

0.1 P-wave
0

-0.1
Q S
-0.2
Fig. 2. Process flow chart of the method used in this work.
420.5 421 421.5 422 422.5
Time (Sec)
C. Pre-processing
Fig. 3. QRS complex detection and extraction of relevant
Normally ECG signals are mixed with noise and artifacts morphological features.
such as baseline wander (due to muscle movement). For the
E. Preparing dataset
highest accuracy these unwanted noises need to be removed.
For denoising different types of filters are used. To remove the A dataset is created with 100 patient’s ECG signals which
power line interface a 2nd order 50 Hz notch filter has been are downloaded from the PhysioNet website. Among these
used. High frequency is removed using 128th order 80 Hz FIR 100 patients, 50 is normal ECG signal and 50 is abnormal
low pass filter. As we are using four different types of database ECG signal. Each contains the 12 features. So the new dataset
the LPF’s order and cut off frequency is changed according to contain 100 ECG signals of different patients with 12 features.
the database’s cut off frequency.
F. Training SVM TABLE I. COMPARISON WITH RELATED WORKS ON SVM
Related
SVM is used as a classification method. After preparing Database Method Performance
works
dataset it is fed to SVM for training. After randomizing the
2 different feature extraction
data it is divided into train data and test data. 80% of the data methods-The wavelet
Ref [9]
is used as training data and 20% is used as test data. We use MIT-BIH transform and autoregressive
99.68% accuracy
Arrhythmia modeling (AR)
a linear kernel function to map the training data into kernel
space. MATLAB R2018a is used to train SVM.
Discrete wavelet transform
G. Training ANN Ref [10] MIT-BIH (DWT) and principle Performance is
99.6367% with
We use a two-layer feed-forward network, with sigmoid arrhythmia components analysis (PCA)
LIBSVM.
hidden and softmax output neurons. MATLAB’s neural
DWT based feature
network pattern recognition app is utilized for it. Data is fed extraction; the R-peaks are
Ref [11] MIT-BIH
to the app as input data to present to the network and target detected to determine the 96% accuracy
arrhythmia HRV signal features.
data defining desired network output. Then it randomly
divided up the 100 samples into 70% training set, 15% Two kinds of features: 1) Accuracy: 97% for
validation set and 15% testing set. Then we set different ECG morphology features dataset 1
Ref [12] MIT-BIH and 2) ECG wavelet
numbers of hidden neurons to get more accurate classification arrhythmia features with QRS width. 91% for dataset 2
and train the network.
The accuracy is 76.83%
Principal Component
IV. RESULTS AND DISCUSSION Ref [13] INCART Analysis (PCA) feature and 98.33% for MSVM
12-lead extraction and SIMCA classifier
As the ECG signal is classify using two different arrhythmia respectively
classifiers, so the result is discuss in two different sections.
Beat classification and
A. SVM classifier result Ref [14] MIT-BIH episode detection and 95% and 94% accuracy
We have used kernel linear function for the classification. arrhythmia classification
The data is divided in training set and test set in MATLAB
Feature extraction includes-
R2018a software. The accuracy is around 87%. The accuracy frequency information, RR
changes every time we retrain the classifier. Ref [15] MIT-BIH intervals, QRS morphology
Accuracy is 97.2%
arrhythmia and AC power of QRS detail
coefficients
Morphology feature
MIT-BIH
Ref [16] extraction (Wavelet-SVM Accuracy is 97.59%
arrhythmia
Method)
Fantasia,
MIT-BIH
normal
sinus
rhythm, Morphology Feature
This MIT-BIH extraction method (12
arrhythmia, features) 87% accuracy
work
Sudden
Fig. 4. Scatter diagram of training and test set during DVM Cardiac
Death
In this classification only 12 features are used to train the Holter
classifier. Though the existing studies on SVM are vast and
have more accuracy than this project. But the only few of the
studies have used these 12 features and four types of database.
Table-I provides the comparison with other studies.
Most of the previous studies have used two feature
extraction methods. This makes the detection more accurate.
But we have used only one feature extraction method and
selected only 12 features. For this reason, the process becomes
easier although the accuracy is not high. The accuracy can be
improved if we find the most dominant features and train the
classifier with those features.
B. ANN classifier result
The overall accuracy of neural network is 93% for the
number of hidden neurons 24. We can change the accuracy
and performance by changing the number of hidden neurons.
The whole dataset is divided into training, validation and test
set. The training set is 70% of the whole dataset and the
validation and test set is 15%.

Fig. 5. Confusion matrix of neural network for 24 hidden neuron


From the confusion matrix we get the value of true positive We acknowledge the support from Dept. of EEE of
(TP), true negative (TN), false positive (FP), false negative Independent University, Bangladesh.
(FN) and total accuracy. Definition of these terms are as
follows [17]: REFERENCES
• True positive (TP): Positive prediction means the [1] https://www.who.int/health-topics/cardiovascular-
diseases/#tab=tab_1 (accessed on 9 Febeuray, 2020)
subject has the disease and it is true prediction.
[2] S. T. Alam, M. M. Hossain, M. D. Rahman, M. K. Islam," Towards
• True negative (TN): Negative prediction means the Development of a Low Cost and Portable ECG Monitoring System for
subject does not have the disease. This is also a true Rural/Remote Areas of Bangladesh ", International Journal of Image,
Graphics and Signal Processing(IJIGSP), Vol.10, No.5, pp. 24-32,
prediction. 2018.DOI: 10.5815/ijigsp.2018.05.03
• False positive (FP): The prediction is positive but the [3] https://en.wikipedia.org/wiki/Willem_Einthoven (accessed on 9
subject doesn't actually have the disease (aka "Type-I February, 2020)
error"). [4] S Pithika, “Electrocardiogram (ECG): Definition and Characteristics |
• False negative (FN): The prediction is negative but the Veterinary Pharmacology,” [online document], Available:
subject actually has the disease (aka "Type-II error"). http://www.biologydiscussion.com/veterinary-
pharmacology/electrocardiogram-ecg-definition-and-characteristics-
The performance of neural network changes with different veterinary-pharmacology/74308 (accessed on Dececember 17, 2019)
number of hidden neurons. The changes are given below- [5] S. Ray, “Understanding Support Vector Machine algorithm from
examples (along with code)” 6 October, 2015. [Online document],
https://www.analyticsvidhya.com/blog/2017/09/understaing-support-
TABLE II. CLASSIFICATION PERFORMANCE OF NEURAL vector-machine-example-code/ (accessed on 20 December, 2019)
NETWORK (PERCEPTRON) FOR DIFFERENT HIDDEN NEURONS [6] B. Marr, “What Are Artificial Neural Networks - A Simple Explanation
# ANN Classifier’s Performance For Absolutely Anyone,” 24 September, 2018, Available:
Hidden https://www.forbes.com/ (accessed on December 19, 2019)
Training Validation Test All
Neurons [7] https://physionet.org/ (accessed on 20 December, 2019)
TP=34.3%, TP=33.3%, TP=33.3%, TP=34%, [8] Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh,
TN=42.4%, TN=53.3%, TN=26.7%, TN=41%, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank,
10 FP=18.6%, FP=0%, FP=20%, FP=16%, PhysioToolkit, and PhysioNet: Components of a New Research
FN=5.7%, FN=13.3%, FN=20%, FN=9%,
Resource for Complex Physiologic Signals (2003). Circulation.
Accuracy=75.7% Accuracy=86.7% Accuracy=60% Accuracy=75%
101(23):e215-e220.
TP=42.9%, TP=46.7%, TP=40%,
TP=43%, [9] Q. Zhao and L. Zhang, “ECG feature extraction and classification using
TN=44.3%, TN=33.3%, TN=46.7%,
TN=43%, wavelet transform and support vector machines,” IEEE International
14 FP=7.1%, FP=6.7%, FP=6.7%,
FP=7%, FN=7%, Conference on Neural Networks and Brain (ICNN&B), vol. 2, pp.
FN=5.7%, FN=13.3%, FN=6.7%,
Accuracy=75.7% 1089–1092 (2015).
Accuracy=87.1% Accuracy=80% Accuracy=86.7%
TP=40%, TP=60%, [10] D. Thanapatay, C. Suwansaroj and C. Thanawattano, “ECG beat
TP=53.3%, TP=45%,
TN=50%, TN=33.3%, classification method for ECG printout with Principle Components
TN=46.7%, TN=47%,
22 FP=7.1%, FP=0%, Analysis and Support Vector Machines”, International Conference on
FP=0%, FN=0%, FP=5%, FN=3%,
FN=2.9%, FN=6.7%, Electronics and Information Engineering, vol. 1, pp. 72-75 (2010).
Accuracy=100% Accuracy=92%
Accuracy=90% Accuracy=93.3%
[11] C. Venkatesan, P. Karthigaikumar, A. Paul, S. Satheeskumaran, and R.
TP=12.9%, TP=6.7%, TP=20%, TP=49%,
Kumar, ‘‘ECG signal preprocessing and SVM classifier-based
TN=45.7%, TN=66.7%, TN=46.7%, TN=13%,
abnormality detection in remote healthcare applications,’’ IEEE
28 FP=40%, FP=26.7%, FP=33.3%, FP=37%,
Access, vol. 6, pp. 9767–9773, (2018).
FN=1.4%, FN=0%, FN=0%, FN=1%,
Accuracy=58.6% Accuracy=73.3% Accuracy=66.7% Accuracy=62% [12] Y. Bazi, N. Alajlan, H. AlHichri and S. Malek, “Domain adaptation
methods for ECG classification”, in: IEEE International Conference on
Computer Medical Application (ICCMA), January 2013, pp. 1–4.
For 22 hidden neurons the accuracy is 92% and for 24 the
[13] N. Jannah and S. Hadjiloucas, “Comparison Between ECG Beat
accuracy is 93%. So for 21 to 25 hidden neurons the accuracy Classifiers Using Multiclass SVM and SIMCA with Time Domain
is 90% to 93%. Below 21 or above 25 hidden neurons the PCA Feature Reduction”, UKSim-AMSS 19th International
accuracy decline. Conference on Modeling & Simulation, pp. 126-131, (2017).
[14] M.G. Tsipouras, D.I. Fotiadis and D. Sideris, “Arrhythmia
V. CONCLUSION classification system based on the RR-interval signal” Artificial
Intelligence in Medicine, vol.33, 237—250, (2005).
This paper proposed two different machine learning [15] Z. Zidelmala, A. Amirou, D. O. Abdeslam and J. Merckle, “ECG beat
algorithms- SVM and ANN for detecting normal and classification using a cost sensitive classifier”, vol. 111 (3), pp.570-
arrhythmia ECG signals by using morphological features. 577, (2013).
This study is conducted on four different types of databases of [16] M.K. Gautam, “Morphological Analysis of ECG Arrhythmias using
100 subjects downloaded from PhysioNet. The average SVM Methodology”, International Journal of Computer Science and
accuracy found with SVM and ANN are around 87% and 94% Information Security (IJCSIS), Vol. 15, No. 5, pp.421-427, (May 2017)
(using 24 hidden neurons) respectively. The accuracy is good [17] https://www.dataschool.io/simple-guide-to-confusion-matrix-
terminology/ (accessed on 27 December, 2019)
but can be further improved and the proposed system may be
developed for practical use. From the features if we can find [18] https://github.com/redxlab/cardio24 (accessed on 15 March, 2020)
the most dominant features (i.e. feature selection) to train the [19] Iyengar N, Peng C-K, Morin R, Goldberger AL, Lipsitz LA. Age-
related alterations in the fractal scaling of cardiac interbeat interval
classifier, it may improve the accuracy. We can try to classify dynamics. Am J Physiol 1996;271:1078-1084.
using advanced neural network like deep neural network [20] Moody GB, Mark RG. The impact of the MIT-BIH Arrhythmia
(DNN) and convolutional neural network (CNN). With CNN Database. IEEE Eng in Med and Biol 20(3):45-50 (May-June 2001).
we can directly use any raw signal to classify which may allow (PMID: 11446209)
to reduce the computational complexity. [21] Greenwald SD. Development and analysis of a ventricular fibrillation
detector. M.S. thesis, MIT Dept. of Electrical Engineering and
ACKNOWLEDGMENT Computer Science, 1986.

You might also like