1015 Final
1015 Final
Abstract— An electrocardiogram (ECG) is a simple test that electrodes capture the tiny electrical changes that are a result
is used to check one’s heart’s electrical activity. Sensors of cardiac muscle depolarization followed by repolarization
attached to the skin are used to detect the electrical signal throughout every cardiac cycle (heartbeat) [2]. Dutch
produced by one’s heart each time it beats. Many people around physician and physiologist Willem Einthoven invented the
the world suffer from cardiovascular diseases. So it is important first practical electrocardiogram (ECG) [3].
to detect arrhythmia/abnormal and normal ECG signal more
accurately. In this paper, ECG signal is classified by support The normal ECG is consist of ‘P’ wave, a ‘QRS’ complex,
vector machine (SVM) and neural network. The research is and a ‘T’ wave. The QRS complex is three separate waves- Q,
conducted on the normal and arrhythmia ECG datasets R and S waves. P wave is generated during the atria depolarize
obtained from the PhysioNet website. The raw ECG data are before contraction. On the other hand, QRS complex is
preprocessed using different filters and then the features are produced when the ventricles depolarize before contraction
extracted based on morphological values of the waveform. i.e. as the depolarization wave spreads through the ventricles.
Twelve features are extracted and these are used to train Hence, ‘depolarization waves’ refer to both P wave and QRS
classifiers to classify normal and abnormal ECG data. For SVM complex. Finally, T wave is generated during the ventricles
classifier, the accuracy is around 87% while for artificial neural recovering from the depolarization state and is called
network, MATLAB’s pattern recognition app is used where the ‘repolarization wave’. Thus, the electrocardiogram consists of
classification accuracy found is around 90% - 93%. The both depolarization and repolarization waves [4].
accuracy varies for different numbers of hidden neurons.
Diverse ECG databases are used to prove the efficacy and
robustness of the use of proposed morphological features and
obtained results are compared with existing state-of-the-art
research works. This work can be further developed in the
future by incorporating deep learning for better performance
and it can eventually help to detect cardiac diseases.
I. INTRODUCTION
According to the world health organization (WHO) the
number one cause of death around the world is cardiovascular
diseases. Around 17.9 million people die every year from
cardiovascular disease which is 31% of all deaths globally [1]. Fig. 1. A normal ECG waveform for one cardiac cycle.
Of all cardiovascular diseases, most deaths are due to heart
attacks and strokes. A large number of heart attacks and B. Support Vector Machine (SVM)
strokes can be prevented by early detection and proper Support Vector Machine (SVM) is one of the most popular
treatment. ECG is one of the most commonly used tests to supervised machine learning algorithms which is very
diagnose cardiac diseases. In recent years, many algorithms efficient for binary and linear classification problems. It is
and methods have been developed for ECG classification. commonly used for both classification and regression
Machine learning has become very popular and widely used analysis. However, it is mostly utilized in classification
for classification for biosignals such as ECG. In this work, we problems [5]. SVM observes data and sorts them into two
classify ECG signal by support vector machine (SVM) and categories. It is trained with a series of labelled 2-class data
artificial neural network (ANN) algorithms with building the model. The job of an SVM algorithm is to find
morphological feature extraction method. The main aim is to out which class a new data point belongs to.
detect normal and abnormal ECG signal by using the machine
learning algorithms along with the feature selection process C. Artificial Neural Network (ANN)
accurately and in a less complicated way in MATLAB ANN is originally inspired from biological neural
R2018a software. networks to recognize any pattern. In practice, ANN has
multi-layer fully-connected neural nets whose architecture
II. LITERATURE REVIEW usually contains an input layer at the left, multiple hidden
A. Electrocardiogram (ECG) layers in the middle, and an output layer at the right. Every
neuron (i.e. node) in one layer is connected to every alternative
Electrocardiogram (ECG) is also known as EKG. It is the
neuron or node within the subsequent layer with different
way to visualize the electrical activity of the heart over a
weights. The input layer receives various types of information
period of time using electrodes placed on the skin. These
R-peak
0.8
0.7 R-peak
R-R
0.6 Interval
Normalized Amplitude
0.5
0.4
0.3
0.2
0.1 P-wave
0
-0.1
Q S
-0.2
Fig. 2. Process flow chart of the method used in this work.
420.5 421 421.5 422 422.5
Time (Sec)
C. Pre-processing
Fig. 3. QRS complex detection and extraction of relevant
Normally ECG signals are mixed with noise and artifacts morphological features.
such as baseline wander (due to muscle movement). For the
E. Preparing dataset
highest accuracy these unwanted noises need to be removed.
For denoising different types of filters are used. To remove the A dataset is created with 100 patient’s ECG signals which
power line interface a 2nd order 50 Hz notch filter has been are downloaded from the PhysioNet website. Among these
used. High frequency is removed using 128th order 80 Hz FIR 100 patients, 50 is normal ECG signal and 50 is abnormal
low pass filter. As we are using four different types of database ECG signal. Each contains the 12 features. So the new dataset
the LPF’s order and cut off frequency is changed according to contain 100 ECG signals of different patients with 12 features.
the database’s cut off frequency.
F. Training SVM TABLE I. COMPARISON WITH RELATED WORKS ON SVM
Related
SVM is used as a classification method. After preparing Database Method Performance
works
dataset it is fed to SVM for training. After randomizing the
2 different feature extraction
data it is divided into train data and test data. 80% of the data methods-The wavelet
Ref [9]
is used as training data and 20% is used as test data. We use MIT-BIH transform and autoregressive
99.68% accuracy
Arrhythmia modeling (AR)
a linear kernel function to map the training data into kernel
space. MATLAB R2018a is used to train SVM.
Discrete wavelet transform
G. Training ANN Ref [10] MIT-BIH (DWT) and principle Performance is
99.6367% with
We use a two-layer feed-forward network, with sigmoid arrhythmia components analysis (PCA)
LIBSVM.
hidden and softmax output neurons. MATLAB’s neural
DWT based feature
network pattern recognition app is utilized for it. Data is fed extraction; the R-peaks are
Ref [11] MIT-BIH
to the app as input data to present to the network and target detected to determine the 96% accuracy
arrhythmia HRV signal features.
data defining desired network output. Then it randomly
divided up the 100 samples into 70% training set, 15% Two kinds of features: 1) Accuracy: 97% for
validation set and 15% testing set. Then we set different ECG morphology features dataset 1
Ref [12] MIT-BIH and 2) ECG wavelet
numbers of hidden neurons to get more accurate classification arrhythmia features with QRS width. 91% for dataset 2
and train the network.
The accuracy is 76.83%
Principal Component
IV. RESULTS AND DISCUSSION Ref [13] INCART Analysis (PCA) feature and 98.33% for MSVM
12-lead extraction and SIMCA classifier
As the ECG signal is classify using two different arrhythmia respectively
classifiers, so the result is discuss in two different sections.
Beat classification and
A. SVM classifier result Ref [14] MIT-BIH episode detection and 95% and 94% accuracy
We have used kernel linear function for the classification. arrhythmia classification
The data is divided in training set and test set in MATLAB
Feature extraction includes-
R2018a software. The accuracy is around 87%. The accuracy frequency information, RR
changes every time we retrain the classifier. Ref [15] MIT-BIH intervals, QRS morphology
Accuracy is 97.2%
arrhythmia and AC power of QRS detail
coefficients
Morphology feature
MIT-BIH
Ref [16] extraction (Wavelet-SVM Accuracy is 97.59%
arrhythmia
Method)
Fantasia,
MIT-BIH
normal
sinus
rhythm, Morphology Feature
This MIT-BIH extraction method (12
arrhythmia, features) 87% accuracy
work
Sudden
Fig. 4. Scatter diagram of training and test set during DVM Cardiac
Death
In this classification only 12 features are used to train the Holter
classifier. Though the existing studies on SVM are vast and
have more accuracy than this project. But the only few of the
studies have used these 12 features and four types of database.
Table-I provides the comparison with other studies.
Most of the previous studies have used two feature
extraction methods. This makes the detection more accurate.
But we have used only one feature extraction method and
selected only 12 features. For this reason, the process becomes
easier although the accuracy is not high. The accuracy can be
improved if we find the most dominant features and train the
classifier with those features.
B. ANN classifier result
The overall accuracy of neural network is 93% for the
number of hidden neurons 24. We can change the accuracy
and performance by changing the number of hidden neurons.
The whole dataset is divided into training, validation and test
set. The training set is 70% of the whole dataset and the
validation and test set is 15%.