0% found this document useful (0 votes)
4 views6 pages

2015 ACPR Heart Sounds

This conference paper presents a novel approach for classifying heart sounds using Discrete and Continuous Wavelet Transforms combined with Random Forests. The study highlights the limitations of human auscultation in detecting cardiac murmurs and proposes a method that improves classification accuracy by utilizing wavelet analysis to extract relevant features. The results indicate that the integrated approach reduces diagnostic errors and has potential applications in low-resource settings for cardiovascular disease detection.

Uploaded by

reciever2050
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views6 pages

2015 ACPR Heart Sounds

This conference paper presents a novel approach for classifying heart sounds using Discrete and Continuous Wavelet Transforms combined with Random Forests. The study highlights the limitations of human auscultation in detecting cardiac murmurs and proposes a method that improves classification accuracy by utilizing wavelet analysis to extract relevant features. The results indicate that the integrated approach reduces diagnostic errors and has potential applications in low-resource settings for cardiovascular disease detection.

Uploaded by

reciever2050
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/305234055

Classification of heart sounds using discrete and continuous wavelet transform


and random forests

Conference Paper · November 2015


DOI: 10.1109/ACPR.2015.7486584

CITATIONS READS

43 441

3 authors, including:

Christine Canoy Balili


Korea Advanced Institute of Science and Technology
4 PUBLICATIONS 82 CITATIONS

SEE PROFILE

All content following this page was uploaded by Prospero C. Naval on 05 March 2020.

The user has requested enhancement of the downloaded file.


Classification of Heart Sounds using Discrete and
Continuous Wavelet Transform and Random Forests

Christine C. Balili, Ma. Caryssa C. Sobrepeña, and Prospero C. Naval, Jr.


Computer Vision & Machine Intelligence Group
Department of Computer Science
College of Engineering
University of the Philippines-Diliman

ABSTRACT A normal heart sound is composed of the rst heart


Cardiac auscultation is a non-invasive and low cost method sound (S1) and the second heart sound (S2). A complete
for cardiovascular disease diagnosis through which a physi- heart sound can be segmented into 2 periods - systole and di-
cian is able to diagnose heart sounds that could be indicative astole. The period between S1 and S2 is called systole while
of cardiac pathologies. Studies have revealed that the abil- the period from S2 to the next S1 marks diastole. Dur-
ity of primary care physicians to accurately interpret heart ing auscultation, a physician may detect cardiac murmurs
sounds is rather poor due to the relative insensitivity of the within these periods. Cardiac murmurs are caused by audi-
human ear to the low frequency components present in these ble vibrations brought about by increased turbulence from
sounds. In this study, we propose an integrated approach to accelerated blood ow in vessels of the heart, ow through a
heart murmur classication using wavelet analysis and Ran- narrow or dilated vessel or chamber, backward ow through
dom Forests. The segmentation step which involves the de- an incompetent valve, or septal defects. Cardiac murmurs
tection of S1 and S2 heart sounds based on Shannon energy vary in intensity. They may sound like a whooshing or swish-
produced lower errors in comparison with previous works us- ing, blowing, or clicking noise [3].
ing the same dataset. A Random Forest was used to classify Cardiac murmurs may be indicative of an underlying
the heart sounds into Normal, Murmur, Extrasystole, and CVD. The challenge for physicians is to identify whether
Artifact. Time- and frequency-based features derived from a cardiac murmur is innocent or pathological. The accu-
Discrete and Continuous Wavelet Transforms were used as racy of auscultation in diagnosing CVDS depends heavily
feature vectors to train and test the classier. In comparison on the auscultation prociency of the physician. This of-
to previously published works using the same dataset, the ten results to inaccuracy and inconsistency in auscultation
total precision for the approach presented was higher for the ndings. Consequently, patients are subjected to further ex-
noisy samples and within range of the ndings from existing aminations which entail greater costs. Furthermore, in low
literature for the less noisy samples. and middle income countries where access to proper health
care is limited, specialized equipment and facilities for fur-
ther cardiac examinations are often unavailable especially in
1. INTRODUCTION rural areas.
Cardiovascular diseases (CVDs) remain as the leading cause With the development of a system that aids physicians
of death globally. In 2008, the World Health Organization in the assessment of cardiac murmurs, we can reduce the
reported that an estimated 17.3 million people died from cost of diagnosis and improve its accuracy. A prerequisite
CVDs, making up 30% of all global deaths. Over 80% of of this system is a good classication method that is robust
the mortalities from CVDs occur in low and middle income enough to be able to handle noisy data. The analysis of
countries because of the greater exposure of the population heart sounds using digital signal processing techniques re-
to risk factors. The WHO also pointed out that these coun- main a challenge because of their inherent complexity as a
tries often lack the benet of prevention programmes and non-stationary signal. Recognizing the limitations of solv-
have less access to eective health care services that include ing this problem using frequency domain analysis, previous
early detection services [1]. researches have used wavelet-based approaches in the study
Early detection and diagnosis are crucial in mitigat- of heart sound signals.
ing cardiovascular diseases and curbing the mortality cases Wavelet transforms are time-frequency representation
caused by them. With the advancement of technology in methods that are suitable for analyzing non-stationary sig-
medicine, a number of diagnostic methods such as elec- nals. The capability to provide both time and frequency
trocardiography (ECG), echocardiography, and ultrasound information simultaneously in varying time and frequency
have helped in understanding the nature of CVDs. These resolutions (multiresolution analysis) is a notable advan-
techniques despite providing more direct and accurate ev- tage over time-frequency analysis methods (e.g. Short-Time
idence of heart disease than auscultation are costly, bulky, Fourier Transform). Two of the most frequently used wavelet
and operationally complex. Cardiac auscultation remains to transforms are the discrete and continuous wavelet trans-
be a basic, non-invasive, and cost eective method in car- forms.
diac examination. Traditionally, this process is done using Discrete Wavelet transform was used in the segmenta-
a stethoscope which enables the physician to detect patho- tion of heart sounds to produce intensity envelopes of ap-
logic heart sounds [2].
proximations of the original heart sound signal. The said
algorithm has shown over 93% accuracy [4]. This segmenta-
tion framework later on became the basis for feature extrac-
tion and classication of heart sounds. The features were
obtained from both original and selected subband signals
derived from the decomposition and reconstruction process.
The feature vector fed into an articial neural network re-
sulted in 74.4 % accuracy for classifying physiological and
pathological murmurs [9]. An assessment of the wavelet to
be used in the analysis heart signals based on the error cal-
culated between the original and the reconstructed signal
identied that the discrete wavelet transform (DWT) was
found to be more suitable in ltering cardiac murmurs with-
out distorting S1 and S2 too much [7].
The Continuous Wavelet Transform (CWT) was used
as basis in the classication of innocent and organic cardiac
murmurs. The matrix derived from CWT was processed
using Singular Value and QR Decomposition. From the
resulting matrices, Shannon entropy and Gini index were
utilised to extract the features. The feature vector was re-
duced through sequential forward oating selection before
classifying via classication and regression trees. The clas-
sication recorded an accuracy of 90% [10].
In the current study, we aim to develop a novel ap-
proach in the classication of heart sounds by extracting
features using both Discrete and Continuous Wavelet Trans-
form and feeding them to a Random Forest classier. The
approach will be tested on a publicly available heart sound
classcation dataset [8].

2. METHODS
2.1 The Dataset Figure 1: Block Diagram of the Classication Process
The heart sound data used was obtained from the 2011 PAS-
CAL Classifying Hearts Sounds Challenge [8]. The data are
grouped into two sets A and B which were gathered from done to obtain the approximation and detail coecients of
the following sources accordingly: a) general public via the the decomposition.
iStethoscope Pro iPhone app and b) clinic trial in hospitals
using the DigiScope electronic stethoscope. Dataset A is di-
vided into four categories: Normal (31 samples), Murmur
(34), Extra Heart Sound (19), and Artifact (40). Dataset
B has three categories - Normal (319), Murmur (93), and
Extrasystole (46). The samples vary in length.

2.2 The Segmentation Procedure


The segmentation procedure proposed in [4] was adapted.
However, for the purposes of making this procedure more
robust, we integrated the peak conditioning process from Figure 2: The structure of a fth-level decomposition
[5]. We implemented the procedure in MATLAB. Discrete
wavelet decomposition and reconstruction was used as a pre-
processing step. We experimented on dierent combinations The signal can be reconstructed using these coecients
of details and approximation obtained from the previous such that:
step to arrive at the one with the highest accuracy. We S = d1 + d2 + d3 + d4 + d5 + a5
tested this chosen combination for datasets A and B. The
whole segmentation procedure is represented by the rst four In this study, the fth-level approximation coecient and
steps in Figure 1. the second-level, third-level and fth-level detail coecients
were used to reconstruct the signal.
2.2.1 Wavelet Decomposition and Reconstruction
A discrete wavelet decomposition enables us to conduct a 2.2.2 Energy Calculation
multi-resolution analysis of a signal. Using this method, we Shannon energy is the primary feature used in the segmen-
are able to look into heart sound signals under dierent fre- tation procedure. The reconstructed signal is split into 20
quency bands. A fth-level discrete wavelet decomposition ms frames with 10 ms frame overlap. The average Shan-
using order six Daubechies lter of the original signal was non energy for each frame is calculated using the following
formula [4]: peak values for all heart cycles, and the heart rate. A mean
signal was calculated for each of the four components. The
−1 ∑ 2
ES = xnorm (i) log x2norm (i) mean signal is derived by taking the mean of the value of
N each component. The S1 and S2 mean signals were divided
where xnorm is the signal sample normalized to the maxi- into 8 equal parts and their corresponding Shannon energy
mum value of the reconstructed signal and N is the number was determined. The systole and diastole mean signals were
of samples in the current frame. Then the computed Shan- divided into 24 and 48 parts respectively and were subjected
non energy per frame is normalized as follows: to the same process. The systolic and diastolic phase mean
ES (f ) − M (ES ) signals were then passed to four 6th order Butterworth band-
ESnorm = pass lters to capture their contents at dierent frequencies.
S(ES )
The bandpass lters are as follows: 50-250 Hz, 100-300 Hz,
where M (ES ) is the mean of ES and S(ES ) is the standard 150-350, and 200-400 Hz. The total energy was calculated
deviation of ES . for the 8 outputs generated. A total of 100 features were
extracted based the the Discrete Wavelet Transform (Table
2.2.3 Thresholding and Peak Conditioning 1).
Using the calculated normalized Shannon energy, peaks whose
energies exceed the higher of two thresholds are selected to Table 1: Summary of Discrete Wavelet Transform features
be the possible S1 and S2. The best value of thresholds
were determined using a Genetic Algorithm. The higher Feature Source
threshold used in this study is low enough to capture sig- 1-4 Standard Deviation of Cardiac Cycle
nicant peaks, but also high enough to capture peaks that Lengths, S1 & S2 Peak Values, and Heart
correspond to neither S1 nor S2. Extra peaks captured were Rate
discarded and missed peaks were detected using the follow- 5-12 Shannon Energies of the 8 parts of the S1
ing rules: Mean Signal
13-36 Shannon Energies of the 24 parts of the
1. if the distance between two peaks is less than half of Systolic Phase Mean Signal
the average distance between all the peaks, the lower 37-44 Shannon Energies of the 8 parts of the S2
peaks is discarded; and Mean Signal
45-92 Shannon Energies of the 48 parts of the
2. if the distance between two peaks is greater than three- Diastolic Phase Mean Signal
halves of the average distance between all the peaks, 93 - 100 Total Energy of Systolic and Diastolic
the relative maximum between the two peaks is also Phase Mean Signals after passing through
marked as a peak; this maximum, however, must also four 6th order Butterworth bandpass l-
exceed a lower threshold. ters (50-250 Hz, 100-300 Hz, 150-250 Hz
and 200-400 Hz)
2.2.4 Identification of S1 and S2
After all the peaks have been marked, the locations of S1
and S2 are decided. The locations of S1 and S2 are decided We also extracted features based on Continuous Wavelet
using the following facts: Transform. Based on the methodology described in [10],
a total of 86 features using matrix decomposition meth-
ods were obtained. The continuous wavelet transform of
1. the diastole period (from S2 to S1) is generally longer
a signal gives a coecient matrix whose rows correspond to
than the systole period (from S1 to S2) and
the CWT coecients for one scale while the columns repre-
2. the systole period is generally more constant than the sent the length of the input. Singular Value Decomposition
systole period. (SVD) and QR Decomposition (QRD) were applied on the
coecient matrices resulting from the CWT process with
scales 45 to 160. The next table (Table 2) enumerates the
The mean and standard deviation of the distance be- features generated by this process.
tween peaks are considered in marking the location of S1s
and S2s. The longest interval between two peaks was lo- 2.4 Random Forest Classification
cated. This interval is automatically a diastole period and A Random Forest is a classier consisting of a collection
its points are marked as S2 and S1 respectively. Using this of tree-structured classiers {h(x, Θk ), k = 1, ...} where the
interval as a point of reference, the rest of the peaks are {Θk } are independent identically distributed random vec-
labelled S1 or S2. tors and each tree casts a unit vote for the most popular
class for input x. It is a classication method that works
2.3 Wavelet Feature Extraction by creating an ensemble of decision trees at training time.
Each heart sound signal was segmented into four parts (S1, To classify an object from an input vector, each tree in the
Systole, S2, and Diastole).We extracted features using Dis- forest gives a classication which is essentially is a vote. The
crete and Continuous Wavelet Transform per component. forest decides the classication that has the most number of
Following [11], every reconstructed signal was used to calcu- votes. Random forests do not overt and run eciently on
late the standard deviation of the duration of all the heart large databases [12].
cycles it contains, the standard deviation of the S1 and S2 A random forest composed of 70 trees was used to clas-
Table 2: Summary of features extracted from CWT ap-
proach

Feature Source
1-3 Maximum, minimum and mean CWT co-
ecients for Systolic Phase Mean Signal
4-13 First 10 singular values from the SVD of
the CWT coecient matrix for the Systolic
Phase Mean Signal
14-23 The absolute values of the rst 10 diagonal
elements of R from the QRD of the CWT
coecient matrix for the Systolic Phase
Mean Signal
24 -33 The Shannon Entropies derived from the
rst 10 column vectors of Q based on the
QRD of the CWT coecient matrix for the
Systolic Phase Mean Signal
34-86 Shannon Energies of the 48 parts of the
Diastolic Phase Mean Signal

sify the heart sounds. The feature extraction process derived


3 sets of feature vectors using DWT, CWT, and a combina-
tion of both respectively. These feature vectors were used
to test and train the classier. The 2011 PASCAL Heart Figure 3: An example of a segmentation result of a signal
Sound Challenge [8] has specied a list of metrics to be used from dataset B. The signal was reconstructed using wavelet
in evaluating the eectiveness of the classication algorithm. coecients d2, d3, d5, a5

3. RESULTS AND DISCUSSIONS


The segmentation procedure described in Section 2.2 was For classication, the precision of each class was mea-
applied on Normal heart sound training set of datasets A sured per dataset. For dataset B, the sensitivity and speci-
and B. The coecient set that produced the best results city of both artifact and heart problem detection were ob-
were the detail coecients d2, d3 and d5 and the approxi- tained while for dataset A, only the sensitivity and speci-
mation coecient a5. An example of a segmentation result city of the artefact were determined. The Youden index is
is shown in Figure 3. a single statistic that captures the performance of a diagnos-
tic test by evaluating its ability to avoid failure.This index
The metrics for the assessment of eectiveness of our was evaluated for the artifact category in dataset A and for
proposed heart sound segmentation and classication ap- the heart sounds with murmur and extrasystole in dataset
proaches were based on the mechanics specied in [8]. The B. Murmurs and extrasystoles are considered as problem-
total segmentation error for dataset A is 3013147.36, with atic heart sounds. The problematic heart sound detection
the average error per sample ranging from 17901.39 to 1446109.50. F-score was computed in dataset A. The Discriminant Power
Dataset B has a total error of 67732.08 with an average er- which measures the ability of the algorithm to distinguish
ror as low as 61.83 but as high as 24205.00. Even with between positive and negative samples was calculated in
fewer number of samples, the total error is much higher in dataset B for problematic heart sounds.
dataset A. Since dataset A was gathered via an iPhone app
and uploaded , there is probably less eort to reduce the Columns 2-4 of Table 4 summarize the results of 2011
noise. These noises may be due to the hardware constraints PASCAL Classifying Heart Sound Challenge winning en-
of the iPhone or background noises while recording the sig- tries. Our approach that used the features obtained us-
nals. Dataset B, on the other hand was recorded using a ing Discrete Wavelet Transform (DWT) registered the high-
digital stethoscope in hospitals where the gathering of data est total precision and outperformed the winning entries.
is facilitated more eectively. The noise in the data can Features derived from the Continuous Wavelet Transform
signicantly mask the peaks that actually correspond to S1 (CWT) as well as from combined DWT and CWT also
and S2. In comparison to the results of the top-performing recorded total precision scores that outperformed the win-
entries submitted to the PASCAL challenge [13] [14], the ning entries. For dataset A, this is attributed to the preci-
approach described in this study recorded the lowest total sion of extrasystole which is signicantly higher compared
error. to the rest. Artifact sensitivity was the highest using fea-
tures extracted by CWT. For dataset B, the total precision
Table 3: Summary of Segmentation Results was lower but is still well within the range of those found
in literature. This is due to the inability of our method
Dataset Our Method Gomes and Deng and to detect the extrasystole in dataset B. The approach that
Periera [13] Bentley [14] used CWT-based features registered the highest heart prob-
A 3013147.36 - 3394378.85 lem specicity on the same dataset. The results shown in
B 67732.08 72242.8 75569.78 columns 2-3 of Table 4 are from [13] (winning entry) that
used the segmentation in [4] and the peak conditioning tech- [2] M.Tavel, ”Cardiac auscultation: a glorious past - but
nique in [6] which were also adopted in this study. We ex- does it have a future?”, Circulation, American Heart
tracted an extensive feature set that included features other Association, Vol. 93, pp. 1250-1253,1996.
than those derived from distance between S1 and S2 pri-
marily used by [13] and [14]. [3] P. O’gara, J. Loscalzo, ”Approach to the patient with a
heart murmur”, Harrison’s Principles of Internal
Medicine, 18th ed., Vol 1, D. Longo, A. Fauci, D.
Table 4: Summary of Classication Results
Kasper, S. Hauser, J. Jameson, J. Loscalzo, Ed: The
[13] DWT CWT MacGraw-Hill Companies, Inc., pp. 1-10, 2012.
Metrics [14] Hybrid
J48 MLP Features Features
Dataset A
Precision of
[4] L. Huiying, L. Sakari, H. Iiro, A heart sound
0.25 0.35 0.46 0.48 0.5 0.47
Normal segmentation algorithm using wavelet decomposition
Precision of
Murmur
0.47 0.67 0.31 0.62 0.5 0.59
and reconstruction, Hin Proc. IEEE/MBS,1997, pp.
Precision of
Extrasystole
0.27 0.18 0.11 1.0 0.33 1.0 1630-1633.,
Precision of
0.71 0.92 0.58 0.82 0.82 0.81
Artifact
Artifact
[5] G. Saha and P. Kumar, An ecient heart sound
0.63 0.69 0.44 0.88 0.88 0.81
Sensitivity segmentation algorithm for cardiac diseases, in IEEE
Artifact
Specificity
0.39 0.44 0.44 0.53 0.47 0.55 India Annual Conference,2004, pp. 344-348.
Youden Index
0.01 0.13 -0.09 0.4 0.35 0.37
of Artifact
F-score 0.20 0.20 0.14 0.23 0.21 0.26 [6] C. Gupta, R. Palaniappan, S. Rajan, S. Swaminathan,
Total
Precision
1.71 2.12 1.47 2.92 2.15 2.87 and S. Krishnan. Segmentation and classication of
Dataset B heart sounds. in Proc. CCECE/CCGEI, May 2005, pp.
Precision of
Normal
0.72 0.70 0.77 0.73 0.71 0.73 1674-1677.
Precision of
0.32 0.30 0.37 0.57 0.33 0.63
Murmur
Precision of
[7] L. Hamza, Cherif, S. Debbal, F. Bereksi-Reguig. Choice
0.33 0.67 0.17 0 1.0 0
Extrasystole of the wavelet analyzing in the phonocardiogram signal
Heart Problem
Detection 0.22 0.19 0.51 0.2 0.14 0.2 analysis using the discrete and the packet wavelet
Sensitivity
Heart Problem transform. Expert Systems with Applications, vol 37,
0.82 0.84 0.59 0.93 0.9 0.95
Detection Spec.
Youden Index
pp. 913-918, 2010.
0.04 0.02 0.01 0.14 0.04 0.15
of Heart Problem
Discriminant
0.05 0.04 0.09 0.31 0.09 0.37 [8] P. Bentley, G. Nordehn, M. Coimbra, S. Mannor, and
Power
Total
1.37 1.67 1.31 1.3 2.04 1.36
R. Getz. The PASCAL classifying heart sounds
Precision
challenge 2011 results [online] 2011,
http://www.peterjbentley.com/heartchallenge/index.html
(Accessed: 8 September 2014)
4. CONCLUSION
In this paper, we presented an integrated approach for the [9] L. Huiying, H. Iiro,” Heart Sound Feature Extraction
classication of heart sounds using the dataset from the Algorithm Based on Wavelet Decomposition and
PASCAL Classifying Heart Sounds Challenge. As a pre- Reconstruction,” Proceedings of the 20th Annual
cursor to classication, the sound segmentation was accom- International Conference of the IEEE Engineering in
plished by rst computing the Shannon energy of the recon- Medicine and Biology Society, Vol. 20, no. 3, pp.
structed heart sound signal. Genetic algorithm was imple- 1539-1542, 1998.
mented to come up with the threshold. The detected peaks
were then subjected to a peak conditioning algorithm. The [10] Y. Chen, S. Wang, C. Shen, ” Matrix decomposition
technique used in this study was found to generate lower to- based feature extraction for murmur classication,”
tal error compared to published works on the same dataset. Medical Engineering & Physics , Vol. 34 , no. , pp. 756
We used three approaches to extract feature vectors used 761, 2012.
to test and train the random forest classier. We compared [11] S. Pavlopoulos, A. Stasis, E. Loukis, ” A decision tree
the results of the classication using the three feature vec- based method for the dierential diagnosis of Aortic
tors derived using discrete and continuous wavelet transform Stenosis from Mitral Regurgitation using heart
and their combination respectively. The feature vector from sounds,” BioMedical Engineering OnLine, Vol. 3, no.
Discrete Wavelet Transform performed best in our experi- 21, pp. 1-15, 2004
ments. In comparison with existing literature, our results
were able to exceed total precision of winning approach in [12] L. Breiman, ”Random Forests,” Machine Learning,
the aforementioned competition in dataset A (the noisier Vol. 45, no. , pp. 5-32, 2001.
dataset) and was well within range of other published re-
sults on dataset B. The development of a robust algorithm [13] E. Gomes, E. Pereira,”Classifying heart sounds using
for the classication of heart sounds will enable us to de- peak location for segmentation and feature
velop a system to aid physicians to diagnose heart diseases construction”,2012, (unpublished).
using auscultation.
[14] Y. Deng, P. Bentley,”A Robust Heart Sound
Segmentation and Classication Algorithm using
5. REFERENCES Wavelet Decomposition and Spectrogram”,2012.
[1] S. Mendis, P. Puska, B. Norrving, ”Global atlas on
cardiovascular diseases prevention and control”, World
Health Organization, 2011.

View publication stats

You might also like