SURF
SURF
Received July 28, 2017, accepted August 24, 2017, date of publication September 7, 2017, date of current version October 12, 2017.
Digital Object Identifier 10.1109/ACCESS.2017.2749758
ABSTRACT Recent advances in wearable devices allow non-invasive and inexpensive collection of
biomedical signals including electrocardiogram (ECG), blood pressure, respiration, among others.
Collection and processing of various biomarkers are expected to facilitate preventive healthcare through
personalized medical applications. Since wearables are based on size- and resource-constrained hardware,
and are battery operated, they need to run lightweight algorithms to efficiently manage energy and memory.
To accomplish this goal, this paper proposes SURF, a subject-adaptive unsupervised signal compressor
for wearable fitness monitors. The core idea is to perform a specialized lossy compression algorithm on
the ECG signal at the source (wearable device), to decrease the energy consumption required for wireless
transmission and thus prolong the battery lifetime. SURF leverages unsupervised learning techniques to build
and maintain, at runtime, a subject-adaptive dictionary without requiring any prior information on the signal.
Dictionaries are constructed within a suitable feature space, allowing the addition and removal of code words
according to the signal’s dynamics (for given target fidelity and energy consumption objectives). Extensive
performance evaluation results, obtained with reference ECG traces and with our own measurements from
a commercial wearable wireless monitor, show the superiority of SURF against state-of-the-art techniques,
including: 1) compression ratios up to 90-times; 2) reconstruction errors between 2% and 7% of the signal’s
range (depending on the amount of compression sought); and 3) reduction in energy consumption of up to
two orders of magnitude with respect to sending the signal uncompressed, while preserving its morphology.
SURF, with artifact prone ECG signals, allows for typical compression efficiencies (CE) in the range
CE ∈ [40, 50], which means that the data rate of 3 kbit/s that would be required to send the uncompressed
ECG trace is lowered to 60 and 75 bit/s for CE = 40 and CE = 50, respectively.
INDEX TERMS Biomedical signal processing, data compression, energy efficiency, self-organizing feature
maps, unsupervised learning, wearable sensors.
I. INTRODUCTION easy to measure, but are at the same time extremely valuable
Wearables can be integrated into wireless body sensor for the aforementioned purposes. We consider the acquisition
networks (WBSN) [1] to update medical records via the of such signals through wearable devices like smart watches
Internet, thus enabling prevention, early diagnosis and per- or chest straps [2], [3] and are concerned with prolonging
sonalized care. However, since they are required to be small the battery time of these wearables through lossy signal com-
and lightweight, they are also resource constrained in terms pression. We consider scenarios where wireless transmission
of energy, transmission capability, and memory. of ECG signals to some access point is required, so that
In this article, we propose new data processing solutions the signal can be stored and made available through cloud
for the long-term monitoring of quasi-periodic electrocardio- servers to be analyzed by clinicians. Our approach consists
graphy (ECG) signals. These biomedical traces are relatively of compressing the ECG time series right on the wearable
2169-3536 2017 IEEE. Translations and content mining are permitted for academic research only.
VOLUME 5, 2017 Personal use is also permitted, but republication/redistribution requires IEEE permission. 19517
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
M. Hooshmand et al.: SURF: Subject-Adaptive Unsupervised ECG Signal Compression
that of selected compression algorithms from the literature Transformation methods perform a linear orthogonal trans-
including neural networks [4], linear approximations [10], formation. The most widely adopted techniques are Fast
Fourier [11]–[13], Wavelet [14] transforms and compressive Fourier Transform (FFT) [11], Discrete Cosine Transform
sensing (CS) [15], [16]. SURF surpasses all of them, achiev- (DCT) [20], and Discrete Wavelet Transform (DWT) [14].
ing remarkable performance, especially at high compression The amount of compression they achieve depends on the
ratios where the reconstruction error (Root Mean Square number of transform coefficients that are selected, whereas
Error, RMSE) at the decompressor is kept below 7% of the their representation accuracy depends on how many and
signal’s peak-to-peak amplitude, whereas the RMSE of other which coefficients are retained. Although these algorithms
solutions becomes unacceptably high. can provide high compression ratios, their computational
A thorough numerical analysis of SURF, carried out on complexity is often too high for wearable devices. Also, as we
the PhysioNet public dataset [17] and our own collected quantify below, these methods are in general outperformed by
ECG traces from a Zephyr Bioharness 3 device, reveals the linear and dictionary-based approaches at high compression
following: ratios.
i) SURF’s dictionaries gracefully and effectively adapt to Parameter extraction methods use Artificial Neural Net-
new subjects or to their new activities, works (ANNs), Vector Quantization (VQ), and pattern recog-
ii) the size of these dictionaries is kept bounded within nition techniques. This is a field with limited investigation
20 kbytes, making them amenable to implementation in that has recently aroused great interest from the research
wireless monitors, community. Unlike direct and transformation methods, the
iii) high compression efficiency is reached (reductions in rationale is to process the input time series to obtain some
the signal size from 50 to 96-fold), kind of knowledge (e.g., input data probability distribution,
iv) the original ECG time series are reconstructed at the signal features, hidden topological structures) and utilize it
receiver with high accuracy, i.e., obtaining peak-to-peak to get compact and accurate signal representations. The algo-
RMSEs within 7% and often smaller than 3% and, rithm that we propose in this paper belongs to this class. Other
v) compression allows saving energy at the transmitter, representative algorithms are [4], [9], [21]–[24]. In [21],
leading to reductions of up to two orders of magnitude a direct waveform Mean-Shape Vector Quantization (MSVQ)
at the highest compression ratios. is tailored for single-lead ECG compression. Based on the
The reminder of this paper is structured as follows. observation that many short-length segments mainly differing
In Section II, we discuss previous work on lossy compression in their mean values can be found in a typical ECG signal,
for ECG signals. In Section III, we briefly review vector quan- the authors segment the ECG into vectors, subtract from
tization, which we also exploit in our design. In Section IV, each vector its mean value, and apply scalar quantization
we introduce the self-organizing maps, and in Section V we and vector quantization to the extracted means and zero-
describe an initial design based on them. This design is the mean vectors respectively. Differently from our approach,
same of [4] and is discussed here for the sake of completeness the segmentation procedure is carried out by fixing the vec-
and for a better understanding of the more complex design of tor length to a predetermined value. This avoids the com-
Section VI, where we describe in detail the SURF compres- putational burden of peak detection, but it does not take
sion scheme. A thorough performance evaluation is carried full advantage of the redundancy among adjacent heartbeats,
out in Section VIII, comparing SURF against state-of-the which are in fact highly correlated. Moreover, in MSVQ,
art solutions for reference and own collected ECG traces. dictionaries are built through the Linde-Buzo-Gray (LBG)
Conclusions are drawn in Section IX. algorithm [25], without adapting its codewords at runtime.
The following Section III introduces some preliminary In [9], Sun et al. propose another vector quantization scheme
notions on network quantization and Section IV summarizes for ECG compression, using the Gain-Shape Vector Quanti-
the main operational principles of self organizing maps, on zation (GSVQ) approach. There, the ECG is segmented into
which the algorithms that are proposed in this paper rest. This vectors made up of samples between two consecutive signal
material can be skipped by an expert reader on these matters. peaks. Each extracted vector is normalized to a fixed length
and divided by its gain to obtain the so called shape vector.
II. RELATED WORK A codebook for the shape vectors is generated using the
Compression algorithms for ECG signals can be grouped LBG algorithm. After this, each normalized vector is assigned
into three main categories: Direct Methods, Transformation the index of the nearest codeword in the dictionary and a
Methods and Parameter Extraction Methods. residual vector is encoded to compensate for inaccuracies.
Direct methods, which include the Lightweight Temporal The original length of each heartbeat, the gain, the index
Compression (LTC) [10], the Amplitude Zone Time Epoch of the nearest codeword and the encoded (residual) stream
Coding (AZTEC) [18], and the Coordinate Reduction Time are sent to the decoder. For the signal reconstruction at the
Encoding System (CORTES) [19], operate in the time domain receiver, the decoder retrieves the codeword from its local
and utilize prediction or interpolation algorithms to reduce copy of the dictionary, performs a denormalization using
redundancy in the input signal by examining subsequent time the gain and the length, and adds the residual signal. SURF
samples. resembles [9] in the way signal segments are defined and in
the adoption of the GSVQ approach. Indeed, the main ECG instead of scalars, even if the data source is memoryless or
peaks are used to extract segments, which then constitute the data compression system is allowed to have memory.
the recurrent patterns to be learned. [22] distinguishes itself Let x = [x1 , x2 , ..., xm ]T ∈ Rm be an m dimensional input
from the previous schemes because it defines a codebook random vector. A vector quantizer is described by:
of ECG vectors adaptable in real-time. The dictionary is • A set of decision regions Ij ⊆ Rm , j = 1, . . . , L, such
implemented in a one dimensional array with overlapped and that Ij ∩ Ih = ∅, j, h = 1, . . . , L, j 6 = h, and the union of
linearly shifted codewords that are continuously updated and Ij (with j = 1, . . . , L) spans Rm .
possibly removed according to their frequency of utilization. • A finite set of reproduction vectors (codewords)
In particular, an input vector that does not find a matching C = {yyj }Lj=1 , y j = [yj1 , yj2 , . . . , yjm ]T ∈ Ij ⊆ Rm . This
codeword is added to the codebook, triggering the removal set is called codebook or dictionary. Each codeword y j
of the codeword with the least number of matches. However, is assigned a unique index.
no details are provided on ECG segmentation nor on how • A quantization rule q(·):
ECG segments with different lengths are to be processed.
A compression approach based on vector quantization, where q(xx ) = yj if x ∈ Ij . (1)
dictionaries are built and maintained at runtime is presented This means that the jth
decision region Ij is associated
in [4]. In this paper, time adaptive self organizing maps with the jth codeword y j and that each vector x belonging
are utilized to reshape the dictionary as the signal statistics to Ij is mapped by (1) into y j .
change. As we show in Section VIII, while this approach has A compression system based on VQ involves an encoder
excellent compression performance and gracefully adapts to and a decoder. At the encoder, the output samples from the
non-stationary signals, it is not robust to artifacts, i.e., the data source (e.g., samples from a waveform, pixels from an
quality of the dictionary degrades in the presence of sudden image) are grouped into blocks (vectors) and each of them is
changes in the signal statistics or of previously unseen pat- given as input to the VQ. The VQ maps each vector x onto
terns. A compression scheme for quasi-periodic time series codeword y j according to (1). Compression is achieved since
can be found in [24], where the authors target the lightweight the index j associated with y j is transmitted to the decoder in
compression of biomedical signals for constrained devices, as place of the whole codeword. Because the decoder has exactly
we do in this paper. They do not use a VQ approach but exploit the same dictionary stored at the encoder, it can retrieve the
sparse autoencoders and pattern recognition as a means to codeword given its index through a table lookup. Note that,
achieve dimensionality reduction and compactly represent the for correct decoding, the dictionary at the decoder shall be the
information in the original signal segments through shorter same in use at the encoder at all times. We say that encoder
segments. Quantitative results assess the effectiveness of their and decoder are synchronized if this is the case and that are
approach in terms of compression ratio, reconstruction error out-of-synchronism otherwise.
and computational complexity. However, the scheme is based The quality of reconstruction is measured by the average
on a training phase that must be carried out offline and is thus distortion between the quantizer input x and the quantizer out-
not suitable for patient-centered applications featuring pre- put y j . A common distortion measure between a vector x and
viously unseen signal statistics. A taxonomy describing most a codeword y j is the Euclidean distance d(xx , y j ) = kxx − y j k.
of these compression approaches, including their quantitative The average distortion is measured through the root mean
comparison, can be found in the survey paper [26]. squared error (RMSE):
Our present work improves upon previous research as neu-
L Z
ral network structures are utilized to build and adapt compact X
representations of biomedical signals at runtime, utilizing E[d(xx , y j )] = kaa − y j kfx (aa)daa, (2)
j=1 Ij
unsupervised learning. Our design uses multiple dictionaries
to ensure robustness against artifacts, still obtaining very high where fx (·) is the probability density function of the random
compression ratios, and achieving small reconstruction errors vector x . The design of an optimal VQ consists in finding the
at all times. dictionary C and the partition of Rm that minimize the average
distortion. It can be proved that an optimal VQ must satisfy
III. PRELIMINARIES ON VECTOR QUANTIZATION the following conditions:
FOR SIGNAL COMPRESSION 1) Nearest Neighbor Condition (NNC). Given the set of
Vector quantization is a technique originally conceived for codewords C, the optimal partition of Rm is the one
lossy data compression but also applicable to clustering, returning the minimum distortion:
pattern recognition and density estimation. VQ is a gener-
alization of scalar quantization of a single random variable Ij = {xx : d(xx , y j ) ≤ d(xx , y h ), j 6 = h}. (3)
to quantization of a block (vector) of random variables [27].
This condition implies that the quantization rule (1)
Its motivation lies on the fundamental result of Shannon’s
can be equivalently defined as q(xx ) = arg min d(xx , yj ),
rate-distortion theory, which states that better performance yj
(i.e., lower distortion for a given rate or lower rate for a i.e., the selected y j is the nearest codeword to the input
given distortion) can always be achieved by encoding vectors vector x .
the learning rate when the signal’s statistics changes and for The normalized segment feeds the dictionary manager, which
this reason is a more appealing technique with non-stationary uses it to update the dictionary, and the pattern matching
signals. The TASOM has been introduced in [7] improving module, which returns the best matching codeword from
upon the basic SOM and preserving its properties in station- the dictionary and outputs its index. The segment’s original
ary and non-stationary settings. In a TASOM, each neuron j, length, offset, gain and codeword index are then sent to the
j = 1, . . . , L, has a synaptic-weight vector w j ∈ Rm with receiver in place of the original samples.
its own learning-rate ηj (n) and neighborhood width σj (n), The dictionary manager is the key block of the
which are continuously adapted so as to allow a potentially TASOM-based compressor. We designed it thinking of a
unlimited training of the synaptic-weight vectors. The reader communication scenario entailing a transmitting wearable
is referred to [7] for additional details. device and a receiver, such as a smartphone. At any time
instant n, two dictionaries are maintained at the transmitter:
V. A FIRST DESIGN: TASOM-BASED ECG COMPRESSION the current dictionary C c (n), which is used to compress
In this section, we describe an initial design that uses the the input signal, and the updated dictionary C u (n), which
TASOM unsupervised learning algorithm. First, we identify undergoes updating at each time instant through the TASOM
as ECG segments the sequence of samples between con- algorithm and is maintained to track statistical changes in
secutive ECG peaks and we use them to build a dictionary the input signal’s distribution. As for the dictionaries, we
that stores typical segments and is maintained consistent and consider a TASOM with L neurons. When the compression
well representative through online updates. A diagram of the scheme is used for the first time, a sufficient number N of
proposed technique is shown in Fig. 2. The ECG signal is signal segments shall be provided as input to the TASOM to
first preprocessed through a third-order Butterworth filter to perform a preliminary training phase. This training allows
remove artifacts. Hence, the fast peak detection algorithm the map to learn the subject signal’s distribution. This may
of [29] is employed to locate the signal peaks. Since the be accomplished the first time the subject wears the device.
segments may have different lengths, after their extraction, After this, a first subject-specific dictionary is available. It can
a linear interpolation block resizes each segment from its be used for compression and can also be updated at runtime
actual length rx (n) to a fixed length m. The resized segment as more data is acquired. Let us assume that time is reset
is termed x (n) = [x1 (n), . . . , xm (n)]T , whereas when the preliminary training ends, and assume n = 0 at
Pm such point. The current and updated dictionaries are C c (0) =
xk (n)
ex (n) = k=1 (6) {ccc1 (0), . . . , ccL (0)} and C u (0) = {ccu1 (0), . . . , cuL (0)}, respec-
m c/u
tively. Their codewords c j (0) represent the synaptic-weight
is its offset and vectors of the corresponding neural (TASOM) maps. At time
m
X
!1/2 n = 0, we have c cj (0) = c uj (0) = w j (0), j = 1, . . . , L. Let also
gx (n) = xk (n) /m
2
(7) assume that the decompressor at the receiver is synchronized
k=1 with the compressor, i.e., it owns a copy of C c (0). From
is its gain. The normalization module applies the following time 0 onwards, for any new segment x (n) (n = 1, 2, . . . )
transformation to each entry of x (n): the following procedure is followed:
Step 2 makes it possible to always maintain an updated
xk (n) − ex (n) approximation of the input segment distribution at the
xk (n) ← , k = 1, . . . , m. (8)
gx (n) transmitter. With step 4, we check the validity of the
Algorithm 1 TASOM-Based Compressor learned. In that case, we do not want previous neurons to be
1) Map x (n) onto the index of the best matching codeword involved in the refinement as we are exploring a new area
in C c (n), i.e., map x (n) onto the index ix (n) such that in the input signal space, and we do not want to do this at the
cost of getting lower accuracies in the portion of space that we
ix (n) = arg min kxx (n) − c cj (n)k, j = 1, . . . , L. (9)
j have already inspected and successfully approximated. In our
new design, we accomplish this through a GNG network [8].
2) Let d(n) = kxx (n) − c ci (n)k be the distance between This type of neural network incrementally learns the topolog-
the current segment and the associated codeword, where ical relations in a given input signal set, dynamically adding
we use index i as a shorthand notation for ix (n). Use x (n) and removing neurons and connections, adapting itself to
as the new input for the current iteration of the TASOM previously unseen input patterns.
learning algorithm and obtain the new synaptic-weight O2) Objective 2 (Reducing Overhead): we aim at further
vectors wj (n), j = 1, . . . , L. reducing the overhead associated with maintaining and trans-
3) Update C u (n) by using the weights obtained in step 2, mitting the dictionary. This is achieved through two tech-
i.e., setting c uj (n) ← w j (n) for j = 1, . . . , L. niques: 1) working within a suitable feature space, where a
4) Let ε > 0 be a tuning parameter. If d(n)/kxx (n)k > ε, number of features much smaller than the size of each ECG
then update C c (n) by replacing it with C u (n), i.e., C c (n) ← segment suffices for its accurate representation; 2) selective
C u (n) and, using (9), re-map x (n) onto the index ix (n) of dictionary update. In the TASOM-based approach, a dictio-
the best matching codeword in the new dictionary C c (n). nary is entirely replaced whenever any of its codewords is no
5) Send to the receiver the segment’s original length rx (n), longer capable of approximating ECG segments belonging to
its offset ex (n), gain gx (n), and the codeword index ix (n). a given signal’s area within a preset accuracy. Instead, in our
If C c (n) has been modified in step 4, then also send C u (n) new design codewords are selectively replaced by new ones
(that in this case is equal to the new C c (n)). that better approximate the portion of signal space that they
are responsible for.
O3) Objective 3 (Coping With Artifacts): ECG signals
approximation provided by the current dictionary (the one gathered from wearable devices are prone to artifacts,
used for compression, which is also known at the receiver). caused, for example, by the body movements of the wearer.
The tunable parameter ε is used to control the signal recon- Dictionary-based approaches are particularly sensitive to arti-
struction fidelity at the decompressor: if d(n)/kxx (n)k ≤ ε, facts as no existing codeword can adequately approximate
codeword c cix (n) (n) is considered suitable to approximate the them. Dictionary updates, attempting to bring the codewords
current segment, otherwise C c (n) is replaced with the updated closer to the new segments (i.e., the artifacts) are likely to
dictionary C u (n) and the encoding mapping is re-executed. result in a degraded representation accuracy for the recurring
Note that the higher ε, the higher the error tolerance and the segments. However, these noisy segments must be accurately
lower the number of updates of the current dictionary. On the represented, as these may indicate anomalous behavior that
contrary, a small ε entails frequent dictionary updates: this could play an important role in the diagnosis of a disor-
regulates the actual representation error and also determines der. Our new compressor successfully copes with this by:
the maximum achievable compression efficiency. 1) sending features in place of full codewords whenever none
At the receiver, the n-th ECG segment is reconstructed by of the current codewords provide a satisfactory match and
picking the codeword with index ix (n) from the local dic- 2) concurrently starting an assessment phase for the new
tionary, performing renormalization of such codeword with pattern. In the assessment phase, a new neuron (codeword)
respect to offset ex (n) and gain gx (n) and stretching the is temporarily added to a local dictionary, which is only
codeword according to the actual segment length rx (n). maintained at the source and is used for the evaluation of
new (or anomalous) patterns. The permanent addition of
VI. THE SURF COMPRESSION SCHEME such codeword to the main dictionary only occurs if further
In what follows, the TASOM-based compressor of the pre- segments are found to match it, which means that the new
vious section is improved through the use of a more flex- segment has become recurrent.
ible neural network architecture, aiming at the following Objectives O1, O2, and O3 are achieved through the
objectives. SURF compression algorithm that we describe in detail next.
O1) Objective 1 (Specializing the Dictionary to New Signal It leverages a GNG neural structure to learn and maintain
Areas): we recall that the number of neurons in the TASOM a set of prototypes in the signal’s feature space in a totally
map remains fixed as time evolves and this entails that some unsupervised fashion. This neural network structure has a
further refinement of the dictionary, whenever the signal number L(n) of neurons, where n is the (discrete) time index,
statistics undergoes major changes and new behaviors arise, which is updated as n ← n+1 each time a new ECG segment
may not always be possible. In fact, from our experiments we is processed.
have seen that, at times, additional neurons may be benefi- A diagram of the SURF algorithm is shown in Fig. 3. The
cial to specialize the dictionary upon the occurrence of new signal is at first preprocessed through the same chain of Fig. 2,
patterns, while at the same time preserving what previously involving filtering to remove artifacts, ECG peak detection
FIGURE 3. Flow diagram of the SURF compression algorithm: the dictionaries are learned in the feature space. Codewords
can be added or removed. When the distance between the best matching codeword in D1 and the current feature vector is
higher than a threshold, a new codeword is added to D2. That codeword is in an assessment phase until it is either
permanently added (to D1 and D3) or deleted (i.e., when no further matches occur). Dictionary D1 is used for
(dictionary-based) compression, D3 for continuous learning. When a good match is found for an ECG segment, the
compressor sends its length, offset and the index of the matching codeword. Otherwise, the segment’s feature vector is sent
along with its length and offset.
and segment extraction. After this, ECG segments are nor- by the decompressor at the receiver. This implies that any
malized, resized and their offset is removed. As different changes to D1 should be promptly communicated to the
ECG segments may have different lengths, linear interpola- decompressor so that the dictionaries at source and at the
tion is used to resize them to a fixed length m. Let x (n) = receiver remain synchronized at all times. Instead, D2 and D3
[x1 (n), . . . , xm (n)]T be the resized m-length ECG segment at only need to be maintained at the source (transmitter).
time n. Offset removal is achieved through: Dictionary D1: The current dictionary D1 contains the
codewords which are currently in use. For each new fea-
xk (n) ← xk (n) − ex (n), k = 1, . . . , m, (10) ture segment y (n), the closest codeword cci∗ (n) in D1 is
fetched (‘‘pattern matching’’ in Fig. 3) by minimizing the
where ex (n) is defined in (6). After this, the normalized distance d(yy(n), c cj (n)) = kyy(n) − c cj (n)k for all codewords
ECG segment x (n) is fed to a feature extraction block which ccj (n) ∈ C c (n), i.e.,
reduces the dimensionality of x (n) through the computation
of a number f < m of features. This mapping is denoted i∗ = arg min d(yy(n), c cj (n)), j = 1, . . . , L(n). (11)
by 9 : Rm → Rf and we have: y(n) = 9(xx (n)), where j
y(n) = [y1 (n), . . . , yf (n)]T . For our experimental results, If d(yy(n), c ci∗ ) is smaller than a preset error tolerance
this mapping corresponds to the DCT transform of x (n), by εf > 0,1 the codeword c ci∗ from D1 is deemed a good
retaining the first (low-pass filtering) f coefficients in the candidate to approximate the current ECG segment. In this
transform (frequency) domain. We underline that our method case, we say that y (n) is matched by c ci∗ . Index i∗ is thus sent
is rather general and other transformation and coefficient to the receiver in place of the entire feature set y (n). At the
selection methods can be applied. receiver side, a copy of D1 is maintained at all times and is
At this point, the SURF dictionaries come into play. used to retrieve c ci∗ from its index.
Differently from the TASOM approach, three dictionaries are Dictionary D2: If d(yy(n), c ci∗ ) > εf , none of the codewords
maintained at the transmitter: D1) the current dictionary in D1 adequately approximates the current feature vector,
C c (n) = {cc1 (n), . . . , ccL(n) (n)}, D2) the reserved dictionary which is then termed unmatched. Note that this may be a
C r (n) = {cr1 (n), . . . , crR(n) (n)} and D3) the updated dictionary consequence of changes in the signal statistics such as sudden
C u (n) = {cu1 (n), . . . , cuL(n) (n)}. D1 and D3 contain the same variations in the subject’s activity, to pathological (and often
number of codewords at all times, whereas D2 contains R(n) 1 Here, ε represents the error tolerance in the feature space, which must
f
codewords, where in general R(n) L(n). D1 is used for not be confused with that in the signal space ε, that was used for the
compression at the source (transmitter) and has to be known TASOM-based compressor of Section V.
sporadic) ECG segments or to measurement artifacts. In these all times, while adaptively (and automatically) tuning the
cases, we check for a match in the reserved dictionary D2 instantaneous compression rate (as a function of the charac-
(C r (n)). If a match occurs, the matching count of the match- teristics of the current segment). Also, this allows refining
ing codeword in D2 is increased by one. Otherwise, a new the main dictionary by only including those patterns that
codeword is added to D2. This is achieved by adding a neuron have become recurrent. As we shall see, this provides excel-
to dictionary C r (n) and using feature vector y (n) to initialize lent accuracy performance, resilience against artifacts, while
its synaptic-weight vector. We stress that the codewords in retaining most of the benefits of dictionary-based schemes
D2 are not yet ready for use in the signal compression, but (very high compression rates).
they have to go through an assessment phase. D2 behaves For the formal description of the SURF algorithm, let us
as a buffer with maximum size Lmax : if a codeword in D2 assume that time is reset when the preliminary training ends
is matched γ times (with γ being a preset parameter), it is and assume n = 0 at such point. The codewords of D1 and
removed from D2 and added to D1. If instead D2 gets full D3, at time n = 0, C c (0) = {ccc1 (0), . . . , c cL(0) (0)} and C u (0) =
and a new codeword has to be added to it for assessment, the {ccu1 (0), . . . , c uL(0) (0)} are set equal to the synaptic-weight vec-
oldest codeword from D2 is deleted and the new one takes tors at the end of the initial training, i.e., c cj (0) = c uj (0) =
its place. The rationale behind the assessment phase is that w j (0), j = 1, . . . , L(0). We also assume that the decompressor
new codewords are added to explore a new portion of the at the receiver is synchronized with the compressor, that is,
signal’s feature space, and this exploration is prompted by it owns a copy of D1 (C c (0)). Also, for any codeword c
the measurement of previously unseen patterns. Now, if these belonging to any dictionary, if d(yy(n), c ) < εf we say that y (n)
patterns are very unlikely to occur again it does not make any is matched by c . For the continuous update of the synaptic
sense to add them to dictionary D1 and it is better to send weight vectors (codewords) in dictionary D3, we apply the
the feature vector y(n) for these isolated instances. In turn, following Algorithm 2, which rests on the Hebbian learning
y(n) will be utilized to reconstruct the pattern at the receiver. theory in [30], [31].
Instead, if after their first appearance, these become recurring
patterns, it does make sense to add them to D1 (and D3 for Algorithm 2 Synaptic Weight Vector Update
their continuous refinement). Note that the combined use of At the generic time n, let y (n) and i∗ respectively be the
D1 and D2 makes it possible to specialize the dictionary to current feature vector and the index associated with the best
new signal areas (new patterns, i.e., objective O1) and as well matching codeword in D1, i.e.,
to cope with artifacts (objective O3).
Dictionary D3: This dictionary has the same number of d(yy(n), c ui∗ (n)) ≤ d(yy(n), c uj (n)), j = 1, . . . , L(n). (12)
neurons of D1 but its codewords are updated for each new We have that i∗ is the winning neuron in map (dictionary)
matched ECG segment. That is, when d(yy(n), c ci∗ ) < εf the D1 for this input (feature) vector y (n) and its synaptic
feature vector y (n) is also used to update dictionary C u (n). weight vector is w i∗ = c ui∗ (n), with w i∗ ∈ Rf . The update
As stated above, dictionary D2 and D3 are continu- rule for w i∗ is:
ously updated: D3 when a match occurs between y (n) and
a codeword in D1, whereas D2 when no codeword in i∗ ← w i∗ + b (y
w new y(n) − w i∗ ). (13)
D1 matches y (n). In this case, if y (n) matches some code-
Moreover, when we have a match, an edge will be created
word in D2, the corresponding matching count is increased,
in the neural map between i∗ and i∗∗ , where i∗∗ is the
otherwise D2 is extended through the addition of a new
second-closest neuron to the current input vector y (n).
codeword. Dictionaries D1 and D3 are initialized with L(0)
If i∗ and i∗∗ are already connected with an edge, no new
neurons, where L(n) is always bounded, i.e., L(n) ≤ Lmax
edge will be added. After that, we update the synaptic
at all times, where Lmax is a preset parameter to cope
weight vector of every neuron j that is a neighbor of i∗ ,
with memory constraints. At time 0, D2 is empty and the
i.e., that is connected to it with an edge:
number of neurons therein is likewise bounded by Lmax .
Similarly to the TASOM-based approach, when the com- w new
j ← w j + n (yy(n) − w j ), (14)
pression scheme is activated for the first time, a suffi-
cient number N of signal segments must be provided as where b and n are constant learning rates. The new
input to perform a preliminary training phase. Such train- weight vectors of (13) and (14) correspond to the updated
ing allows the dictionaries to learn the subject signal’s codewords for dictionary D3.
distribution. An observation is in oder. Basically, the just
described approach dynamically switches the compression Keeping the above definitions and update rules into
strategy between a dictionary-based technique and a stan- account, from time 0 onwards, for any new feature segment
dard transform-based one (i.e., sending a number of DCT y (n) (n = 1, 2, . . . ) the following procedure is executed:
coefficients for the current segment). The dictionary is In the above algorithm, Step 2 checks whether the current
used when it approximates well the current ECG pattern. segment is matched by one codeword in the current dictio-
Otherwise, a DCT compression approach is exploited. Note nary D1. If not, the current feature vector is tagged as an
that this makes is possible to achieve high accuracy at unknown pattern and is added to dictionary D2 to go through
into the corresponding number of clock cycles Ncc and, from The maximum number of packets per seconds PPSmax that
there, we derived the energy expenditure, as in [32]. For the can be exchanged between the two devices is thus: PPSmax =
energy consumption plots of Section VIII-A, we considered nmax /CImin , with CImin expressed in seconds, and the maxi-
a Cortex M4-90LP [34] processor, whose number of clock mum throughput is obtained as
cycles per operation is detailed in Table 7-1 of [35]. As for
the energy consumption per clock cycle, Ecc , in active mode Thrmax = PPSmax × payload_size. (19)
the Cortex M4-90LP consumes 10.94 µA with the MCU
operating at 1 MHz and the supply voltage being +3 V: Here, nmax depends on the operating system of the terminals,
for example, at the time of writing, Android has nmax = 6,
Ecc = 10.94 µA × 3 V/1 MHz = 32.82 · 10−12 J. (18) whereas iOS has nmax = 4. Using (19), the maximum
throughput for a wireless ECG monitor connected with an
B. TRANSMISSION ENERGY Android terminal (nmax = 6), is thus: Thrmax = PPSmax ×
When ECG samples are measured using a Zephyr Bioharness payload_size = (6/0.0075) × 105 = 672 kbit/s. This
3 module [36], the sampling frequency is 250 Hz, and each maximum throughput is more than enough to support the
ECG sample takes 12 bits. This amounts to a transmission transmission of the raw ECG signal (3 to 4 kbit/s).
rate for a continuously streamed (uncompressed) ECG signal The number of transmitted packets is computed according
of 3 kbit/s. This is the setup considered for the results in to the number of information bits that are to be transmitted by
Section VIII-B, whereas in Section VIII-A the bitrate is of the radio, segmenting the bitstream into multiple data packets
3.96 kbit/s, as the sampling rate is higher (360 Hz with 11 bits according to a fixed payload length of 105 information bytes.
per sample). The raw ECG signal is then compressed using The energy consumption associated with the transmission of
SURF and transmitted through the wireless channel. Next, we a single data packet is obtained as Epacket , as per the above
detail how we estimated the energy consumption associated discussion. Finally, the total energy consumption is computed
with the transmission of data packets as they travel from the as the sum of processing and transmission energy.
wearable device to the data receiver. Towards this end, we Two additional metrics are considered in the performance
consider the energy consumption figure of the Bluetooth LE analysis, i.e., the Compression Efficiency (CE) and the Root
Texas Instruments CC2541 radio [37], whose energy con- Mean Square Error (RMSE). CE has been computed as the
sumption per transmitted bit is Ebit = 300 nJ/bit (18.2 mA ratio between the total number of bits that would be required
at 3.3 V considering a physical layer bitrate of 2 Mbps to transmit the full signal divided by those required for the
and the radio in active mode). The procedure that we now transmission of the compressed bitstream. The RMSE is used
describe can be applied to any other radio, by plugging the to represent the reconstruction fidelity, as is computed as
corresponding Ebit . the root mean square error between the original and the
The energy consumption for each transmitted packet is compressed signals, normalizing it with respect to the signal’s
obtained as Epacket = Ebit × packet_size, where peak-to-peak amplitude (p2p), that is
packet_size = header_size + payload_size. PK !1/2
2
No energy consumption is accounted for when the radio is 100 i=1 (xi − x̂i )
RMSE = , (20)
in idle mode (between packet transmissions). The packet p2p K
transmission process follows the Bluetooth LE protocol in
the connected mode (in our case, a point-to-point connection where K corresponds to the total number of samples in the
between only one master and only one slave). In Bluetooth ECG trace, xi and x̂i are the original sample i and that recon-
LE, a data packet consists of the following fields: pream- structed at the decompressor (receiver side), respectively. The
ble (1 byte), access address (4 bytes), link layer header SURF default parameters have been set as follows: b = 0.01,
(2 bytes), L2CAP header (4 bytes), which are followed by n = 0.005, α = 0.5, β = 0.995, γ = 3, Lmax = 10,
3 bytes of ATT command type/ attribute ID, Ldata informa- λ = 200 and amax = 100. These parameters were selected
tion bytes (containing application data), and the CRC field empirically and provide a good tradeoff between RMSE and
(3 bytes), see [38]. This leads to a total protocol overhead overhead (memory and compression efficiency).
of header_size = 17 bytes. For our results, we picked
a payload size of Ldata = payload_size = 105 unen- VIII. NUMERICAL RESULTS
coded information bytes (leading to a protocol overhead of In this section, we show quantitative results for the proposed
(17/122) × 100 = 13.9%), although the numerical results signal compression algorithms, detailing their energy con-
can be promptly adapted for any other value. Each side com- sumption, compression efficiency and reconstruction fidelity.
municates with the other on a given period called Connection In Section VIII-A, we first assess the performance
Interval (CI), whose minimum value is 7.5 milliseconds. Each of the considered compression algorithms for the refer-
communication instance between the master and the slave ence ECG traces from the PhysioNet MIT-BIH arrhythmia
is called a communication event, subsequent communica- database [17]. In Section VIII-B, we extend our analysis to
tion events are separated by CI seconds and a maximum (artifact prone) ECG traces that we collected from a Zephyr
of nmax data packets can be transmitted within this period. BioHarness 3 wearable chest monitor.
A. PHYSIONET ECG TRACES correspond to the original signal segments, i.e., y (n) = x (n).
For the first set of graphs, we considered the following ECG If an input segment is unmatched, the corresponding DCT
traces from the MIT-BIH arrhythmia database [17]: 101, 112, coefficients are transmitted to the receiver. So, in this case
115, 117, 118, 201, 209, 212, 213, 219, 228, 231 and 232, the DCT transform is only applied if a new pattern that the
which were sampled at rate of 360 samples/s with 11−bit res- current dictionary is unable to approximate is detected. In this
olution. Note that not all the traces in the database are usable case, f of its DCT coefficients are sent to reconstruct it at the
(some are very noisy due to heavy artifacts probably due to receiver (f = 200 is used for the SURF-TD curve in Fig. 4).
the disconnection of the sensing devices) and an educated
selection has to be carried out for a meaningful performance 2) SURF
analysis, as done in previous work [17], [39]. The above It is the feature domain implementation that we have
performance metrics were obtained for these ECG signals and described in Section VI, for which we considered the fol-
their average values are shown in the following plots. lowing values for the feature space size f ∈ {50, 75, 100,
150, 200} (see Figs. 4, 5 and 7).
From Fig. 4, we see that SURF achieves the highest CE,
up to 90-fold for the considered PhysioNet signals, whereas
time domain processing allows for maximum efficiencies of
60-fold. As expected, increasing f entails a smaller RMSE
at the cost of a smaller CE. However, we see that when f
increases beyond 100 the RMSE performance gets affected
and starts decreasing. In these cases, SURF behaves similarly
to its time domain counterpart. This is because dictionary
construction in feature space allows for more robustness and
generalization capabilities than working in the time domain,
which may lead to overfitting codewords to specific signal
examples. This means that an optimal value of f can be
identified, which in our case is around f ' 100. Fig. 5
shows the total energy consumption (adding up processing
FIGURE 4. SURF – RMSE vs compression efficiency. and transmission) and we see that savings of almost two
orders of magnitude with respect to the case where the signal
is sent uncompressed (‘‘no-compression’’) are possible. This
is further discussed below.
TABLE 1. Energy breakdown [no. operations] and consumption [µJ] for TASOM. RMSE = 3.6% and CE = 20.92.
TABLE 2. Energy breakdown [no. operations] and consumption [µJ] for SURF. RMSE = 3.6% and CE = 76.6.
FIGURE 12. Original and reconstructed signal in the presence of artifacts for LTC, TASOM and SURF. (a) LTC: CE = 22 and RMSE = 2%. (b) LTC: CE = 29 and
RMSE = 3%. (c) TASOM: CE = 34 and RMSE = 2%. (d) TASOM: CE = 49 and RMSE = 3%. (e) SURF: CE = 43 and RMSE = 2%. (f) SURF: CE = 53 and
RMSE = 3%.
presence of anomalous ECG segments (toward the middle unable to effectively represent the new (anomalous) pat-
of the plots). Remarkably, although all algorithms have the terns. SURF provides the best results as it preserves the
same average RMSE, LTC heavily affects the ECG mor- signal morphology, while achieving the highest CE, i.e.,
phology. TASOM does a better job, but its dictionary is up to CE = 53.
FIGURE 16. Average normalized RMSE versus training time for the FIGURE 17. Efficiency regions for SURF compression with different radios
updated dictionary D3: the dictionary is trained on a first subject for the and MCUs. Radios: CC2420 (250 kbit/s, power 0 dBm), CC2541 (2 Mbit/s,
first 55 minutes. After that, the ECG trace of a different subject is used as at maximum power 0 dBm), CC2541LP (low rate 500 kbit/s and power
the input time series. The dictionary at first produces high errors, but then −20 dBm). MCUs: Cortex-M4 versions 40LP, 90LP, 180ULL.
quickly adapts and converges to the steady-state RMSE for the second
subject.
indicate the number of clock cycles that are needed to run
the section. At time zero, the dictionary is initialized using the compression algorithm, which depends on the number
random ECG segments from the first subject, whereas its bits B in the original signal and on the compression error εf
subsequent training follows the GNG-based algorithms of (which dictates a certain compression factor). Compression
Section VI. A few observations are in order. As expected, is convenient when the following inequality holds:
when the training starts the error is higher (the RMSE is
higher than 4% for the first subject for Lmax = 10) but it Ecc Ncc (B, εf ) + Etx
0
B̂ < Etx
0
B, (21)
decreases with time and converges to the steady-state error which means that the energy for compression added to that
within 20 minutes. After 55 minutes, the signal is swapped for transmission of the compressed sequence (left hand side)
with that of another subject and this may for example occur must be smaller than the energy that would be required to
when the wireless ECG monitor is handled over to another send the uncompressed signal (right hand side). Solving this
patient. At this point, we observe a peak in the RMSE, 0 , we find the minimum E 0 that allows
inequality for Etx tx
which suddenly increases from 2.8% to 4.1%. However, D3 compression to be energy efficient, that is:
is retrained and in about 20 more minutes converges to the Ecc Ncc (B, εf )
0,min
new steady-state RMSE for the second subject. This shows Etx = . (22)
that SURF gracefully adapts to new wearers, progressively B − B̂
0,min
tuning its dictionaries to their ECG patterns. From this graph, The lines plotted in Fig. 17 correspond to Etx computed
we also see that the RMSE depends on the maximum number for several values of εf , which in turn imply different com-
of codewords in the dictionary, Lmax : an increasing Lmax leads pression efficiencies (CE in the figure). The region in this
to higher accuracies. As a last remark, we recall that the plot where compression is advantageous (energy efficient
0,min
RMSE in Fig. 16 corresponds to the representation error of region) is that for which E 0tx > Etx , which corresponds
SURF dictionaries, but the actual RMSE of the full SURF to the region above the curves. As seen from the plot, the
algorithm is always within the preset error tolerance. In fact, energy efficient regions weakly depend on the compression
according to the algorithms of Section VI, when the RMSE parameters as the number of clock cycles is almost constant
is higher than a preset threshold the dictionary is not used, for different settings (changing εf ), B is also constant and
but the feature vector associated with the current segment is it depends on the sampling rate of the ECG monitor, and
sent as the compressed representation. In other words, SURF the only variable that changes is B̂. Most importantly, in
automatically switches between dictionary-based compres- the graph we have also reported the energy consumption
sion and feature-based (e.g., DCT) compression, meeting the figures (Etx0 and E ) of several radios and MCUs (each
cc
preset representation accuracy at all times. radio/MCU pair is indicated by a filled dot in the figure).
Fig. 17 shows the energy consumption associated with All of them fall within the efficient region and, as expected,
radio transmission and processing, identifying the region compression provides the highest gain when the radio is
where compression provides energy savings and it is there- energy hungry (CC2420) and the processor is energy efficient
fore recommended. We obtained this plot as follows. Let (Cortex M4-40LP). Before applying the SURF algorithm
B and B̂ respectively be the number of bits to send over to any architecture, one should make sure that the selected
the channel when no compression is applied and those to combination of radio and MCU operates within the energy
be sent when the signal is compressed. With Ncc (B, εf ) we efficient region of Fig. 17.
[39] Y. Zigel, A. Cohen, and A. Katz, ‘‘ECG signal compression using anal- TOMMASO MELODIA (S’02–M’07–SM’16)
ysis by synthesis coding,’’ IEEE Trans. Biomed. Eng., vol. 47, no. 10, received the Ph.D. degree in electrical and com-
pp. 1308–1316, Oct. 2000. puter engineering from the Georgia Institute of
[40] Z. Zhang, T.-P. Jung, S. Makeig, and B. D. Rao, ‘‘Compressed sensing Technology, Atlanta, GA, USA, in 2007. He is cur-
for energy-efficient wireless telemonitoring of noninvasive fetal ECG via rently an Associate Professor with the Department
block sparse Bayesian learning,’’ IEEE Trans. Biomed. Eng., vol. 60, no. 2, of Electrical and Computer Engineering, North-
pp. 300–309, Feb. 2013. eastern University, Boston, MA, USA. He is also
serving as the lead PI on multiple grants from U.S.
federal agencies including the National Science
Foundation, the Air Force Research Laboratory,
the Office of Naval Research, and the Army Research Laboratory. He is
the Director of Research for the PAWR Project Office, a public-private
partnership that is developing four city-scale platforms for advanced wireless
research in the U.S. His research focuses on modeling, optimization, and
experimental evaluation of wireless networked systems, with applications
MOHSEN HOOSHMAND received the M.Sc. to 5G Networks and Internet of Things, software-defined networking,
degree in computer engineering from the Isfa- and body area networks. He is the Technical Program Committee Chair
han University of Technology, in 2011, and the for IEEE INFOCOM 2018. He is a recipient of the National Science
Ph.D. degree from the University of Padova, Foundation CAREER award and of several other awards. He is an Asso-
in 2017. He is currently a Postdoctoral Fellow with ciate Editor for the IEEE TRANSACTIONS on WIRELESS COMMUNICATIONS, the
the Biomedical and Clinical Informatics Labora- IEEE TRANSACTIONS on MOBILE COMPUTING, and the IEEE TRANSACTIONS on
tory, Department of Computational Medicine and BIOLOGICAL, MOLECULAR, and MULTI-SCALE COMPUTER NETWORKS, and Smart
Bioinformatics and a member with the Michigan Health.
Center for Integrative Research in Clinical Care,
University of Michigan, USA. His research inter- MICHELE ROSSI (SM’–) is currently an Asso-
ests include signal and medical image processing, machine learning and ciate Professor with the Department of Infor-
Internet of Things devices. mation Engineering, University of Padova, Italy.
In the last few years, he has been actively
involved in EU projects on IoT technology and
has collaborated with SMEs, such as Worldsensing
(Barcelona, ES) in the design of optimized IoT
solutions for smart cities and, with large compa-
nies, such as Samsung and Intel. He is author of
over 100 scientific papers published in interna-
DAVIDE ZORDAN received the M.Sc. degree in tional conferences, book chapters and journals and has been the recipient
telecommunications engineering and the Ph.D. of four Best Paper Awards from the IEEE. His current research interests are
degree from the University of Padova, Italy, in centered around wireless sensor networks, Internet of Things (IoTs), green
2010 and 2014, respectively. He is currently a 5G mobile networks and wearable computing. In 2014, he was the recipient
Postodoctoral Researcher with the Department of of a Samsung Gro award with a project entitled Boosting Efficiency in
Information Engineering, University of Padova. Biometric Signal Processing for Smart Wearable Devices. Since 2016, he has
His research interests include stochastic mod- been collaborating with Intel on the design of IoT protocols exploiting cog-
eling and optimization, protocol design, and nition and machine learning, as part of the Intel Strategic Research Alliance
performance evaluation for wireless networks, Research and Development program. His research is also supported by the
in-network processing techniques including European Commission on green 5G mobile networks. He currently serves on
compressive sensing, energy efficient protocols and energy harvesting tech- the Editorial Board of the IEEE TRANSACTIONS ON MOBILE COMPUTING.
niques for WSNs, and wearable IoT devices.