Detecting epileptic seizures using machine learning and interpretable features of human EEG
Detecting epileptic seizures using machine learning and interpretable features of human EEG
Received 31 August 2022 / Accepted 26 October 2022 / Published online 17 November 2022
© The Author(s), under exclusive licence to EDP Sciences, Springer-Verlag GmbH Germany, part of
Springer Nature 2022
Abstract Epilepsy is a neurological disorder distinguished by sudden and unexpected seizures. To diag-
nose epilepsy, clinicians register the signals of brain electric activity (electroencephalograms, EEG) and
extract segments with seizures. It enables characterizing their type and finding an onset zone, a brain area
where they originate. This procedure requires manual EEG deciphering, which is slow and necessitates
the assistance of machine learning (ML) algorithms. Traditionally, ML handles this issue in a supervised
fashion, i.e., after the training on the representative data, it constructs a boundary in the feature space
that separates classes. As the number of features grows, this boundary becomes complex and less gener-
alized. The feature space of brain data is high dimensional. The standard recording includes 30 signals
and 50 frequencies resulting in 1500 features. Using additional time-domain features may further enlarge
the feature space. Thus, selecting appropriate features is a big part of the successful classification. The
selection procedure relies on either a data-based mathematical approach (e.g., principal components, PCs)
or the expert domain knowledge of data (explainable features, EFs). Here, we demonstrate the benefits of
using EFs. For the EEG data of 30 epileptic patients, we trained a RandomForest algorithm using PCs and
EFs. The feature importance analysis revealed that explainable features outperform principal components.
123
674 Eur. Phys. J. Spec. Top. (2023) 232:673–682
of the most common approaches to obtain this infor- applied to EEG data to provide time-domain based fea-
mation is the electroencephalogram (EEG) study: the tures—for example, line length, frequency and energy
patients are monitored for a period of time with occa- [39, 40]. However, this approach often leads to great
sional functional trials to stimulate the arousal of increase in number of features, which, in its turn, neg-
epileptiform activity [12]. While this method is fairly atively affects computational costs, response time and
reliable, there are certain issues. Firstly, proper epilepsy performance.
diagnostics requires collecting data for a representa- ML commonly addresses classification in supervised
tive number of events which is only possible during fashion—an algorithm is trained on a set of previously
the prolonged continuous EEG monitoring. The stud- labeled data to estimate outputs for unlabeled data [41,
ies show that it is common to require more than three 42]. In this form, machine learning is often used to diag-
days of EEG recording to diagnose the nature of parox- nose neural activity in the brain [43, 44]. Review shows
ysmal episodes [12]. This issue occurs partly due to that the majority of existing seizure detection meth-
high variability of epileptic activity—exact underly- ods rely on supervised ML algorithms [45]. While this
ing cause for epilepsy is usually unknown and can approach demonstrates generally higher performance,
include brain injury, stroke, tumor, congenital disabili- it can suffer from the class imbalance and overfitting.
ties, etc. [13–15]. Secondly, EEG approach relies heav- The class imbalance originates from the rare nature
ily on data deciphering, which is commonly done man- of seizures and requires artificial balancing for “seizure”
ually in clinical practice [16]. Visual analysis requires and “non-seizure” examples in the training set. One
much effort—an experienced specialist can spend hours way to manage the imbalance is constructing feature
reviewing the data of a single patient. Additionally, the space resulting in the long distance between classes.
human factor is involved, which can lead to increased However, stretching this concept too far often leads to
error rate under conditions of high workload and overfitting. The overfitting implies that the algorithm
fatigue. Misdiagnosis can have a heavy impact on the performs satisfactorily on the training data but fails to
patient’s physical and mental health and require its own properly classify test data. Addressing this issue relies
treatment and rehabilitation. Thus, an expert requires on constructing a feature subspace with the biomarkers
assistance from automated systems for seizure detec- of seizures common for the most patients. These rea-
tion [17]. While fully automated detection of epileptic sonings lead us to the problem of the feature selection
seizures seems very attractive, even the modern meth- and interpretability which often occurs in ML. It is cru-
ods in this field still possess a high chance of misdi- cial to analyze obtained feature space to find the most
agnosis. The working solution here is partial automa- important features and perform feature reduction pro-
tion, well-known as the Clinical Decision Support Sys- cedure. In this work, we aimed to propose ML-based
tem (CDSS) [18]. In CDSS, the computer analyzes data approach to epileptic EEG marking that uses specific
and provides recommendations, and the medical expert set of features and can possibly be applied in CDSS.
makes the final decision.
An optimistic approach to automated epileptic
seizure detection is machine learning (ML) [19, 20]. In
case of ML, seizure detection comes in a form of clas- 2 Methods
sifier that commonly detects two classes in EEG data:
“seizures” and “non-seizures” [21, 22] A wide variety of Figure 1 illustrates the whole pipeline of the research.
ML techniques have been applied to this task, includ- Each separate step is explained in detail further in the
ing support vector machine (SVM) [23–26], random for- paper.
est [27–29], artificial neural network (ANN) [30, 31],
k-nearest neighbors (kNN) [32, 33], deep learning [34]. 2.1 Participants
As we mentioned above, epileptic activity can be
highly variable which leads to under-representation and In the study, we used anonymized long-term EEG and
non-robust EEG footprint of an epileptic pattern. This video-monitoring data of 30 adult subjects (15 males
issue leads to situation where direct application of ML and 15 females, age 33.4 ± 9.4) with confirmed diagno-
classifier to raw EEG dataset may not produce enough sis “focal epilepsy”. The experimental dataset was pro-
sensible patterns. Thus, in most cases ML approach vided by National Medical and Surgical Center named
requires use of informative input features, that are com- after N. I. Pirogov of Russian Healthcare Ministry
monly derived from time and frequency domains of (Moscow, Russia). All subjects were patients of the
EEG data [35]. Vast research on time-frequency struc- Department of Neurology and Clinical Neurophysiol-
ture of epileptic EEG [14, 15] reveals some major time- ogy in 2017–2019. Medical procedures were held in
domain features, for example, repeatability, regular- the Center following the Helsinki Declaration and the
ity (periodicity), synchronicity and amplitude variation Center’s medical regulations, and were approved by
of EEG, that are considered to be able to differenti- the local ethics committee. All patients provided writ-
ate epileptic seizure from normal activity [36]. Various ten informed consent before the treatment. The data
transformation techniques including Fourier transfor- was collected during patients’ regular daily routine and
mation (FT), discrete wavelet transformation (DWT), occasional standard physiological trials such as photic
continuous wavelet transformation (CWT) [37, 38] are stimulation and hyperventilation [46]. Length of the
123
Eur. Phys. J. Spec. Top. (2023) 232:673–682 675
monitoring varied from 8 to 57 h and depended on the 2.2 Data acquisition and preprocessing
personal patient’s condition [12]. Each patient had from
one to five epileptic seizures during the time of the mon- EEG signals were recorded with “Micromed”
itoring. While all the patients were subjected to phys- encephalograph (Micromed S.p.A., Italy). Dataset
iological trials, none of the seizures was triggered by for each patient included 25 channels arranged in
this stimulation; i.e., all epileptic seizures were spon- accordance with the international “10–20” system.
taneous. Recorded EEG and video-monitoring data Ground electrode was placed on the forehead and
of the patients were retrospectively analyzed by the reference electrodes were placed at the ears. Sampling
experts from the Center, and all epileptic seizures were rate of EEG data was 128 Hz. The video monitoring
marked. system was used to track patients’ states for easier
data marking.
123
676 Eur. Phys. J. Spec. Top. (2023) 232:673–682
EEG signals are known to be highly susceptible to WP (AWP) by averaging WP values over N = 25 EEG
the influence of various external and internal noises, channels:
especially during prolonged recording [47]. In clini-
N
cal monitoring, external noises usually emerge through 1
poor contact of EEG electrodes, powergrid and cell- E(t) = Wn (f , t) (2)
phone interference, etc. Internal noises (physiological N n=1
artifacts) originate from physiological processes such as
heartbeat, blinking, or breathing [48]. To deal with low- The second step included further decrease of the com-
and high-frequency noises we applied band-pass filter plexity of the data via “downsampling” of AWP. We
with cutoff frequencies of 1 Hz and 60 Hz. Addition- divided each EEG recording into 60-second intervals
ally, we used 50-Hz notch filter to diminish powergrid Tm , where m = 1, 2...M , M = L//60, L—the length
interference. We considered the frequency band 2–30 of EEG recording in seconds, “//” stands for integer
Hz, which includes all commonly studied waveforms division. The choice of such interval length is justified
(delta, theta, alpha, beta), and is often regarded as an by the average duration of an epileptic seizure—from
effective frequency range of EEG [16]. To remove some 30 to 120 s [58]. AWP values were calculated for each
undesired activity that can interfere in this frequency time interval Tm and averaged over the whole length of
range (e.g., blinking artifacts) we used standard pro- the interval to obtain “downsampled” AWP (DAWP):
cedure based on an independent component analysis
(ICA) [49].
1
Studies on epileptic EEG show that seizures manifest em = E(t)dt, (3)
as “outliers” in EEG data [26, 50, 51]. However, out- ΔT t∈Tm
liers in data can also be caused by some external inter-
ference such as mechanical impact on EEG electrodes, where ΔT is the length of each interval Tm (ΔT =60 s).
which is quite common in prolonged EEG recordings
[52]. The existence of two types of outliers in data can
negatively affect training of ML classifier and its abil-
ity to distinct two classes. In our work we removed 3 Machine learning
outliers in normal but not in epileptic EEG activity.
Such preprocessing requires preliminary data labeling 3.1 Feature engineering
and analysis, which contradicts the purpose of classifier.
So we removed outliers only for the training dataset, The initial feature space consisted of DAWP spectra,
but validation and testing were performed on unaltered but we aimed to introduce several additional features.
data. Figure 2 illustrates typical DAWP spectra of a sin-
To construct feature space from EEG data, we gle patient. The red curve corresponds to the epilep-
performed time–frequency analysis of EEG signals tic seizure, blue curves are the specra obtained in the
using CWT with Morlet mother wavelet function neighbouring time-points before and after the seizure.
[53, 54]. We considered wavelet power (WP) as it Green curve reflects the spectrum averaged across the
is common CWT-based characteristic to describe the whole recording of this patient. Extended research on
time-frequency structure of the epileptic EEG [55, epileptic EEG reveals certain peculiarities of seizures in
56]: comparison to normal EEG [4, 14, 15, 59, 60], so intro-
duction of new features that would capitalize on this
Wn (f , t) = |wn (f , t)|, (1) difference can help in seizure detection greatly.
It is well-known that epileptic seizures occur due to
where n = 1, 2...N is the number of EEG channel (N = abnormal excessive or synchronous neuronal activity
25 for the used dataset), f and t are the frequency and in the brain [61]. The statement of abnormality sug-
time point, wn (f , t) are the coefficients of CWT. gests that EEG activity in seizure is generally differ-
To reduce obtained feature space we considered two ent. This means that basic properties of EEG spec-
additional steps. The first step included averaging WP trum—DAWP spectrum in our case—such as dominant
over the EEG channels. This approach is inspired by frequencies, peak energy, energy distribution across fre-
the features of spatial distribution of EEG activity dur- quencies should also differ between epileptic and normal
ing epileptic seizures. In generalized seizures, activity activity [4].
arises suddenly all over the brain, and all EEG sig- According to the explanatory Fig. 2, the DAWP spec-
nals are highly correlated [57]. In focal seizures, activ- trum has much higher power during seizure. Moreover,
ity is localized in a few EEG channels near the focus, it demonstrates large deviation of the power between
however, these channels stand out in terms of time- low and high frequencies. Therefore, we introduce two
frequency structure of EEG signal, so even after averag- features capturing these properties:
ing over the channels WPs for normal and pathological
activity differ significantly. While this approach elimi- • Mean mean DAWP across 2–30 Hz range
nates spatial distribution of EEG activity, it can help • Variance variance of DAWP in spectrum
to differentiate normal and epileptic activity without
knowledge on focus location. We calculated averaged
123
Eur. Phys. J. Spec. Top. (2023) 232:673–682 677
Additional features to assess normal and epileptic data marker of epileptic seizure. According to this conclusion
similarity can be introduced using cosine similarity. we introduced another feature—FreqDiff as difference
This approach suggests considering DAWP spectrum between DAWPs averaged over low (2–5 Hz) and high
in each time interval Tm as a vector, and it is especially (5–30 Hz) frequencies.
popular in ML methods [62, 63]. We introduced fea- Thus, we derived five new features from the data:
ture SimToMean as cosine similarity between DAWP Mean, Variance, SimToMean, SimToNeigh, FreqDiff .
spectrum at given time interval Tm and mean DAWP We aimed to use them along with original DAWP spec-
spectrum for the patient (green curve in the Fig. 2). tra to construct ML model. However, each DAWP spec-
We suppose that this feature in addition to Mean and trum contains many features—spectrum was calculated
Variance can capitalize on the contrast between seizure in 2–30 Hz range with 0.1 Hz step.
and normal EEG. Large number of features negatively affects time for
Epileptic seizures in addition to being abnormal and ML model training. Moreover, DAWP on neighboring
excessive activity also occur spontaneously [64]. This frequencies, such as 2.1 and 2.2 Hz, are highly corre-
fact suggests that EEG activity during the seizure lated, which leads to data redundancy. To lower the
differs greatly from the activity before and after the dimensionality of feature set we used principal compo-
seizure. To assess this difference we introduced another nent analysis (PCA) [65]. The analysis showed that first
cosine similarity-based feature—SimToNeigh. We cal- two components (PCA0 and PCA1) contain 97.18% of
culated SimToNeigh as mean cosine similarity between all information from the initial data. These principal
DAWP spectrum at given time interval Tm and each components, PCs are show in Fig. 3. Red dots corre-
of DAWP spectra from neighboring intervals (Tm−3 , spond to the epileptic seizures, blue dots—to the seg-
Tm−2 , Tm−1 , Tm+1 , Tm+2 , Tm+3 ) (these spectra are ments of normal activity. Although these PCs explain
marked in blue color in the Fig. 2). 97.18% of data, projecting the data onto the recon-
Deep understanding of the spectrum structure can structed feature space barely allows separating seizures
also improve seizure detection. Our recent research and normal EEG.
demonstrated that some parts of the spectrum are Finally, correlation analysis showed high correlation
more prone to reflect epileptic activity. In the paper between Mean and PCA0, so we decided to remove
[50], we reported that the absence seizures in WAG/Rij Mean from the feature set. In the end, for constructing
rats induced a drastic increase of WP in the frequency ML model we used six features: PCA0, PCA1, Vari-
range of 6–8 Hz, while there were no manifestations of ance, SimToMean, SimToNeigh, FreqDiff .
such behavior for other frequencies. In the follow-up
research [26, 51], we showed that epileptic seizures in 3.2 Algorithm
human patients demonstrate similar behavior in the fre-
quency range of 2–5 Hz and not in the rest of the spec- We used RandomForest, a popular supervised learning
trum (5–30 Hz). Figure 2 clearly illustrates this state- algorithm, which builds a forest with an ensemble of
ment. During the seizure, wavelet power (red curve) decision trees and averages their outputs [66]. We chose
reaches the highest values at the low frequency and RandomForest for the following advantages: (i) due to
rapidly decreases within the 2–5 Hz frequency range. binning the variables, RandomForest is not influenced
For normal activity, the difference in wavelet power by outliers; (ii) it handles both linear and non-linear
between the 2–5 Hz and 5–30 Hz is much smaller (blue
curves). Thus, the pronounced difference between low-
and high-frequency EEG activity can be considered as a
123
678 Eur. Phys. J. Spec. Top. (2023) 232:673–682
relationships; (iii) it balances the bias-variance trade- consists of N samples and M is feature space dimen-
off, hence preventing overfitting; (iv) it can automat- sion, then integer
√ m is a fixed parameter (m M , com-
ically balance data sets when one class is more infre- monly m ≈ M ). At each node, m of the variables are
quent than another; (v) it provides feature importance, selected at random. Only these variables are searched
hence allowing the interpretation. through for the best split. The largest tree possible is
In usual tree construction (Classification and Regres- grown and is not pruned. The forest consists of K trees.
sion Tree, CART) each node corresponds to a subset of To classify a new object having coordinates x , x is put
data. Initially the root node contains all data, and at down each of the K trees. Each tree gives a classifica-
each node, the algorithm searches through all variables tion for x . The forest chooses that classification having
to find best split into two children nodes. The algorithm the most out of K votes [67].
splits all the way down and then prunes the tree up to In our work we built a forest of 500 trees using the
get minimal test set error. “sklearn” library in python. To control overfitting, we
In RandomForest the root node contains a bootstrap set restrictions on the growth of the tree. The mini-
sample of data of same size as original data. A different mum number of samples required to be at a leaf node
bootstrap sample for each tree is grown. If training set was set to 3. Therefore, a split point at any depth was
123
Eur. Phys. J. Spec. Top. (2023) 232:673–682 679
considered if it leaves at least 3 training samples in • False Negative (FN) missed epileptic seizure, i.e.
each of the left and right branches. Finally, the max seizure identified as episode of normal activity.
depth was equal to 5; therefore, only five splits were
available for each tree. Other parameters were set by In clinical practice it is commonly important to not
default. miss any seizures, since each episode of epileptic activ-
In training ML model we used custom cross- ity can be crucial for diagnostics. With this in mind, we
validation function. In our case this function is close have chosen Recall (Eq. (4)) as a main metrics to eval-
to “leave-one-out” cross-validation, but among the uate the efficiency of the classifier, since Recall reflects
patients. The model is trained on 29 patients out of the percentage of detected seizures, and classifier with
total 30 and tested on the one remained patient. This higher Recall can be considered as more prominent for
approach imitates situation in medical practice when clinical purpose. However, a possible application for
we have ML algorithm trained on K patients and we epileptic activity classifier includes preliminary EEG
need to diagnose a new, (K + 1)-th, patient, after that marking and reducing working load on human expert.
we can retrain the algorithm on (K + 1) patients and This suggests that the classifier marks some segments of
prepare it for (K + 2)-th patient, etc. EEG recording as epileptic activity, and these segments
In our work we considered “seizure” to be a “pos- are then examined by the expert. It is important that
itive” class, so our two-class classifier has 4 possible classifier-marked segments would include as much true
outcomes with corresponding meanings: seizures as possible, so Precision (Eq. (5)) is another
important metrics.
123
680 Eur. Phys. J. Spec. Top. (2023) 232:673–682
Table 2 Feature ranking The RandomForest provides estimates for the feature
importance, hence enabling interpretation of the clas-
Feature % sification rule. We designed a set of features for the
classifier, some of which were derived from the raw
Variance 31.68
EEG data with a mathematical approach based on the
SimToNeigh 26.93 principal component analysis (PCA), while others were
FreqDiff 24.65 based on the known peculiarities of epileptic activity. As
PCA0 7.97
the result, the classifier demonstrated the recall of 77%
which is comparable with other models trained on these
PCA1 7.04 data. Finally, the RandomForest algorithm assigned the
SimToMean 1.73 importance of 31, 26, and 24% to the interpretable
features, while the most informative principal compo-
nents had the importance of 8 and 7%, respectively.
4 Results We believe that this result emphasizes the importance
of using explainable ML features, and these features are
We used the developed classifier based on Random- the first step to a fully explainable ML algorithm.
Forest algorithm to classify all data in the used EEG
dataset. Results are presented in Table 1. Acknowledgements The study was supported by the
The classifier provides Recall = 78.67 ± 1.33 (mean Project 36-L-22 of the Priority 2030 program of Immanuel
± standard error (SE)) and Precision = 5.33 ± 0.22. Kant Baltic Federal University. VM thanks The Pres-
These results are comparable to our previous work ident Grant (MD-2824.2022.1.2) in part of formulating
[26]. In the paper [26] we proposed unsupervised clas- research hypothesis. VG thanks The President Grant (MK-
sifier for epileptic activity, which was able to achieve 2603.2022.1.6 and MD-590.2022.1.2) in part of data analy-
Recall = 76.97 ± 4.4 and Precision = 12.7 ± 1.47 on a sis.
similar epileptic EEG dataset. From Table 1 one can
see, that Recall commonly has one of the two opposite
values: 100% (23 subjects) or 0% (5 subjects), which Author contributions
was also the case in the previous work [26]. We theorize
that small number of seizures in data (usually only one) All authors contributed to the study conception and
can result in Recall = 100%/0%, if this only seizure design. Material preparation, data collection and anal-
is detected/missed. However, there are some occasions ysis were performed by OEK, SA, VVG, VM, SK, NU,
where multiple seizures were all detected (patients 24 and AEH. The first draft of the manuscript was written
and 30) or all missed (patient 22). The fact that we by VVG, VM, and AEH and all authors commented on
obtained such similar results with drastically differ- previous versions of the manuscript. All authors read
ent approaches—supervised RandomForest in this work and approved the final manuscript.
and unsupervised SVM in [26]—may suggest that there
are some peculiarities in the data itself, and the used Data availability The data that support the findings of
explainable features reflect them well. this study are available from National Medical and Surgi-
cal Center named after N. I. Pirogov of Russian Healthcare
These results bring up again the importance of data
Ministry but restrictions apply to the availability of these
analysis and feature selection. In our work we per-
data, which were used under license for the current study,
formed analysis of feature significance and ranked the and so are not publicly available. Data are however available
features. Results are presented in Table 2. from the authors upon reasonable request and with permis-
From Table 2 one can see, that the three most sig- sion of National Medical and Surgical Center named after
nificant features—Variance, SimToNeigh and FreqD- N. I. Pirogov of Russian Healthcare Ministry.
iff —together contribute 83.26 % to classification. At
the same time, features PCA0 and PCA1, that contain Declarations
97.18% of all information from the “raw” data, con-
tribute only ∼ 15 %. This is an important result: most Conflict of interest All authors certify that they have no
significant features are based on the knowledge of EEG affiliations with or involvement in any organization or entity
data and peculiarities of seizure activity, while the fea- with any financial interest or non-financial interest in the
tures derived mathematically have low significance for subject matter or materials discussed in this manuscript.
classification.
5 Conclusion References
1. W.H. Organization, G.C. against Epilepsy, P. for
In this paper, we demonstrated the importance of using Neurological Diseases, N.W.H. Organization), I.B. for
explainable features for ML. For the EEG data of 30 Epilepsy, W.H.O.D. of Mental Health, S. Abuse, I.B.
patients, we trained a RandomForest classifier to dis- of Epilepsy, I.L. against Epilepsy, Atlas: epilepsy care
tinguish between epileptic seizures and normal activity. in the world (World Health Organization, 2005)
123
Eur. Phys. J. Spec. Top. (2023) 232:673–682 681
2. E. Beghi, The epidemiology of epilepsy. Neuroepidemi- 23. V. Chavakula, I.S. Fernández, J.M. Peters, G. Popli,
ology 54(2), 185–191 (2020) W. Bosl, S. Rakhade, A. Rotenberg, T. Loddenkemper,
3. R.S. Fisher, C. Acevedo, A. Arzimanoglou, A. Bogacz, Automated quantification of spikes. Epilepsy Behav.
J.H. Cross, C.E. Elger, J. Engel Jr., L. Forsgren, J.A. 26(2), 143–152 (2013)
French, M. Glynn et al., Ilae official report: a practical 24. M. Zabihi, Patient-specific epileptic seizure detection in
clinical definition of epilepsy. Epilepsia 55(4), 475–482 long-term EEG recording in paediatric patients with
(2014) intractable seizures. Master’s Thesis, Tampere Univer-
4. R.D. Thijs, R. Surges, T.J. O’Brien, J.W. Sander, sity of Technology (2013)
Epilepsy in adults. Lancet 393(10172), 689–701 (2019) 25. O. Fasil, R. Rajesh, Time-domain exponential energy
5. G. Motamedi, K. Meador, Epilepsy and cognition. for epileptic EEG signal classification. Neurosci. Lett.
Epilepsy Behav. 4, 25–38 (2003) 694, 1–8 (2019)
6. J.W. Sander, The use of antiepileptic drugs-principles 26. O.E. Karpov, V.V. Grubov, V.A. Maksimenko, S.A.
and practice. Epilepsia 45, 28–34 (2004) Kurkin, N.M. Smirnov, N.P. Utyashev, D.A. Andrikov,
7. S. Ghosh, J.K. Sinha, T. Khan, K.S. Devaraju, P. Singh, N.N. Shusharina, A.E. Hramov, Extreme value the-
K. Vaibhav, P. Gaur, Pharmacological and therapeutic ory inspires explainable machine learning approach for
approaches in the treatment of epilepsy. Biomedicines seizure detection. Sci. Rep. 12(1), 1–14 (2022)
9(5), 470 (2021) 27. C. Donos, M. Dümpelmann, A. Schulze-Bonhage, Early
8. P. Ryvlin, J.H. Cross, S. Rheims, Epilepsy surgery in seizure detection algorithm based on intracranial EEG
children and adults. Lancet Neurol. 13(11), 1114–1126 and random forest classification. Int. J. Neural Syst.
(2014) 25(05), 1550,023 (2015)
9. G.K. Bergey, Neurostimulation in the treatment of 28. N.D. Truong, L. Kuhlmann, M.R. Bonyadi, J. Yang, A.
epilepsy. Exp. Neurol. 244, 87–95 (2013) Faulks, O. Kavehei, Supervised learning in automatic
10. World Health Organization, Epilepsy: a public health channel selection for epileptic seizure detection. Expert
imperative (World Health Organization, 2019) Syst. Appl. 86, 199–207 (2017)
11. C.E. Elger, C. Hoppe, Diagnostic challenges in epilepsy: 29. K.D. Tzimourta, A.T. Tzallas, N. Giannakeas,
seizure under-reporting and seizure detection. Lancet L.G. Astrakas, D.G. Tsalikakis, P. Angelidis, M.G.
Neurol. 17(3), 279–288 (2018) Tsipouras, A robust methodology for classification of
12. D.E. Friedman, L.J. Hirsch, How long does it take to epileptic seizures in EEG signals. Heal. Technol. 9(2),
make an accurate diagnosis in an epilepsy monitoring 135–142 (2019)
unit? J. Clin. Neurophysiol. 26(4), 213–217 (2009) 30. L. Guo, D. Rivero, J. Dorado, J.R. Rabunal, A. Pazos,
13. S.D. Shorvon, The etiologic classification of epilepsy. Automatic epileptic seizure detection in EEGs based on
Epilepsia 52(6), 1052–1057 (2011) line length feature and artificial neural networks. J. Neu-
14. E.M. Goldberg, D.A. Coulter, Mechanisms of epilep- rosci. Methods 191(1), 101–109 (2010)
togenesis: a convergence on neural circuit dysfunction. 31. J. Birjandtalab, V.N. Jarmale, M. Nourani, J. Harvey, in
Nat. Rev. Neurosci. 14(5), 337–349 (2013) 2018 IEEE Biomedical Circuits and Systems Conference
15. G.D. Hammer, S.J. McPhee, M.H. Education, Patho- (BioCAS) (IEEE, 2018), pp. 1–4
physiology of Disease: An Introduction to Clinical 32. S. Siuly, E. Kabir, H. Wang, Y. Zhang, Exploring sam-
Medicine (McGraw-Hill Education Medical, New York, pling in the detection of multicategory EEG signals.
2014) Comput. Math. Methods Med. 2015, (2015)
16. W.O. Tatum IV., Handbook of EEG Interpretation 33. S. Lahmiri, A. Shmuel, Accurate classification of seizure
(Springer Publishing Company, New York, 2021) and seizure-free intervals of intracranial EEG signals
17. S. Beniczky, S. Wiebe, J. Jeppesen, W.O. Tatum, M. from epileptic patients. IEEE Trans. Instrum. Meas.
Brazdil, Y. Wang, S.T. Herman, P. Ryvlin, Automated 68(3), 791–796 (2018)
seizure detection using wearable devices: a clinical prac- 34. J.H. Kang, Y.G. Chung, S.P. Kim, An efficient detec-
tice guideline of the international league against epilepsy tion of epileptic seizure by differentiation and spectral
and the international federation of clinical neurophysi- analysis of electroencephalograms. Comput. Biol. Med.
ology. Clin. Neurophysiol. 132(5), 1173–1184 (2021) 66, 352–356 (2015)
18. E.S. Berner, Clinical Decision Support Systems, vol. 233 35. B. Direito, J. Duarte, C. Teixeira, B. Schelter, M. Le
(Springer, Berlin, 2007) Van Quyen, A. Schulze-Bonhage, F. Sales, A. Dourado,
19. M. Mohri, A. Rostamizadeh, A. Talwalkar, Foundations Feature selection in high dimensional EEG features
of Machine Learning (MIT Press, Cambridge, 2018) spaces for epileptic seizure prediction. IFAC Proc. Vol.
20. M.K. Siddiqui, R. Morales-Menendez, X. Huang, N. 44(1), 6206–6211 (2011)
Hussain, A review of epileptic seizure detection using 36. V. Harpale, V. Bairagi, An adaptive method for fea-
machine learning classifiers. Brain Inform. 7, 1–18 ture selection and extraction for classification of epilep-
(2020) tic EEG signal in significant states. J. King Saud Univ.
21. A.T. Tzallas, M.G. Tsipouras, D.I. Fotiadis, Automatic Comput. Inf. Sci. 33(6), 668–676 (2021)
seizure detection based on time-frequency analysis and 37. H.U. Amin, A.S. Malik, R.F. Ahmad, N. Badruddin, N.
artificial neural networks. Comput. Intell. Neurosci. Kamel, M. Hussain, W.T. Chooi, Feature extraction and
2007, 80510 (2007) classification for EEG signals using wavelet transform
22. J. Birjandtalab, M.B. Pouyan, M. Nourani, in First and machine learning techniques. Australas. Phys. Eng.
International Workshop on Pattern Recognition, vol. Sci. Med. 38(1), 139–149 (2015)
10011 (International Society for Optics and Photonics, 38. R. Esteller, J. Echauz, T. Tcheng, B. Litt, B. Pless,
2016), p. 100110M in 2001 Conference Proceedings of the 23rd Annual
123
682 Eur. Phys. J. Spec. Top. (2023) 232:673–682
International Conference of the IEEE Engineering in 53. A.E. Hramov, A.A. Koronovskii, V.A. Makarov, V.A.
Medicine and Biology Society, vol. 2 (IEEE, 2001), Maximenko, A.N. Pavlov, E. Sitnikova, Wavelets in
pp. 1707–1710 Neuroscience (Springer, Berlin, 2021)
39. C. Guerrero-Mosquera, A.M. Trigueros, J.I. Franco, A. 54. A. Aldroubi, M. Unser, Wavelets in Medicine and Biol-
Navia-Vazquez, New feature extraction approach for ogy (Routledge, London, 2017)
epileptic EEG signal detection using time-frequency dis- 55. E. Sitnikova, A.E. Hramov, A.A. Koronovsky, G. van
tributions. Med. Biol. Eng. Comput. 48(4), 321–330 Luijtelaar, Sleep spindles and spike-wave discharges in
(2010) EEG: their generic features, similarities and distinctions
40. L. Logesparan, A.J. Casson, E. Rodriguez-Villegas, disclosed with fourier transform and continuous wavelet
Optimal features for online seizure detection. Med. Biol. analysis. J. Neurosci. Methods 180(2), 304–316 (2009)
Eng. Comput. 50(7), 659–669 (2012) 56. A.N. Pavlov, A.E. Hramov, A.A. Koronovskii, E.Y. Sit-
41. G. Koller, E. Schürholz, T. Ziebart, R. Frankenberger, nikova, V.A. Makarov, A.A. Ovchinnikov, Wavelet anal-
A. Neff, J.W. Bartsch, Clinical evaluation of pathog- ysis in neurodynamics. Phys. Usp. 55(9), 845 (2012)
nomonic salivary protease fingerprinting for oral disease 57. P. Gloor, R. Fariello, Generalized epilepsy: some of its
diagnosis. J. Personal. Med. 1(9), 866 (2021) cellular mechanisms differ from those of focal epilepsy.
42. S.P.K. Shiao, J. Grayson, C.H. Yu, Gene-metabolite Trends Neurosci. 11(2), 63–68 (1988)
interaction in the one carbon metabolism pathway: pre- 58. E. Trinka, J. Höfler, A. Zerbs, Causes of status epilep-
dictors of colorectal cancer in multi-ethnic families. J. ticus. Epilepsia 53, 127–138 (2012)
Personal. Med. 8(3), 26 (2018) 59. H. Adeli, Z. Zhou, N. Dadmehr, Analysis of EEG records
43. A. Batmanova, A. Kuc, V. Maksimenko, A. Savosenkov, in an epileptic patient using wavelet transform. J. Neu-
N. Grigorev, S. Gordleeva, V. Kazantsev, S. Korchagin, rosci. Methods 123(1), 69–87 (2003)
A.E. Hramov, Predicting perceptual decision-making 60. E. Sitnikova, A.E. Hramov, V.V. Grubov, A.A.
errors using EEG and machine learning. Mathematics Ovchinnkov, A.A. Koronovsky, On-off intermittency of
10(17), 3153 (2022) thalamo-cortical oscillations in the electroencephalo-
44. R. Islam, A.V. Andreev, N.N. Shusharina, A.E. Hramov, gram of rats with genetic predisposition to absence
Explainable machine learning methods for classification epilepsy. Brain Res. 1436, 147–156 (2012)
of brain states during visual perception. Mathematics 61. R.S. Fisher, W.V.E. Boas, W. Blume, C. Elger, P.
10(15), 2819 (2022) Genton, P. Lee, J. Engel Jr., Epileptic seizures and
45. B. Abbasi, D.M. Goldenholz, Machine learning applica- epilepsy: definitions proposed by the International
tions in epilepsy. Epilepsia 60(10), 2037–2047 (2019) League Against Epilepsy (ILAE) and the International
46. D. Kasteleijn-Nolst Trenité, G. Rubboli, E. Hirsch, A. Bureau for Epilepsy (IBE). Epilepsia 46(4), 470–472
Martins da Silva, S. Seri, A. Wilkins, J. Parra, A. Cova- (2005)
nis, M. Elia, G. Capovilla et al., Methodology of photic 62. C. Luo, J. Zhan, X. Xue, L. Wang, R. Ren, Q. Yang, in
stimulation revisited: updated European algorithm for International Conference on Artificial Neural Networks
visual stimulation in the EEG laboratory. Epilepsia (Springer, 2018), pp. 382–391
53(1), 16–24 (2012) 63. K. Park, J.S. Hong, W. Kim, A methodology combin-
47. D.M. White, C.A. Van Cott, EEG artifacts in the inten- ing cosine similarity with classifier for text classification.
sive care unit setting. Am. J. Electroneurodiagn. Tech- Appl. Artif. Intell. 34(5), 396–411 (2020)
nol. 50(1), 8–25 (2010) 64. B. Schelter, J. Timmer, A. Schulze-Bonhage, Seizure
48. J.S. Ebersole, T.A. Pedley, Current Practice of Clinical Prediction in Epilepsy: From Basic Mechanisms to Clin-
Electroencephalography (Lippincott Williams & Wilkins, ical Applications (Wiley, New York, 2008)
Philadelphia, 2003) 65. H. Abdi, L.J. Williams, Principal component analysis.
49. A. Hyvärinen, E. Oja, Independent component analy- Wiley Interdiscipl. Rev. Comput. Stat. 2(4), 433–459
sis: algorithms and applications. Neural Netw. 13(4–5), (2010)
411–430 (2000) 66. G. Biau, E. Scornet, A random forest guided tour. TEST
50. N.S. Frolov, V.V. Grubov, V.A. Maksimenko, A. 25(2), 197–227 (2016)
Lüttjohann, V.V. Makarov, A.N. Pavlov, E. Sitnikova, 67. L. Breiman, Manual on setting up, using, and under-
A.N. Pisarchik, J. Kurths, A.E. Hramov, Statistical standing random forests v3. 1. Statistics Department,
properties and predictability of extreme epileptic events. University of California, Berkeley, CA, USA, vol. 1, no.
Sci. Rep. 9(1), 1–8 (2019) 58, pp. 3–42 (2002)
51. O.E. Karpov, V.V. Grubov, V.A. Maksimenko, N.
Utaschev, V.E. Semerikov, D.A. Andrikov, A.E. Springer Nature or its licensor (e.g. a society or other part-
Hramov, Noise amplification precedes extreme epileptic ner) holds exclusive rights to this article under a publish-
events on human EEG. Phys. Rev. E 103(2), 022,310 ing agreement with the author(s) or other rightsholder(s);
(2021) author self-archiving of the accepted manuscript version of
52. M. Krauledat, G. Dornhege, B. Blankertz, K.R. Müller this article is solely governed by the terms of such publishing
et al., Robustifying EEG data analysis by removing out- agreement and applicable law.
liers. Chaos Complex. Lett. 2(3), 259–274 (2007)
123