Seizure classification with selected frequency bands and EEG montages: a Natural Language Processing approach

Wang, Ziwei; Mengoni, Paolo

doi:10.1186/s40708-022-00159-3

Research
Open access
Published: 27 May 2022

Seizure classification with selected frequency bands and EEG montages: a Natural Language Processing approach

Brain Informatics volume 9, Article number: 11 (2022) Cite this article

6472 Accesses
12 Citations
1 Altmetric
Metrics details

Abstract

Individualized treatment is crucial for epileptic patients with different types of seizures. The differences among patients impact the drug choice as well as the surgery procedure. With the advance in machine learning, automatic seizure detection can ease the manual time-consuming and labor-intensive procedure for diagnose seizure in the clinical setting. In this paper, we present an electroencephalography (EEG) frequency bands (sub-bands) and montages selection (sub-zones) method for classifier training that exploits Natural Language Processing from individual patients’ clinical report. The proposed approach is targeting for individualized treatment. We integrated the prior knowledge from patient’s reports into the classifier-building process, mimicking the authentic thinking process of experienced neurologist’s when diagnosing seizure using EEG. The keywords from clinical documents are mapped to the EEG data in terms of frequency bands and scalp EEG electrodes. The data of experiments are from the Temple University Hospital EEG seizure corpus, and the dataset is divided based on each group of patients with same seizure type and same recording electrode references. The classifier includes Random Forest, Support Vector Machine and Multi-Layer Perceptron. The classification performance indicates that competitive results can be achieve with a small portion of EEG the data. Using the sub-zones selection for Generalized Seizures (GNSZ) on all three electrodes, data are reduced by nearly 50% while the performance metrics remain at the same level with the whole frequency and zones. Moreover, when selecting by sub-zones and sub-bands together for GNSZ with Linked Ears reference, the data range reduced to 0.3% of whole range, and the performance deviates less than 3% from the results with whole range of data. Results show that using proposed approach may lead to more efficient implementations of the seizure classifier to be executed on power-efficient devices for long lasting real-time seizures detection.

1 Introduction

Epileptic seizure is one of the most common neurologic disorders that affects the population of all age groups worldwide [1]. Epilepsy is characterized by unprovoked and recurrent seizures and is manifested as a brain spectrum disorder [2]. Seizure is the temporal interruption of normal electrical brain function with burst alterations of neurologic regulation triggered by abnormal electrical neurons discharge [3]. Treatment of seizure includes medicines and brain surgeries, but medically intractable seizures that severely impact some patients’ quality of life still exist.

The diagnosis and monitoring of seizures can be analyzed with electroencephalography (EEG). EEG is neuro-electrophysiologic signals that represent the brain activities acquired from electrodes either implanted subdurally (intracranial EEG) or placed along the scalp (scalp EEG). Given the low-cost and non-invasive nature of scalp EEG, it is still a widely used tool for probing neural functions.

The gold standard to identify seizures is the visual recognition by a trained neurophysiologist using the EEG data where the abnormal electrical morphology is discovered. This manual procedure is labor-intensive as well as time-consuming in the clinical setting, it is subject to electrical signals interference by external noise and artifacts, and the subjective nature of such analysis can lead to disagreement among neurophysiologists.

Automated seizure detection from EEG recordings have been investigated by researchers since 1970s [4]. Models are built to distinguish patterns in brain signals that manifest of epileptic seizures. The models are framed in two typical steps: feature engineering and classification of the ictal/inter-ictal (during seizure/in-between seizures) signals. For a real-world seizure detection problem, the machine learning classification models need to be built with cost-consciousness trying to avoid the intermediate steps for feature computation that have high computational cost. The frequency domain features have proved to be more computationally efficient than time domain [5] and time–frequency domain features.

Background EEG frequency band \((\alpha , \beta , \theta , \delta , \gamma )\) oscillations have been intensively studied in brain normal function. These frequency bands can be recorded during state of wake or sleep: occipital alpha frequency activity \((\alpha )\) observed with a relax state while eyes are closed, frontal and central beta \((\beta )\) and gamma oscillations \((\gamma )\) during alert and vigilance mental state, theta frequency activity \((\theta )\) during sleep or memory tasks, and frontal delta rhythm \((\delta )\) that was recorded while sleep. Nevertheless, the seemingly normal EEG background bands contain clear abnormalities which have been shown as a significant prognostic tool [6,7,8,9].

The selection of EEG channels/montages is widely studied [10] for faster detection and noise removal. The EEG montages refers to the electrodes located on scalps connecting the patient and recording device. Since brain signal conduct in a non-linear and dynamic manner, the recorded electrical voltage is impacted by electrode locations significantly [11]. Neurologists select the channels using their prior knowledge and for an efficient approach it is vital to select the montages which carry the most discriminative information. It has been demonstrated that it is effective to select only a small number of montages for seizure detection [12, 13]. For computer scientists, the process of selecting montages requires additional steps of generation and evaluation of certain electrodes. The subset channels are generated from whole set using various statistical measures [14,15,16].

In this study, we use a natural language processing approach for the efficient selection of frequency bands (sub-bands) and scalp EEG electrodes (sub-zones). By consolidating each patient’s clinical report, we aim to integrate the medics’ prior knowledge into the classifier-building process. In particular, we classify seizure ictal/inter-ictal phases with sub-bands and sub-zones selection from six designed inputs. The three types of frequency band inputs are: the whole frequency range provided in data corpus, the background frequency EEG bands \((\alpha , \beta , \theta , \delta , \gamma )\), and the selected background bands based on keywords extracted from patients’ clinical reports by Natural Language Processing (NLP). We also introduce scalp EEG electrodes reduction by using the electrodes keywords (pre-frontal, frontal, temporal, parietal, occipital, central) extracted from individual’s clinical reports.

The research questions that we aim to answer using out novel approach can be synthesized as follows:

RQ1: How do the selection of frequency bands \((\alpha , \beta , \theta , \delta , \gamma )\) influence seizure classification?
RQ2: How do the selection of EEG electrodes (Fp, F, T, P, O, C) influence seizure classification?
RQ3: How to integrate prior knowledge from experts to build individualized seizure classification models?

In this work, to answer the above stated questions, we (i) present the evidence of background EEG in brain functionality during seizures, (ii) illustrate that selective background frequency bands and EEG electrodes coupling can lead to better seizure classification results, and finally (iii) build a resource-efficient model targeting for individualized seizure classification purpose.

In Sect. 2, we provide background knowledge of the paper. In Sect. 3, we discuss related works. In Sect. 4, we introduce the publicly available dataset used in this work. In Sect. 5, we introduce the design thinking process, the methods including pre-processing using Short-Time Fourier Transform (STFT) and Natural Language Processing (NLP), and the machine learning algorithms. Sect. 6 reports the experiment results, discussed in . Finally, in Sect. 7, we draw conclusions and propose future works.

2 Background

In this section, we discuss the medical background of the research and different types of seizures. Specifically, we introduce the “10-20 system” standardized EEG electrodes placement, the normal and abnormal seizures, and the classification of different types of seizures.

2.1 EEG electrode reference placement

Electrodes connect the EEG machine to patients for the recording of electric inputs generated from brain activities. The standardized electrode placement is represented in Fig. 1. It follows the international “10-20 system” which has been originally proposed in 1958 [17]. The name of the electrodes consists of two symbols. The first symbol is an abbreviation letter precisely pointing to the underlying six brain zones. The letters include F (frontal), Fp (pre-frontal), P (parietal), C (central sulcus), T (temporal), and O (occipital). Additional electrodes are placed behind the outer ear to record the prominent bone process using the letter A. The second symbol is a number (when on mid-line it is a letter z) specifying the left or right brain cortex: electrodes located on the scalp’s right side are assigned even numbers, and odd numbers are used for electrodes on the left side. Smaller numbers denote positions closer to the mid-line, and larger numbers are farther away spots. Note that electrodes P7 and P8 are placed over the posterior part of the temporal, not the parietal region, and also F7 and F8 electrodes are not only close to the frontal cortex but also the pole of the temporal lobe.

The commonly used reference schemes of EEG electrodes are categorized into two classes, namely unipolar and non-unipolar references [11]. Unipolar references construct a neutral record, including Average Reference (AR), Linked Ears reference (LE), and Reference Electrode Standardization Technique (REST).

AR assumes that neuroelectricity transmits isotropically on a perfect layered spherical head, thus using the average of a finite number of electrodes as a reference. LE reference is based on the assumption that due to the sites lack of electrical activity, the average of the potentials recorded is close to zero between two ears. REST is based on the fact that the same brain sources generate all EEG activities. Non-unipolar references are the potential differences of electrodes, including the bipolar and the Laplacian reference. Bipolar Reference shows the 1st derivative of potentials, which is the difference between two nearby electrodes’ potential. Laplacian Reference show the second derivative of potentials, which is the difference between each electrode’s potential and its nearest four neighbors’ averaged potential.

The advantage of unipolar references is that the changes can be observed directly since it is the potential of the electrodes. The main disadvantage is that they are sensitive to common noise and artifact activity. If one electrode is contaminated, interpretation of activity in the brain area can be difficult. Non-unipolar references are not affected by noise as it is the difference of potentials, but this may attenuate the abnormalities observed in the recordings. If the derivation is zero, e.g., caused by equal effects of cerebral activity around electrodes, the interpretation can be challenging.

2.2 Normal and abnormal EEG

Normal EEGs are measurable both qualitatively and quantitatively. Normal EEG activities appear when people are not affected by any disease. Seizure events consist of abnormal brain activities, formally known as inter-ictal epileptiform discharges (IED). The EEG of IED is characterized by the unusual waveforms that deviate from the normal EEG on frequency, amplitude, morphology, localization, and reactivity. Figure 2 shows 10 s of normal and seizure EEG.

In Fig. 3, the five most common normal EEG activity frequency bands \(\alpha , \beta , \theta , \delta , \gamma\) are represented. Each band may have a different interpretation, that can be described as follows:

(i)
Alpha rhythm \((\alpha )\): frequency between 8 and 12 Hz. It is more prominent in the occipital regions of an adult brain and can be observed in amplitude during relaxed and eyes-closed wakefulness. When eye-open and mental alert, alpha activities decrease in amplitude and demonstrate reactivity. Alpha variants are the mixture of the alpha rhythm with other rhythms, which have distinct morphology but, in another way, exhibit the same reactivity and localization.
(ii)
Beta rhythm \((\beta )\): frequency between 12 and 30 Hz. It is primarily seen in the frontal and central areas of the adult brain. It also exhibits a gradual increase with age in the frequency for children. Beta activity is triggered by alertness and vigilance, suppressed by voluntary movements.
(iii)
Theta rhythm \((\theta )\): frequency between 4 and 8 Hz. It is prominently seen in the central, parietal, and temporal parts of the left side scalp recording. Theta rhythm can reflect the abnormal activity in adults during wakefulness and is frequently observed in adults in sleep state.
(iv)
Delta rhythm \((\delta )\): frequency between 0.5 and 4 Hz. It is most predominantly found in adults frontally and in children posteriorly. Delta waves are associated with the deepest levels of the sleep stage and have a healing effect on the body and brain.
(v)
Gamma rhythm \((\gamma )\): frequency between 30 and 50 Hz. It is seen in the cerebral cortex with cognitive and motor activities. Visual stimulation and meditation could increase the amplitude of gamma rhythms. It is often observed in the seizures ictal phase and prevalent in seizure onset. Altered gamma oscillations are regularly detected in brain disorders like Alzheimer’s disease besides epilepsy.

Abnormal EEG activity is often prevalent in people with neurological or other diseases and absent from normal individuals. IED is the abnormal synchronous electrical discharge that originates in epileptic focus with a group of misfunctioning neurons [18]. Sharps and spikes are the prominent abnormal EEG waveforms and manifest as pointed peaks, serving as biological markers for either focal or generalized epileptogenesis. Spike waves are transients often exhibit between 20 and 70 ms. Sharp waves are similar but last longer with typical duration of 70–200 ms. Besides duration, sharps and spikes can have varying waveforms, like the voltage, frequencies, etc. Their occurrence can be single or repetitive, and distribution can be focal or general. The appearance of sharps and spikes is asymmetric, with initial deflection primarily as a sharper slope. The observation can be isolated waveforms or can be followed by slow waves. The subtypes can be divided by multiple ways. For example, by localization, there are temporal/centrotemporal/occipital/generalized spikes and sharp frontal waves; frequency spikes and sharps are associated with various frequency ranges, like 6-Hz spike-and-wave, polyspikes, and 14- and 6-Hz positive bursts, etc. The spike-and-slow-wave complex is the occurrence of a spike followed by a longer duration slow-wave, with varying frequency and amplitude and often distinct from the underlying background. A sharp wave can be the initial waveform rather than a spike. A sharp- and slow-wave complex is identical to the spike- and slow-wave complex, except that a sharp-wave succeeds the slower and broader wave. In these discharges, the slow-wave that follows may symbolize inhibition and subsequent hyperpolarization of cortical neurons, which accompany the initial synchronous depolarization [19]. For epilepsy patients, the above abnormalities are routinely observed between seizure periods and suggest an underlying propensity toward seizures; nevertheless, the abnormalities during a seizure do not result in observable clinical behavior for certain.

2.3 Seizures types

Seizures and epilepsy are classified from International League Against Epilepsy (ILAE) using modern era’s terminology and concepts [20]. The two broader types are defined as generalized and focal seizures. Generalized seizures arise in neuronal networks distributed bilaterally, while focal seizures are limited to one hemisphere. Seizures may propagate from partial to generalized state, when the neuronal network is initially partly altered and may became complete dysfunctional at a later stage. Table 1 reports a selection of seizure categories, together with their symptoms descriptions. For clarity of presentation, the list is partial as it includes only the seizure types included in the dataset used in this work.

Table 1 Seizure type description

Seizure classification with selected frequency bands and EEG montages: a Natural Language Processing approach

Abstract

1 Introduction

2 Background

2.1 EEG electrode reference placement

2.2 Normal and abnormal EEG

2.3 Seizures types

2.4 Seizure classification with EEG

3 Related work

3.1 Frequency bands selection

3.2 EEG montages selection

4 Dataset

4.1 EEG data

4.2 Textual data

5 Methodology

5.1 Time–frequency transformation

5.2 Natural language mapping of signals

5.3 Inputs selection

5.4 Classification models

6 Experiments

6.1 Natural language keywords aggregation

6.2 Seizure classification

6.2.1 Experimental settings

6.2.2 Evaluation metrics

6.2.3 Model performance

6.3 Ablation study

7 Discussion

8 Conclusions and future works

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords