Analyzing EEG Signals with Machine Learning for Diagnosing Alzheimer ’ s Disease

In order to have the greatest treatment impact the early and accurate diagnose of Alzheimer’s disease (AD) is essential. In this paper we present a method for analyzing EEG signals with machine learning approach in order to diagnose AD. We show how to extract features out of EEG recordings to be used with a machine learning algorithm for the induction of AD classification model. The obtained results are very promising. DOI: http://dx.doi.org/10.5755/j01.eee.18.8.2627


I. INTRODUCTION
Alzheimer's disease (AD) is a brain disorder characterized by a progressive dementia that occurs in middle or late life [1].The pathologic characteristics are degeneration of specific nerve cells, presence of neuritic plaques, and neurofibrillary tangles.The presence of Alzheimer's disease in a patient is very difficult to diagnose, especially when the progress of the disease is in its earlier stages [2].On the other hand, the success of known treatments (which is to alleviate the problems by improving the cognitive performance and stopping or slowing down the severity of symptoms) depends a great deal on the stage when the disease is correctly identified and diagnosed [3].Identification of individuals prior to the development of significant clinical symptoms is imperative in order to have the greatest treatment impact by maintaining cognitive abilities and preserving quality of life [4].
Clinical criteria for the diagnosis of Alzheimer's disease include insidious onset and progressive impairment of memory and other cognitive functions [1].There are no obvious motor, sensory, or coordination deficits early in the disease.The diagnosis cannot be determined by laboratory tests.In this manner, the AD is generally diagnosed using a combination of psychological and motor abilities tests, which are not completely reliable.
As AD is a brain disorder, the measurement of brain activity should possibly reveal the presence of the AD.In this manner, the analysis of the electroencephalogram (EEG) signals has been often used in order to diagnose the presence of AD [5] matched control subjects.However, this effect is not always easily detectable, which is especially the case for patients in the pre-symptomatic phase, prior to the clinical symptoms appearance.Recent studies show that the accuracy of diagnosing AD in such patients is around 83% with balanced sensitivity and specificity [5].
Based on good experience we had with symbolic machine learning (ML) in various domains [6], [7] it was our proposition to perform the analysis of EEG signals using ML approach for diagnosing AD (that is in our knowledge first such attempt).In this manner, the first important task is the identification and extraction of EEG signals' features, which are to be used as attributes for a ML algorithm.The next important task is the selection or development of an appropriate ML algorithm for this case.
After performing intensive experiments, it has turned out that the use of ML approach for the analysis of EEG signals with regard to AD can be a very viable alternative to the known AD diagnosing approaches.For this purpose, the selection of only a few highly informative features from EEG signals is necessary, and the use of an adaptive ML algorithm that is able to produce balanced classification models with regard to accuracy, sensitivity and specificity.Using our own evolutionary algorithm for the induction of decision trees genTrees we were able to achieve the accuracy rate of approximately 86 %.

II. MEASURING EEG SIGNALS OF A PERSON'S BRAIN
The electroencephalogram (EEG) can be roughly defined as the mean electrical activity of the brain in different sites of the head.More specifically, it is the sum of the extracellular current flows of a large group of neurons.EEG recordings are achieved by placing electrodes of high conductivity (impedance < 5000 Ω) in different locations of the head [8].Measures of the electric potentials can be recorded between pairs of active electrodes (bipolar recordings) or with respect to a supposed passive electrode called reference (monopolar recordings).These measures are mainly performed on the surface of the head (scalp EEG) or by using special electrodes placed in the brain after a surgical operation (intracranial EEG).
In scalp (surface) EEG recordings, the most widely used placement of electrodes is the so called 10-20 system, consisting in 20 electrodes (or sometimes less) uniformly distributed along the head, generally referenced to 2 electrodes in the earlobes [9], see Fig. 1.Normal scalp EEG recordings are usually taken with the subject relaxed.

III. ANALYSIS OF EEG SIGNALS
EEG signals require quantitative techniques that can be validly applied to time series exhibiting ranges of nonstationary behavior.Up to the moment, after the introduction of digital recordings of EEGs, the spectral analysis based on the Fourier Transform is by far the most used quantitative method for the analysis of EEG signals [8].Fourier Transform allows the separation and study of different EEG rhythms, a task difficult to perform visually when several rhythms occur simultaneously.
Since the Fourier Transform is based on comparing the signal with complex sinusoids that extend through the whole time domain, its main disadvantage is the lack of information about the time evolution of the frequencies.Then, if an alteration occurs at some time boundary, the whole Fourier spectrum will be affected, thus also needing the requirement of stationarity.In many occasions, therefore, signals have time varying features that cannot be resolved with the Fourier Transform.In this manner, the problem can be (at least to some extent) resolved by using Gabor transform or co-called Short-Time Fourier Transform (STFT).The basic idea of STFT is slicing the waveform of interest into a number of short segments and performing the analysis on each of these segments, usually using the standard Fourier transform.
Given digital EEG data as x(t), the basic STFT is defined as where * denotes complex conjugation, g * D (t'-t) is the window of wide D and center in t.The transform can be considered as an inner product between the signal and the sinusoidal functions e -i2πft' modulated by the window g D , i.e.
Two main problems with the STFT are: selecting an optimal window length for data segments that contain several different features may not be possible and the timefrequency trade-off: shortening the data length, to improve time resolution will reduce frequency resolution.Shortening the data segment could also result in the loss of low frequencies that are no longer fully included in the data segment.If the window is made smaller to improve the time resolution, then the frequency resolution is degraded and vice versa.This time-frequency tradeoff has been equated to an uncertainty principle where the product of frequency resolution (expressed as bandwidth, B) and time, T, must be greater than some minimum; specifically: BT ≥ 1/4π.
The spectrogram corresponding to one channel of the EEG recording, obtained after performing the STFT, is shown in Fig. 2. As can be seen from this figure, timefrequency maps of EEG signals are often sparse: most energy is contained in specific regions of the time-frequency map ("bumps"), corresponding to transient oscillations.

IV. SELECTION OF RELEVANT ATTRIBUTES FROM EEG SIGNALS
After the low-level signal pre-processing is performed, in the second stage of the proposed EEG analysis, we perform feature extraction.These features should serve as unique descriptors of person's brain activity and will provide an input to the classification stagethe attributes to be used with a selected ML algorithm.In addition, by extracting features from raw EEG data, the dimensionality of the problem is dramatically reduced to the acceptable level for symbolic ML.
A known approach to feature extraction from EEG data is to use some algorithm for estimating the dominant frequency and power content for the cases where it is assumed that there is only one dominant sinusoid in each channel of the filtered visual evoked potential signal [11],.These algorithms, such as MUSIC [12], belong to the class of subspace methods, also known as high-resolution methods or super-resolution methods, and are based on the eigenanalysis or eigendecomposition of the data correlation matrix [11].
One of the problems with the above approach is that the extracted features are very difficult if not impossible to interpret without profound mathematical knowledge, which is usually not the case with medical experts.In this manner, we decided to use some basic and well known statistical measures to extract a set of attributes, based on the domain knowledge of how the electrical activities in brain are influenced in the presence of AD.
The study of different types of oscillations and rhythmicities of the brain and their relation with different pathologies and functions keep the attention of researchers since the beginnings of EEG measuring.Brain oscillations were divided in frequency bands that have been related with different brain states, functions or pathologies [13].The characteristic oscillations are (Table I):  delta rhythms (0.5-3.5 Hz) are characteristic of deep sleep stages; delta oscillations with certain specific morphologies, localizations and rhythmicities are correlated with different pathologies,  theta rhythms (3.5-7.5 Hz) are enhanced during sleep and they play an important role in infancy and childhood; in the awake adult, high theta activity is considered abnormal and it is related with different brain disorders,  alpha rhythms (7.5-12.5 Hz) appear spontaneously in normal adults during wakefulness, under relaxation and mental inactivity conditions; they are best seen with eyes closed and most pronounced in occipital locations,  beta rhythms (12.5-30Hz) are best defined in central and frontal locations, they have less amplitude than alpha waves and they are enhanced upon expectancy states or tension,  gamma rhythms (30-60 Hz) are generally not of major interest with regard to the surface EEG.Based on the above description, the most interesting and hopefully revealing are delta and theta rhythms, which have already been correlated with different brain disorders, and alpha rhythm, because it represents the awake state that is most easily measured without distractions (subject is relaxed without mental activity).Beta rhythm is not appropriate because the recording is influenced by mental activities of a subject.
In this way we defined 5 frequency bands (3-4 Hz, 5-6 Hz, 7-8 Hz, 9-10 Hz, and 11-12 Hz).First, within each frequency band the following measures were extracted (see Fig. 3): the mean, the median, the max and the min value over the frequency band.
Then, based on the max value within a frequency band (Fig. 4), the time slice were selected, from which the following measures were extracted (Fig. 5): the mean, the median, the max, and xmed 1/2 (a special kind of central tendency) over the selected time slice (Fig. 5).

V. CLASSIFICATION OF AD AND RESULTS
Using the described approach, attributes were extracted from EEG recordings for 208 patients.The EEG was recorded in the length of 120 seconds with up to 100 Hz.The data were originally collected at the Hospital of Santa Cruz in the cooperation with University of La Laguna, Spain.The EEG data was collected systematically for a selected group of patients who reported some problems, possibly related to brain disorders.All patients were then carefully observed in a longer period of time until the final diagnose of AD was determined with highest confidence.The class distribution was highly imbalanced: 88% of AD and 12% of other vascular problems (no AD).
The first experiments showed very low classification performance.There are two problems with regard to AD dataset: 1) too many attributes (640) in regard to number of instances (208), and 2) highly imbalanced class distribution.For the first issue, we decided to use only a subset of channels (each channel is represented by 40 attributes).For the second issue, we decided to use our evolutionary decision tree induction algorithm genTrees, which is a robust and adaptable ML algorithm, developed as the core of AREX [6].
After a series of experiments, it turned out that the best results can be obtained when using a combination of 4 th and 16 th EEG channel.All the measures have been performed using a 10-fold cross validation with approx.80% of cases used for training and the remaining 20% for testing purposes.The obtained results are presented on Fig. 6.It can be easily seen that only genTrees provided somewhat balanced classification model (with sensitivity 88.89% and specificity 71.43% not too far apart), while achieving also the highest accuracy (86.05%).

VI. CONCLUSIONS
The obtained results show that the proposed method of analyzing EEG signals with ML approach is a viable way of diagnosing AD.We have succeeded to determine relevant features within EEG recordings, which were used as attributes for inducting a classification model using different ML algorithms.It has been also shown that the use of a robust and adaptive ML algorithm enables one to induce well balanced classification models, which are even more accurate (86 %) when compared to well known synchrony measures (83 %) [5].

Fig. 3 .
Fig. 3. Selection of attributes in frequency domain in one freq.band (the same is applied to all five frequency bands).In this manner, 40 attributes have been extracted from a single EEG channel: 20 from the frequency domain (4 for every of 5 frequency bands) and another 20 from the time domain of a patient's EEG signal.Altogether 640 attributes have been selected for 16 channels (16 times 40).

Fig. 4 .
Fig. 4. Selection of attributes from EEG signal, showing the characteristic points for the frequency-(b5) and time-domain (t5); the same applies to bands 1 through 4.

Fig. 5 .
Fig. 5. Selection of attributes in time domain at the time slice, where the Fourier power within the frequency band is maximal (the same applies to all five frequency bands).

Fig. 6 .
Fig. 6.AD classification performance of different algorithms on the test set: accuracy (button height defined by std.dev.) with the range between sensitivity and specificity shown as a vertical line.
. It is well known that EEG signals of AD patients are generally less synchronous than in age- Manuscript received March 9, 2012; accepted May 12, 2012.

TABLE I .
THE CHARACTERISTIC BRAIN OSCILLATIONS.