Fractional Delay Time Embedding of EEG Signals into High Dimensional Phase Space

1Abstract—We propose a method for reconstruction of the high-dimensional phase space of the electroencephalography (EEG) signal. The method is based on the selection of positive trajectories from the phase space, distance-adaptive sampling of the negative trajectories from the phase space, classification of trajectories in the phase space, and reconstruction of a fuzzy state of the signal for classification of EEG signals.


I. INTRODUCTION
Brain-computer interface (BCI) is a communication system that translates brain activity into commands for a computer or other digital device [1].Most BCI systems work by reading and interpreting cortically-evoked electropotentials via an electroencephalogram (EEG) data.The frequencies of these brain waves range from 0.5 Hz to 100 Hz, and their characteristics change dynamically depending on the activity of the human brain [2].
BCI systems require correct classification of the EEG signals for useful operation.Since the EEG signal may be considered as chaotic [3], [4], the nonlinear dynamics and chaos theory methods can be applied for analysis and classification of the EEG data.Nonlinearity of the EEG and other types of biosignals has been studied by applying nonlinear analysis methods such as Detrended Fluctuation Analysis (DFA) [5], Poincaré Plots [6], reconstruction of Local Phase Space, segmentation of the EEG signal into stationary fragments [7], application of non-linear operators to the EEG time series [8].
Nonlinear analysis usually depends on the reconstruction of the phase-space geometry of the signal from a small number of observables.The first step is to embed data in a higher dimensional phase space.The embedding method is based on two theorems.
First, Whitney Embedding Theorem [9] implies that 2M + 1 independent signals measured from a system can be considered as a map from the set of states to the 2M + 1 dimensional space, thereby reconstructing the phase space.
Next, Takens Embedding Theorem [10] proved that instead of 2M + 1 generic signals, the time-delayed versions of a generic signal would suffice to embed the Mdimensional manifold.This paper proposes a method for reconstruction of the high-dimensional phase space of the EEG signal.The novelty of the method, first, is the concept of fractional embedding, which extends linear embedding for fractional time delays.Second novelty is the reconstruction of the fuzzy state vector of the signal which is useful for in-depth signal analysis and identification of signal landmarks [11].

A. Preliminaries
The aim of binary classification is classification of the members of a given set of objects into two groups on the basis of whether they have some property or not.
Let us have two datasets of signal values: a positive dataset . The aim is to perform classification of unknown label time series u X sampled from X , i.e., to find ) ( u X  .

B. Method
The proposed method consists of the following stages: 1. Reconstruction of the high-dimensional phase space of the signal.2. Selection of positive trajectories from the phase space.
3. Distance-adaptive sampling of the negative trajectories from the phase space.4. Classification of trajectories in the phase space.5. Reconstruction of a fuzzy state of the signal.
The stages are explained in detail in the following subsections.

C. Reconstruction of Phase Space
The -dimensional phase space where  is the delay (lag), and M is the dimensionality of the trajectory.We define the reconstruction of the phase space as a function Y X  :   , where  is the delay parameter.
The classical method of delays allows only for    .Our novelty is the modification of the classical delay-based method to allow for fractional delay for a fractional delay  is calculated using the linear interpolation as in ( 1) This step allows for dimensionality reduction of original datasets, because usually T M  .The result of this step is two datasets of signal trajectories: a positive dataset } , , , { 2 1

D. Selection of Positive Trajectories
We assume that a positive dataset P Y is a mixed set of trajectories, which consists of a subset of trajectories that are similar to N Y as well as of a subset of trajectories that are different from N Y .The aim is to find a subset of positive trajectories P Y ŝuch that the distance between We also assume that for any dataset i X there exists at one such positive trajectory such described in (2) ˆ, max ( , ), where ˆ( ) is a dissimilarity (distance) function between two trajectories.
Here we use the Euclidean distance, described in (3), as the dissimilarity function.We specify the selection step as a

E. Distance-Adaptive Sampling of Negative Trajectories
The result of the selection step is a positive trajectory dataset } , , , { 2 1 . Using such imbalanced datasets for classification is impractical, because the classification results become biased [12].
To balance the dataset, we perform the distance-adaptive sampling of a negative dataset N Y as described in (4) ( ) , where ( , ) The algorithm of the distance-adaptive sampling is as follows: 1 ), (  .

G. Reconstruction of a Fuzzy State of the Signal
The classification result of each trajectory is used to reconstruct the state vector ) (t S of the EEG signal as described in (5)
The results are interpreted in terms of fuzzy logic as a fuzzy probability of the state (positive or negative) of the signal at time t .
Finally, a classification label is assigned to u X as described in ( 6): 1, max ( ) , ( ) 0, max ( ) , where  is the threshold probability of the state.

III. METHODS
For our experiments, Datasets Ia, Ib (‹self-regulation of SCPs›) and dataset IV (‹self-paced 1s›) from the BBCI competition datasets (http://bbci.de/competition/)were used.The datasets were taken from healthy subjects.In the first two datasets the subject was asked to move a cursor up and down on a computer screen, and received visual feedback, while his slow cortical potentials (SCPs) were recorded.
Dataset Ia [13] consists of 135 trials belonging to class 0 and 133 trials belonging to class 1.
Dataset Ib [13] consists of 100 trials belonging to class 0 and 100 trials belonging to class 1.Each trial consists of 896 samples from each of 6 channels.The sampling rate is 256 Hz, the recording length is 3.5 s.
In dataset IV [14], the goal is to predict the laterality of upcoming finger movements (left vs. right hand) 130 ms before key press.The sampling rate is 1 kHz.Downsampling to 256 Hz was performed before the experiments.The set consists of 159 trials belonging to class 1 and 157 trials belonging to class 0.
All datasets were randomly partitioned into 5 parts, and 5fold cross-validation was used to evaluate the classification results.

IV. RESULTS
Figure 1 shows some examples of the EEG signals taken from Dataset Ib, channel 0.

V. RELATED WORK
The idea of embedding has been used previously for analysis of EEG data.
Athitsos et al. [15] proposed a method for approximate subsequence matching by using embedding.Embedding maps each database time series into a sequence of vectors, so that every step of every time series is mapped to a vector.Embedding is computed by applying full dynamic time warping between reference objects and each time series.
Klonowski et al. [3] have estimated the embedding dimension of the EEG signals of the same subjects to be between 7 and 11.
Rohrbacker [16] has found that the EEG data has been unfolded in 22-dimensional space.Anderson et al. [17] have used a time-embedding dimension of 50 for a neural network classifier.
Kroupi et al. [18] investigated how brain signals represented by the EEG data interacted when watching music clips.A non-linear measure was applied to determine the driver/driven relationship between these two modalities.The range of the embedding dimensions used for the EEG segments was between 3 to 8.
Jianbo and Jing [19] used the recurrence time based method for seizure detection.Their empirical evidence shows that embedding dimensions from 3 to 6 are fine, while time delays may be chosen from 2 to 6.
Yuan et al. [20] analysed the embedding dimension of normal and epileptic EEG time The study has found that the embedding dimension reflects variation of freedom degree of human brain nonlinear dynamic system during seizure.The embedding dimension changes during seizure and becomes different from that of normal EEG signals: the average value of the embedding dimension of normal EEG signals is 8, while the embedding dimension of the epileptic EEG signals is 17.

VI. CONCLUSIONS
The advantages of the proposed method are: 1) the reconstruction of the fuzzy state vector of the signal allows for in-depth analysis of the signal and identification of signal landmarks; and 2) fractional time embedding allows for more flexible embedding.
The disadvantages of the proposed method are 1) the determination of the delay time and the dimension , which allows to achieve best separation of trajectories from positive and negative datasets, is a computationally hard task if done by extensive search; 2) identification of signal landmarks can be time-consuming if performed manually by an expert; and 3) the method is susceptible to noise (this is the main problem of all embedding-based methods), so the accuracy of the method is hard to guarantee.
Further research will focus on proposing the computationally feasible solution to finding optimal delay time and embedding dimension values to achieve best separation of positive and negative signal trajectories considering time constraints of real-time BCI systems.

1 . 2 .
EEG signal (Dataset Ib, channel 0): a) 20 positive signals, b) 20 negative signals.The results of the fractional time delay embedding of the EEG signal into the high dimensional phase space is shown in Fig. 2 (dataset Ib, channel 0).The plot shows that positive dataset has a wider distribution of values than a negative dataset (compare Fig. 2(a) with Fig. 2(b)).Such extreme values (called landmarks [11]) can be exploited for signal classification.Phase space reconstruction of EEG signals (dimension M = 2, time delay τ = 9.5): a) 20 positive signals, b) 20 negative signals.