Algorithm for the Detection of Changes of the Correlation Structure in Multivariate Time Series

In this research, an improved algorithm for the detection of changes of the correlation structure in multivariate time series is proposed. The starting point of the technique is a covariance matrix whose entries are the largest entries of a cross-covariance matrix which is composed of all pairs of the time series reconstructed to an M -dimensional phase space. Principal component analysis is performed on this maximized cross-covariance matrix, and the overall degree of synchronization among multiple-channel signals is defined, by synchronization index, as the Shannon entropy of the eigenvalue spectrum. Throughout the experiment, the effectiveness of the proposed algorithm is validated with simulated data – a network of time series generated by autoregressive models and a network of coupled chaotic Roessler oscillators. DOI: http://dx.doi.org/10.5755/j01.eee.18.8.2625


I. INTRODUCTION
Spatially-extended complex dynamical systems may be thought of as being composed of numerous constituents (dynamically formed subsystems), each having its own dynamics.Typically, the relevant state variables of such systems can only be viewed through observation functions that project the high-dimensional state space onto an observation space of much lower dimension, resulting in a set of time series [1].In realistic situations, however, the system contains multiple interacting components, and is nonlinear, non-stationary and noisy, and one goal of data analysis is to detect, characterise, and possibly predict any events that can significantly affect the normal function of the system [2].Multivariate analyses of such time series may in fact help to give deeper insights into the collective dynamics of spatially-extended systems.For example, understanding brain function, during both physiological and pathophysiological conditions (such as, for example, epilepsy), requires the characterisation and quantification of the collective behaviour of neural networks that generate signals in different areas.Such real-world examples call for an effective method of analysing nonlinear and nonstationary multichannel data collected in a noisy environment.For realistic dynamical systems in the presence of noise, when the multichannel recordings are usually from signals to which quite different combinations of the intrinsic dynamical variables of the underlying system contributed, it is often useful to explore weaker forms of synchronization, such as phase synchronization [2].Over the last decade, a number of studies have been devoted to interactions in spatially-extended systems [1]- [8].In some studies, in particular, a zero-lag (or equal-time) correlation matrix, constructed from multivariate data sets, was analysed and several matrix-based methods to detect global changes in synchronization were proposed.The key point of these methods is that changes in the degree of synchronization between time series provoke level repulsions between Eigen states at both edges of the spectrum of the correlation matrix [4], [5].To quantify the degree of uniformity in the distribution of the eigenvalues, the Shannon entropy is often used as a robust approach.The methods were applied to EEG recordings from epilepsy patients and were demonstrated to be able to detect, for instance, statistically significant changes in the correlation structure of focal onset seizures [4], [8].Another way to study interactions in spatially-extended systems is based on a statistical analysis of multivariate phase synchronization phenomena by using the phase-coherence matrix [1,6] or by constructing a matrix based on the average phase synchronization times (APST) among all available pairs of channels [2], [6].It is argued that the APST can, in general, be significantly more sensitive to changes in the degree of synchronization than correlations [2].On the other hand, the matrices for phase coherence or the APST are based on the Hilbert transformation, on the assumption that time series are oscillatory.Multichannel data from a real system are to a considerable degree stochastic, with a broad power spectrum, as they are corrupted by both internal (e.g.dynamic) and external (e.g.measurement) noises, and the extracted instantaneous phases often have no physical meaning [4].Multivariate singular spectrum analysis (M-SSA) [9], [10] provides insight into the unknown or only partially known dynamics of the underlying system, by decomposing the delay-coordinate phase space of a given multivariate time series into a set of data-adaptive orthonormal components, and can greatly help phase synchronization analysis [9], also when lag synchronization occurs.However, the full multivariate singular spectrum of processes with broadband power spectrum is not concentrated into a single first largest eigenvalue, and the Shannon entropy of this singular spectrum is high even when all channels of a system are identical.Therefore the effectiveness of the entropy measure is low.Another remark is that M-SSA operates on a large covariance matrix, which is computed from the full augmented trajectory matrix whose size is equal to the product of the number of channels and the embedding dimension of the reconstructed phase space [9].As a result, M-SSA becomes computationally expensive, especially in a moving-window analysis of nonstationary data.
In this paper I present a simple and less computationally complex algorithm for the detection of changes of the correlation structure in multivariate time series by using the maximum linear cross correlation measure for lag synchronization.The starting point of the technique is a covariance matrix whose entries are the largest entries of a cross-covariance matrix which is composed of all pairs of the time series reconstructed to an M-dimensional phase space.

II. ALGORITHM BASED ON SINGULAR SPECTRUM ANALYSIS OF A MAXIMISED COVARIANCE MATRIX (MCM-SSA) Let
{ ( ) } be a multivariate time series with D channels of length N. It is assumed that each channel has been centred and normalized.Each channel can be reconstructed to an M-dimensional phase space by selecting the embedding dimension M and time delay .Each phase point in the phase space is thus defined by [11] where ( ) , and ( ) denotes the transpose of a real matrix.At the reconstructed phase space matrix with M rows and columns (called a trajectory matrix) is defined by: and encompasses M delayed versions of each channel.The total trajectory matrix of the set will be a concatenation of the component trajectory matrices computed for each channel, i.e.
[ ] .This full augmented trajectory matrix, which has DM rows of length can be used for multivariate singular spectrum analysis (M-SSA) [9].However, as mentioned above, the eigendecomposition of a large covariance matrix in a moving-window analysis of non-stationary data becomes computationally expensive.The covariance matrix can be rewritten as a block matrix: with cross-covariance matrix for all pairs of trajectory matrix in blocks, i.e. , , .Only the largest entries of the cross-covariance matrix ( ) are incorporated in the resulting covariance matrix ̂.These largest values denote the cross-covariance between channels of the multivariate time series which are aligned by phase (have a similar course in time).It is necessary that would be greater than the expected time delay between dependent channels.In order to reduce computational time, the resulting covariance matrix ̂ is calculated from the full matrix according to the following formula where .A further reduction of the computational time can be achieved by replacing the full covariance matrix by the reduced covariance matrix , where [ ] is a concatenation of the reduced component trajectory matrices , , which for each channel are defined by: Thus only zero-lagged and M-lagged versions of each channel are used.This allows us to capture all lagged correlations, and thereby to avoid multiple calculations of the covariance between lagged copies, which differ only in several time points.The resulting covariance matrix ̂ is calculated from the reduced covariance matrix according to the slightly corrected formula (4) ))).
A further approach is based on the singular value decomposition (SVD) of the maximized covariance matrix ̂ ̂ , to yield a diagonal matrix that contains the real eigenvalues of ̂, and a matrix E whose columns are the associated eigenvectors , { }.The form a new orthogonal basis in the embedding space of , and the corresponding give the variance in the direction of .The spectral decomposition in Equation ( 7) determines the directions of greatest variance, from largest to smallest, subject to the condition that each new direction be orthogonal to all preceding ones.The overall degree of synchronization among multiple-channel signals shall be denoted by the synchronization index , which will be defined as the well-known Shannon entropy of the eigenvalue spectrum where ∑ are the squared normalized eigenvalues.
indicates perfect synchronization and signals complete lack of synchronization.

III. SIMULATION RESULTS
We consider a network of time series generated by an autoregressive model (AR) as a prototypical model of a stochastic dynamical system, to compare the performance of the proposed algorithm, based on principal component analysis of a maximized cross-covariance matrix (MCM-PCA), to an M-SSA based method for the overall degree of synchronization estimation.The set of D=10 uncoupled time series were generated by means of the Matlab GARCH module and applying data smoothing.The data were generated repeatedly in 20 independent trials.The length of this multivariate time series was chosen to be 10,000 data points, the size of the moving window was chosen to be 1,000 data points and the time interval between two adjacent moving windows was 100 data points.The size of the moving window defines the time scale on which correlations are measured, and a compromise has to be made between the time scale given by and the influence of noise and random correlations.Hence, the choice of the length of strongly depends on the specific properties of the system under consideration, i.e. its typical time scales, the magnitude of noise contamination, and the sampling rate of the measurement.The signals inside the window were normalized channel-wise so that they had a mean of zero and unit variance.Because the proposed algorithm is applicable largely to multivariate time series, where lag synchronization occurs, we progressively inserted, for simplicity, lagged copies (with progressively increased lag ) of a single channel in additional to the 10 existing channels, thus increasing the overall degree of synchronization, rather than introducing couplings into these processes [4].The average (over all windows) synchronization index was calculated for each trial.In the following, we compare the proposed MCM-PCA algorithm with its M-SSA counterpart.The synchronization indices ( ) (for the MCM-PCA algorithm) and ( ) (for the M-SSA algorithm) over the number J of additional inserted single channel lagged copies were each calculated according to Equation (8).
In order to put the obtained results onto more general grounds, we considered also a prototypical model of nonstationary dynamical systems, a network of coupled chaotic Roessler oscillators under noise [2].The nonstationary nature of the system is manifested by the time-dependent coupling parameter.Data processing and analyses were performed using software written by Matlab (The MathWorks, Natick, MA).
Fig. 1 shows the increment of the overall degree of synchronization among multiple-channel signals over the number of single channel lagged copies inserted in addition to the existing ones.Note that the direction of change in the synchronization indices is opposite to this increment.We can see from this figure that the proposed approach appears to be more sensitive to changes in the degree of synchronization than the method based on M-SSA.To compare the effectiveness of the MCM-PCA algorithm and M-SSA based algorithm and consider approximately linear behavior in ( ) for , we have defined for our prototype model the following contrast measure [6] ( ) ( ) that characterizes the sensitivity of the algorithm to changes in the degree of synchronization.Further we have calculated the contrast measure for each trial and use ANOVA to compare this measure for both methods.ANOVA says there is a significant difference (p<0,001).Fig. 2 shows the time evolution of synchronization indices (i.e., Shannon entropy) for a network of coupled chaotic Roessler oscillators, where the dashed line indicate the full synchronized regime.From Fig. 2 it can be seen that the value of the contrast for the Shannon entropy defined by M-SSA based method is far less than those defined by MCM-PCA based method.

Fig. 1 .
Fig. 1.Evolution of the averaged over all windows and all trials synchronization indices and for a network of AR time series versus the number J of additional inserted single channel lagged copies.

Fig. 2 .
Fig. 2. Evolution of the synchronization indexes and for a network of 10 locally coupled chaotic Roessler oscillators; the full synchronized regime is marked by dotted line.

Fig. 3
Fig. 3 shows the averaged relative running time for computing synchronization indices according to the MCM-PCA and M-SSA algorithms versus the embedding dimension M, for a network of AR time series and a network of coupled chaotic Roessler oscillators of equivalent size.

Fig. 3 .
Fig. 3.The averaged relative running time for computing synchronization indices and according to the MCM-PCA and M-SSA algorithms respectively, for a network of 10 locally coupled chaotic Roessler oscillators and network of 10 AR time series, versus the embedding dimension M.