Multiple Physiological Signal-Based Human Stress Identification Using Non-Linear Classifiers

This paper describes the human stress identification using multiple physiological signals. The Electrocardiogram (ECG), Electromyogram (EMG), Heart Rate Variability (HRV), Galvanic Skin Response (GSR), and Skin Temperature (ST) are the multiple physiological signals acquired and derived from the 40 subjects using mental arithmetic task -based stress-inducing stimuli. To compute the stress induced in the participated subjects, the wavelet denoising, digital elliptic filtering, ectopic beat removal algorithm, Lomb-Scargle (LS) periodogram, Fast Fourier Transform (FFT), and startle detection algorithms are the signal processing methods used to extract the various features of five physiological signals. K Nearest Neighbour (KNN) and Probabilistic Neural Network (PNN) are the nonlinear classifiers used to discriminate the normal and stress states of of the subjects. In order to strengthen the multiple evidence-based stress identification system, we investigated the Higher-Order Statistical (HOS) features in HRV signals that successfully in various applications in cardiac fault detection. Similarly, to evaluate the efficacy of the electromyogram (EMG), galvanic skin response (GSR), and skin temperature (ST), the existing statistical features are considered with a large number of data samples in stress research. The results indicate that the proposed HOS of HRV performed well, with accuracy up to 93.75 %. In other extreme, 76.25 %, 71.25 %, 70.32 %, and 75.32 % were obtained in ECG, EMG, GSR, and ST, respectively. Finally, this study concludes that multiple physiological signal-based subject-independent analyses incorporated and its algorithm gives the reasonably improved detection rate. DOI: http://dx.doi.org/10.5755/j01.eee.19.7.2232


I. INTRODUCTION
Identifying the human stress remains a major challenge in the computational field.Stress can be defined as lack of coordination with the human body and mind, and which can be controlled by relaxation as well the suitable management techniques.Human stress results in the loss of performance, emotional stability and, in some cases, might lead to detection rate using a computerised Stroop colour word test [7], with the results indicating 90.02% in subject-dependent approaches.The features in the physiological signals have been found to be different in the stressed individual than the normal human being.However, the identification of a stressed state in comparison to the normal state is recent in this research.
In this work, we targeted to identifying the stressed subjects an independent analysis using the mental arithmetic task with the help of advanced signal processing for five physiological signals.The paper is organised into five sections.Section II deals a brief introduction used in the stress assessment research.Section III describes the proposed methodology and its related materials and methods.Section IV elaborated the results and its discussion.Finally, section V concludes the results.

A. Heart rate variability
HRV remains one of the dominant indicators of human stress compared to other physiological signals.Previously, several studies have considered the HRV signals for the identification and analysis of stress [1], [2], [6], [7].More often these studies have been laboratory-based, rather than in real time.Our previous study concludes the dominant stimuli, physiological signals and lack of signal processing methods are the major impediments in this work [8].The High Frequency [HF] and Low Frequency (LF) range of HRV is frequently investigated and is one of the reliable measures.How long the heart rate was elevated due to stress is measured, which leads to computational problems.Because the HRV is unevenly sampled signal, this leads to the unknown sampling frequency.Usually, the duration of heart rate signals is important to the efficacy of the frequency band analysis.Therefore, the determination of the duration of the sample as well as sampling frequency or suitable methods required to process the data.Our previous study was intentionally motivated to solve these two problems using an LS periodogram after applying the ectopic beat removal algorithm [9].

B. Electrocardiogram
The ECG is a well-known physiological signal, but limited addressed the stress using the ECG signal.So far, the HRV signal has been the dominant measure of stress.Recently, studies have started to use an ECG as a measure in emotion classification research [10].Normally, useful information in an ECG exists below 100 Hz.Emotion assessment studies are done in two frequency ranges: 0-100 Hz and 0-10 Hz [10], [11].The car driver stress and emotions were identified in a 10s window of a 0-100 Hz signal, and a classification accuracy of 79% and 76% using SVM and KNN classifiers was reported [10].Similarly, a 0-10 Hz signal was separated into eight sub-bands with equally spaced non-overlapping intervals, and a classification accuracy of 75% was obtained using an extended linear discriminate classifier [11].Usually, in HRV, 0-0.5 Hz is considered to analyse the stress due to autonomic nervous system (ANS) relationships.

C. Electromyogram
Trapezius muscle in the shoulder is often used in the identification of involuntary activities such as stress [4], [6].Muscles consist of voluntary and involuntary contractions.Voluntary muscle movement is finely controlled by the brain through the autonomic nervous system (ANS).Similarly, involuntary muscles such as cardiac muscles and blood vessels as well as muscles in digestive and reproductive systems undergo involuntary movements that poor in the control of the brain.This involuntary muscle movement also generates an action potential during the stress state.This potential in facial and Trapezius muscles can be measured [12].Previously, Healey studied the 0-16 Hz of an EMG signal of driver stress, and Lundeberg et al. studied the EMG signal in the Trapezius locations of the shoulder and observed measurable changes [3], [6].

D. Galvanic skin response and skin temperature
GSR is one of the effective measures in stress assessment research.Previously, GSR has been measured several laboratories based [7], [13] and real time driver stress-level computation [6].During stress, the GSR signal has the startle response, which is occurring only in extreme stimuli or demand.During the stress assessment, external or internal demand is created to evoke the startle potential [6].The fingers, arc and toes on the foot are ideal locations for skin conductance.Usually, the GSR is measured in two fingers.
ST is the simplest measure that is used to estimate changes in affective states such as stress and emotions [7], [14].Though, ST measurement and analysis are easier than other physiological signals [15].A limited number of studies have investigated the relationship between stress and ST [7], [15].In order to increase the evidence of stress measurement, the skin temperature is used in this work to propose a physiological stress assessment system.

A. Protocol
Fig. 1 shows the proposed research methodology of this work.The mental arithmetic task is one of the efficient stimuli to induce stress [2], [16], [17].In this work, a mental arithmetic task has been proposed, which consists of arithmetic problems with audio distraction to increase stress through the mental demand [2], [18].The protocol consists of 4 different levels such as relaxing with soft music, low level (low difficulty), medium level (medium difficulty), and high level (high difficulty) with audio distraction.Each level has 30 arithmetic problem addition, subtraction, multiplication and division or combination of these.Each problem has 4 answers and subject click the correct answer using mouse within 10 sec provided.If the subject is unsuccessful to click the correct or not click any answer within 10 sec, the protocol automatically starts from the beginning of that level.In this approach, arithmetic problem with audio distraction increase the mental demand and reduces the cognition and memory.These complexities are the main reason for induction of stress levels.Relaxations are provided at initial and final stages of the stress-inducing task.For efficient classification the initial relaxation and high-level stress are considered to be normal and stress, respectively.The complete duration of stress is around 66 min.

B. Subjects and data acquisition
Totally, 40 subjects (20 males and 20 females) in the age group between 20 and 25 years with similar educational and cultural backgrounds participated in this study.All the subjects free from history of medication or a drug intake and smoking habit.All the subjects provided written consent and confirmed that their participation is voluntary to this research.The detailed explanation was given about the protocol prior to the start of the task.Three ECG surfacereplaceable electrodes (Ag/AgCl) were placed based on Einthoven triangle placement.Bipolar (Ag/AgCl) gold plated reusable electrodes were placed on the left Trapezius muscle.Similarly, the thermistor-based skin temperature electrodes were placed under the armpit in order to avoid external temperature interference.Finally, the GSR electrode was placed on the hand between two fingers.Four physiological signals (ECG, EMG, GSR, and ST) were collected during the entire experiments from the participating subjects.ECG, GSR and ST signals were sampled at a frequency of 1 kHz using AD Instruments, Australia.The EMG signal was sampled at 500 Hz using PHYWE Instruments, Germany.The complete experimentation was free from the fluctuation of temperature, lighting, human interference, and external sound.Finally, the subject was asked to report the effectiveness of the protocol after the experiments were completed.Fig. 2 shows the descriptive pattern of multiple signals acquired in this research.

C. Wavelet denoising and digital elliptic filtering
A common preprocessing methodology is required while investigating multiple physiological signals without using any deletion of the useful frequency range, to remove the baseline wandering, high-frequency noises, and power line noises.In this work, to remove the noises from the ECG and EMG signal, the common preprocessing methods were found to be more useful.The wavelet transform has its own capability to denoise the abnormal variation occurring in any frequency [19].The complete algorithm does not depend on the cutoff frequency ranges.Instead, of cutoff frequency, a statistical estimate based thresholding is used to remove noise in each level of the decomposition.In our previous work, performance of wavelet denoising was tested in our ECG signal using different wavelet functions.Finally, we found that the rigrsure thresholding rule is better than other methods.Similarly, three mother wavelet functions, namely "db4", "coif5" and "sym7" wavelet functions were investigated to determine the best among them [20].The result shows that "coif5" wavelet transform performs well and the obtained wavelet function is common for both ECG and EMG signals.The major advantage of "coif5" wavelet function is the convergence of the ECG and EMG signals completely.The "coif5" mother wavelet, rigrsure thresholding rule, and soft thresholding method was utilized in our experiments.ECG and EMG analysis signals were decomposed up to 14 levels in order to search for the lowfrequency signals.Similarly, the GSR and ST signals useful information in a particular frequency range only.Therefore, the 4 order low pass digital IIR elliptic filters were applied to remove the unwanted information above the 8 Hz in GSR and 1 Hz in ST signals.

D. Feature computation
To extract the features, the preprocessed signal was run through the feature extraction methods.The ECG signal was directly analysed using FFT and simple statistical features such as mean, standard deviation, power, energy were computed in both time and frequency domains of 0.05-0.15Hz and 0.15-0.5 Hz [11].However, the HRV is the more dominant signal in stress assessment.HRV signals were derived from the acquired ECG signal using the "coif5" wavelet function-based modified Pan-Tompkins algorithm [21].The derived HRV signal presented the problem of noise peaks and missed beats that more sensitive to HF and LF power in the stress assessment studies.To resolve these issues, ectopic beat removal was done using beat replacement and interpolation [22].The HRV is an unevenly sampled time signal and its power spectrum extraction in the frequency domain cannot be done by FFT.Gilford et al. proposed the LS periodogram to extract the power of the above complex HRV signal [22].Totally, three frequency bands (0.04-0.15 Hz, 0.15-0.5 Hz and 0.04-0.5 Hz) in the power spectrum were studied using (HOS) features, namely kurtosis, skewness, third-order cumulant, and fourth-order cumulant.The same features of ECG were used in the analysis of EMG signals time and frequency domains for the frequency range of 0-16 Hz [6].The latency, amplitude, rise time, and half recovery time represent the typical electrodermal activity.Latency of EDR is the time difference between stimuli and the onset of the rise of the response.EDR amplitude is the difference between the peaks of the response to baseline amplitude.EDR duration is the time duration between the onset of response and peak response.The above features are computed in the GSR signal in the time domain signal of 0-8 Hz.Finally, the mean, minimum, maximum, and standard deviation of ST were calculated.

E. Classification
Mapping the input pattern into appropriate class is the main function of artificial classifiers.In this work, we concentrated on studying two simple nonlinear classifiers, namely K-nearest neighbour (KNN) and PNN.These classifiers work entirely differently from each other.The KNN works based on the number of nearest neighbours in the testing features over the training feature.The Euclidian distance-based operation is incorporated in this study.The minimum k value change is usually suggested for better discrimination.Therefore, 1-10 k value tested and the classification result of each k value is obtained from the average of 10x randomly taken samples of shuffled data.Ten-fold cross validation of KNN classifiers is also done to compare the results with the same procedure.PNN classifier works in different principle, the architecture of PNN consists of input, output, pattern, and summation layers.It has a parallel structure and which faster than back propagation neural networks.The operation of PNN is based on the probability density function, with its performance depending upon the spread factor.The spread factor close to 1 is more significant than the minimum value during the classification.Each spread value and its overall classification accuracy were obtained in a similar technique to the k value of KNN.Table I shows the summary features investigated in each signal.A total of 40 subjects and 148 features were investigated; 70% of the data in each feature were tested, and 30% of the features were trained in all the signals.

IV. RESULTS AND DISCUSSION
In this work, the overall classification accuracy is reported along with sensitivity, and specificity.To obtain these measures, the True Positive (TP), False Positive (FP), True Negative (TP), and False Negative (FN) values were calculated.Sensitivity=TP/ (TP+FN), and specificity=TN/ (TN+FP), Overall accuracy=TP+TN / (TP+TN+FP+FN) was computed.The overall classification accuracy of the dominant features of each signal is reported in Table II.

A. Classification using nonlinear classifier
The compressive results of dominant features and classification results are compared in three classifiers including 10-fold KNN.The maximum classification accuracy of 76.25 % was obtained in 0.15-0.5 Hz of the ECG signal in mean feature.The specificity and sensitivity are well balanced in the features with the rate of 75.61 and 76.92 %, respectively, in the KNN conventional classifier when the value is 7.The HRV signal is another prime measure in human stress assessment research.Totally, 3 frequencies were investigated, and the LF and HF frequency bands of HRV produced the maximum detection rate in second-and third-order cummulants (93.75% in both of the features).However, the specificity of LF and HF bands is 90.7 % and 88.89 %, respectively, and sensitivity of LF and HF bands is 97.3 % and 100%, respectively while the k value is 5 in K-fold KNN.The HRV feature shows the maximum accuracy obtained while the K value is minimum.The classification EMG signal reached the accuracy of 71.25 % in the K-fold KNN classifier, which has high specificity and sensitivity when the K is 7.However, the PNN is performing relatively well with the accuracy of 70.82% while the spread factor is 0.03.
In GSR and ST classification, the classification accuracy of 70.83% and 75.32% was obtained with different specificities and sensitivities while the spread factor is 0.04 and K value is 1.

B. Statistical analysis of dominant features
One-way ANOVA was performed in all the dominant features to identify the features that are statistically distinguishable.The mean in 0.15-0.5 Hz of the ECG is statistically significant, between normal to stress: F (1, 79) = 28.62,p<0.001, in normal (M=0.072,SD=0.016) and in stress (M=0.050,SD=0.021).Similarly, normalised secondorder cumulant of the HRV feature between normal and stress is statistically significant: F (1, 79) =112.51,p<0.001, in normal state (M=0.086,SD=0.011) and in stress state (M=0.05,SD=0.018).On the other hand, normalised thirdorder cumulant of the HRV feature between normal and stress is also statistically significant: F (  The mean value and standard deviation are only included in the features of EMG, GSR, ST, and GSR because the probability of those features was p<0.05:EMG in normal (M=-0.053,SD=0.023) and stress (M=-0.060,SD=0.026) conditions; GSR in normal (M=0.060,SD=0.0146) and stress (M=-0.065,SD=0.145) conditions; ST in normal (M=-0.062,SD=0.005) and stress (M=0.064,SD=0.0053) conditions.Fig. 3 shows the dominant feature elevations and drop in normal and relaxed state over the 40 subjects in this study during the final segment of the task.

C. Comparative analysis and discussion
Literature only carried out similar kinds of experimentation using classifiers.In this investigation, we generally intend on identifying the stress.Therefore, we investigated several features instead of selecting optimum features.In [6]   However, these investigations are only discussing the overall classification results rather than the individual signals and its feature performance.Later on, some of the studies are initiated to investigate the various features using single physiological signals that targeted the classification instead of statistical analysis [13], [15].Our present study involves subjects being independently analysed according to the protocol design.The efficient protocol, suitable signal processing methodology, and proposed as well standard features are the main reasons for the improved accuracy.In existing investigations, the absence of long-term objectives, improper stimuli and signal selection, and short of efficient feature estimation and classification method are the factors for the deprived results.

V. CONCLUSIONS
In this paper, we experimentally elaborated the human stress computation using multiple physiological signals as to locate the more dominant stress-relevant features in each signal.Dominant features were individually identified in ECG, HRV, EMG, GSR, and ST signals.Finally, the maximum classification accuracy obtained was 93.75% and 91.67%, respectively in the basic HOS features of HRV.Similarly, GSR shows the lowest accuracy of 70.83% compared to other signals.Advanced signal processing methodology and the proposed HOS feature in the HRV signal are the main reasons for the improved result.Similarly, the EMG and GSR signals results were improved.Notably, the complete result is outperforming compared to the subject-dependent studies.In future, number subjects with different age ranges are considered to develop the stress identification system.Later on, it will be extended to the stress level classification system.

TABLE I .
FEATURES INVESTIGATED IN ALL THE SIGNALS.

TABLE II .
RESULTS OF MULTIPLE PHYSIOLOGICAL SIGNALS-BASED STRESS ASSESSMENT SYSTEM.
, 29 statistical features including 4 physiological signals have been estimated for car driver stress detection, which contains the mean of the EMG, normalised mean, and variance of respiration, heart rate, and skin conductivity.Video metric-based data sorting and classification produced the 97.02% accuracy in all signals.