Deep Learning Framework with ECG Feature-Based Kernels for Heart Disease Classification

1 Abstract —Heart disease classification with high accuracy can support the physician’s correct decision on patients. This paper proposes a kernel size calculation based on P, Q, R, and S waves of one heartbeat to enhance classification accuracy in a deep learning framework. In addition, Electrocardiogram (ECG) signals were filtered using wavelet transform with dmey wavelet, in which the shape of the dmey is closed to that of one heartbeat. With this selected dmey, each heartbeat was standardized with 300 samples for calculation of kernel sizes so that it contains most features in each heartbeat. Therefore, in this research, with 103,459 heart rhythms from the MIT-BIH Arrhythmia Database, the proposed approach for calculation of kernel sizes is effective with seven convolutional layers and other fully connected layers in a Deep Neural Network (DNN). In particular, with five types of heart disease, the result of the high classification accuracy is about 99.4 %. It means that the proposed kernel size calculation in the convolutional layers can achieve good classification performance and it may be developed for classifying different types of disease.


I. INTRODUCTION
Cardiovascular diseases (CVDs) are the main cause of leading the large number of deaths each year in the world. In accordance to World Health Organization (WHO), there were about 17.9 million people died from CVDs in 2016, of which about 31 % of people died globally [1]. In heart disease diagnosis, Electrocardiogram (ECG) signals always play an important role [2]. Therefore, the ECG signals, which are exactly processed, can support the physician's diagnosis of the heart disease more accurately.
Difficulty lies in the fact that the ECG signals often have different types of noise, such as Baseline Wander (BW), Power Line Interference (PLI), and artifacts [3]- [5], which can affect the evaluation of diagnostic results. For this reason, many methods for improving the quality of the ECG signals have been proposed [6], [7]. In particular, a bandpass filter at the frequency range of 0.1 Hz-100 Hz was utilized to obtain the necessary information in the ECG signals [8]. Moreover, a Butterworth high-pass filter with a cut-off frequency of 0.5 Hz and a Finite Impulse Response (FIR) filter of the 12 th order with a cut-off frequency of Manuscript received 10 September, 2020; accepted 5 December, 2020. 35 Hz at the 3 dB point were applied for filtering the baseline wander and unwanted high frequency noises in ECG signals [9]. The result is that, the maximum accuracy value of one classifier with the filtered ECG signals was obtained at 98.6 %. In addition to these methods, Wavelet Transform (WT) algorithm has been applied for filtering noise in ECG signals.
The WT was utilized to eliminate noise in ECG signals [10], [11], such as BW noise, PLI noise or random noise. In practice, the WT with a daubechies wavelet function was applied for BW removal in ECG signals in order to classify heart disease [12]. The results showed that the accuracy of the classifier using the filtered ECG signal is higher than that of directly the original ECG signals, in which the accuracy of the heart disease classifier with the filtered ECG signals was about 94.03 %. Another research using the WT for extracting features in the ECG signals is a Continuous Wavelet Transform (CWT). This research applied the daubechies5 (db5) wavelet function to transform heartbeat into the two-dimensional (2D) image corresponding CWT patterns considered as features for the classification of heart disease [13].
In addition, different wavelet functions, such as daubechies, symlet, coiflet, biorthogonal, and reverse biorthogonal, decomposing at level eight were applied to remove noises in ECG signals before using classifiers [14]. As a result, the accuracy of the classifier with all mentioned wavelet functions, in which the neural classifier has high accuracy using daubechies wavelet, is 100 % compared to other classifiers. However, this paper only shows classification for two types of heart disease. In particular, the normal and abnormal classes in the testing dataset were only 24 and 20 heartbeats, respectively. In practice, filtering noise using the WT method is applied for feature extraction of the ECG signals for the purpose of improving the efficiency in heart disease classifiers [15]- [17]. It is obvious that different wavelet families have been employed for filtering noises or decomposing ECG signals for enhancing classification accuracy. In our research, dmey wavelet family will be chosen to be applied due to its shape like that of one heartbeat. With this selection of the dmey, the classification accuracy in the DNN structure will increase.
Almost all classifiers using Convolutional Neural Networks (CNNs) have been utilized for classifying heart diseases in recent years, in which kernel sizes in convolutional layers are differently chosen related to classification performance. In particular, different feature maps (C1/C2: 12/6, 6/6, 6/12, 3/6, and 6/3) were convoluted with different sizes of kernels (3×9, 3×22, 5×9, and 5×11) [18]; a CNN with many different convolutional layers was applied, in which authors proposed to setup parameters of the model using the kernel size of 5 for all layers [19]; there were many kernels with different lengths for convolutional layers, particularly the first convolutional layer used 20 kernels with the same size of 661 and different features for 20 convolutional times. This first layer was connected to the second convolutional layer using 50 kernels with the same size of 440 [20].
In [21], Guo, Sim, and Matuszewski proposed a densely connected CNN and gated a recurrent unit network model for tackling an issue in an inter-patient ECG classification system. In particular, four convolutional layers with one kernel size of eight in this classification system were applied to classify heart disease. For calculating this kernel size, the ECG signals were sampled at 180 Hz to be eight data points, and then the kernel size was presented for 44 ms in time. It means that the shortest duration of QRS complex was extracted (e.g., the QRS complex duration is around of 60 ms). The result of the classification is that supraventricular (S) disease and ventricular (V) arrhythmia were classified with the accuracy of 93.61 % and 93.71 %, respectively. It is obvious that in many CNNs, the calculation of kernel sizes and the number of convolutional kernels used are different for increasing classification performance. In our article, the calculation of kernel sizes is based on a heartbeat with 300 sampling points effectively using seven convolutional layers in the DNN structure.
In recent years, Machine Learning (ML) algorithms have been widely applied in many fields of life and obtained a lot of achievements [22]- [25]. In the field of identifying heart disease from ECG signals, one of the ML algorithms is the CNN, in which its structure often consists of some convolutional layers and fully connected layers. In addition, the convolutional layers are utilized to extract features in the ECG signals, and the fully connected layers, which are based on the features extracted from the convolutional layers, are used for classifying heart disease [26], [27]. In practice, a combination of convolutional layers and Long Short-Term Memory layers was employed for extraction of ECG features, and these features are used for the input of the fully connected layers to identify five types of heart disease using the MIT-BIH Arrhythmia Database (AD) [28]. The result is that the accuracy of 98.10 % was achieved using this classifier. Furthermore, an empirical mode decomposition for removing noise and a Faster Regions with a Convolutional Neural Network (Faster R-CNN) [29] were combined for identifying five types of heart disease from the MIT-BIH AD ECG database [11]. Heartbeats, which were segmented from the ECG signals, were taken into the Faster R-CNN for the feature of extraction and classification. The result of the high accuracy of the heart disease classifier was at 99.21 %.
For the study of heart disease classification, pure ECG databases play an important role due to the higher classification performance. In recent years, many heart disease databases have been published on websites for research objective and attracted researchers around the world [30]- [32]. In the MIT-BIH AD ECG database [33], we can just collect five types of heart disease according to the American Heart Association (AHA) standard, and this is an imbalance among these diseases for using classification algorithms. For instance, there are 90431 heartbeats for normal beat (N), while Fusion of Ventricular and Normal Beat (F) heartbeats have only 802 heartbeats used in a classifier. As a result, the classification performance often drops due to this imbalance. Another case is that in the MIT-BIH QT database [34], heartbeats concerning with N and F heart disease categories are 81097 and 251, respectively. To tackle this problem, the oversampling and undersampling methods were applied [35]. However, the originative problem is that dealing with imbalanced data can skew the reality of life. Therefore, identifying diseases with such condition is not only the imbalance, but also a big challenge. To improve the act of identifying these heart diseases, one deep learning framework with kernel size calculation is implemented in our article to produce higher classification performance.
From the difficulty and challenge due to the effect of noise in ECG signals and imbalanced database among different diseases, the selection of an appropriate DNN with suitable kernel calculation algorithm in convolutional layers is very important. In the DNN structures [12], [18], [36], one of the most important tasks is to select kernel sizes in convolutional layers and the number of convolutional layers for calculating main features. In [20], authors proposed an automatic segmentation of heart sound signal in a DNN, in which feature engineering tasks were not required. The result of this proposed method is to detect heartbeats in a fully automatic manner with the accuracy of 79.95 %.
In our article, the goal is to calculate kernel size in convolutional layers of the DNN to solve unbalanced datasets and increasing classification accuracy. The main contribution of this article is the kernel size calculation based on P, Q, R, and S waves of one heartbeat with 300 sampling points; the suitable selection of dmey wavelet function; and applying 7 convolutional layers in the DNN for heart disease classification. The objectives of this paper can be summarized as follows: 1) promoting the best suitable wavelet function and wavelet analysis level for eliminating noise in the ECG signals; 2) standardizing the length of the heartbeat; 3) proposing one algorithm for calculating kernel size in convolutional layers and fully connected layers to combine the best feature extraction for heart disease classification with higher performance in the DNN structure; 4) evaluating classification performance.

II. METHODOLOGY
For classification of heart disease with high performance, ECG signals should be pre-processed and separated into heartbeats for standardizing the suitable lengths for kernel size calculation. Therefore, the standardized heartbeat with P, Q, R, and S waves is used to build kernels in convolutional layers in a Deep Neural Network (DNN). In Fig. 1, with the MIT-BIH AD ECG database, the heart disease classification method includes the following steps: (a) pre-processing ECG signals using a Wavelet Decomposition (WD) algorithm and a Wavelet Reconstruction (WR) algorithm with thresholds for eliminating BW noise and unwanted high frequencies, (b) standardizing reconstructed ECG signals for calculating kernel sizes in the convolutional layers, and then developing a classifier based on the DNN structure, (c) evaluating the classification method using confusion matrix.

A. ECG Datasets
In this article, the MIT-BIH AD ECG database was employed in the heart disease classification method [33], in which five groups of heart disease in the MIT-BIH AD ECG database are mapped into the Association for the Advancement of Medical Instrumentation (AAMI) 2012 standard, including Normal Beat (N), Supraventricular Ectopic Beat (S or SVEB), Ventricular Ectopic Beat (V or VEB), Fusion of the V and N Beats (F), and Unknown Beat Type (Q) as shown in Table I.   TABLE I These five heart disease groups contain 48 recordings of 47 different subjects (the numbered recordings 201 and 202 are one subject). ECG signals in the five groups were sampled at 360 Hz and 11-bits resolution. While four numbered recordings, including 102, 104, 107, and 217, are only paced beats and were removed [7]. Therefore, the ECG database used in this classification method has all 44 recordings as shown in Table II, while Table III describes the number of heartbeats in each type of the five heart disease groups in the MIT-BIH AD ECG database. Division of the heartbeat database into training and testing data using the DNN for classifying is extremely important due to classifier's relation to accuracy. It is clear that the classifier accuracy can be very high when the two training and testing heartbeat datasets were obtained from the same ECG dataset [37]. To evaluate the performance of the heart disease classifier, we divided the ECG dataset into training data and testing data by different percentages. In particular, we divided the dataset into different ratios started from 10 % to 90 % and ended from 90 % to 10 % for training and testing as shown in Table IV to evaluate the classification performance of the different division (Table VIII). B. Pre-Processing ECG Signals ECG signals usually include information and noise, and then the noise can affect classification of heart disease. Therefore, the elimination of the noise in the ECG signals is essential for improving classifying performance. In this paper, the WD and WR algorithms combined with soft and hard thresholds were applied as described in Fig. 2. In particular, the WD algorithm is employed to decompose the ECG signals into approximation and detail coefficients through high-pass and low-pass filters with a downsampling by two. It means that the detail coefficients are obtained after the ECG signals are passed through the lowpass filter, and the approximation one is collected using the high-pass filter. Therefore, assume that [] xn is the ECG signals and m is the level of the wavelet decomposition. Moreover, the expressions for calculating the approximation d m and detail a m coefficients are expressed as follows [28]: in which m is the level of the WD algorithm, g[2n -k] and h[2n -k] denote the high-pass and low-pass filters, respectively. The approximation and detail coefficients are applied for eliminating noise in ECG signals based on their frequencies and thresholds. In particular, one ECG signal is sampled at the 360 Hz frequency using Nyquist's theorem for collecting the maximum frequency of 180 Hz. In addition, the frequency of approximation and detail coefficients in each decomposition level are described as in [38], [39]. In the approximation component with the very low frequency a m , the decomposition of level-m and a hard threshold λ a are utilized for eliminating the noise from the approximation m a using the following formula The threshold of the approximation λ a is calculated by the maximum value of the approximation coefficients in ECG signals during the WD as follows In similarity, the detail coefficients of d m can be used to remove the unwanted high frequency components using the soft threshold λ d . In particular, the detail components m d are determined using the following sign function The soft threshold λ d for removing the unwanted components is determined using the following expressions [40]: where N is the data point number of the detail coefficients and σ represents the estimated noise of the detail coefficients using the median function median(d m In particular, the approximation coefficients m a go through a high-pass filter and the detail coefficients m d pass through a low-pass filter with up-sampling by two. Therefore, the ECG signal [] xn is reconstructed using the following equation The reconstructed ECG (R-ECG) signal [] xn will be segmented to obtain heartbeats for calculating kernel size in convolutional layers. In this paper, the Pan-Tompkins algorithm is employed to determine the R_peak position of the heartbeat, in which heartbeat is segmented based on R_peak information. Therefore, the location of R_peak in one heartbeat is very important in identifying heart disease using a DNN classifier.
In ECG signal, the length of each heartbeat can be different, and this can affect the classification performance. Particularly it means that the obtained accuracy can be different [17], [22], [30]. According to statistics, one ECG signal can have heartbeats with different lengths, in which one heartbeat with the largest length can be 1.2 seconds. Therefore, if one ECG signal is sampled at 360 Hz, the samples of the heartbeat with the largest length are calculated by 360 × 1.2 = 432. It means that the R-ECG signal [] xn can be segmented to obtain each heartbeat x[n] with the maximum length of 432 samples.

C. Deep Neural Network Structure
In this paper, the integration of standardized heartbeats into a DNN structure is represented. In particular, convolutional layers in the DNN structure are utilized to extract features of the standardized heartbeats, in which optimal kernel sizes in convolutional layers are designed to extract features. Moreover, for classifying different types of heart disease, Multilayer Perceptron (MLP) with fully connected layer is added to the DNN structure (see Fig. 3), in which input neurons in the input layer correspond to the sample number of one heartbeat x[n], the size of the output layer corresponds to the number of heart diseases. In hidden layers, the size of kernels and nodes in each hidden layer needs to be adjusted for obtaining the best classification performance. Furthermore, with the proposed kernel size calculation in this paper, seven convolutional layers were chosen to extract all important features in the P, Q, R, and S waves of the heartbeat signal, in which three first convolutional layers will be employed for extracting features in the QRS complex and the remaining convolutional layers will be used to extract features in individual P, Q, R, and S waves.
where b m j is the bias value of the neuron j, w l jk denotes the weight value between the neuron k in the CNN layer l and the neuron j in the MLP layer m. Moreover, the hidden layers and the output layer in the MLP use tansig and softmax functions, respectively. It means that the activation function of the fully connected layers is tansig and the activation function of the output layer is softmax.  (12) in which N is the total number of neurons in the output layer corresponding to the number of heart disease. Moreover, the weights w l ki and biases b l ki in the DNN structure are updated based on the derivative of the cost function and the previous weights with the learning rate η and the momentum α as follows: where w l ki (t + 1) and b l ki (t + 1) are the weights and biases of the DNN structure at the iteration of t + 1, respectively. w l ki (t) is the weights at the iteration of t and b l ki (t) is the weights at the iteration of t of the DNN structure. While w l ki (t -1) and b l ki (t -1) are the weights and biases of the DNN structure at the iteration of t -1, respectively. In addition, the momentum α is added in order for the network to pass the local minimum value and get the global minimum one.

D. Calculation of Kernel Size
Calculation of the kernel size is an important task in a deep learning framework with convolutional layers in this research. In the DNN structure, the algorithm of the optimal kernel sizes in convolutional layers is proposed to calculate based on P, Q, R, and S waves to produce features. In this research, ECG signals are periodic and its characteristic is usually determined based on the shape of the heartbeat signal, such as QRS complex and P_wave as shown in Fig.  5. Therefore, a kernel size is determined based on the duration of QRS complex, P_wave, and T_wave of one standardized heartbeat, in which their durations in the time domain are described in detail (see Table V) [42].

In this article, the heartbeat x[n] is segmented from one R-ECG signal [ ],
xn and then the convolutional kernel size is calculated relative to the duration of P, Q, R, and S waves of the heartbeat x[n] as is shown in Table V (15) in which L k is the kernel size, with k = 1, 2, …, 8, F denotes the sampled frequency of ECG signals. D describes the duration of P, Q, R, and S waves and can be chosen for calculating the kernel size dependent on the order of convolutional layers in the DNN. Therefore, the size of the convolutional kernel is proposed to be calculated as follows:

E. Evaluation of Classification System
In this research, with the DNN structure applied for classifying heart disease, the performance of the classifier needs to be evaluated. Therefore, a confusion matrix with multi-class is utilized in the classifier, in which the confusion matrix is a square matrix with the number of dimension C coressponding to the output nodes in the output layer. In particular, the parameters of the overall accuracy (ACC), the overall sensitivity (SEN), and the overall positive predictive value (PPV) are calculated in the confusion matrix [32] as described in Table VI.   TABLE VI Then the ACC, SEN, and PPV are defined as follows: 1 1 , With the mentioned parameters above, a classifier with the high parameter values ACC, SEN, and PPV means that this classification system will produce high performance.

III. RESULTS
From the deep learning framework for heart disease classification, experimental results related to pre-processing, classifying, and evaluating were represented. In particular, the pre-processing results of ECG signals using the wavelet transform to eliminate noise, and then the length of each heartbeat with the R peak of the QRS complex of the filtered ECG signals were standardized. After preprocessing the heartbeats, heart disease data were separated into two datasets with different percentage for training and testing in the classifier. In this research, the DNN structure was adjusted the convolutional layers with the calculated kernel sizes for the evaluation of the classification performance using confusion matrix.

A. Noise Elimination and Standardization of Heartbeat Length
The results of ECG signal decomposition and the noise elimination by applying the hard and soft thresholds are shown in Fig. 6. In particular, the hard threshold was applied for the approximation a 8 to eliminate the BW noise and the soft threshold was utilized to reduce the high and unwanted frequency in the detail components of d 1 , d 2 , and d 3 . Fig. 6(a) shows the waveform of the approximation and detail coefficients by applying the WD algorithm at level eight with the dmey wavelet.
(a) (b) Fig. 6. Representation of the approximation and detail coefficients: (a) the coefficients obtained from an original ECG signal using the WD; (b) the coefficients obtained using the soft and hard thresholds.
In particular, the approximation coefficient a 8 with the lowest frequency is considered as the BW noise and it is completely eliminated by the hard threshold. The detail components d 1 , d 2 , and d 3 with the highest frequency were nearly filtered by the soft threshold as in Fig. 6(b).
The original ECG signal (blue) of one patient with the 222 code was collected from the MIT-BIH AD, and the reconstructed ECG signal (red) after being recovered from the filtered approximation and detail coefficients are shown in Fig. 7. It is clear that the red ECG signal was processed to produce the smoother signal between heartbeats, and this may result in the better classification performance. Heartbeats in one filtered ECG signal need to be determined by the R_peak position of the QRS complex using the Pan-Tompkins algorithm [43] for calculating kernel sizes as depicted in Fig. 8. In particular, the ECG signal with the blue line was filtered using the wavelet transform, and the red circle describes the detected R_peak position in one heartbeat. The filtered ECG signal was separated into heartbeats based on the R-positions. Therefore, the length of the heartbeat (red) was standardized to be 300 samples, including 150 samples in front of the R_peak, 149 samples behind the R_peak, and 1 sample of the R_peak. In addition, intervals of Pre_R and Pos_R can be different depending on each ECG signal. In two red heartbeats, although the R_peak of the second heartbeat is higher than that of the first one, these heartbeats still do not affect the classification performance.

B. Noise Elimination and Standardization of Heartbeat Length
Each R-ECG signal (in Table II) was segmented into heartbeats, and then all heartbeats of R-ECG signals in the ADB dataset were divided into two groups for training and testing as shown in Table IV. In this paper, the DNN structure with seven convolutional layers (in Fig. 3) was applied, in which seven kernel sizes were calculated based on P, Q, R, S, and T waves as follows: In this DNN, we just propose seven convolutional layers due to only seven kernel sizes calculated based on waves of one heartbeat from Layer-1 to Layer-7 containing almost of features of heart disease. Moreover, from the 8 th layer to the 9 th one, the kernel sizes are very small and actually they do not affect the classification performance. In addition to calculation of kernel sizes in convolutional layers, the activation functions of the convolutional layers and fully connected layers in the DNN are Relu function and tansig function, respectively. In the output layer of the DNN model, the softmax function is employed to classify five types of heart disease. Moreover, the number of kernels in the seven convolutional layers are 32, 64, 32, 16, 16, 16, and 16, respectively, in which the size of max-pooling window used in the first stage is 4 samples, and in the remaining ones -2 samples. Detail training parameters are described in Table VII. From Table VIII, the classification performance of the DNN in the proposed deep learning framework achieved results with the high percentage. In particular, about 50 % of data was used for training in the DNN, the accuracy of classifying five types of heart disease is more than 99 %, and this is the very high classification performance. With 90 % of dataset for training, the accuracy of PPV, SEN, and ACC are 98.14 %, 98.18 % and 99.37 %, respectively.
Filtered noise and undesired frequencies in ECG signals using one suitable wavelet function in wavelet transform can increasingly create higher classification performance. Therefore, in this paper, six wavelet functions, including dmey, bior5.5, db4, sym1, bior1.3, and db1 were applied to determine the most suitable functions. Simulation results related to the classification performance based on three parameters (PPV, SEN, and ACC) are shown in Table IX. In particular, the first three wavelet functions dmey, bior5.5, and db4, called "Group-1", basically have the same waveform shapes with that of a heartbeat, while the remaining waveforms (sym1, bior1.3, and db1), called "Group-2", are different from that of the heartbeat. Based on the three parameters, the first three wavelet functions produce better classification performance.

C. Classification Performance with Different Heartbeat Lengths
The length of a heartbeat related to the classification performance should be considered due to features contained in P, Q, R, and S waves. In particular, Table X presents the statistics of the heart disease classification performance corresponding to different heartbeat lengths from previous research. In these studies, the classification accuracy of the heartbeat length with 300 ms is very high [11]. However, the high accuracy depends not only on the length, but also on the quality of datasets, pre-processing methods, types of classification system, and others. In our article, the most suitable length of one heartbeat was segmented to be 300 samples, and it produces the best classification performance. In order to find the suitable length of a heartbeat, this article performed six different lengths on heartbeats. The kernel sizes were calculated based on the six different lengths and used in the same DNN for evaluating classification accuracy as shown in Table XI. In this experiment, with the evaluation based on the parameters (PPV, SEN, and ACC), the minimum length with 143 samples has the lowest classification accuracy, and the maximum one with 432 samples produces the highest performance. Therefore, the heartbeat length with 300 samples selected is optimal for calculating kernel sizes in the DNN to produce the highest classification accuracy.

D. Classification Performance with Different Convolutional Layers
In addition to applying the dmey wavelet and the heartbeat length for optimizing kernel sizes, convolutional layers in the DNN were built based on the kernel sizes in the relation with heartbeat features. Therefore, the number of the convolutional layers chosen in the DNN may affect the classification performance. In this paper, we have performed experiments with the number of different convolutional layers (from one to nine layers) for evaluating the classification performance as shown in Fig. 9. The result shows that the nine convolutional layers were just chosen without the higher convolutional layers due to reduced classification performance.
In the experiments related to the change of the convolutional layers, Fig. 9 shows the different classification performance based on PPV, SEN, and ACC in the DNN. In particular, the classification performance just increases from one layer to seven layers, while the use of eight or nine convolutional layers produces the lower classification performance evaluated through the evaluation parameters (PPV, SEN, and ACC). In this research, the DNN structure with the seven convolutional layers was also chosen to apply for classifying heart diseases due to seven proposed kernel sizes based on the standardized heartbeat with 300 sampling points. In addition, each convolutional layer was combined with the Relu layer and the Max-pooling layer for feature extraction during training and classifying. With the obtained experimental results, the DNN structure with the seven convolutional layers in the deep learning framework produced the high classification performance. This illustrates the effectiveness of processing ECG signals, standardizing heartbeat, optimizing kernel sizes, and the chosen DNN structure.

IV. DISCUSSION
From the results achieved in this research, it can be seen that the DNN framework with convolutional layers for heart disease classification is highly effective. In particular, the dmey wavelet function with the heartbeat shape was chosen to filter the ECG signals before segmenting to produce heartbeats. The heartbeat length was standardized to be 300 samples for calculating seven different kernel sizes applied to seven convolutional layers in the DNN. It means that heart disease features calculated in the seven convolutional layers went through the dropout layer to eliminate unnecessary elements before reaching the fatten layer for converting 2D data to 1D.
With the proposed deep learning framework for classifying five types of heart disease, the high classification performance is very important. Ji, Zhang, and Xiao [11] used a combination of the MIT-BIH database and ECG data, and an empirical mode decomposition method for preprocessing the ECG signals was proposed to obtain smooth ECG signals. In addition, the authors proposed a Faster R-CNN structure for classifying five heart diseases, and the high accuracy was 99.21 %. In addition, Oh, Ng, Tan, and Acharya [48] proposed a CNN -Long Short-Term Memory (CNN-LSTM) model for extracting features of heart disease from the MIT-BIH ECG database. In this research, one heartbeat in ECG signals were segmented to produce the length of 260 samples, and then they were standardized using a z-score method before applying the input data of the classification model. In this classification model, heart disease features extracted using the CNN-LSTM model were applied in a MLP with fully connected layers, and the result was 98.10 %.
In addition, Xia, Wulan, Wang, and Zhang [36] proposed the combination of the Stationary Wavelet Transform (SWT) and the DCNN in a classifier for classifying heart diseases. In particular, an elliptical band-pass filter with the 10 th order and the passband of 0.5 Hz-50 Hz were utilized to remove BW power-PLI and muscle noise to obtain the filtered ECG signals. Therefore, this signal was decomposed by using a SWT algorithm at level-6 to achieve six components of each detail and approximation. All these components were used as a 2D data input of the DCNN, including two convolutional layers, two max-pooling layers, two Relu layers, one dropout layer, two fully-connected layers, and one softmax layer. The MIT-BIH Atrial Fibrillation (AF) and non-AF used to test the model produced the classification performance with the accuracy of 98.63 %.
In our study, we proposed how to calculate kernel sizes in the convolution layers of the DNN to be able to retain the most features in heartbeats. Furthermore, we selected the dmey wavelet function, which has a heartbeat-like shape for the best filtering efficiency compared with other wavelet functions as shown in Table IX. From determining the kernel sizes, we performed the determination of the number of convolutional layers corresponding to the kernel sizes. Thus, the result of statistics in Fig. 9 produced the selection of seven convolutional layers with the best classification accuracy in the DNN classifier.
In the previous studies [11], [28], the authors used the daubechies wavelet function for filtering ECG signals. In our study, the dmey wavelet function was chosen because its shape is close to that of a heartbeat. Furthermore, the use of kernel sizes in different CNNs is described in Table XII, in which, in [11], [36], [48], the common point is that these studies applied different kernel sizes according to the choice to be suitable to the typical classification system. In our study, the kernel sizes were determined based on the 300 sampling length of a heartbeat. From this calculation, the number of the most suitable convolutional layers is seven in the DNN, and the result is that the high classification accuracy in our classification system for five types of heart disease is 99.37 %.

V. CONCLUSIONS
In this paper, the proposed kernel size calculation was applied to calculate convolutional layers in the deep learning framework for increasing performance in classifying five types of heart disease. In addition, the dmey wavelet was chosen to be able to extract features of almost each heartbeat standardized to be 300 samples. The results have shown that the classifier using the proposed kernel sizes has the high accuracy in the DNN structure with seven convolutional layers and three fully connected ones. In particular, the classification effectiveness of five types of heart disease in the DNN is very high, with the accuracy of 99.37 %, the sensitivity of 98.28 %, and the positive predictive value of 98.14 %. This means that the proposed kernel size algorithm for calculating convolutional layers can be applied for developing other classifications.