Performing the classification of pulsation cardiac beats automatically by using CNN with various dimensions of kernels

ABSTRACT


INTRODUCTION
When compared to other dangerous illnesses, cardiovascular diseases (CVDs) have the greatest mortality rates [1]. To a larger degree, maximum peak emphasis has been given to heart health. The stimulation of a proper electrical impulse in the heart muscle comes from the Sino atrial node. Arrhythmia is a cardiac clinical condition that implies any aberrant source of electrical pulses or any irregularity state during the conduction process. Even while not all arrhythmias are life threatening, a small number of them may cause instant death [2]. Because it can capture variations in cardiac electrical potential, the electrocardiogram (ECG) is one of the most growing and important instruments for detecting cardiac arrhythmias. Furthermore, due to the presence of subtle noisy waveforms, medical inter-relations, incorrect The invention and widespread usage of computer assisted diagnosis (CAD) has been created to address the constraints that have occurred due to diagnosing visually. The automated categorization of heart beats has been produced by using the maximum quantum of machine and deep learning techniques. Many research papers have used methods such as linear discriminant analysis (LDA) [3], support vector machines (SVM) [4], and artificial neural networks (ANN) to diagnose cardiac arrhythmias. The extraction and selection of characteristics are regarded one of the most important processes prior to classification in such traditional systems. These strategies are impacted by the issue of overfitting since they perceive the design of feature extractors for extracting features from the original raw signal after denoising and baseline wander removal [5]. As deep learning progresses through the phases of growth, the difficulty of creating the network decreases, resulting in improved performance and other benefits [6]. Deep learning is a method that involves learning about intrinsic properties and constructing them from the buried layer of each consecutive neurons [7]. As a result, information that is inherent in its nature has been acquired without the need to build a unique feature extraction procedure. Several ECG arrhythmia researches have used entirely deep learning-based techniques [8]. For identifying aberrant heart beats, researchers used a deep learning neural network (DNN) with seven layers [9]. Another design has been created by using a patient-specific classifier based on the convolution neural network (CNN) for categorising the heart beat [10]. For identifying sleep apnea, researchers used a recurrent neural network (RNN) [11]. One of the most often used deep learning structures is the convolutional neural network. When taking into account the numerous architectures established for CNN by various researchers in categorising the heartbeat, a single convolution layer typically includes kernels of comparable size. It is possible to increase feature variety by using kernels of varying sizes.
The proposed improved CNN approach for automated heart beat categorization has been done in this research. To read the useful heart beats, the ECG data is first preprocessed, and then the noise-free ECG signal components are obtained. The heart beats are then supplied directly into the CNN without the need for feature extraction. The selection of kernels of different sizes and dimensions is done in a single convolution layer in this suggested CNN. Later, in the pooling layers, all of the feature map sets are subjected to the max pooling procedure. As a result, the required final mapped features are allowed to concatenate before being fed into the fully linked layers. The last entirely linked layers with the function designated as softmax literally collect the results. The use of improvised CNN in classification analysis is being done to validate compliance with the association for the Advancement of Medical Instrumentation's Standard (AAMI). A comparison experiment is being conducted with the conventional structure of convolutional neural networks in order to determine the efficacy of the suggested design. The suggested CNN model's flowchart is shown in Figure 1. The clustering strategy provided by the AAMI standard is shown in Table 1. The following is an outline of the paper's future contents. The second part presents the data used in this investigation and provides a full explanation of the suggested approach. The third component covers the execution of the tests as well as the analysis of the outcomes. Finally, the fourth part summarises the findings and debates, as well as the conclusion and next work scope.  Figure 1 shows an example of the suggested research. Following the signal preparation operation, the ECG data are used to train the convolution neural network classifier in the training phase. The ECG data obtained from the database is also used in the testing procedure. The gathering of ECG data is then enabled to be preprocessed and categorised by the model of training, which leads to the formation of clinical views. The emphasis of this research is on the training and testing of the suggested model.

PROPOSED METHOD
This research uses the MIT-BIH arrhythmia dataset from the physionet database [12]. The information was collected between 1975 and 1979 by the BIH arrhythmia lab. On the overall, it's being described as the heart beats of one lakh and nine thousand people are being counted and analysed. The datasets are being obtained from forty-seven different participants, totaling forty-eight recordings, each of which may be seen in half an hour. Each ECG recording is made up of binary leads with a sampling frequency of 360 Hz and an eleven-bit resolution across a range of 10 mV. The MIT-BIH cardiac arrhythmia database has fifteen different kinds of heart beats [13]. The AAMI recommends that the standard be changed to group all heart beats into five different groups based on their physiological origin. The clustering method of the whole heartbeats is given in Table 1 after removing paced heartbeats. The results from the modified limb lead II (MLII) are particularly useful in this research. The whole data is divided evenly into binary sets for the training and testing procedure to succeed. The heart pulse count in binary sets is shown to be same for each individual class.
In the classic technique, the quality of the ECG waveform has the largest impact on the categorization outcome. Powerline interference, baseline drift, motion artefacts, quantization noise, electro surgery noise, and other types of noise may all affect the ECG signal. The use of raw and filtered data is combined to determine if the procedure of filtering is required prior to the approach of deep learning [14]- [20]. The A-dataset represents the raw ECG data, while the B-dataset represents the filtered ECG data. Denoising utilising wavelet transformation method is used to the raw ECG waveform in this work, and the signal is decomposed into six levels using the Daubechies 6 (db6) wavelet. The baseline drift has a range of 0.1 Hz to 2.82 Hz across the frequency of the 6th level sub-band (approximation). At the same time, frequency elements after the maximum frequency of the 3rd level sub-band (detail) 47 Hz are thought to provide less useable information. As a result, only the coefficients between the third and sixth sub-bands (detail) are kept. The identification of cardiac beat sites is required to obtain the heartbeats that are being segmented. This work does not describe the heart beat detection approach since numerous literatures have dealt with sufficiently accurate findings, such as the use of the Pan Tompkins algorithm. The annotations present at the peak of the R wave are regarded direct fiducial sites in the MIT-BIH cardiac arrhythmia database [21]- [25].

CONVOLUTIONAL NEURAL NETWORK
Convolutional neural networks are one of the most important and efficient deep learning neural network topologies. CNN's structure is enhanced by permitting the imitation of a human visual brain model using multi-layer perceptron (MLP). It's presented as a feed forward neural network with convolution measurement representation and deep learning structure. In terms of implicit feature learning, CNN performs well. Furthermore, the data are sent directly into the network, eliminating the need for further processing and feature extraction. Convolution, pooling, and fully-connected layers are among the three types of fundamentally specified layers that make up CNN's structure.
Convolutional layer: the convolutional layer is the most important and multicore operational layer in CNN. The learning features of the samples collected as input are elevated by this layer. The convolution is carried out by conducting operations between the samples that are collected as input and the kernels. The results of the convolution are being shifted, while non-linear transformations are being conducted at the same time. The convolution kernel mapping is shown in Figure 2. Figure 2(a) depicts a normal dual-dimensional convolution kernel procedure, whereas Figure 2(b) depicts a typical spatially partitioned two-dimensional convolution kernel. The kernel is slid towards the input samples, allowing it to convolve with the sample subspace. The values are then acquired at the relevant spaces. In this convolution layer, numerous kernels of comparable sizes are usually used.
Subsampling layer: another term for a pooling layer is a subsampling layer. The layer further reduces data size by sampling input data across many dimensions. Furthermore, the pooling layer is supposed to be invariant across the local linear transformations of the collected input information sequence, which improves network generalisation. The division of the pooling layer towards the input is done in such a manner that one is for non-overlapping subordinate areas and the other is for measuring the representative value of each individual region, as shown in Figure 3. It is shown using an example. Pooling layer: after numerous layers of convolution and pooling have completed their processing, the network performs implicit feature learning. The data dimension has been sufficiently reduced so that it can be processed using the feed forward network. The completely linked layer is represented similarly to a traditional multi-layer perceptron.
Heartbeat categorization using an improvised convolutional neural network, because the ECG data is a single-dimensional signal, the network architecture for ECG heart beat is considerably different from the traditional convolutional neural network used in many image processing studies. In this suggested study, a single-dimensional convolutional neural network is introduced. As previously stated, in a single convolutional layer of a classical CNN, kernels of comparable size and dimension are used. This implies that the sampling of input data is followed by convolution using similar-sized windows. The improvement of feature variety will occur if the window size is detected to be variable. As a result, the CNN structure is improvised using kernels of various sizes. The suggested CNN modelled structure is shown in Figure 4. The network has a total of seven levels. There are two convolution layers, two pooling layers, and three completely linked layers. Each each convolutional layer employs kernels of four different sizes. Table 2 summarises the construction of the improvised convolutional neural network. The convolutional layer is made up of the first and third layers. Kernels of four different sizes (8, 10, 12, and 14) are used for the originally placed convolution layer. The number of kernels in each size is 8. The next (second) convolution layer's kernel sizes are six, eight, ten, and twelve, respectively. This layer has a total of 64 kernels. The stride of both convolution layers has been set to one. After each each convolution layer, the collected feature maps are immediately applied with a max pooling layer of size two. Pooling layers additionally increases the output size of the output. Three thousand five hundred and eighty-five neurons make up the output of the fourth layer, which is concatenated and sent onto the fifth layer. Because there are four classes, the last layer is made up of four neurons. As a result, the total number of neurons in the three entirely linked layers (5th, 6th, and 7th layer) is 2506, 322, and 4, respectively, as learned from the experience. The leaky rectifier layer units (ReLU) are used as activation functions for the dual convolution layers and the first two totally linked layers. The activation function of the ninth layer is regarded to be softmax's function. 4 neurons are acquired as an output of the 9th layer, which correspond to the classes of normal, supraventricular, ventricular, and fusion, respectively. The cross-entropy function is used to evaluate the loss while training the improvised CNN model. Furthermore, weights are being included into the loss function to reduce the classification imbalance impact. The CNN training is done with a batch size of 64. The learning rate has been set to a value of 0.01. The training procedure will be extended for another 50 iterations.

RESULTS AND DISCUSSIONS
The MIT-BIH arrhythmia physionet database is being used to test the proposed convolutional neural network model. A further experiment is being undertaken with the deployment of classic CNN to enumerate the comparison. Each layer of convolution in this convolution neural network only has kernels with a single autonomous size. The total number of kernels is found to be comparable to that of the makeshift CNN model. The remaining parameters in both networks are likewise discovered to be identical. Table 3 has examples of these characteristics. As previously stated, both the raw and filtered information are combined. Both the CNN and the LSTM are applied to the first and second sets of data. As a result, there are four different configurations in total, as illustrated in the Table 4. The convolutional neural networks are trained using the training set, which contains 50% of the data. After then, the complete sample set in the test set is evaluated. In this study, the evaluation of performance over the classification under each sub class is being made by accuracy, sensitivity and positive predictivity. The equations of these performance metrics are as (1), (2), and (3).
Where TP stands for true positive, TN for true negative, FP for false positive, and FN for false negative. The use of a confusion matrix is used to keep track of the categorization results. The Table 5 shows the ECG cardiac beats confusion matrix for the second setup. The diagonal cells contain the total number of properly categorised heart beats. The majority of heart beats are categorised properly. However, two tasks have been discovered as having gone wrong when conducting the categorization job. 288 S class cardiac beats are incorrectly categorised as normal. In addition, 120 ventricular heart beats are incorrectly classed as normal. Table 6 shows the total value of accuracy, sensitivity, and positive predictivity for the four configurations. It is clear that the accuracy values for the configurations 1, 2, 3, and 4 are 97.90 percent, 98.67 percent, 97.43 percent, and 98.2 percent, respectively. In the second setup, the implication is formed with the aid of raw data and different kernel sizes, resulting in the highest level of accuracy. Classification yielded sensitivity of 99.76 percent, 80.24 percent, 96.21 percent, and 77.82 percent, respectively. The percentages of positive predictivity achieved as a consequence of categorization are 98.99 percent, 97.46 percent, 95.88 percent, and 94.02 percent, respectively. Except for the value of positive predictivity (PPV) that was determined for the categorization of fusion beats, all indices were found to have the greatest value. The minimum graded outcome for the third configuration, which utilised filtered information and a single valued kernel size, has been displayed.   Comparing the 1st, 2nd, 3rd, and 4th configurations with varied kernel sizes yielded substantial best results. As a consequence of using numerous kernel sizes in a single layer, the process of extracting features with varied sizes with clear perception and visuality has been completed. As a result, the boosted parameter is feature diversity. As a result, in this suggested research, the use of varied kernel sizes has been shown. On the back end, performance evaluations were done between four different configurations, suggesting that filtering may be investigated with some information loss and quality reduction. As a result, deep learning networks do not need the filtering procedure. The sensitivity value for the supraventricular (S) and fusion (F) classes does not seem to be high. It occurred as a result of the limited amount of samples available for binary classes. The convolutional neural network has the greatest amount of parameters needed for training. As a result, the maximum number of heart beats must be achieved in the category with the smallest number of samples.

CONCLUSION
The use of a convolutional neural network to automatically classify ECG heart beats has been suggested in this research. Without the need for additional feature extractors, CNN could extract implicit data. The ECG heart beats may be directly delivered as an input to the network after the segmentation procedure. In this work, a CNN with a seven-layer configuration was used. Multiple kernel sizes are used in each each layer of convolution in this suggested convolutional neural network layout. The max pooling layer comes after that. The outputs of the rear max pooling layer are concatenated and sent into the entirely linked layers as an input. Finally, the suggested approach obtained a maximum peak accuracy of 98.67% when applied to the issue of classification using the AAMI standard. The acquired findings have been verified towards the usefulness of utilising variable size kernels by enumerating the comparison with the experiment on the basis of CNN. There is also a dispute over the filtering's impact. The presentation is set up in such a manner that it eliminates the need for CNN filtering and even allows for the deterioration of important data. It is proposed to force the properties of CNN to be built with different efficient architectures in future research. Furthermore, the implementation might be created using the largest amount of ECG data possible, yielding exact categorization and diagnostic findings. Gomathy Sankaraiyer currently working as associate prof and head of the department of Electrical and Electronics Engineering, Adi Shankara Institute of Engineering and Technology Kalady. Graduated from Calicut University and post graduated from Cochin University of Science and Technology Kerala. Having an industrial experience of ten years and seventeen years of academics working in Adi Shankara Institute of Engineering and Technology Kalady. Published more than twenty five papers and completed several projects. Senior member of IEEE and won Best student branch councillor award, National SEEM award for the energy management and conservation strategies implemented in the institution. Interested in energy management activities and has a passion for power electronics application in renewable sector. She can be contacted at email: gomathy.eee@adishanara.ac.in. He has published around 45 papers in the reputed indexed international journals indexed by SCI, Scopus, Web of science, Major indexing and more than 146 papers presented/published in national, international journal and conferences. Besides he has contributed a book chapter also. He also serves as a board member, reviewer, speaker, session chair, advisory and technical committee of various colleges and conferences. He is also to attend the various workshop, seminar, conferences, faculty development programme, STTP and online courses. His areas of interest are smart antennas, digital signal processing, wireless communication, wireless networks, embedded system, network security, optical communication, microwave antennas, electromagnetic compatability and interference, wireless sensor networks, digital image processing, satellite communication, cognitive radio design and soft computing techniques. He is Member of IEEE, ISTE, IEI, IETE, CSI, IAENG, SEEE, IEAE, INSC, IARDO, ISRPM, IACSIT, ICSES, SPG, SDIWC, IJSPR and EAI Community. He can be contacted at email: kannadhasan.ece@gmail.com.