Professor Subhas Chandra Mukhopadhyay
Exeley Inc. (New York)
Subject: Computational Science & Engineering, Engineering, Electrical & Electronic
eISSN: 1178-5608
SEARCH WITHIN CONTENT
Wisnu Jatmiko ^{*} / M. Anwar Ma’sum ^{*} / Hanif Arief Wisesa / Hadaiq Rolis Sanabila ^{*}
Keywords : Telehealth, ECG, LVQ, GLVQ, FNLVQ, FNLVQ-PSO, FNGLVQ, AM-GLVQ, SPIHT, FPGA
Citation Information : International Journal on Smart Sensing and Intelligent Systems. Volume 12, Issue 1, Pages 1-28, DOI: https://doi.org/10.21307/ijssis-2019-009
License : (BY-NC-ND-4.0)
Received Date : 02-April-2018 / Published Online: 29-November-2019
Technology is developed to benefit society. One of the applications of technology in the healthcare sector is telehealth monitoring system. The system proposes a new way of communication between the doctor and the patient, even in a very remote location. In this paper, we elaborate the progress and challenges regarding the development of Tele-ECG in Indonesia, which includes data acquisition, feature extraction, data compression, classification algorithm, mobile and web development system and small device implementation on an FPGA board. The classification is conducted by using LVQ, GLVQ, FNLVQ, FNLVQ-PSO, FNGLVQ, and AM-GLVQ. The compression is conducted by using SPIHT algorithm. Tele-ECG can assist in monitoring heartbeat anomalies and reduce the risk of heart attack. It could also be a solution for infrastructure discrepancy in healthcare.
Mobile technologies provide a medium for long-distance communication systems, where the users interact with each other without having to physically meet. Long-distance communication systems, using mobile technologies, are useful for critical and urgent matter, such as healthcare. Many research works regarding the implementation of these systems in healthcare services have been conducted. This is shown by researchers who have developed these kinds of systems (Dabiri et al., 2009; Feng et al., 2015). In this field, the systems are commonly known as telehealth monitoring systems. These systems are primarily focused on the interaction between the doctor and the patient. The doctor could provide a consultation via the mobile device to monitor the patient’s condition. In addition, the patient could input his/her own condition, so the doctor could review the progress of the patient’s treatment. Another common feature included in these devices is a first response classification system. These telehealth monitoring systems could detect the anomalies in the data obtained from the patient. Several examples of these classified data are ECG (electocardiogram), USG (ultrasound images), EEG (electroencephalogram), and many other data. These data will be classified by the system according to the disease that could possibly cause the symptoms. The data will be sent to the appropriate practitioner in order to respond to the patient as fast as possible. The architecture of a telehealth monitoring system can be seen in Figure 1.
In the present day, there are a myriad of telehealth monitoring systems that have been researched and developed. An example of this telehealth system is the tele-USG system. This system is used to monitor fetus in a pregnant mother. The mother can consult the doctors to know the progress and condition of her fetus by using a mobile device. The system has been developed and implemented in our previous studies (Jatmiko et al., 2015a). Our previous research in this field also includes an automatic detection for fetal organs, such as fetal head and fetal humerus in an USG image. The automatic detection of the organs was implemented in the Tele-USG system (Ma’sum et al., 2015a, 2015b; Jatmiko et al., 2015b).
One of the most interesting telehealth monitoring systems to be researched is Tele-ECG. This system is commonly researched due to the fact that the cardiovascular-type disease is one of the most common causes of death by a non-communicable disease (W. H. Organization, 2016). Therefore, many efforts have been done in order to try and solve this problem, which is divided into several fields. In the field of ECG data quality and compression, Tobón et al. proposed an electrocardiogram quality index to maintain the quality of the ECG signals that were obtained for use in an automatic classification system (Tobon et al., 2016). The quality is maintained by measuring the change of spectral image representation. In the field of ECG data recognition, an implementation of the telehealth monitoring system was conducted by Marcolino et al. (Soriano Marcolino et al., 2016). In several other studies, QRS algorithms was utilized to analyze the electocardiogram data in a telehealth monitoring system (Hayn et al., 2012; Khamis et al., 2016).
In this study, we examine the complete telehealth monitoring systems for cardiovascular disease patients. We focus on transmitting electrocardiogram (ECG) data by the patients in the telehealth monitoring system. The analysis includes an application for a consultation between the doctors and the patients. This study also analyzes several compression algorithms that are utilized to compress the data, before sending to the server, to minimize the size of the data. Furthermore, this study also analyzes and tracks the development of classification algorithms for classifying and analyzing the ECG data for a first response inside the system. In addition, the ECG pre-processing methods are also included in this study. To complete this paper, we have also included several implementations of telehealth systems that focus on monitoring cardiovascular disease.
In this paper, we will elaborate the progress and challenges of developing a smart tele-ECG device by examining several data pre-processing methods. Data pre-processing method is crucial in this study to extract the features of the heartbeat. These features are going to be used to process the data using various classification algorithms. This section will be divided into three different sub-sections. The first sub-section explains the anatomy of the ECG data. The next section will discuss about the normalization of the beats, which involves the process of removing the baseline wander, commonly known as baseline wander removal. The final sub-section will explain about the individual beat segmentation from the ECG data.
Before we explain the pre-processing stage of ECG data, we need to know about the ECG data (Tawakal et al., 2012). A single beat in an ECG signal is comprised of several sections and points. The first point is the P point, which is the first small peak of the ECG signal. After the P point, there is a decline as the beat enters the Q point. It then peaks into the R peak, which is the tallest peak in the beat, and it declines again into the S point. Finally, the beat will reach the T point. The distance between the Q, R, and S is called a QRS complex. The distance after the P reaches its peak and the Q point is called a PR segment, whereas the distance before the P reaches its peak and the Q point is called the PR interval. The QT interval is the distance between the Q and the T point (after the peak), and the ST segment is the distance between the S and T point before the peak. The distance between two R peaks is called the RR interval. ECG beat is aptly illustrated in Figure 2.
In the pre-processing stage of data, the data are usually normalized so that they are uniformed. This normalization process is key to having a correct classification system later on. In the case of the ECG data, the data are generated from the sensor continuously. During this continuous process, there are occasional noises that could affect the recording of the ECG beats. One example of the noise that could affect the ECG data is low frequency noise. This noise will affect the ECG data in terms of the up and down shifting of the ECG data. Therefore, the acquired data do not lie completely on a straight, iso-centric line. This is a challenge that needs to be addressed in order to have uniformed ECG data.
In a previous study conducted by Ima et al. (Setiawan et al., 2011), cubic spline interpolation is utilized in order to remove the baseline wander. The method will estimate the baseline from the ECG data so that the ECG data could be normalized and straightened on the iso-centric line. The PR segment was used in that study to generate the cubic spline, which, in turn, will estimate the baseline of the data. Therefore, it is essential to detect the QRS complex and PR segment from the data to conduct this process. The process of baseline wander removal can be seen in Figure 3.
In 2012, Isa et al. used discrete wavelet transform in order to remove the baseline wander from the ECG signal (Isa et al., 2012a). This wavelet transformation method provides a multiresolution analysis that helps to remove the baseline wander from the ECG signal. This is conducted according to a previous statement from the study of Sargolzaei et al., which mentioned that the baseline spectrum is below the spectrum of the ECG signal (Sargolzaei et al., 2009). They mentioned that the baseline wander could be detected from the approximation coefficients inverse wavelet transform during the time the energy coefficient in a level of the decomposition reaches the local minima.
The individual beat segmentation is a three-step process. The first process is the beat extraction process. This process will segment the continuous beats in an ECG data into individual beats. The next process involves removing the outliers on the beats. Finally, the features of the beat will be extracted and processed by the classification algorithm.
As mentioned previously, the ECG data obtained from the ECG sensors will be continuous. The example of these continuous data can be seen in Figure 3, where the ECG data are comprised of many smaller individual beats. However, in order to extract the features of the beats, we will need to extract the continuous beats into individual beats. This process will utilize the cutoff technique, as previously conducted in the study of Jatmiko et al. (2016). The method approximates each individual beat to be 300 data points in length. In addition, the center of the individual beat will be on the R peak. This means that the individual beat starts on R-150 data point and ends on R + 149 data point. The continuous signal will then be cutoff at these points in order to extract the individual beats. The cutoff technique can be seen in Figure 4.
In the ECG data, there will be various unnecessary beats that are captured by the sensor while recording the data. These unnecessary beats tend to distract the classification process, which is conducted on the later stage. Therefore, these unnecessary beats need to be removed. These unnecessary beats are referred to as outliers in the data. These outliers are found outside of the distribution of the data; hence, Ima et al. and Jatmiko et al. used a simple outlier removing procedure known as the interquartile range (IQR) technique. The technique will determine the outlier by creating a boundary from the percentile of the data. In the study of Jatmiko et al. (2016), they chose the upper quartile (Q _{1}) in the 25th percentile and chose the lower quartile (Q _{2}) in the 75th percentile. The interquartile range (IQR) can be calculated by using the following equation:
The lower extremity level and the higher extremity level for the boundary of the outliers can be calculated by using the following equation:
After Equation (2) has been calculated, it is applied to every feature of the data. However, it should be noted that the correlation between each feature is not considered. Afterwards, the beat located outside the extremity levels (either from the lower and higher levels) will be removed from the whole data set. The process of removing these outliers using IQR can be seen in Figure 5, and the ‘before and after’ data of the outlier removal process can be seen in Figure 6.
Feature extraction is one of the key aspects of pre-processing techniques, as this process will select the features that will be used by the classification process later on. Therefore, it is essential to extract the best features from the data set, as selecting better features will lead to better outcome during the classification process. To extract the features, wavelet transformation is proven effective, as shown in the studies of Setiawan et al. (2011) and Jatmiko et al. (2016). The wavelet transform definition can be seen in the following equation:
In Equation (3), the notation s denotes the scaling factor, the basic wavelet y(x) with an s scaling factor dilation is ${\psi}_{s}(x)=\frac{1}{s}\psi (\frac{x}{s})$ . Let s = 2j, where j is an element of the integral set. The wavelet transform is called a dyadic wavelet transform (Zhao and Zhang, 2005). Mallat algorithm (Mallat, 1989) can calculate the dyadic wavelet transform from a digital signal. The Mallat algorithm is shown in the following equations:
In Equation (4), the notation S _{2j } denotes the smoothing operator, whereas the low-frequency coefficients are represented as S _{2j } f(n). The coefficient is an approximation of the original signal S _{2j } f(n) = a _{ j }·a _{ j }. In Equation (5), the W _{2j } f(n) denotes the high-frequency coefficient. The high-frequency coefficient is the detail of the original signal W _{2j } f(n) = d _{ j }·d _{ j }. In order to obtain the important information in the wavelet coefficients, a very crucial part in wavelet theory is to select the mother wavelet and also the decomposition level. In the case of the arrhythmia’s usage of the ECG signals, Daubechies is selected as the mother wavelet since it achieves a better performance as, described by Senhadji et al. (1995). In the studies of Setiawan et al. (2011) and Jatmiko et al. (2016), Daubechies level 8 proved to have a better result than the other mother wavelet. The research then used the db8 as the mother wavelet and decomposed the individual beats from 1 to 5. The decomposition of Daubechies 8 can be seen in Figure 7.
The coefficient that will represent the signal has to be selected properly in order to select the best features that represent the signal. In the studies of Setiawan et al. (2011) and Jatmiko et al., (2016), d2, d3, d4, and d5 represent the high-frequency signal. a1, a2, a3, a4, and a5 represent the signal approximation; hence, it has the important features of the signal. Therefore, the study selects the coefficients for the beat. The decomposed ECG signal that was conducted by Jatmiko et al. (2016) can be seen in Figure 8. There are imbalanced data in ECG signal, which could degrade the performance on recognizing a minor, but significant, class. A study has been performed for examining the performance of oversampling on imbalanced data (Sanabila et al., 2016).
In this section, we will be discussing two basic and early classification algorithms that are commonly used in classifying ECG signal data. These algorithms are learning vector quantization (LVQ) algorithm and generalize learning vector quantization. The learning vector quantization is an earlier version of the generalize learning vector quantization algorithm. The generalize learning vector quantization algorithm provides several improvements in accuracy and cost function compared to the learning vector quantization algorithm.
The learning vector quantization (LVQ) algorithm is an algorithm that is based on the self-organizing map (SOM) algorithm (Kohonen, 1990). This algorithm was first proposed by Kohonen. The algorithm modified the self-organizing map into a supervised method for classification by modifying the learning and update procedures. In addition, the class is also assigned to each codebook. The basic function of this algorithm is to measure the codebook and adjust it to the input vector. If the winning codebook and the input vector share the same class, then the winning codebook will be pulled closer to the input vector. However, if the winning codebook and the input vector do not share the same class, then the winning codebook is pushed away. This is shown in Equations (7) and (8). The closest codebook vector is defined in the following equation (Figure 9):
In Equation (6), notation w _{ c } denotes the closest codebook vector, the input vector is defined as x and the notation w _{ i } denotes the codebooks with a previously assigned class. These classes are denoted by i. The equation for pushing and pulling the codebook can be seen in Equations (7) and (8), respectively:
Equation (7) applies when x and the closest codebook are in the same class:
Equation (8) applies when x and the closest codebook are on different classes. The value of α decreases over time and 0 < α < 1.
According to Nascimento (2005), there are several known limitations of the LVQ algorithm. These limitations include the overtraining problems and non-well specified clustering goals. These drawbacks will be further addressed by the generalize learning vector quantization (GLVQ) algorithm, which is explained in the next section.
In 1996, Sato and Yamada proposed an improvement of the learning vector quantization algorithm, which is called the generalize learning vector quantization (GLVQ) algorithm (Sato and Yamada, 1995). The advantage of this algorithm is that it minimizes the cost function from the LVQ algorithm. In addition, it also minimizes the rate of the misclassification error. The reference vector for the input vector class C _{ x } = C _{ w1} is defined by w _{1}, and the closest reference class from a different class C _{ x } ≠ C _{ w2} is defined as w _{2}. The error of misclassification could be defined as Equation (9):
In Equation (9), the distance between the input vector x and the reference vector w _{1} is denoted by d _{1}, whereas the distance between the input vector x and the reference vector distance between w _{2} and input vector x is denoted by d _{2}. Equation (10) will minimize the cost function during the learning process:
In Equation (10), f(ϕ(x)) is a rising monotonic function. To minimize S, we used the steepest descent method. The update rules for w _{1} and w _{2} are defined in the following equation:
We could assume that the discriminant function is a Euclidian; therefore, Equation (11) can be formulated as Equation (12) and Equation (13):
The update rules can be derived as Equation (14) and Equation (15):
In GLVQ algorithm, the sigmoid function that is used as the monotonic function can be defined as Equation (16) and Equation (17):
In this section, we will examine several advanced algorithms that are used for classifying ECG signal data. The algorithms that will be discussed are fuzzy-neuro learning vector quantization (FNLVQ), fuzzy-neuro learning vector quantization particle swarm optimization (FNLVQ-PSO), fuzzy-neuro generalize learning vector quantization (FNGLVQ), and the adaptive multilayer generalize learning vector quantization (AM-GLVQ). These algorithms are developed from early algorithms that were explained in the third section by modifying the structure, membership function, update rule, or adding optimization function.
In the previous research, Benyamin et al. improved LVQ classifier by applying fuzzy membership function (Kusumoputro et al., 2002). The new algorithm is called fuzzy-neuro learning vector quantization (FNLVQ). In the research, FNLVQ is deployed in odor discrimination system. The FNLVQ architecture is shown in Figure 10. FNLVQ has three layers, that is input layer, output layer, and hidden layer. The input layer is used for representing the feature of the data, the hidden layer is used for representing the reference vectors (codebook) of the classifier, and the output layer is used for representing the output class. As the reference vector uses fuzzy membership function, each vector has three components/values, that is w _{ ij }(l), w _{ ij }, w _{ ij }(r), which represent min, mean, and max of the fuzzy triangle, respectively.
The FNLVQ uses winner-take-all approach in training phase as used in LVQ method. The reference vector is updated on the basis of the winner vector. Same as in LVQ, the winner vector is the closest vector to the input, which is the vector with highest similarity value. The similarity value for FNLVQ is the intersection of input fuzzy triangle and reference vector fuzzy triangle. The method to compute the similarity value for two fuzzy triangles is shown in Figure 11. In Figure 11, x is the input vector, h _{ x } is the fuzzy membership function of vector x, w _{ i } is the reference vector for class i, and h _{ wi } is the fuzzy membership function of reference vector w _{ i }.
Mathematically, the similarity value for feature j in class i, μ _{ ij }, is calculated by using the following equation:
Then, the similarity value for class i is calculated by using the following equation:
The update for reference vector is conducted by following three conditions. The first condition is that if the winner class label is the same as the input vector class label, then the reference vector will be updated by using the following equations:
Then, Equations (20)–(23) are followed by widening or narrowing of the fuzzy triangle using the following equations:
The second condition is that if the vector class is not the same as the input label class, then the reference vector will be updated using the following equations:
Then, Equations (26)–(29) are continued by fuzzy triangle adjustment (widening or narrowing) that is conducted by using the following equations:
The third and final condition is that if the winner class label is the same as the label of input class, but the similarity value is 0, then the reference vector will be updated by using Equations (20)–(23). Then, the fuzzy triangle is widened and narrowed by using the following equations:
In the equations above, α is the learning rate value and β is a constant value between 0 and 1. Concerning testing in FNLVQ, the process is conducted by computing the closest vector (winner vector) to the input vector. The predicted class from the classifier is this winner vector class label.
In a previous research, Jatmiko et al. added optimization process for FNLVQ training. The authors applied particle swarm optimization (PSO) algorithm to train FNLVQ, which resulted in FNLVQ-PSO (Jatmiko et al., 2009). PSO is an optimization algorithm that uses colony of agents to find the optimum value (Kennedy, 2011). In FNLVQ-PSO, optimization is used to find the best reference vector. The fitness value to measure the reference vector is the training error or matrix similarity analysis (MSA). The structure of FNLVQ-PSO is shown in Figure 12.
PSO is an optimization method that is inspired from colony of animal. The PSO is used to find an optimum value on a search space. Each particle will move and look for the optimum point (position) using the following equations:
In the above equations, V _{ i }(t) is the current velocity of the particle, V _{ i }(t − 1) is the previous velocity of the particle, X _{ i }(t) is the current position of the particle, X _{ i }(t − 1) is the previous position of the particle, c1 and c2 are constants, γ is a construction factor that has a value between 0 and 1, P _{ i }(t − 1) is the previous local base of the particle, and P _{ g }(t − 1) is the previous global base of all particles. The local best is the point with optimum fitness value for a particle from start to current time (iteration). The global best is the point with optimum fitness value for all particles from start to current time (iteration). In the FNLVQ-PSO, each particle is represented as a reference vector (codebook), same as reference vector (codebook) in FNLVQ. During the training process, the classifier is optimized to find the optimum reference among all the candidates (particles). The training process in FNLVQ-PSO is conducted by using the following steps:
Initializing the reference vector: the number of particle is given by user input. In FNLVQ-PSO, the particles are reference vectors. Therefore, by doing this step, we have several candidates of reference vector to be chosen later.
Training the FNLVQ algorithm for each particle by computing similarity value between reference vector and input vector.
Computing matric similarity analysis (MSA) for each particle: the MSA is an n × n size matrix where n is the number of output class.
Computing MSA and fitness using the following equation:
Computing local best for each and global best for all particles.
Updating the reference vector using the following equation:
Updating the fuzzy triangle of each particle using the following equation:
where d _{ ij } is the distance between the particle and input vector.Repeating Steps 2–7 until given number of iteration or until the reference vector converges.
FNGLVQ is an enhancement of GLVQ by adding the fuzzy membership function. In the previous work, FNGLVQ was proposed to classify heart diseases (Setiawan et al., 2011). The architecture of FNGLVQ is illustrated in Figure 13 (Setiawan et al., 2011).
First, the distance value in Equation (9) is redefined into d = 1 − μ, resulting in the following equation:
By applying fuzzy membership function, the reference vector is defined with a triangular function, w _{ ij }(w _{min,ij }, w _{mean,ij }, w _{max,ij }), which is defined as follows:
Derivation of the membership function to (w _{mean}) lead is divided into three conditions, resulting in the learning formula below:
In these equations, w _{1} is the reference vector from the same class as the input vector C _{ x } = C _{ w1}, and w _{2} is the reference vector from a different class C _{ x } ≠ C _{ w2}. Update rules for w _{min} and w _{max} follow the equation below:
The value of α is between 0 and 1, and its value will decrease along with the value of the number of iteration (t), as defined in the following equation:
To achieve a better performance, additional rules need to be defined to adjust w _{min} and w _{max}, as written in the conditions below:
If (μ1 > 0 or μ2 > 0), and ϕ < 0, then increase the fuzzy triangular width by using the following equations:
If the input class is recognized into the wrong class (ϕ ≥ 0), then decrease the triangular width using the following equations:
If μ1 = 0 and μ2 = 0, then the fuzzy triangular vectors must be increased using the following equations (γ is a constant, with a value of 0.1 in this research):
Adaptive multilayer generalized learning vector quantization (AM-GLVQ) is an enhancement of GLVQ by using a multi-layer approach proposed by Imah et al. (2012). The idea is to modify GLVQ into a multi-layer structure. The architecture of the AM-GLVQ is shown in Figure 14.
In AM-GLVQ, the input vector is denoted as x. The input data in the Eigenspace are denoted as x’, as written in the following equation:
Therefore, we need to find the best values of transformation matrix T during the training process. Update rule for matrix T is defined in the following equations:
With the large amount of ECG data that will be processed, there has to be an efficient method to transmit and store the data, so that there are no interruptions in the classification process. In this section, we will discuss several methods to compress the ECG data in order to efficiently store and transmit the ECG data. These methods were proposed by the previous researchers. This paper primarily discusses compression methods by using set partitioning in hierarchical trees (SPIHT). There will be two main sub-sections in this section. The first section will explain two-dimensional SPIHT compression, while the second sub-section will discuss about three-dimensional SPIHT compression.
In the present day, several researchers have proposed using two-dimensional SPIHT method to compress ECG data. In 2012, Isa et al. used SPIHT in order to compress the ECG data (Sato and Yamada, 1995; Isa et al., 2012a, 2012b). This method is a state-of-the-art lossless compression method. In this sub-section, we will discuss the compression stage of the algorithm. The stages that were proposed by Isa et al. in SPIHT coding can be seen in Figure 15. We will not discuss the 2D wavelet transform phase, as it has been discussed previously in this paper (please refer to the pre-processing section).
During the compression stage of the SPIHT, the initial method that is conducted is beat reordering method. The beat reordering method is conducted in order to arrange the ECG beat to optimize the SPIHT coding. The arrangement of beats will be based on their similarities, as Isa et al. pointed out that a higher compression rate is more achievable in highly predictable data. This method is chosen due to the fact that the ECG beat data do not have a highly predictable nature because they repeat according to the heart rate (Zhao and Chen, 2006).
In the beat reordering method, the initial step is to use fuzzy c-means clustering techniques in order to cluster the beats that have a high similarity. After we have clustered the similar beats, we will arrange the order of the beats of each cluster on the basis of the distance to the center of the cluster. In Figure 16, we can see the before and after result of the beat reordering technique. As shown in the figure, it can be inferred that after the beat reordering method has been applied to the ECG data, there are less noticeable peaks and granulation of the data. This results in a smoother ECG data, and this is because the beat reordering method will reduce the high-frequency components and reduce the variance among beats that are adjacent to each other. This method will optimize the process of SPIHT compression in ECG data.
Several works on ECG signal compression have been conducted previously (Lu et al., 2000; Moazami-Goudarzi et al., 2006; Linnenbank et al., 1992). In this section, we will be discussing about SPIHT. The set partitioning in hierarchical trees (SPIHT) is a wavelet-based coding method that uses a set partitioning algorithm in order to transform coefficients based on the sub-band pyramid. In the SPIHT method, the most important information coefficient will be sent first. The hierarchical quad-tree data structure is adopted on a wavelet signal. The low-frequency coefficient is the center of the wavelet transformed signal, where the coefficients are ordered in a hierarchy that includes a sub-band parent–child relationship.
There are four main steps in the SPIHT method, with the first step being the initialization step. The initialization step will empty the List of Significant (LIS). Afterwards, the list of insignificant points (LIP) and the list of the insignificant sets (LIS) will also be set. The threshold is also set at T _{0} = 2^{ n }, with _{ n log (max c(i, j))}. In the threshold, c(i, j) denotes the coefficient at position (i, j). The next step in the SPIHT method is the sorting pass in LIP, where the coefficients in the LIP are examined. The significant and important coefficients are transferred to the LSP. Then, we encode sign bits of the important coefficient. The next phase is the sorting pass in the LIS. If the entry in the LIS is important, 1 will be sent. Also, two of the offsprings will be examined. However, if it is not important, a 0 will be sent instead. The final phase will be the refinement pass in which the previous entries of the LSO are examined. If it is important with the current threshold, 1 is sent and the magnitude is reduced by the current threshold. However, if it is not important, a 0 is sent.
In 2015, Jati et al. proposed an improvement over Isa et al.’s SPIHT coding by adding predictive coding (Linnenbank et al., 1992). The main objective of this predictive coding method is to reduce the variation of amplitudes in the 2D ECG signals. By applying the predictive coding method, the signal value will be relatively smaller than before. The flowchart of the SPIHT process that was conducted by Jati et al. can be seen in Figure 17.
In order to reduce the signal value, we first need to select the beat that will represent the group of clustered beats during the beat reordering method. This selected beat will be chosen as the benchmark beat. To select the benchmark beat, we can apply several statistics values of all samples of 2D ECG array (mean, min, or max) or we can use the most important beat in the ordered cluster. After we have obtained the benchmark beat, we have to recreate the new 2D ECG array by computing the difference between the real signal value with the obtained benchmark value. Figure 18 shows the process of the proposed predictive coding by Jati et al.
To accommodate the recent improvements in the biomedical imaging, Isa et al. proposed a three-dimensional SPIHT compression (Isa et al., 2014). The three-dimensional SPIHT compression is a modified two-dimensional SPIHT compression by modifying the spatial orientation tree concept. The original 2D SPIHT compression has four offsprings on each node of a wavelet coefficient, whereas the proposed 3D SPIHT has eight offsprings. Figure 19 shows the concept of the 3D SPIHT compression.
In Isa et al. (2014), the multi-lead ECG signal was constructed with a combination of two-dimensional residual arrays. This combination will produce a three-dimensional residual array. The samples, ECG leads, and the heartbeat are defined by the axis of the volumetric structures. The intra-beat, the inter-beat, and the inter-lead redundancies are de-correlated by the 3D SPIHT method. The 3D residual array of the 3D SPIHT method can be seen in Figure 20.
In the study before, a smart Tele-ECG system has been developed. The main purposes of the system are early detection of heart diseases and heart monitoring. Several works have been conducted on telehealth systems (Sudhamony et al., 2008; Hababeh et al., 2015). The architecture of the system is shown in Figure 21. The system is comprised of four components: the ECG sensor, smartphones, FPGA, and the server. The first component, the sensor, is used to acquire heartbeat signal from patient’s body. The transducer, a component of the sensor, that is used in this system is commonly used in Indonesian hospitals. The second component is the smartphone. The smartphone is used to visualize and analyze the heartbeat signal sent by ECG sensor. The third component, the FPGA, was developed for ECG signal processing and classification. The FPGA implementation is intended for designing a chip that is dedicated to process ECG data. The ECG sensor is connected to the smartphone or FPGA through a bluetooth network. The last component is the server that is used for communication between the patient and the cardiologist. The mobile application of the smart Tele-ECG accommodates both patient and the doctor. However, the doctor has a unique privilege to verify the heartbeat signal sent by the patient.
In the previous research, we built an ECG sensor using electrical components. The ECG sensor used in this system has several components, as shown in Figure 22. The first component is the transducer that is attached to patient’s body. The amplifier component is functioned to amplify the acquired raw signal. The main amplifier chip in the ECG sensor is INA118. This amplifier can generate output signals up to 10 times the original input signal’s strength. In this first amplifier, we used electronic circuit called the right-leg-driven loop. The circuit is used as inverted version of the interference at the right foot of the human body. Therefore, the circuit can decrease interference generated by human body. This electronic circuit is also used for human safety system. Detailed characteristics and functionality of the circuit is explained in the INA118 datasheet.
The second amplifier is used to amplify the signal resulted from previous components. This amplifier strengthens the input signal that is in range 5 mV to 40 mV. The OP07 circuit is used for the second amplifier. The Op07 is a non-inverted amplifier. In the sensor we built, the reinforcement value for the second amplifier is 100 times. The following component is a filter, a component that is used to separate the wanted signal from unwanted signal. The ECG signal has two types of filters that is low pass filter (LPF) and high pass filter (HPF). LPF is used to pass signal that has lower frequency than the threshold, whereas HPF is functioned to pass signal from higher frequency. In the sensor, The HPF is used to remove human body noise, whereas LPF is used to remove contained noise that is already in ECG signal.
Besides the single-lead ECG sensor, the system also used multi-lead ECG sensor to acquire heartbeat signals from the human body. The main engine of the multi-lead ECG machines is similar to the engine in the single-lead sensor. The main modules in multi-lead sensor are amplifier, filter, and adder. Figure 23 shows the electronics of ECG single-lead device.
In the previous research, smartphone is the component for visualizing, classifying, and saving/storing the user’s heartbeat data. The application is deployed as mobile apps in Android operating system. The view of the Tele-ECG mobile application is shown in Figure 24. The application has several menus to accommodate the users for heartbeat monitoring and early detection of heart diseases. The menus are retrieve heartbeat data from ECG, heartbeat history, user management, cardiologist information, hospital information, heart disease information, doctor’s special menu, settings, and help.
The main processes in the smartphone device are heartbeat signal acquisition, signal preprocessing, signal classification and compression. The classification is used to predict the patient’s heart condition. It predicts if the patient heartbeat shows symptoms of heart disease. To classify the heartbeat signal, the system uses neural network classifier that has been mentioned in the previous section: generalized learning vector quantization (GLVQ) and fuzzy-neuro generalized learning vector quantization (FNGLVQ). We select these classifiers due to the simple structure of the methods and fast process, and the good performance of the method. The computational time of the methods is relatively faster than other neural network methods, such as multi-layer perceptron (MLP) or deep learning.
The server has two modules: web services and database. The tele-ECG server was implemented using java platform and Java EE framework. The web services provide many functionalities for create, read, update, and delete (CRUD) and other services, that is save heartbeat, verify heartbeat, register user data, etc. There are 11 services built in the server that provide the users server to carry out the telehehalth function of the system. The detailed information about the services provided by Tele-ECG server is shown in Table 1.
In this section, we will discuss about the implementation of ECG signal in a microprocessor. Field-programmable gate array (FPGA) is an integrated circuit that is designed to be configured after the manufacturing process. This device enables the programmer to add new features by changing the configuration even after the program has been installed on the board. In our previous research, we used Spartan 3AN board from Xilinx (Xilinx, 2011). Several research studies on FPGA that were focused on analyzing ECG signal have been conducted (Figueiredo and Michael, 2013; Risman et al., 2014). In this section, we will elaborate the implementation of FLVQ, FNLVQ, and AFNGLVQ algorithm in FPGA (Setiawan et al., 2011; Suryana et al., 2012; Afif et al., 2015).
The design of ECG classifier in FPGA consists of three modules. These modules are ROM module, preprocess module, and classification module. The ROM module serves as the initial data storage. The preprocess module will decompose the input data with the fourth-order wavelet Daubechies and prepare the data that will be used for in the next stage, which is the classification module. This module consists of three sub modules, namely the wavelet Daubechies 4 module, float to fixed converter module, and data normalization unit module. Wavelet Daubechies 4 module is employed for extracting 300 features per beat into 40 features per beat. The float to fixed converter module is used to convert float-point (wavelet Daubechies module result) into fixed-point number, which becomes the input type for the classification module. The data normalization unit module normalizes the data using Z-score. The last module is classification module, which is responsible for doing classification using FLVQ on ECG data. The result of the classification stages will be displayed on the LCD module. The illustration of the ECG classifier on FPGA can be seen in Figure 25.
During the preprocessing stages, the original signal is decomposed into half-length for each iteration. This decomposition process uses 16-bit floating points, consisting of 5-bit exponent and 10-bit mantissa. The input signals have 300 features long scalar and will be run for three iterations that will produce 40 features of long scalar. The wavelet Daubechies architecture is shown in Figure 26. The features should be padded before they are passed to the convolution stages. The signal will be padded with mirroring at the head and tail of the input signal. As the main part of the Wavelet Daubechies submodule, the convolution component has a role to convolve the padded input signal. Furthermore, the convoluted signal is downsampled into half-length data and stored in the output RAM. Values kept in the output RAM would serve as the input for the next decomposition operation.
The convolution unit consists of four components: Shift Register (4 temporary registers), Daubechies Constant (4 hard-coded constants), 16-bit FPMultiplier that handles multiplication operation, and 16-bit FP Adder that handles addition operations. The convolution process initially separates 306 features into 19 chunks of data. Furthermore, the FPMultiplier performs multiplication operation between each data chunk and Daubechies Constant. The results of the multiplication process are summed up to obtain the convolution result. Figure 27 shows the convolution architecture.
The classification module is the implementation of FLVQ algorithm for classifying the data. The component of this module consists of input data RAM, vector reference RAM, expected value RAM, neuron component, similarity divider component, updater component, and state machine. The input data RAM is the location for storing the output of the preprocessing process from the wavelet decomposition. Meanwhile, the vector references RAM is used to store the vector references that come from the output of the fuzzification process. The neuron component is responsible for measuring the similarities of each feature; meanwhile, the similarity divider component has a role to determine the winner vector. Moreover, the winner vector is updated by moving or changing the shape of the corresponding fuzzy triangle based on the update case by the updater component. The output of this classification process is stored in expected value RAM. The design of the FLVQ architecture in FPGA is shown in Figure 28.
We have also implemented FNGLVQ in the FPGA. Some additional modifications have been done, including the addition of a sigmoid derivative core unit, which expands the number. The arithmetic operations in this design uses 32-bit fixed-point format, which consists of 16-bit integer and 16-bit fraction. The main reason of using fixed-point rather than floating-point format is the expressiveness and the resource usage. Moreover, using the fixed-point format can increase the computation speed and increase the efficiency. As a comparison, an addition operation of two 32-bit number fixed-point unit spends 32 slices, whereas a floating-point unit spends 500 slices when run on 1 clock. The design of FNGLVQ in FPGA is shown in Figure 29.
The FNGLVQ algorithm is implemented in a state machine. The state machine defines the connectivity of the component in the design. The designed state machine is illustrated in Figure 30. Based on the experiment (Sudhamony et al., 2008), the performance of FNGLVQ in the FPGA is slightly degraded compared to the top-level implementation of FNGLVQ. The performance of FNLGLVQ in the FPGA degrades around 2%. The main reason of this degradation in performance is the number format that is used in the FPGA, which is a fixed-point number format. Floating-point number format has a more detailed and precise value compared to a fixed-point number. Thus, it produces better performance. However, floating point number in FPGA requires substantial resource allocation and extra computational time. Fixed-point number has a slightly lower performance; however, it is faster and requires smaller resource than a floating-point number.
In addition to the FNGLVQ, we had also implemented the AFNGLVQ algorithm in the FPGA board. The design of the AFNGLVQ algorithm is relatively similar to the FNGLVQ. The main difference is in the state machine. The state machine will receive the input vectors and store them in the w _{min}, w _{mean}, and w _{max}. Furthermore, the state machine will update the value of reference vector weight and store it in three components, that is w _{min}, w _{mean}, and w _{max}. The designs of AFNGLVQ and state machine during the training phase are depicted in Figures 31 and 32.
In the FPGA board, AFNGLVQ achieves a better performance compared to the FNGLVQ (Afif et al., 2015). The adaptive feature on AFNGLVQ has a beneficial impact on performance. Hence, it still has a slightly lower performance compared to the high-level implementation of AFNGLVQ due to the number format. Floating-point number has a more detailed and precise value compared to the fixed-point number. Thus, the FPGA implementation has had competitive result, and it is worth to be implemented in a micro device.
This section discusses the performance of the ECG telehealth system. The measurement consists of the accuracy of the arrhythmia classification, compression error and arrhythmias classification in FPGA. This section summarizes the experiments that were conducted in the previous study.
As mentioned previously, FNGLVQ is a GLVQ that has a fuzzy membership function. A study by Setiawan et al. utilized FNGLVQ to classify arrhythmias based on the ECG sensor data (Setiawan et al., 2011). Setiawan et al. also used MIT–BIH arrhythmias dataset, which consists of 12 classes. These twelve classes are normal beat (NOR), right bundle branch block beat (RBBB), left bundle branch block beat (LBBB), paced beat (P), premature ventricular contraction beat (PVC), atrial premature beat (AP), fusion of paced and normal beat (fPN), fusion of ventricular and normal beat (fVN), nodal (junctional) escape beat (NE), nodal (junctional) premature beat (NP) aberrated atrial premature beat (aAP), and ventricular escape (VE). Before classifying, there are several preprocessing steps that were applied to the dataset. These preprocessing steps are baseline wandering removal (BWR), outlier removal, and wavelet transform.
In the experiment, FNGLVQ was compared to LVQ, LVQ21, and GLVQ. The experiment was conducted using 150 epochs. The dataset was divided into 10 parts. Each part has training and testing set with 50:50 ratio. The experiment results are shown in Figure 33. Figure 33 shows that parts 1 to 10 have a similar trend. FNGLVQ has the highest accuracy, followed by GLVQ, LVQ21, and LVQ. Over all, FNGLVQ achieved 95.52% accuracy, GLVQ achieved 93.36%, LVQ21 reached 87.41% accuracy, and LVQ reached 73.73% accuracy. This means that in the MIT–BIH arrhythmias dataset, FNGLVQ outperformed GLVQ, LVQ21, and LVQ, with a margin of 2.16, 8.07, and 21.79%, respectively. This infers that adding fuzzification in arrhythmia classification improves the accuracy.
Diane et all continued the research by Setiawan by investigating the impact of round robin approach on training process (Fitria et al., 2014). The investigation was applied in various LVQ-based classifiers. The classifiers are LVQ1, LVQ2, LVQ2.1, FNLVQ, FNLVQ MSA, FNLVQ-PSO, GLVQ, and FNGLVQ. In the study, the authors only used 5 classes from the MIT–BIH dataset. The classes are normal beat (NOR), right bundle branch block (RBBB), left bundle branch block (LBBB), paced beat (PACE), and premature ventricular contraction (PVC).
The experiment result is shown in Figure 34. The figure shows that without round robin approach, the classifiers achieved 74.62, 82.29, 86.55, 82.25, 83.66, 92.54, 88.09, and 94.07% for LVQ1, LVQ2, LVQ2.1, FNLVQ, FNLVQ MSA, FNLVQ-PSO, GLVQ, and FNGLVQ, respectively. However, when using round robin approach, the classifiers achieved 74.78, 86.75, 98.04, 84.5, 86.12, 90.43, 98.12, and 94.31% for LVQ1, LVQ2, LVQ2.1, FNLVQ, FNLVQ MSA, FNLVQ-PSO, GLVQ, and FNGLVQ, respectively. Figure 34 shows that round robin approach improves the accuracy for almost all classifiers: LVQ2, LVQ21, FNLVQ, FNLVQ-MSA, and GLVQ. The round robin approach increases the classification accuracy from 2 to 11%. For LVQ and FNGLVQ, the round robin approach has a slightly higher accuracy than without round robin. But for FNLVQ-PSO, the round robin approach actually decreased the accuracy.
As mentioned in the previous section, Imah et al. developed an adaptive multilayer generalized learning vector quantization (AM-GLVQ), which integrates feature extraction and classification for Arrhythmia heartbeat classification (Imah et al., 2012). In the study, Imah et al. also investigated the performance of the classifiers in the dataset with unknown class and compared them with normal case (without unknown class). The authors used MIT–BIH Arrhythmias dataset, as used by Setiawan et all. There were 14 classes, where 12 classes were explained in subsection A, with an addition of escape beat (AE), and supra ventricular premature beat (SP).
There are two scenarios, without unknown class and with unknown class. In this evaluation, AM-GLVQ is compared to backpropagation, GLVQ, LVQ, and SVM. The experiment results are shown in Figure 35. For both scenarios, AM-GLVQ achieves the highest accuracy among all the algorithms. AM-GLVQ reached 95.16 and 95.04% accuracy for the data set without unknown class and with unknown class, respectively. As shown in Figure 35, in the dataset with the unknown class scenario, the accuracy of many classifiers decreased when compared to the scenario without unknown class in the dataset. The reduction of accuracy varies from 3 to 4%. However, AM-GLVQ can still maintain its performance in scenario with unknown class.
This sub-section summarizes performance of ECG compression from previous studies. To evaluate the performance of ECG compression, we used percentage root-mean-square difference (PRD). This sub-section will summarize the performance of 2D SPIHT proposed by Sani et al., 3D SPIHT proposed by Sani et al., and 2D SPIHT on embedded device proposed by Jati et al. (Isa et al., 2012b, 2014; Grafika Jati et al., 2014). 2D SPIHT was tested in MIT–BIH dataset, whereas 3D SPIHT was tested in INCART data set.
PRD values of the compression algorithm are shown in Table 2. Table 2 shows that 2D SPIHT and 3D SPIHT achieve a good performance, their PRD values are less than 5, with a compression ratio of 16 and 14 for 2D SPIHT, and 3D SPIHT, respectively. 2D SPIHT that was implemented in the embedded device has an acceptable compression ratio up to 32. Above this ratio, the PRD value is more than 10.
In addition to the classification and compression, in the previous study, the telehealth system responsiveness was also evaluated. There are URLs available in the ECG telehealth, which are as follows: /RegisterDoctor, /RegisterPatient, /LookHistory, /UploadHistory, /GetHospitalData, /GetDoctorData, /VerifyHistory, /GetUnverifiedHistory, /RegisterAffiliation, /RegisterHospital, and /GetDoctorAffiliation. Each URL has a different function and service itself. In the previous study, we measured the responsiveness of the service.
The response time for the ECG telehealth is shown in Table 3. In the table, the mean response time varies from 129.6 ms (register patient) to 556.4 ms (upload history). Upload history takes longest service time because this service is used to upload the heartbeat data by the patient. The size of the data is relatively larger than the patient’s identity data. Therefore, this service takes the longest response time. The average service response time is 251.3 ms.
Jatmiko et al. developed the architecture of FNGLVQ and GLVQ in FPGA. The architecture of the classifiers and the state diagram is explained in the previous section. The classifiers that were implemented in the FPGA were evaluated using MIT–BIH data set. However, only five sub-sets of the MIT–BIH data set were used in this experiment. The goal of the experiment is to find the performance of the classifiers in the FPGA and compare it to high language (Matlab) version.
The result of the experiment is shown in Figure 36. Figure 36 shows that there are similar trends in all sets. FPGA implementation of FNGLVQ has a higher accuracy compared to FPGA implementation of GLVQ. However, the accuracy is lower compared to Matlab version of FNGLVQ. Overall, the GLVQ implementation in FPGA achieved 68.3% accuracy, whereas FNGLVQ implementation in FPGA achieved 75.16% accuracy. Matlab implementation of FNLGVQ achieved 81.41% accuracy.
We have elaborated the progress and challenges in developing Tele-ECG system for early detection and monitoring heart disease. This includes the data acquisition, data and feature extraction, data compression, classification algorithm, web and mobile system implementation, and micro device implementation on the FPGA board. Tele-ECG system provides new fundamental interaction between doctor and patients by exploiting technology. Tele-ECG system helps to monitor the anomalies of heartbeat and reduces the risk of heart attack. Early detection of heart beat should be conducted; thus, the risk can be eliminated, which, in turn, can reduce the mortality of heart attack.
Several issues regarding Tele-ECG should be considered and explored. Issues in various areas such as security, privacy, and the effect of embedding the new device in a patient should be explored. Nevertheless, researchers should also consider about the adaptation of new technology in the medical field. Thus, the Tele-ECG is a promising technique to diminish the risk of heart failure by giving an early warning, speed up the diagnosis, improve patient treatment and long-term outcomes, and alleviate patient discomfort and travel time.
Nowadays, the data growth in any area is exceptional, including the medical field. Based on specific medical data, anyone could learn patterns and draw conclusions regarding someone’s life. Therefore, the concern of security and privacy in telehealth system should be considered. It involves several security risks, such as medical entity breach, appointment data privacy, and patient privacy. Therefore, we will conduct deep exploration in ECG data encryption and employ differential privacy methods in a medical record.
This study is supported by Insentif Riset Sistem Inovasi Nasiona (INSINAS) Research Grant No. 5459/UN2.R3.1/HKP05.00/2018, entitled ‘Pengembangan Sistem Telehealth Cerdas Terintegrasi Berbasiskan Perangkat Portable dan Big Data Platform untuk Meningkatkan Pelayanan Kesehatan di Indonesia’ from Ministry of Research and Higher Education, Republic of Indonesia.